Query Syntax ============ A querexfuzz query is a single string made up of optional clauses. **Clauses are order-sensitive** and must appear in the sequence shown below. All clauses are optional; an empty string returns all base columns for all rows. .. code-block:: text [verbose] [recent] [top N | bottom N] [select ...] [field ~ regex | ! regex] [where expr] [order by cols] [@[field] date_spec] [# fuzzy_term] The fuzzy clause (``#``) must always be **last**. ---- .. contents:: Clauses :local: :depth: 1 ---- Flags: ``verbose`` and ``recent`` ---------------------------------- ``verbose`` Prints query details to the logger. No effect on the returned DataFrame. ``recent`` Sorts results by the configured ``recent_field`` (most recent first) after all filters are applied. Equivalent to ``order by -`` but uses the engine's configured default rather than naming a column explicitly. .. code-block:: python df.querex("verbose recent top 5") ---- ``top N`` / ``bottom N`` ------------------------ Limits the result to the first or last *N* rows **after** sorting and filtering. .. code-block:: python df.querex("top 10") df.querex("bottom 5") df.querex("recent top 3") # 3 most recent df.querex("top 2 # python") # 2 best fuzzy matches ---- ``select`` ---------- Controls which columns are returned. When no ``select`` clause is given, the engine returns the configured ``base_cols``. .. list-table:: :header-rows: 1 :widths: 30 70 * - Syntax - Meaning * - *(omitted)* - ``base_cols`` (the configured default set) * - ``select *`` - ``base_cols`` * - ``select **`` - All columns in the DataFrame * - ``select a, b, c`` - Named columns only * - ``select *, a`` - ``base_cols`` plus column ``a`` * - ``select *, -a`` - ``base_cols`` minus column ``a`` * - ``select *, !a`` - Same as ``select *, -a`` (``-`` and ``!`` are equivalent for exclusion) * - ``select **, -a`` - All columns minus ``a`` .. code-block:: python df.querex("select name, salary") df.querex("select *, -salary") df.querex("where age > 30 select name, dept, salary") ---- Regex: ``!`` and ``~`` ----------------------- Filters rows using a regular expression. Matching is **case-insensitive**. ``! pattern`` Applies the regex to the configured ``bang_field`` (the default regex target column). ``field ~ pattern`` Applies the regex to the named column. Patterns can be plain text or enclosed in slashes (``/pattern/``); both forms are equivalent. Multiple regex clauses can be chained with ``and``: .. code-block:: python df.querex("! python") # bang_field contains 'python' df.querex("! /^Alice/") # bang_field starts with 'Alice' df.querex("dept ~ ^Eng") # dept starts with 'Eng' df.querex("! python and dept ~ Eng") # both conditions ---- ``where`` --------- SQL-style row filter, passed directly to :meth:`pandas.DataFrame.query`. Supports all operators that ``df.query()`` accepts. Supported comparison operators: ``==``, ``!=``, ``>``, ``>=``, ``<``, ``<=`` Logical operators: ``and``, ``or`` Parentheses for grouping are supported: .. code-block:: python df.querex("where salary > 90000") df.querex("where dept == 'Engineering' and salary >= 95000") df.querex("where dept == 'Engineering' or dept == 'Finance'") df.querex("where (dept == 'Engineering' or dept == 'Finance') and salary > 85000") String values must be quoted (single or double quotes). ---- ``order by`` / ``sort by`` -------------------------- Sorts the result by one or more columns. ``sort by`` is an alias for ``order by``. Prefix a column name with ``-`` for descending order: .. code-block:: python df.querex("order by salary") # ascending df.querex("order by -salary") # descending df.querex("sort by dept, -salary") # dept asc, then salary desc df.querex("order by dept, name") # multiple columns When a fuzzy ``#`` clause is present and no explicit ``order by`` is given, results are automatically sorted by fuzzy score descending. ---- ``@`` Date range ---------------- Filters rows whose date column falls within a relative date window. Uses the configured ``default_date_field`` unless a field name is specified. **Syntax:** ``@[field] unit[-start[:end]]`` **Date units:** .. list-table:: :header-rows: 1 :widths: 15 85 * - Unit - Meaning * - ``d`` - Days * - ``w`` - Weeks * - ``m`` - Months (calendar months via ``dateutil.relativedelta``) * - ``q`` - Quarters (3-month blocks) * - ``y`` - Years * - ``h`` - Hours * - ``c`` - Calendar years (Jan 1 – Dec 31 boundaries) **Forms:** .. list-table:: :header-rows: 1 :widths: 25 75 * - Form - Meaning * - ``@m`` - Last 1 month (unit only → start defaults to 1, end = now) * - ``@m-3`` - Last 3 months * - ``@y-2`` - Last 2 years * - ``@y-5:1`` - Between 5 years ago and 1 year ago * - ``@joined y-2`` - Last 2 years on column ``joined`` (explicit field) .. code-block:: python df.querex("@d-30") # last 30 days df.querex("@m-3") # last 3 calendar months df.querex("@y-1") # last year df.querex("@y-5:1") # 5 years ago to 1 year ago df.querex("@joined y-2") # named field df.querex("@c-1") # last calendar year (Jan 1 to Dec 31) Multiple date clauses can be chained (one per field): .. code-block:: python df.querex("@created y-1 @modified m-3") ---- ``#`` Fuzzy search ------------------ Performs a fuzzy search across the configured ``fuzzy.fields`` using `skimmatch `_. **Must be the last clause.** Results include a score column (default name ``score``) and are sorted by score descending unless an explicit ``order by`` overrides this. The fuzzy matcher is built once on the first call per DataFrame and cached for the lifetime of the engine — subsequent calls reuse it at near-zero cost. When combined with pre-filter clauses (``where``, regex, date), the matcher still runs over the full DataFrame and the results are intersected with the filtered rows using 5× over-fetch: .. code-block:: python df.querex("# python") # fuzzy over all rows df.querex("top 5 # engineering") # top 5 matches df.querex("where dept == 'Engineering' # python") # filter then fuzzy ---- Combining clauses ----------------- All clauses can be combined freely, subject to the ordering constraint: .. code-block:: python # Most recently hired Engineers, showing name and salary df.querex("recent top 5 where dept == 'Engineering' select name, salary") # Hired in last year, sorted by salary descending, fuzzy on 'python' df.querex("order by -salary @y-1 # python") # Complex: verbose, recent 3, filter, date, select, fuzzy df.querex("verbose recent top 3 where salary > 80000 @y-2 select name, dept # alice")