API Reference


Public API

These are the names exported by import querexfuzz.

querexfuzz.querexfuzz_from_df(df, *, base_cols=None, bang_field=None, default_date_field=None, recent_field=None, fuzzy_fields=None, score_col_name='score', attach=True)[source]

Create a Querexfuzz by inspecting df.

By default, all columns not starting _ are selected to base_cols. All date fields to date_fields. If only one, it becomes the default_date_field (default for @ clauses) and the recent_field (field for sorting for recent).

If there is only one object field, it becomes the bang field (default for regex). If fuzzy_fields is None then all object fields are selected. If there is only one, it becomes bang_field if that is omitted.

Highlight mode is true if len(fuzzy_fields)==1.

Parameters

bang_field, default_date_field, recent_field -> score_col_name: name for the score column in fuzzy matches attach: attach querexfuzz method to df (default True)

querexfuzz.querexfuzz_help() str[source]

Help on the grammar.


Querexfuzz class

class querexfuzz.Querexfuzz(*, config_path: str | Path | None = None, **kwargs)[source]

Bases: object

Manages configuration and attachment of the .querex method.

static parse(expr)[source]

Convenience method for testing parser.

attach_to(df: DataFrame, method_name: str | None = None, alias: str | None = 'q', mutable: bool = False) DataFrame[source]

Attaches the query method to a DataFrame instance.

Parameters:
  • df (pd.DataFrame) – The DataFrame to modify.

  • alias (str | None, optional) – A short alias for the query method. Set to ‘q’ by default. If None or ‘’, no alias is created. Defaults to ‘q’.

  • mutable (bool) – If True, the fuzzy matcher is rebuilt on every call instead of being cached. Use when the DataFrame’s contents change between queries. Defaults to False.

Returns:

The same DataFrame, now with the query method attached.

Return type:

pd.DataFrame


Configuration

class querexfuzz.config.QuerexfuzzConfig(*, base_cols: list[str] = <factory>, bang_field: str | None = None, date_fields: list[str] = <factory>, default_date_field: str | None = None, recent_field: str | None = None, fuzzy: FuzzyConfig = <factory>)[source]

Bases: BaseModel

Main configuration for the Querexfuzz engine.

base_cols: list[str]
bang_field: str | None
date_fields: list[str]
default_date_field: str | None
recent_field: str | None
fuzzy: FuzzyConfig
classmethod from_yaml(path: Path)[source]

Loads configuration from a YAML file.

model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class querexfuzz.config.FuzzyConfig(*, fields: list[str] | Literal['all'] = 'all', limit: int = 100, score_col_name: str = 'score', highlight: bool = True)[source]

Bases: BaseModel

Configuration for the fuzzy searcher.

fields: list[str] | Literal['all']
limit: int
score_col_name: str
highlight: bool
model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].


Parser

querexfuzz.parser.parser(text: str) dict[source]

Parses a querexfuzz query string into a specification dictionary.


Engine

querexfuzz.engine.execute_query(df: DataFrame, spec: dict, config: QuerexfuzzConfig, engine=None) DataFrame[source]

Apply the parsed query specification to a DataFrame.

Filter operations (WHERE, regex, date range, sort, head/tail, column select) all return new DataFrames, so no upfront copy is needed. The only in-place write is date-column type coercion; a copy is made there, after prior filters have already reduced the DataFrame.

original_df is kept as a reference to the full unfiltered frame so the fuzzy matcher (built once on engine._fuzzy_matcher) always searches the complete data set; pre-filter results are intersected afterwards.

class querexfuzz.engine.QuerexfuzzConfigurationWarning[source]

Bases: UserWarning

Warning raised when Querexfuzz configuration is unusual or suboptimal.


Date utilities

querexfuzz.dates.resolve_date_range(spec: dict) tuple[datetime, datetime][source]

Converts a date spec from the parser into a (start, end) datetime tuple.