package

dframeio

Top-level package for dataframe-io.

module

dframeio.filter

Filter expressions for data reading operations with predicate pushdown.

This module is responsible for translate filter expressions from a simplified SQL syntax into different formats understood by the various backends. This way the same language can be used to implement filtering regardless of the data source.

The grammar of the filter statements is the same as in a WHERE clause in SQL. Supported features:

  • Comparing column values to numbers, strings and another column's values using the operators > < = != >= <=, e.g. a.column < 5
  • Comparison against a set of values with ÌN and NOT IN, e.g. a.column IN (1, 2, 3)
  • Boolean combination of conditions with AND, OR and ǸOT
  • NULL comparison as in a IS NULL or b IS NOT NULL

Strings can be quoted with single-quotes and double-quotes. Column names can but don't have to be quoted with SQL quotes (backticks). E.g.:

`a.column` = "abc" AND b IS NOT NULL OR index < 50
Functions
  • to_prefix_notation(statement) (str) Parse a filter statement and return it in prefix notation.</>
  • to_psql(statement) (str) Convert a filter statement to Postgres SQL syntax</>
  • to_pyarrow_dnf(statement) (Union(list of list of (str, str, any), list of (str, str, any), (str, str, any))) Convert a filter statement to the disjunctive normal form understood by pyarrow</>
module

dframeio.abstract

Abstract interfaces for all storage backends

Classes
module

dframeio.parquet

Access parquet datasets using pyarrow.

Classes
module

dframeio.postgres

Access PostgreSQL databases using psycopg3.

Classes