Sources

A data source can be considered a repository or a database that the program should search for particular entries. The entries in turn can represent individual objects, records or items in the database.

Data structures and functions

The everything function serves a default filtering condition for entries retrieval from data sources. This function simply accepts a parameter of Position object and always returns True. This essentially means that, in absence of a specific predicate, all entries in the designated search area (Position) will be returned.

The SourcePort protocol defines the standard for source adapter objects that are used to retrieve entries from data sources. An object implementing this protocol must provide a callable interface where the call should accept a location (of Location type) and a predicate (of Callable type). The predicate should be a function that determines whether a specific entry (based on its position) should be included in the results or not.

The SourceAdapter is essentially a type alias for the SourcePort protocol. It aims to establish uniformity in referring to the protocol, especially in the context of implementing the actual source adapters.

Finally, the SourceConfig dataclass is used to handle source configurations. This is an abstract dataclass, and it is expected to be subclassed by actual source configuration classes for different types of sources. It mandates that any subclass should implement a location_type method that would reveal the type of the location data relevant to that source.

Catalog of named sources

jsonlike

Required Configuration

None defined

Optional Configuration

Name

Default

file_format

JSONLikeFormat.JSON

encoding

utf-8

errors

surrogateescape

tokenizer

delimiter

tokenizer_config

<class ‘dict’>

tabular

Required Configuration

None defined

Optional Configuration

Name

Default

encoding

utf-8

errors

surrogateescape

delimiter

,

textfile

Required Configuration

None defined

Optional Configuration

Name

Default

encoding

utf-8

errors

surrogateescape

tokenizer

delimiter

tokenizer_config

<class ‘dict’>

token_separators

None