synthesized3.meta package#

class synthesized3.meta.Meta#

Bases: ABC, BaseModel

Abstract base class & Pydantic model to store meta info about a tabular data column.

Parameters:
  • name (str) – Name of the column the meta information being stored is about.

  • nature (Nature) – Relates to whether the data is interpreted categorically or continuously

  • observed_dtype (ColumnType) – The storage datatype of the column, converted to our internal column type ColumnType. E.g. data stored in a database as a VARCHAR(24) here would be saved as STRING, or similar.

  • missing_value_meta (MissingValueMeta) – Optional instance of the MissingValueMeta submeta if it exists for this column.

name: str#
observed_dtype: ColumnType#
nature: Nature#
missing_value_meta: Meta | None#
get_sub_metas()#

Recursively get the sub metas.

Parameters:

meta (Meta) – Meta to get sub metas from.

Returns:

List of sub metas.

Return type:

List[Meta]

class synthesized3.meta.MetaCollection#

Bases: Mapping[str, Meta]

A class to store a collection of Meta objects. This class is iterable and is indexed by the names of columns in tables. Notably, the keys are specific to the original data_interface and not to any expanded or transformed data interfaces. Columns that are sub-columns of original data columns (e.g. _nan columns for columns that contain NaNs, or _dow, _month, … columns for datetime columns) are stored in the submeta field of the original column’s Meta object.

__init__(metas: Sequence[Meta], num_rows: int)#
property metas: List[Meta]#
property num_rows: int#
classmethod from_data_interface(data_interface: DataInterface, meta_overrides: Mapping[str, Type[Meta]] | None = None, num_quantiles: PositiveInt = 100) MetaCollection#

Factory for creating MetaCollection objects

class synthesized3.meta.BooleanMeta#

Bases: Meta

Meta class to store information about columns which are to be treated as containing booleans

categories: List[Any]#
frequencies: List[float]#
nature: Literal[Nature.CATEGORICAL]#
missing_value_meta: MissingValueMeta | None#
classmethod freqs_sum_to_one(values)#
class synthesized3.meta.CategoricalMeta#

Bases: Meta

Meta class to store information about columns which are to be treated as containing categories

categories: List[Any]#
frequencies: List[float]#
nature: Literal[Nature.CATEGORICAL]#
missing_value_meta: MissingValueMeta | None#
classmethod freqs_sum_to_one(values)#
classmethod len_freqs_categs_equal(values)#
property num_categories#
class synthesized3.meta.ConstantMeta#

Bases: Meta

Meta class to store information about columns which contain a single non-missing value

categories: List[Any]#
nature: Literal[Nature.CATEGORICAL]#
missing_value_meta: MissingValueMeta | None#
class synthesized3.meta.DatetimeMeta#

Bases: Meta

Metaclass to store information about columns which are to be treated as containing datetimes. For now, restricted to be continuous in nature. The unix_meta property can be an instance of DoubleMeta or LongMeta depending on the required accuracy of the unix timestamp

unix_meta: DoubleMeta#
hour_meta: CategoricalMeta | ConstantMeta#
dow_meta: CategoricalMeta | ConstantMeta#
day_meta: CategoricalMeta | ConstantMeta#
month_meta: CategoricalMeta | ConstantMeta#
year_meta: CategoricalMeta | ConstantMeta#
missing_value_meta: MissingValueMeta | None#
nature: Literal[Nature.CONTINUOUS]#
classmethod check_categories_are_within_allowed_limits(values)#
get_sub_metas()#

Recursively get the sub metas.

Parameters:

meta (Meta) – Meta to get sub metas from.

Returns:

List of sub metas.

Return type:

List[Meta]

class synthesized3.meta.DoubleMeta#

Bases: Meta

Meta class to store information about columns which are to be treated as containing floats

quantiles: List[float]#
missing_value_meta: MissingValueMeta | None#
nature: Literal[Nature.CONTINUOUS]#
class synthesized3.meta.FloatMeta#

Bases: Meta

Meta class to store information about columns which are to be treated as containing floats

quantiles: List[float]#
nature: Literal[Nature.CONTINUOUS]#
missing_value_meta: MissingValueMeta | None#
class synthesized3.meta.IntegerMeta#

Bases: Meta

Meta class to store information about columns which are to be treated as containing integers

quantiles: List[float]#
nature: Literal[Nature.CONTINUOUS]#
missing_value_meta: MissingValueMeta | None#
class synthesized3.meta.LongMeta#

Bases: Meta

Meta class to store information about columns which are to be treated as containing integers

quantiles: List[float]#
missing_value_meta: MissingValueMeta | None#
nature: Literal[Nature.CONTINUOUS]#

Subpackages#

Submodules#

synthesized3.meta.meta module#

class synthesized3.meta.meta.Meta#

Bases: ABC, BaseModel

Abstract base class & Pydantic model to store meta info about a tabular data column.

Parameters:
  • name (str) – Name of the column the meta information being stored is about.

  • nature (Nature) – Relates to whether the data is interpreted categorically or continuously

  • observed_dtype (ColumnType) – The storage datatype of the column, converted to our internal column type ColumnType. E.g. data stored in a database as a VARCHAR(24) here would be saved as STRING, or similar.

  • missing_value_meta (MissingValueMeta) – Optional instance of the MissingValueMeta submeta if it exists for this column.

name: str#
observed_dtype: ColumnType#
nature: Nature#
missing_value_meta: Meta | None#
get_sub_metas()#

Recursively get the sub metas.

Parameters:

meta (Meta) – Meta to get sub metas from.

Returns:

List of sub metas.

Return type:

List[Meta]

synthesized3.meta.meta_collection module#

class synthesized3.meta.meta_collection.MetaCollection#

Bases: Mapping[str, Meta]

A class to store a collection of Meta objects. This class is iterable and is indexed by the names of columns in tables. Notably, the keys are specific to the original data_interface and not to any expanded or transformed data interfaces. Columns that are sub-columns of original data columns (e.g. _nan columns for columns that contain NaNs, or _dow, _month, … columns for datetime columns) are stored in the submeta field of the original column’s Meta object.

__init__(metas: Sequence[Meta], num_rows: int)#
property metas: List[Meta]#
property num_rows: int#
classmethod from_data_interface(data_interface: DataInterface, meta_overrides: Mapping[str, Type[Meta]] | None = None, num_quantiles: PositiveInt = 100) MetaCollection#

Factory for creating MetaCollection objects

synthesized3.meta.meta_collection_test module#

synthesized3.meta.meta_factory module#

synthesized3.meta.meta_factory_test module#

synthesized3.meta.meta_test module#