src.cleaning#

Module Contents#

Classes#

CleaningParam

Define the threshold and the condition for Cleaning.

Cleaning

Query the data from agreements that match the conditions. Conditions are set by

class src.cleaning.CleaningParam#

Bases: pydantic.BaseModel

Define the threshold and the condition for Cleaning.

Attributes:

threshold float keep If it is “above”, the condition will be the column grater equal than threshold. If it is “below”, the condition will be the column less equal than threshold.

threshold: float#
keep: Literal[above, below]#
class src.cleaning.Cleaning#

Bases: pydantic.BaseModel

Query the data from agreements that match the conditions. Conditions are set by initiating CleaningParam.

Attributes:

stats a dictionary holds column name and CleaningParam object match If it is “all”, the Cleaning query the data that match all conditions. If it is “any”, the queried data can only match one of the condition.

stats: dict[str, CleaningParam]#
match: Literal[all, any]#
__call__(agreements: pandas.DataFrame) pandas.DataFrame#

This object is callable.

Parameters:

agreements (DataFrame) – input dataframe

Returns:

output that match the conditions

Return type:

DataFrame