ads.feature_engineering.feature_type.handler package

Submodules

ads.feature_engineering.feature_type.handler.feature_validator module

The module that helps to register custom validators for the feature types and extending registered validators with dispatching based on the specific arguments.

Classes

FeatureValidator

The Feature Validator class to manage custom validators.

FeatureValidatorMethod

The Feature Validator Method class. Extends methods which requires dispatching based on the specific arguments.

class ads.feature_engineering.feature_type.handler.feature_validator.FeatureValidator[source]

Bases: object

The Feature Validator class to manage custom validators.

register(self, name: str, handler: Callable, condition: Tuple | Dict[str, Any] = None, replace: bool = False) None[source]

Registers new validator.

unregister(self, name: str, condition: Tuple | Dict[str, Any] = None) None[source]

Unregisters validator.

registered(self) pd.DataFrame[source]

Gets the list of registered validators.

Examples

>>> series = pd.Series(['+1-202-555-0141', '+1-202-555-0142'], name='Phone Number')
>>> def phone_number_validator(data: pd.Series) -> pd.Series:
...    print("phone_number_validator")
...    return data
>>> def universal_phone_number_validator(data: pd.Series, country_code) -> pd.Series:
...    print("universal_phone_number_validator")
...    return data
>>> def us_phone_number_validator(data: pd.Series, country_code) -> pd.Series:
...    print("us_phone_number_validator")
...    return data
>>> PhoneNumber.validator.register(name="is_phone_number", handler=phone_number_validator, replace=True)
>>> PhoneNumber.validator.register(name="is_phone_number", handler=universal_phone_number_validator, condition = ('country_code',))
>>> PhoneNumber.validator.register(name="is_phone_number", handler=us_phone_number_validator, condition = {'country_code':'+1'})
>>> PhoneNumber.validator.is_phone_number(series)
    phone_number_validator
    0     +1-202-555-0141
    1     +1-202-555-0142
>>> PhoneNumber.validator.is_phone_number(series, country_code = '+7')
    universal_phone_number_validator
    0     +1-202-555-0141
    1     +1-202-555-0142
>>> PhoneNumber.validator.is_phone_number(series, country_code = '+1')
    us_phone_number_validator
    0     +1-202-555-0141
    1     +1-202-555-0142
>>> PhoneNumber.validator.registered()
               Validator                 Condition                            Handler
    ---------------------------------------------------------------------------------
    0    is_phone_number                        ()             phone_number_validator
    1    is_phone_number          ('country_code')   universal_phone_number_validator
    2    is_phone_number    {'country_code': '+1'}          us_phone_number_validator
>>> series.ads.validator.is_phone_number()
    phone_number_validator
        0     +1-202-555-0141
        1     +1-202-555-0142
>>> series.ads.validator.is_phone_number(country_code = '+7')
    universal_phone_number_validator
        0     +1-202-555-0141
        1     +1-202-555-0142
>>> series.ads.validator.is_phone_number(country_code = '+1')
    us_phone_number_validator
    0     +1-202-555-0141
    1     +1-202-555-0142

Initializes the FeatureValidator.

register(name: str, handler: Callable, condition: Tuple | Dict[str, Any] | None = None, replace: bool = False) None[source]

Registers new validator.

Parameters:
  • name (str) – The validator name.

  • handler (callable) – The handler.

  • condition (Union[Tuple, Dict[str, Any]]) – The condition for the validator.

  • replace (bool) – The flag indicating if the registered validator should be replaced with the new one.

Returns:

Nothing.

Return type:

None

Raises:
  • ValueError – The name is empty or handler is not provided.

  • TypeError – The handler is not callable. The name of the validator is not a string.

  • ValidatorAlreadyExists – The validator is already registered.

registered() DataFrame[source]

Gets the list of registered validators.

Returns:

The list of registerd validators.

Return type:

pd.DataFrame

unregister(name: str, condition: Tuple | Dict[str, Any] | None = None) None[source]

Unregisters validator.

Parameters:
  • name (str) – The name of the validator to be unregistered.

  • condition (Union[Tuple, Dict[str, Any]]) – The condition for the validator to be unregistered.

Returns:

Nothing.

Return type:

None

Raises:
  • TypeError – The name of the validator is not a string.

  • ValidatorNotFound – The validator not found.

  • ValidatorWIthConditionNotFound – The validator with provided condition not found.

class ads.feature_engineering.feature_type.handler.feature_validator.FeatureValidatorMethod(handler: Callable)[source]

Bases: object

The Feature Validator Method class.

Extends methods which requires dispatching based on the specific arguments.

register(self, condition: Tuple | Dict[str, Any], handler: Callable) None[source]

Registers new handler.

unregister(self, condition: Tuple | Dict[str, Any]) None[source]

Unregisters existing handler.

registered(self) pd.DataFrame[source]

Gets the list of registered handlers.

Initializes the Feature Validator Method.

Parameters:

handler (Callable) – The handler that will be called by default if suitable one not found.

register(condition: Tuple | Dict[str, Any], handler: Callable) None[source]

Registers new handler.

Parameters:
  • condition (Union[Tuple, Dict[str, Any]]) – The condition which will be used to register a new handler.

  • handler (Callable) – The handler to be registered.

Returns:

Nothing.

Return type:

None

Raises:

ValueError – If condition not provided or provided in the wrong format. If handler not provided or has wrong format.

registered() DataFrame[source]

Gets the list of registered handlers.

Returns:

The list of registerd handlers.

Return type:

pd.DataFrame

unregister(condition: Tuple | Dict[str, Any]) None[source]

Unregisters existing handler.

Parameters:

condition (Union[Tuple, Dict[str, Any]]) – The condition which will be used to unregister a handler.

Returns:

Nothing.

Return type:

None

Raises:

ValueError – If condition not provided or provided in the wrong format. If condition not registered.

exception ads.feature_engineering.feature_type.handler.feature_validator.ValidatorAlreadyExists(name: str)[source]

Bases: ValueError

exception ads.feature_engineering.feature_type.handler.feature_validator.ValidatorNotFound(name: str)[source]

Bases: ValueError

exception ads.feature_engineering.feature_type.handler.feature_validator.ValidatorWithConditionAlreadyExists(name: str)[source]

Bases: ValueError

exception ads.feature_engineering.feature_type.handler.feature_validator.ValidatorWithConditionNotFound(name: str)[source]

Bases: ValueError

exception ads.feature_engineering.feature_type.handler.feature_validator.WrongHandlerMethodSignature(handler_name: str, condition: str, handler_signature: str)[source]

Bases: ValueError

ads.feature_engineering.feature_type.handler.feature_warning module

The module that helps to register custom warnings for the feature types.

Classes

FeatureWarning

The Feature Warning class. Provides functionality to register warning handlers and invoke them.

Examples

>>> warning = FeatureWarning()
>>> def warning_handler_zeros_count(data):
...    return pd.DataFrame(
...        [['Zeros', 'Age has 38 zeros', 'Count', 38]],
...        columns=['Warning', 'Message', 'Metric', 'Value'])
>>> def warning_handler_zeros_percentage(data):
...    return pd.DataFrame(
...        [['Zeros', 'Age has 12.2% zeros', 'Percentage', '12.2%']],
...        columns=['Warning', 'Message', 'Metric', 'Value'])
>>> warning.register(name="zeros_count", handler=warning_handler_zeros_count)
>>> warning.register(name="zeros_percentage", handler=warning_handler_percentage)
>>> warning.registered()
                    Name                               Handler
    ----------------------------------------------------------
    0         zeros_count          warning_handler_zeros_count
    1    zeros_percentage     warning_handler_zeros_percentage
>>> warning.zeros_percentage(data_series)
             Warning               Message         Metric      Value
    ----------------------------------------------------------------
    0          Zeros      Age has 38 zeros          Count         38
>>> warning.zeros_count(data_series)
             Warning               Message         Metric      Value
    ----------------------------------------------------------------
    1          Zeros   Age has 12.2% zeros     Percentage      12.2%
>>> warning(data_series)
        Warning                    Message         Metric      Value
    ----------------------------------------------------------------
    0          Zeros      Age has 38 zeros          Count         38
    1          Zeros   Age has 12.2% zeros     Percentage      12.2%
>>> warning.unregister('zeros_count')
>>> warning(data_series)
             Warning               Message         Metric      Value
    ----------------------------------------------------------------
    0          Zeros   Age has 12.2% zeros     Percentage      12.2%
class ads.feature_engineering.feature_type.handler.feature_warning.FeatureWarning[source]

Bases: object

The Feature Warning class.

Provides functionality to register warning handlers and invoke them.

register(self, name: str, handler: Callable) None[source]

Registers a new warning for the feature type.

unregister(self, name: str) None[source]

Unregisters warning.

registered(self) pd.DataFrame[source]

Gets the list of registered warnings.

Examples

>>> warning = FeatureWarning()
>>> def warning_handler_zeros_count(data):
...    return pd.DataFrame(
...        [['Zeros', 'Age has 38 zeros', 'Count', 38]],
...        columns=['Warning', 'Message', 'Metric', 'Value'])
>>> def warning_handler_zeros_percentage(data):
...    return pd.DataFrame(
...        [['Zeros', 'Age has 12.2% zeros', 'Percentage', '12.2%']],
...        columns=['Warning', 'Message', 'Metric', 'Value'])
>>> warning.register(name="zeros_count", handler=warning_handler_zeros_count)
>>> warning.register(name="zeros_percentage", handler=warning_handler_percentage)
>>> warning.registered()
                  Warning                              Handler
    ----------------------------------------------------------
    0         zeros_count          warning_handler_zeros_count
    1    zeros_percentage     warning_handler_zeros_percentage
>>> warning.zeros_percentage(data_series)
             Warning               Message         Metric      Value
    ----------------------------------------------------------------
    0          Zeros      Age has 38 zeros          Count         38
>>> warning.zeros_count(data_series)
              Warning              Message         Metric      Value
    ----------------------------------------------------------------
    1          Zeros   Age has 12.2% zeros     Percentage      12.2%
>>> warning(data_series)
             Warning               Message         Metric      Value
    ----------------------------------------------------------------
    0          Zeros      Age has 38 zeros          Count         38
    1          Zeros   Age has 12.2% zeros     Percentage      12.2%
>>> warning.unregister('zeros_count')
>>> warning(data_series)
             Warning               Message         Metric      Value
    ----------------------------------------------------------------
    0          Zeros   Age has 12.2% zeros     Percentage      12.2%

Initializes the FeatureWarning.

register(name: str, handler: Callable, replace: bool = False) None[source]

Registers a new warning.

Parameters:
  • name (str) – The warning name.

  • handler (callable) – The handler associated with the warning.

  • replace (bool) – The flag indicating if the registered warning should be replaced with the new one.

Returns:

Nothing

Return type:

None

Raises:
registered() DataFrame[source]

Gets the list of registered warnings.

Return type:

pd.DataFrame

Examples

>>>    The list of registerd warnings in DataFrame format.
                     Name                               Handler
    -----------------------------------------------------------
    0         zeros_count           warning_handler_zeros_count
    1    zeros_percentage      warning_handler_zeros_percentage
unregister(name: str) None[source]

Unregisters warning.

Parameters:

name (str) – The name of warning to be unregistered.

Returns:

Nothing.

Return type:

None

Raises:

ads.feature_engineering.feature_type.handler.warnings module

The module with all default warnings provided to user. These are registered to relevant feature types directly in the feature type files themselves.

ads.feature_engineering.feature_type.handler.warnings.high_cardinality_handler(s: Series) DataFrame[source]

Warning if number of unique values (including Nan) in series is greater than or equal to 15.

Parameters:

s (pd.Series) – Pandas series - column of some feature type.

Returns:

Dataframe with 4 columns ‘Warning’, ‘Message’, ‘Metric’, ‘Value’ and 1 rows, which lists count of unique values.

Return type:

pd.Dataframe

ads.feature_engineering.feature_type.handler.warnings.missing_values_handler(s: Series) DataFrame[source]

Warning for > 5 percent missing values (Nans) in series.

Parameters:

s (pd.Series) – Pandas series - column of some feature type.

Returns:

Dataframe with 4 columns ‘Warning’, ‘Message’, ‘Metric’, ‘Value’ and 2 rows, where first row is count of missing values and second is percentage of missing values.

Return type:

pd.Dataframe

ads.feature_engineering.feature_type.handler.warnings.skew_handler(s: Series) DataFrame[source]

Warning if absolute value of skew is greater than 1.

Parameters:

s (pd.Series) – Pandas series - column of some feature type, expects continuous values.

Returns:

Dataframe with 4 columns ‘Warning’, ‘Message’, ‘Metric’, ‘Value’ and 1 rows, which lists skew value of that column.

Return type:

pd.Dataframe

ads.feature_engineering.feature_type.handler.warnings.zeros_handler(s: Series) DataFrame[source]

Warning for greater than 10 percent zeros in series.

Parameters:

s (pd.Series) – Pandas series - column of some feature type.

Returns:

Dataframe with 4 columns ‘Warning’, ‘Message’, ‘Metric’, ‘Value’ and 2 rows, where first row is count of zero values and second is percentage of zero values.

Return type:

pd.Dataframe

Module contents