ads.feature_engineering.feature_type.handler package#

Submodules#

ads.feature_engineering.feature_type.handler.feature_validator module#

The module that helps to register custom validators for the feature types and extending registered validators with dispatching based on the specific arguments.

Classes#

FeatureValidator

The Feature Validator class to manage custom validators.

FeatureValidatorMethod

The Feature Validator Method class. Extends methods which requires dispatching based on the specific arguments.

class ads.feature_engineering.feature_type.handler.feature_validator.FeatureValidator[source]#

Bases: object

The Feature Validator class to manage custom validators.

register(self, name: str, handler: Callable, condition: Tuple | Dict[str, Any] = None, replace: bool = False) None[source]#

Registers new validator.

unregister(self, name: str, condition: Tuple | Dict[str, Any] = None) None[source]#

Unregisters validator.

registered(self) pd.DataFrame[source]#

Gets the list of registered validators.

Examples

>>> series = pd.Series(['+1-202-555-0141', '+1-202-555-0142'], name='Phone Number')
>>> def phone_number_validator(data: pd.Series) -> pd.Series:
...    print("phone_number_validator")
...    return data
>>> def universal_phone_number_validator(data: pd.Series, country_code) -> pd.Series:
...    print("universal_phone_number_validator")
...    return data
>>> def us_phone_number_validator(data: pd.Series, country_code) -> pd.Series:
...    print("us_phone_number_validator")
...    return data
>>> PhoneNumber.validator.register(name="is_phone_number", handler=phone_number_validator, replace=True)
>>> PhoneNumber.validator.register(name="is_phone_number", handler=universal_phone_number_validator, condition = ('country_code',))
>>> PhoneNumber.validator.register(name="is_phone_number", handler=us_phone_number_validator, condition = {'country_code':'+1'})
>>> PhoneNumber.validator.is_phone_number(series)
    phone_number_validator
    0     +1-202-555-0141
    1     +1-202-555-0142
>>> PhoneNumber.validator.is_phone_number(series, country_code = '+7')
    universal_phone_number_validator
    0     +1-202-555-0141
    1     +1-202-555-0142
>>> PhoneNumber.validator.is_phone_number(series, country_code = '+1')
    us_phone_number_validator
    0     +1-202-555-0141
    1     +1-202-555-0142
>>> PhoneNumber.validator.registered()
               Validator                 Condition                            Handler
    ---------------------------------------------------------------------------------
    0    is_phone_number                        ()             phone_number_validator
    1    is_phone_number          ('country_code')   universal_phone_number_validator
    2    is_phone_number    {'country_code': '+1'}          us_phone_number_validator
>>> series.ads.validator.is_phone_number()
    phone_number_validator
        0     +1-202-555-0141
        1     +1-202-555-0142
>>> series.ads.validator.is_phone_number(country_code = '+7')
    universal_phone_number_validator
        0     +1-202-555-0141
        1     +1-202-555-0142
>>> series.ads.validator.is_phone_number(country_code = '+1')
    us_phone_number_validator
    0     +1-202-555-0141
    1     +1-202-555-0142

Initializes the FeatureValidator.

register(name: str, handler: Callable, condition: Tuple | Dict[str, Any] | None = None, replace: bool = False) None[source]#

Registers new validator.

Parameters:
  • name (str) – The validator name.

  • handler (callable) – The handler.

  • condition (Union[Tuple, Dict[str, Any]]) – The condition for the validator.

  • replace (bool) – The flag indicating if the registered validator should be replaced with the new one.

Returns:

Nothing.

Return type:

None

Raises:
  • ValueError – The name is empty or handler is not provided.

  • TypeError – The handler is not callable. The name of the validator is not a string.

  • ValidatorAlreadyExists – The validator is already registered.

registered() DataFrame[source]#

Gets the list of registered validators.

Returns:

The list of registerd validators.

Return type:

pd.DataFrame

unregister(name: str, condition: Tuple | Dict[str, Any] | None = None) None[source]#

Unregisters validator.

Parameters:
  • name (str) – The name of the validator to be unregistered.

  • condition (Union[Tuple, Dict[str, Any]]) – The condition for the validator to be unregistered.

Returns:

Nothing.

Return type:

None

Raises:
  • TypeError – The name of the validator is not a string.

  • ValidatorNotFound – The validator not found.

  • ValidatorWIthConditionNotFound – The validator with provided condition not found.

class ads.feature_engineering.feature_type.handler.feature_validator.FeatureValidatorMethod(handler: Callable)[source]#

Bases: object

The Feature Validator Method class.

Extends methods which requires dispatching based on the specific arguments.

register(self, condition: Tuple | Dict[str, Any], handler: Callable) None[source]#

Registers new handler.

unregister(self, condition: Tuple | Dict[str, Any]) None[source]#

Unregisters existing handler.

registered(self) pd.DataFrame[source]#

Gets the list of registered handlers.

Initializes the Feature Validator Method.

Parameters:

handler (Callable) – The handler that will be called by default if suitable one not found.

register(condition: Tuple | Dict[str, Any], handler: Callable) None[source]#

Registers new handler.

Parameters:
  • condition (Union[Tuple, Dict[str, Any]]) – The condition which will be used to register a new handler.

  • handler (Callable) – The handler to be registered.

Returns:

Nothing.

Return type:

None

Raises:

ValueError – If condition not provided or provided in the wrong format. If handler not provided or has wrong format.

registered() DataFrame[source]#

Gets the list of registered handlers.

Returns:

The list of registerd handlers.

Return type:

pd.DataFrame

unregister(condition: Tuple | Dict[str, Any]) None[source]#

Unregisters existing handler.

Parameters:

condition (Union[Tuple, Dict[str, Any]]) – The condition which will be used to unregister a handler.

Returns:

Nothing.

Return type:

None

Raises:

ValueError – If condition not provided or provided in the wrong format. If condition not registered.

exception ads.feature_engineering.feature_type.handler.feature_validator.ValidatorAlreadyExists(name: str)[source]#

Bases: ValueError

exception ads.feature_engineering.feature_type.handler.feature_validator.ValidatorNotFound(name: str)[source]#

Bases: ValueError

exception ads.feature_engineering.feature_type.handler.feature_validator.ValidatorWithConditionAlreadyExists(name: str)[source]#

Bases: ValueError

exception ads.feature_engineering.feature_type.handler.feature_validator.ValidatorWithConditionNotFound(name: str)[source]#

Bases: ValueError

exception ads.feature_engineering.feature_type.handler.feature_validator.WrongHandlerMethodSignature(handler_name: str, condition: str, handler_signature: str)[source]#

Bases: ValueError

ads.feature_engineering.feature_type.handler.feature_warning module#

The module that helps to register custom warnings for the feature types.

Classes#

FeatureWarning

The Feature Warning class. Provides functionality to register warning handlers and invoke them.

Examples

>>> warning = FeatureWarning()
>>> def warning_handler_zeros_count(data):
...    return pd.DataFrame(
...        [['Zeros', 'Age has 38 zeros', 'Count', 38]],
...        columns=['Warning', 'Message', 'Metric', 'Value'])
>>> def warning_handler_zeros_percentage(data):
...    return pd.DataFrame(
...        [['Zeros', 'Age has 12.2% zeros', 'Percentage', '12.2%']],
...        columns=['Warning', 'Message', 'Metric', 'Value'])
>>> warning.register(name="zeros_count", handler=warning_handler_zeros_count)
>>> warning.register(name="zeros_percentage", handler=warning_handler_percentage)
>>> warning.registered()
                    Name                               Handler
    ----------------------------------------------------------
    0         zeros_count          warning_handler_zeros_count
    1    zeros_percentage     warning_handler_zeros_percentage
>>> warning.zeros_percentage(data_series)
             Warning               Message         Metric      Value
    ----------------------------------------------------------------
    0          Zeros      Age has 38 zeros          Count         38
>>> warning.zeros_count(data_series)
             Warning               Message         Metric      Value
    ----------------------------------------------------------------
    1          Zeros   Age has 12.2% zeros     Percentage      12.2%
>>> warning(data_series)
        Warning                    Message         Metric      Value
    ----------------------------------------------------------------
    0          Zeros      Age has 38 zeros          Count         38
    1          Zeros   Age has 12.2% zeros     Percentage      12.2%
>>> warning.unregister('zeros_count')
>>> warning(data_series)
             Warning               Message         Metric      Value
    ----------------------------------------------------------------
    0          Zeros   Age has 12.2% zeros     Percentage      12.2%
class ads.feature_engineering.feature_type.handler.feature_warning.FeatureWarning[source]#

Bases: object

The Feature Warning class.

Provides functionality to register warning handlers and invoke them.

register(self, name: str, handler: Callable) None[source]#

Registers a new warning for the feature type.

unregister(self, name: str) None[source]#

Unregisters warning.

registered(self) pd.DataFrame[source]#

Gets the list of registered warnings.

Examples

>>> warning = FeatureWarning()
>>> def warning_handler_zeros_count(data):
...    return pd.DataFrame(
...        [['Zeros', 'Age has 38 zeros', 'Count', 38]],
...        columns=['Warning', 'Message', 'Metric', 'Value'])
>>> def warning_handler_zeros_percentage(data):
...    return pd.DataFrame(
...        [['Zeros', 'Age has 12.2% zeros', 'Percentage', '12.2%']],
...        columns=['Warning', 'Message', 'Metric', 'Value'])
>>> warning.register(name="zeros_count", handler=warning_handler_zeros_count)
>>> warning.register(name="zeros_percentage", handler=warning_handler_percentage)
>>> warning.registered()
                  Warning                              Handler
    ----------------------------------------------------------
    0         zeros_count          warning_handler_zeros_count
    1    zeros_percentage     warning_handler_zeros_percentage
>>> warning.zeros_percentage(data_series)
             Warning               Message         Metric      Value
    ----------------------------------------------------------------
    0          Zeros      Age has 38 zeros          Count         38
>>> warning.zeros_count(data_series)
              Warning              Message         Metric      Value
    ----------------------------------------------------------------
    1          Zeros   Age has 12.2% zeros     Percentage      12.2%
>>> warning(data_series)
             Warning               Message         Metric      Value
    ----------------------------------------------------------------
    0          Zeros      Age has 38 zeros          Count         38
    1          Zeros   Age has 12.2% zeros     Percentage      12.2%
>>> warning.unregister('zeros_count')
>>> warning(data_series)
             Warning               Message         Metric      Value
    ----------------------------------------------------------------
    0          Zeros   Age has 12.2% zeros     Percentage      12.2%

Initializes the FeatureWarning.

register(name: str, handler: Callable, replace: bool = False) None[source]#

Registers a new warning.

Parameters:
  • name (str) – The warning name.

  • handler (callable) – The handler associated with the warning.

  • replace (bool) – The flag indicating if the registered warning should be replaced with the new one.

Returns:

Nothing

Return type:

None

Raises:
registered() DataFrame[source]#

Gets the list of registered warnings.

Return type:

pd.DataFrame

Examples

>>>    The list of registerd warnings in DataFrame format.
                     Name                               Handler
    -----------------------------------------------------------
    0         zeros_count           warning_handler_zeros_count
    1    zeros_percentage      warning_handler_zeros_percentage
unregister(name: str) None[source]#

Unregisters warning.

Parameters:

name (str) – The name of warning to be unregistered.

Returns:

Nothing.

Return type:

None

Raises:

ads.feature_engineering.feature_type.handler.warnings module#

The module with all default warnings provided to user. These are registered to relevant feature types directly in the feature type files themselves.

ads.feature_engineering.feature_type.handler.warnings.high_cardinality_handler(s: Series) DataFrame[source]#

Warning if number of unique values (including Nan) in series is greater than or equal to 15.

Parameters:

s (pd.Series) – Pandas series - column of some feature type.

Returns:

Dataframe with 4 columns ‘Warning’, ‘Message’, ‘Metric’, ‘Value’ and 1 rows, which lists count of unique values.

Return type:

pd.Dataframe

ads.feature_engineering.feature_type.handler.warnings.missing_values_handler(s: Series) DataFrame[source]#

Warning for > 5 percent missing values (Nans) in series.

Parameters:

s (pd.Series) – Pandas series - column of some feature type.

Returns:

Dataframe with 4 columns ‘Warning’, ‘Message’, ‘Metric’, ‘Value’ and 2 rows, where first row is count of missing values and second is percentage of missing values.

Return type:

pd.Dataframe

ads.feature_engineering.feature_type.handler.warnings.skew_handler(s: Series) DataFrame[source]#

Warning if absolute value of skew is greater than 1.

Parameters:

s (pd.Series) – Pandas series - column of some feature type, expects continuous values.

Returns:

Dataframe with 4 columns ‘Warning’, ‘Message’, ‘Metric’, ‘Value’ and 1 rows, which lists skew value of that column.

Return type:

pd.Dataframe

ads.feature_engineering.feature_type.handler.warnings.zeros_handler(s: Series) DataFrame[source]#

Warning for greater than 10 percent zeros in series.

Parameters:

s (pd.Series) – Pandas series - column of some feature type.

Returns:

Dataframe with 4 columns ‘Warning’, ‘Message’, ‘Metric’, ‘Value’ and 2 rows, where first row is count of zero values and second is percentage of zero values.

Return type:

pd.Dataframe

Module contents#