ads.common package

Submodules

ads.common.card_identifier module

credit card patterns refer to https://en.wikipedia.org/wiki/Payment_card_number#Issuer_identification_number_(IIN) Active and frequent card information American Express: 34, 37 Diners Club (US & Canada): 54,55 Discover Card: 6011, 622126 - 622925, 624000 - 626999, 628200 - 628899, 64, 65 Master Card: 2221-2720, 51–55 Visa: 4

class ads.common.card_identifier.card_identify

Bases: object

identify_issue_network(card_number)

Returns the type of credit card based on its digits

Parameters

card_number (String) –

Returns

String

Return type

A string corresponding to the kind of credit card.

ads.common.auth module

ads.common.auth.api_keys(oci_config: str = '/home/docs/.oci/config', profile: str = 'DEFAULT', client_kwargs: Optional[dict] = None) dict

Prepares authentication and extra arguments necessary for creating clients for different OCI services using API Keys.

Parameters
  • oci_config (str) – OCI authentication config file location. Default is $HOME/.oci/config.

  • profile (str) – Profile name to select from the config file. The defautl is DEFAULT

  • client_kwargs (dict) – kwargs that are required to instantiate the Client if we need to override the defaults.

Returns

Contains keys - config, signer and client_kwargs.

  • The config contains the config loaded from the configuration loaded from oci_config.

  • The signer contains the signer object created from the api keys.

  • client_kwargs contains the client_kwargs that was passed in as input parameter.

Return type

dict

Examples

>>> from ads.common import auth as authutil
>>> from ads.common import oci_client as oc
>>> auth = authutil.api_keys(oci_config="/home/datascience/.oci/config", profile="TEST", client_kwargs={"timeout": 6000})
>>> oc.OCIClientFactory(**auth).object_storage # Creates Object storage client with timeout set to 6000 using API Key authentication
ads.common.auth.default_signer(client_kwargs=None)

Prepares authentication and extra arguments necessary for creating clients for different OCI services based on the default authentication setting for the session. Refer ads.set_auth API for further reference.

Parameters

client_kwargs (dict) – kwargs that are required to instantiate the Client if we need to override the defaults.

Returns

Contains keys - config, signer and client_kwargs.

  • The config contains the config loaded from the configuration loaded from the default location if the default auth mode is API keys, otherwise it is empty dictionary.

  • The signer contains the signer object created from default auth mode.

  • client_kwargs contains the client_kwargs that was passed in as input parameter.

Return type

dict

Examples

>>> from ads.common import auth as authutil
>>> from ads.common import oci_client as oc
>>> auth = authutil.default_signer()
>>> oc.OCIClientFactory(**auth).object_storage # Creates Object storage client
ads.common.auth.get_signer(oci_config=None, oci_profile=None, **client_kwargs)
ads.common.auth.resource_principal(client_kwargs=None)

Prepares authentication and extra arguments necessary for creating clients for different OCI services using Resource Principals.

Parameters

client_kwargs (dict) – kwargs that are required to instantiate the Client if we need to override the defaults.

Returns

Contains keys - config, signer and client_kwargs.

  • The config contains and empty dictionary.

  • The signer contains the signer object created from the resource principal.

  • client_kwargs contains the client_kwargs that was passed in as input parameter.

Return type

dict

Examples

>>> from ads.common import auth as authutil
>>> from ads.common import oci_client as oc
>>> auth = authutil.resource_principal({"timeout": 6000})
>>> oc.OCIClientFactory(**auth).object_storage # Creates Object Storage client with timeout set to 6000 seconds using resource principal authentication

ads.common.data module

class ads.common.data.ADSData(X=None, y=None, name='', dataset_type=None)

Bases: object

This class wraps the input dataframe to various models, evaluation, and explanation frameworks. It’s primary purpose is to hold any metadata relevant to these tasks. This can include it’s:

  • X - the independent variables as some dataframe-like structure,

  • y - the dependent variable or target column as some array-like structure,

  • name - a string to name the data for user convenience,

  • dataset_type - the type of the X value.

As part of this initiative, ADSData knows how to turn itself into an onnxruntime compatible data structure with the method .to_onnxrt(), which takes and onnx session as input.

Parameters
  • X (Union[pandas.DataFrame, dask.DataFrame, numpy.ndarray, scipy.sparse.csr.csr_matrix]) – If str, URI for the dataset. The dataset could be read from local or network file system, hdfs, s3 and gcs Should be none if X_train, y_train, X_test, Y_test are provided

  • y (Union[str, pandas.DataFrame, dask.DataFrame, pandas.Series, dask.Series, numpy.ndarray]) – If str, name of the target in X, otherwise series of labels corresponding to X

  • name (str, optional) – Name to identify this data

  • dataset_type (ADSDataset optional) – When this value is available, would be used to evaluate the ads task type

  • kwargs – Additional keyword arguments that would be passed to the underlying Pandas read API.

static build(X=None, y=None, name='', dataset_type=None, **kwargs)

Returns an ADSData object built from the (source, target) or (X,y)

Parameters
  • X (Union[pandas.DataFrame, dask.DataFrame, numpy.ndarray, scipy.sparse.csr.csr_matrix]) – If str, URI for the dataset. The dataset could be read from local or network file system, hdfs, s3 and gcs Should be none if X_train, y_train, X_test, Y_test are provided

  • y (Union[str, pandas.DataFrame, dask.DataFrame, pandas.Series, dask.Series, numpy.ndarray]) – If str, name of the target in X, otherwise series of labels corresponding to X

  • name (str, optional) – Name to identify this data

  • dataset_type (ADSDataset, optional) – When this value is available, would be used to evaluate the ads task type

  • kwargs – Additional keyword arguments that would be passed to the underlying Pandas read API.

Returns

ads_data – A built ADSData object

Return type

ads.common.data.ADSData

Examples

>>> data = open_csv("my.csv")
>>> data_ads = ADSData(data, 'target').build(data, 'target')
to_onnxrt(sess, idx_range=None, model=None, impute_values={}, **kwargs)

Returns itself formatted as an input for the onnxruntime session inputs passed in.

Parameters
  • sess (Session) – The session object

  • idx_range (Range) – The range of inputs to convert to onnx

  • model (SupportedModel) – A model that supports being serialized for the onnx runtime.

  • kwargs (additional keyword arguments) –

    • sess_inputs - Pass in the output from onnxruntime.InferenceSession(“model.onnx”).get_inputs()

    • input_dtypes (list) - If sess_inputs cannot be passed in, pass in the numpy dtypes of each input

    • input_shapes (list) - If sess_inputs cannot be passed in, pass in the shape of each input

    • input_names (list) -If sess_inputs cannot be passed in, pass in the name of each input

Returns

ort – array of inputs formatted for the given session.

Return type

Array

ads.common.model module

class ads.common.model.ADSModel(est, target=None, transformer_pipeline=None, client=None, booster=None, classes=None, name=None)

Bases: object

Construct an ADSModel

Parameters
  • est (fitted estimator object) – The estimator can be a standard sklearn estimator, a keras, lightgbm, or xgboost estimator, or any other object that implement methods from (BaseEstimator, RegressorMixin) for regression or (BaseEstimator, ClassifierMixin) for classification.

  • target (PandasSeries) – The target column you are using in your dataset, this is assigned as the “y” attribute.

  • transformer_pipeline (TransformerPipeline) – A custom trasnformer pipeline object.

  • client (Str) – Currently unused.

  • booster (Str) – Currently unused.

  • classes (list, optional) – List of target classes. Required for classification problem if the est does not contain classes attribute.

  • name (str, optional) – Name of the model.

static convert_dataframe_schema(df, drop=None)
feature_names(X=None)
static from_estimator(est, transformers=None, classes=None, name=None)

Build ADSModel from a fitted estimator

Parameters
  • est (fitted estimator object) – The estimator can be a standard sklearn estimator or any object that implement methods from (BaseEstimator, RegressorMixin) for regression or (BaseEstimator, ClassifierMixin) for classification.

  • transformers (a scalar or an iterable of objects implementing transform function, optional) – The transform function would be applied on data before calling predict and predict_proba on estimator.

  • classes (list, optional) – List of target classes. Required for classification problem if the est does not contain classes attribute.

  • name (str, optional) – Name of the model.

Returns

model

Return type

ads.common.model.ADSModel

Examples

>>> model = MyModelClass.train()
>>> model_ads = from_estimator(model)
static get_init_types(df, underlying_model=None)
is_classifier()

Returns True if ADS believes that the model is a classifier

Returns

Boolean

Return type

True if the model is a classifier, False otherwise.

predict(X)

Runs the models predict function on some data

Parameters

X (MLData) – A MLData object which holds the examples to be predicted on.

Returns

Usually a list or PandasSeries of predictions

Return type

Union[List, pandas.Series], depending on the estimator

predict_proba(X)

Runs the models predict probabilities function on some data

Parameters

X (MLData) – A MLData object which holds the examples to be predicted on.

Returns

Usually a list or PandasSeries of predictions

Return type

Union[List, pandas.Series], depending on the estimator

prepare(target_dir=None, data_sample=None, X_sample=None, y_sample=None, include_data_sample=False, force_overwrite=False, fn_artifact_files_included=False, fn_name='model_api', inference_conda_env=None, data_science_env=False, ignore_deployment_error=False, use_case_type=None, inference_python_version=None, imputed_values={}, **kwargs)

Prepare model artifact directory to be published to model catalog

Parameters
  • target_dir (str, default: model.name[:12]) – Target directory under which the model artifact files need to be added

  • data_sample (ADSData) – Note: This format is preferable to X_sample and y_sample. A sample of the test data that will be provided to predict() API of scoring script Used to generate schema_input.json and schema_output.json which defines the input and output formats

  • X_sample (pandas.DataFrame) – A sample of input data that will be provided to predict() API of scoring script Used to generate schema.json which defines the input formats

  • y_sample (pandas.Series) – A sample of output data that is expected to be returned by predict() API of scoring script, corresponding to X_sample Used to generate schema_output.json which defines the output formats

  • force_overwrite (bool, default: False) – If True, overwrites the target directory if exists already

  • fn_artifact_files_included (bool, default: True) – If True, generates artifacts to export a model as a function without ads dependency

  • fn_name (str, default: 'model_api') – Required parameter if fn_artifact_files_included parameter is setup.

  • inference_conda_env (str, default: None) – Conda environment to use within the model deployment service for inferencing

  • data_science_env (bool, default: False) – If set to True, datascience environment represented by the slug in the training conda environment will be used.

  • ignore_deployment_error (bool, default: False) – If set to True, the prepare will ignore all the errors that may impact model deployment

  • use_case_type (str) – The use case type of the model. Use it through UserCaseType class or string provided in UseCaseType. For example, use_case_type=UseCaseType.BINARY_CLASSIFICATION or use_case_type=”binary_classification”. Check with UseCaseType class to see all supported types.

  • inference_python_version (str, default:None.) – If provided will be added to the generated runtime yaml

  • **kwargs

  • --------

  • max_col_num ((int, optional). Defaults to utils.DATA_SCHEMA_MAX_COL_NUM.) – The maximum column size of the data that allows to auto generate schema.

Returns

model_artifact

Return type

an instance of ModelArtifact that can be used to test the generated scoring script

rename(name)

Changes the name of a model

Parameters

name (str) – A string which is supplied for naming a model.

score(X, y_true, score_fn=None)

Scores a model according to a custom score function

Parameters
  • X (MLData) – A MLData object which holds the examples to be predicted on.

  • y_true (MLData) – A MLData object which holds ground truth labels for the examples which are being predicted on.

  • score_fn (Scorer (callable)) – A callable object that returns a score, usually created with sklearn.metrics.make_scorer().

Returns

Almost always a scalar score (usually a float).

Return type

float, depending on the estimator

show_in_notebook()

Describe the model by showing it’s properties

summary()

A summary of the ADSModel

transform(X)

Process some MLData through the selected ADSModel transformers

Parameters

X (MLData) – A MLData object which holds the examples to be transformed.

visualize_transforms()

A graph of the ADSModel transformer pipeline. It is only supported in JupyterLabs Notebooks.

ads.common.model_metadata module

class ads.common.model_metadata.ExtendedEnumMeta(name, bases, namespace, **kwargs)

Bases: ABCMeta

The helper metaclass to extend functionality of a general class.

values(cls) list:

Gets the list of class attributes.

values() list

Gets the list of class attributes.

Returns

The list of class values.

Return type

list

class ads.common.model_metadata.Framework

Bases: str

BERT = 'bert'
CUML = 'cuml'
EMCEE = 'emcee'
ENSEMBLE = 'ensemble'
FLAIR = 'flair'
GENSIM = 'gensim'
H20 = 'h2o'
KERAS = 'keras'
LIGHT_GBM = 'lightgbm'
MXNET = 'mxnet'
NLTK = 'nltk'
ORACLE_AUTOML = 'oracle_automl'
OTHER = 'other'
PROPHET = 'prophet'
PYMC3 = 'pymc3'
PYOD = 'pyod'
PYSTAN = 'pystan'
PYTORCH = 'pytorch'
SCIKIT_LEARN = 'scikit-learn'
SKTIME = 'sktime'
SPACY = 'spacy'
STATSMODELS = 'statsmodels'
TENSORFLOW = 'tensorflow'
TRANSFORMERS = 'transformers'
WORD2VEC = 'word2vec'
XGBOOST = 'xgboost'
class ads.common.model_metadata.MetadataCustomCategory

Bases: str

OTHER = 'Other'
PERFORMANCE = 'Performance'
TRAINING_AND_VALIDATION_DATASETS = 'Training and Validation Datasets'
TRAINING_ENV = 'Training Environment'
TRAINING_PROFILE = 'Training Profile'
class ads.common.model_metadata.MetadataCustomKeys

Bases: str

CLIENT_LIBRARY = 'ClientLibrary'
CONDA_ENVIRONMENT = 'CondaEnvironment'
CONDA_ENVIRONMENT_PATH = 'CondaEnvironmentPath'
ENVIRONMENT_TYPE = 'EnvironmentType'
MODEL_ARTIFACTS = 'ModelArtifacts'
MODEL_SERIALIZATION_FORMAT = 'ModelSerializationFormat'
SLUG_NAME = 'SlugName'
TRAINING_DATASET = 'TrainingDataset'
TRAINING_DATASET_NUMBER_OF_COLS = 'TrainingDatasetNumberOfCols'
TRAINING_DATASET_NUMBER_OF_ROWS = 'TrainingDatasetNumberOfRows'
TRAINING_DATASET_SIZE = 'TrainingDatasetSize'
VALIDATION_DATASET = 'ValidationDataset'
VALIDATION_DATASET_NUMBER_OF_COLS = 'ValidationDataSetNumberOfCols'
VALIDATION_DATASET_NUMBER_OF_ROWS = 'ValidationDatasetNumberOfRows'
VALIDATION_DATASET_SIZE = 'ValidationDatasetSize'
class ads.common.model_metadata.MetadataCustomPrintColumns

Bases: str

CATEGORY = 'Category'
DESCRIPTION = 'Description'
KEY = 'Key'
VALUE = 'Value'
exception ads.common.model_metadata.MetadataDescriptionTooLong(key: str, length: int)

Bases: ValueError

Maximum allowed length of metadata description has been exceeded. See https://docs.oracle.com/en-us/iaas/data-science/using/models_saving_catalog.htm for more details.

exception ads.common.model_metadata.MetadataSizeTooLarge(size: int)

Bases: ValueError

Maximum allowed size for model metadata has been exceeded. See https://docs.oracle.com/en-us/iaas/data-science/using/models_saving_catalog.htm for more details.

class ads.common.model_metadata.MetadataTaxonomyKeys

Bases: str

ALGORITHM = 'Algorithm'
ARTIFACT_TEST_RESULT = 'ArtifactTestResults'
FRAMEWORK = 'Framework'
FRAMEWORK_VERSION = 'FrameworkVersion'
HYPERPARAMETERS = 'Hyperparameters'
USE_CASE_TYPE = 'UseCaseType'
class ads.common.model_metadata.MetadataTaxonomyPrintColumns

Bases: str

KEY = 'Key'
VALUE = 'Value'
exception ads.common.model_metadata.MetadataValueTooLong(key: str, length: int)

Bases: ValueError

Maximum allowed length of metadata value has been exceeded. See https://docs.oracle.com/en-us/iaas/data-science/using/models_saving_catalog.htm for more details.

class ads.common.model_metadata.ModelCustomMetadata

Bases: ModelMetadata

Class that represents Model Custom Metadata.

get(self, key: str) ModelCustomMetadataItem

Returns the model metadata item by provided key.

reset(self) None

Resets all model metadata items to empty values.

to_dataframe(self) pd.DataFrame

Returns the model metadata list in a data frame format.

size(self) int

Returns the size of the model metadata in bytes.

validate(self) bool

Validates metadata.

to_dict(self)

Serializes model metadata into a dictionary.

to_yaml(self)

Serializes model metadata into a YAML.

add(self, key: str, value: str, description: str = '', category: str = MetadataCustomCategory.OTHER, replace: bool = False) None:

Adds a new model metadata item. Replaces existing one if replace flag is True.

remove(self, key: str) None

Removes a model metadata item by key.

clear(self) None

Removes all metadata items.

isempty(self) bool

Checks if metadata is empty.

to_json(self)

Serializes model metadata into a JSON.

to_json_file(self, file_path: str, storage_options: dict = None) None

Saves the metadata to a local file or object storage.

Examples

>>> metadata_custom = ModelCustomMetadata()
>>> metadata_custom.add(key="format", value="pickle")
>>> metadata_custom.add(key="note", value="important note", description="some description")
>>> metadata_custom["format"].description = "some description"
>>> metadata_custom.to_dataframe()
                    Key              Value         Description      Category
----------------------------------------------------------------------------
0                format             pickle    some description  user defined
1                  note     important note    some description  user defined
>>> metadata_custom
    metadata:
    - category: user defined
      description: some description
      key: format
      value: pickle
    - category: user defined
      description: some description
      key: note
      value: important note
>>> metadata_custom.remove("format")
>>> metadata_custom
    metadata:
    - category: user defined
      description: some description
      key: note
      value: important note
>>> metadata_custom.to_dict()
    {'metadata': [{
            'key': 'note',
            'value': 'important note',
            'category': 'user defined',
            'description': 'some description'
        }]}
>>> metadata_custom.reset()
>>> metadata_custom
    metadata:
    - category: None
      description: None
      key: note
      value: None
>>> metadata_custom.clear()
>>> metadata_custom.to_dataframe()
                    Key              Value         Description      Category
----------------------------------------------------------------------------

Initializes custom model metadata.

add(key: str, value: str, description: str = '', category: str = 'Other', replace: bool = False) None

Adds a new model metadata item. Overrides the existing one if replace flag is True.

Parameters
  • key (str) – The metadata item key.

  • value (str) – The metadata item value.

  • description (str) – The metadata item description.

  • category (str) – The metadata item category.

  • replace (bool) – Overrides the existing metadata item if replace flag is True.

Returns

Nothing.

Return type

None

Raises
  • TypeError – If provided key is not a string. If provided description not a string.

  • ValueError – If provided key is empty. If provided value is empty. If provided value cannot be serialized to JSON. If item with provided key is already registered and replace flag is False. If provided category is not supported.

  • MetadataValueTooLong – If the length of provided value exceeds 255 charracters.

  • MetadataDescriptionTooLong – If the length of provided description exceeds 255 charracters.

clear() None

Removes all metadata items.

Returns

Nothing.

Return type

None

isempty() bool

Checks if metadata is empty.

Returns

True if metadata is empty, False otherwise.

Return type

bool

remove(key: str) None

Removes a model metadata item.

Parameters

key (str) – The key of the metadata item that should be removed.

Returns

Nothing.

Return type

None

set_training_data(path: str, data_size: Optional[str] = None)

Adds training_data path and data size information into model custom metadata.

Parameters
  • path (str) – The path where the training_data is stored.

  • data_size (str) – The size of the training_data.

Returns

Nothing.

Return type

None

set_validation_data(path: str, data_size: Optional[str] = None)

Adds validation_data path and data size information into model custom metadata.

Parameters
  • path (str) – The path where the validation_data is stored.

  • data_size (str) – The size of the validation_data.

Returns

Nothing.

Return type

None

to_dataframe() DataFrame

Returns the model metadata list in a data frame format.

Returns

The model metadata in a dataframe format.

Return type

pandas.DataFrame

class ads.common.model_metadata.ModelCustomMetadataItem(key: str, value: Optional[str] = None, description: Optional[str] = None, category: Optional[str] = None)

Bases: ModelTaxonomyMetadataItem

Class that represents model custom metadata item.

key

The model metadata item key.

Type

str

value

The model metadata item value.

Type

str

description

The model metadata item description.

Type

str

category

The model metadata item category.

Type

str

reset(self) None

Resets model metadata item.

to_dict(self) dict

Serializes model metadata item to dictionary.

to_yaml(self)

Serializes model metadata item to YAML.

size(self) int

Returns the size of the metadata in bytes.

update(self, value: str = '', description: str = '', category: str = '') None

Updates metadata item information.

to_json(self) JSON

Serializes metadata item into a JSON.

to_json_file(self, file_path: str, storage_options: dict = None) None

Saves the metadata item value to a local file or object storage.

validate(self) bool

Validates metadata item.

property category: str
property description: str
reset() None

Resets model metadata item.

Resets value, description and category to None.

Returns

Nothing.

Return type

None

update(value: str, description: str, category: str) None

Updates metadata item.

Parameters
  • value (str) – The value of model metadata item.

  • description (str) – The description of model metadata item.

  • category (str) – The category of model metadata item.

Returns

Nothing.

Return type

None

validate() bool

Validates metadata item.

Returns

True if validation passed.

Return type

bool

Raises
  • ValueError – If invalid category provided.

  • MetadataValueTooLong – If value exceeds the length limit.

class ads.common.model_metadata.ModelMetadata

Bases: ABC

The base abstract class representing model metadata.

get(self, key: str) ModelMetadataItem

Returns the model metadata item by provided key.

reset(self) None

Resets all model metadata items to empty values.

to_dataframe(self) pd.DataFrame

Returns the model metadata list in a data frame format.

size(self) int

Returns the size of the model metadata in bytes.

validate(self) bool

Validates metadata.

to_dict(self)

Serializes model metadata into a dictionary.

to_yaml(self)

Serializes model metadata into a YAML.

to_json(self)

Serializes model metadata into a JSON.

to_json_file(self, file_path: str, storage_options: dict = None) None

Saves the metadata to a local file or object storage.

Initializes Model Metadata.

get(key: str) ModelMetadataItem

Returns the model metadata item by provided key.

Parameters

key (str) – The key of model metadata item.

Returns

The model metadata item.

Return type

ModelMetadataItem

Raises

ValueError – If provided key is empty or metadata item not found.

property keys: Tuple[str]

Returns all registered metadata keys.

Returns

The list of metadata keys.

Return type

Tuple[str]

reset() None

Resets all model metadata items to empty values.

Resets value, description and category to None for every metadata item.

size() int

Returns the size of the model metadata in bytes.

Returns

The size of model metadata in bytes.

Return type

int

abstract to_dataframe() DataFrame

Returns the model metadata list in a data frame format.

Returns

The model metadata in a dataframe format.

Return type

pandas.DataFrame

to_dict()

Serializes model metadata into a dictionary.

Returns

The model metadata in a dictionary representation.

Return type

Dict

to_json()

Serializes model metadata into a JSON.

Returns

The model metadata in a JSON representation.

Return type

JSON

to_json_file(file_path: str, storage_options: Optional[dict] = None) None

Saves the metadata to a local file or object storage.

Parameters
  • file_path (str) – The file path to store the data. “oci://bucket_name@namespace/folder_name/” “oci://bucket_name@namespace/folder_name/metadata.json” “path/to/local/folder” “path/to/local/folder/metadata.json”

  • storage_options (dict. Default None) – Parameters passed on to the backend filesystem class. Defaults to options set using DatasetFactory.set_default_storage().

Returns

Nothing.

Return type

None

Raises
  • ValueError – When file path is empty.:

  • TypeError – When file path not a string.:

Examples

>>> metadata = ModelTaxonomyMetadataItem()
>>> storage_options = {"config": oci.config.from_file(os.path.join("~/.oci", "config"))}
>>> storage_options
{'log_requests': False,
    'additional_user_agent': '',
    'pass_phrase': None,
    'user': '<user-id>',
    'fingerprint': '05:15:2b:b1:46:8a:32:ec:e2:69:5b:32:01:**:**:**)',
    'tenancy': '<tenancy-id>',
    'region': 'us-ashburn-1',
    'key_file': '/home/datascience/.oci/oci_api_key.pem'}
>>> metadata.to_json_file(file_path = 'oci://bucket_name@namespace/folder_name/metadata_taxonomy.json', storage_options=storage_options)
>>> metadata_item.to_json_file("path/to/local/folder/metadata_taxonomy.json")
to_yaml()

Serializes model metadata into a YAML.

Returns

The model metadata in a YAML representation.

Return type

Yaml

validate() bool

Validates model metadata.

Returns

True if metadata is valid.

Return type

bool

validate_size() bool

Validates model metadata size.

Validates the size of metadata. Throws an error if the size of the metadata exceeds expected value.

Returns

True if metadata size is valid.

Return type

bool

Raises

MetadataSizeTooLarge – If the size of the metadata exceeds expected value.

class ads.common.model_metadata.ModelMetadataItem

Bases: ABC

The base abstract class representing model metadata item.

to_dict(self) dict

Serializes model metadata item to dictionary.

to_yaml(self)

Serializes model metadata item to YAML.

size(self) int

Returns the size of the metadata in bytes.

to_json(self) JSON

Serializes metadata item to JSON.

to_json_file(self, file_path: str, storage_options: dict = None) None

Saves the metadata item value to a local file or object storage.

validate(self) bool

Validates metadata item.

size() int

Returns the size of the model metadata in bytes.

Returns

The size of model metadata in bytes.

Return type

int

to_dict() dict

Serializes model metadata item to dictionary.

Returns

The dictionary representation of model metadata item.

Return type

dict

to_json()

Serializes metadata item into a JSON.

Returns

The metadata item in a JSON representation.

Return type

JSON

to_json_file(file_path: str, storage_options: Optional[dict] = None) None

Saves the metadata item value to a local file or object storage.

Parameters
  • file_path (str) – The file path to store the data. “oci://bucket_name@namespace/folder_name/” “oci://bucket_name@namespace/folder_name/result.json” “path/to/local/folder” “path/to/local/folder/result.json”

  • storage_options (dict. Default None) – Parameters passed on to the backend filesystem class. Defaults to options set using DatasetFactory.set_default_storage().

Returns

Nothing.

Return type

None

Raises
  • ValueError – When file path is empty.:

  • TypeError – When file path not a string.:

Examples

>>> metadata_item = ModelCustomMetadataItem(key="key1", value="value1")
>>> storage_options = {"config": oci.config.from_file(os.path.join("~/.oci", "config"))}
>>> storage_options
{'log_requests': False,
    'additional_user_agent': '',
    'pass_phrase': None,
    'user': '<user-id>',
    'fingerprint': '05:15:2b:b1:46:8a:32:ec:e2:69:5b:32:01:**:**:**)',
    'tenancy': '<tenency-id>',
    'region': 'us-ashburn-1',
    'key_file': '/home/datascience/.oci/oci_api_key.pem'}
>>> metadata_item.to_json_file(file_path = 'oci://bucket_name@namespace/folder_name/file.json', storage_options=storage_options)
>>> metadata_item.to_json_file("path/to/local/folder/file.json")
to_yaml()

Serializes model metadata item to YAML.

Returns

The model metadata item in a YAML representation.

Return type

Yaml

abstract validate() bool

Validates metadata item.

Returns

True if validation passed.

Return type

bool

class ads.common.model_metadata.ModelProvenanceMetadata(repo: Optional[str] = None, git_branch: Optional[str] = None, git_commit: Optional[str] = None, repository_url: Optional[str] = None, training_script_path: Optional[str] = None, training_id: Optional[str] = None, artifact_dir: Optional[str] = None)

Bases: DataClassSerializable

ModelProvenanceMetadata class.

Examples

>>> provenance_metadata = ModelProvenanceMetadata.fetch_training_code_details()
ModelProvenanceMetadata(repo=<git.repo.base.Repo '/home/datascience/.git'>, git_branch='master', git_commit='99ad04c31803f1d4ffcc3bf4afbd6bcf69a06af2', repository_url='file:///home/datascience', "", "")
>>> provenance_metadata.assert_path_not_dirty("your_path", ignore=False)
artifact_dir: str = None
assert_path_not_dirty(path: str, ignore: bool)

Checks if all the changes in this path has been commited.

Parameters
  • path ((str)) – path.

  • (bool) (ignore) – whether to ignore the changes or not.

Raises

ChangesNotCommitted – if there are changes not being commited.:

Returns

Nothing.

Return type

None

classmethod fetch_training_code_details(training_script_path: Optional[str] = None, training_id: Optional[str] = None, artifact_dir: Optional[str] = None)

Fetches the training code details: repo, git_branch, git_commit, repository_url, training_script_path and training_id.

Parameters
  • training_script_path ((str, optional). Defaults to None.) – Training script path.

  • training_id ((str, optional). Defaults to None.) – The training OCID for model.

  • artifact_dir (str) – artifact directory to store the files needed for deployment.

Returns

A ModelProvenanceMetadata instance.

Return type

ModelProvenanceMetadata

git_branch: str = None
git_commit: str = None
repo: str = None
repository_url: str = None
training_id: str = None
training_script_path: str = None
class ads.common.model_metadata.ModelTaxonomyMetadata

Bases: ModelMetadata

Class that represents Model Taxonomy Metadata.

get(self, key: str) ModelTaxonomyMetadataItem

Returns the model metadata item by provided key.

reset(self) None

Resets all model metadata items to empty values.

to_dataframe(self) pd.DataFrame

Returns the model metadata list in a data frame format.

size(self) int

Returns the size of the model metadata in bytes.

validate(self) bool

Validates metadata.

to_dict(self)

Serializes model metadata into a dictionary.

to_yaml(self)

Serializes model metadata into a YAML.

to_json(self)

Serializes model metadata into a JSON.

to_json_file(self, file_path: str, storage_options: dict = None) None

Saves the metadata to a local file or object storage.

Examples

>>> metadata_taxonomy = ModelTaxonomyMetadata()
>>> metadata_taxonomy.to_dataframe()
                Key                   Value
--------------------------------------------
0        UseCaseType   binary_classification
1          Framework                 sklearn
2   FrameworkVersion                   0.2.2
3          Algorithm               algorithm
4    Hyperparameters                      {}
>>> metadata_taxonomy.reset()
>>> metadata_taxonomy.to_dataframe()
                Key                    Value
--------------------------------------------
0        UseCaseType                    None
1          Framework                    None
2   FrameworkVersion                    None
3          Algorithm                    None
4    Hyperparameters                    None
>>> metadata_taxonomy
    metadata:
    - key: UseCaseType
      category: None
      description: None
      value: None

Initializes Model Metadata.

to_dataframe() DataFrame

Returns the model metadata list in a data frame format.

Returns

The model metadata in a dataframe format.

Return type

pandas.DataFrame

class ads.common.model_metadata.ModelTaxonomyMetadataItem(key: str, value: Optional[str] = None)

Bases: ModelMetadataItem

Class that represents model taxonomy metadata item.

key

The model metadata item key.

Type

str

value

The model metadata item value.

Type

str

reset(self) None

Resets model metadata item.

to_dict(self) dict

Serializes model metadata item to dictionary.

to_yaml(self)

Serializes model metadata item to YAML.

size(self) int

Returns the size of the metadata in bytes.

update(self, value: str = '') None

Updates metadata item information.

to_json(self) JSON

Serializes metadata item into a JSON.

to_json_file(self, file_path: str, storage_options: dict = None) None

Saves the metadata item value to a local file or object storage.

validate(self) bool

Validates metadata item.

property key: str
reset() None

Resets model metadata item.

Resets value to None.

Returns

Nothing.

Return type

None

update(value: str) None

Updates metadata item value.

Parameters

value (str) – The value of model metadata item.

Returns

Nothing.

Return type

None

validate() bool

Validates metadata item.

Returns

True if validation passed.

Return type

bool

Raises

ValueError – If invalid UseCaseType provided. If invalid Framework provided.

property value: str
class ads.common.model_metadata.UseCaseType

Bases: str

ANOMALY_DETECTION = 'anomaly_detection'
BINARY_CLASSIFICATION = 'binary_classification'
CLUSTERING = 'clustering'
DIMENSIONALITY_REDUCTION = 'dimensionality_reduction/representation'
IMAGE_CLASSIFICATION = 'image_classification'
MULTINOMIAL_CLASSIFICATION = 'multinomial_classification'
NER = 'ner'
OBJECT_LOCALIZATION = 'object_localization'
OTHER = 'other'
RECOMMENDER = 'recommender'
REGRESSION = 'regression'
SENTIMENT_ANALYSIS = 'sentiment_analysis'
TIME_SERIES_FORECASTING = 'time_series_forecasting'
TOPIC_MODELING = 'topic_modeling'

ads.common.decorator.runtime_dependency module

The module that provides the decorator helping to add runtime dependencies in functions.

Examples

>>> @runtime_dependency(module="pandas", short_name="pd")
... def test_function()
...     print(pd)
>>> @runtime_dependency(module="pandas", object="DataFrame", short_name="df")
... def test_function()
...     print(df)
>>> @runtime_dependency(module="pandas", short_name="pd")
... @runtime_dependency(module="pandas", object="DataFrame", short_name="df")
... def test_function()
...     print(df)
...     print(pd)
>>> @runtime_dependency(module="pandas", object="DataFrame", short_name="df", install_from="ads[optional]")
... def test_function()
...     pass
>>> @runtime_dependency(module="pandas", object="DataFrame", short_name="df", err_msg="Custom error message.")
... def test_function()
...     pass
class ads.common.decorator.runtime_dependency.OptionalDependency

Bases: object

BDS = 'oracle-ads[bds]'
BOOSTED = 'oracle-ads[boosted]'
DATA = 'oracle-ads[data]'
GEO = 'oracle-ads[geo]'
LABS = 'oracle-ads[labs]'
MYSQL = 'oracle-ads[mysql]'
NOTEBOOK = 'oracle-ads[notebook]'
ONNX = 'oracle-ads[onnx]'
OPCTL = 'oracle-ads[opctl]'
OPTUNA = 'oracle-ads[optuna]'
PYTORCH = 'oracle-ads[torch]'
TENSORFLOW = 'oracle-ads[tensorflow]'
TEXT = 'oracle-ads[text]'
VIZ = 'oracle-ads[viz]'
ads.common.decorator.runtime_dependency.runtime_dependency(module: str, short_name: str = '', object: Optional[str] = None, install_from: Optional[str] = None, err_msg: str = '', is_for_notebook_only=False)

The decorator which is helping to add runtime dependencies to functions.

Parameters
  • module (str) – The module name to be imported.

  • short_name ((str, optional). Defaults to empty string.) – The short name for the imported module.

  • object ((str, optional). Defaults to None.) – The name of the object to be imported. Can be a function or a class, or any variable provided by module.

  • install_from ((str, optional). Defaults to None.) – The parameter helping to answer from where the required dependency can be installed.

  • err_msg ((str, optional). Defaults to empty string.) – The custom error message.

  • is_for_notebook_only ((bool, optional). Defaults to False.) – If the value of this flag is set to True, the dependency will be added only in case when the current environment is a jupyter notebook.

Raises
  • ModuleNotFoundError – In case if requested module not found.

  • ImportError – In case if object cannot be imported from the module.

Examples

>>> @runtime_dependency(module="pandas", short_name="pd")
... def test_function()
...     print(pd)
>>> @runtime_dependency(module="pandas", object="DataFrame", short_name="df")
... def test_function()
...     print(df)
>>> @runtime_dependency(module="pandas", short_name="pd")
... @runtime_dependency(module="pandas", object="DataFrame", short_name="df")
... def test_function()
...     print(df)
...     print(pd)
>>> @runtime_dependency(module="pandas", object="DataFrame", short_name="df", install_from="ads[optional]")
... def test_function()
...     pass
>>> @runtime_dependency(module="pandas", object="DataFrame", short_name="df", err_msg="Custom error message.")
... def test_function()
...     pass

ads.common.decorator.deprecate module

class ads.common.decorator.deprecate.TARGET_TYPE(value)

Bases: Enum

An enumeration.

ATTRIBUTE = 'Attribute'
CLASS = 'Class'
METHOD = 'Method'
ads.common.decorator.deprecate.deprecated(deprecated_in: str, removed_in: Optional[str] = None, details: Optional[str] = None, target_type: Optional[str] = None)

This is a decorator which can be used to mark functions as deprecated. It will result in a warning being emitted when the function is used.

Parameters
  • deprecated_in (str) – Version of ADS where this function deprecated.

  • removed_in (str) – Future version where this function will be removed.

  • details (str) – More information to be shown.

ads.common.model_introspect module

The module that helps to minimize the number of errors of the model post-deployment process. The model provides a simple testing harness to ensure that model artifacts are thoroughly tested before being saved to the model catalog.

Classes

ModelIntrospect

Class to introspect model artifacts.

Examples

>>> model_introspect = ModelIntrospect(artifact=model_artifact)
>>> model_introspect()
... Test key         Test name            Result              Message
... ----------------------------------------------------------------------------
... test_key_1       test_name_1          Passed              test passed
... test_key_2       test_name_2          Not passed          some error occured
>>> model_introspect.status
... Passed
class ads.common.model_introspect.Introspectable

Bases: ABC

Base class that represents an introspectable object.

exception ads.common.model_introspect.IntrospectionNotPassed

Bases: ValueError

class ads.common.model_introspect.ModelIntrospect(artifact: Introspectable)

Bases: object

Class to introspect model artifacts.

Parameters
  • status (str) – Returns the current status of model introspection. The possible variants: Passed, Not passed, Not tested.

  • failures (int) – Returns the number of failures of introspection result.

run(self) None

Invokes model artifacts introspection.

to_dataframe(self) pd.DataFrame

Serializes model introspection result into a DataFrame.

Examples

>>> model_introspect = ModelIntrospect(artifact=model_artifact)
>>> result = model_introspect()
... Test key         Test name            Result              Message
... ----------------------------------------------------------------------------
... test_key_1       test_name_1          Passed              test passed
... test_key_2       test_name_2          Not passed          some error occured

Initializes the Model Introspect.

Parameters

artifact (Introspectable) – The instance of ModelArtifact object.

Raises
  • ValueError – If model artifact object not provided.:

  • TypeError – If provided input paramater not a ModelArtifact instance.:

property failures: int

Calculates the number of failures.

Returns

The number of failures.

Return type

int

run() DataFrame

Invokes introspection.

Returns

The introspection result in a DataFrame format.

Return type

pd.DataFrame

property status: str

Gets the current status of model introspection.

to_dataframe() DataFrame

Serializes model introspection result into a DataFrame.

Returns

The model introspection result in a DataFrame representation.

Return type

pandas.DataFrame

class ads.common.model_introspect.PrintItem(key: str = '', case: str = '', result: str = '', message: str = '')

Bases: object

Class represents the model introspection print item.

case: str = ''
key: str = ''
message: str = ''
result: str = ''
to_list() List[str]

Converts instance to a list representation.

Returns

The instance in a list representation.

Return type

List[str]

class ads.common.model_introspect.TEST_STATUS

Bases: str

NOT_PASSED = 'Failed'
NOT_TESTED = 'Skipped'
PASSED = 'Passed'

ads.common.model_export_util module

class ads.common.model_export_util.ONNXTransformer

Bases: object

This is a transformer to convert X [pandas.Dataframe, pd.Series] data into Onnx readable dtypes and formats. It is Serializable, so it can be reloaded at another time.

Examples

>>> from ads.common.model_export_util import ONNXTransformer
>>> onnx_data_transformer = ONNXTransformer()
>>> train_transformed = onnx_data_transformer.fit_transform(train.X, {"column_name1": "impute_value1", "column_name2": "impute_value2"}})
>>> test_transformed = onnx_data_transformer.transform(test.X)
fit(X: Union[DataFrame, Series, ndarray, list], impute_values: Optional[Dict] = None)

Fits the OnnxTransformer on the dataset :param X: The Dataframe for the training data :type X: Union[pandas.DataFrame, pandas.Series, np.ndarray, list]

Returns

Self – The fitted estimator

Return type

ads.Model

fit_transform(X: Union[DataFrame, Series], impute_values: Optional[Dict] = None)

Fits, then transforms the data :param X: The Dataframe for the training data :type X: Union[pandas.DataFrame, pandas.Series]

Returns

The transformed X data

Return type

Union[pandas.DataFrame, pandas.Series]

static load(filename, **kwargs)

Loads the Onnx model to disk :param filename: The filename location for where the model should be loaded :type filename: Str

Returns

onnx_transformer – The loaded model

Return type

ONNXTransformer

save(filename, **kwargs)

Saves the Onnx model to disk :param filename: The filename location for where the model should be saved :type filename: Str

Returns

filename – The filename where the model was saved

Return type

Str

transform(X: Union[DataFrame, Series, ndarray, list])

Transforms the data for the OnnxTransformer.

Parameters

X (Union[pandas.DataFrame, pandas.Series, np.ndarray, list]) – The Dataframe for the training data

Returns

The transformed X data

Return type

Union[pandas.DataFrame, pandas.Series, np.ndarray, list]

ads.common.model_export_util.prepare_generic_model(model_path: str, fn_artifact_files_included: bool = False, fn_name: str = 'model_api', force_overwrite: bool = False, model: Optional[Any] = None, data_sample: Optional[ADSData] = None, use_case_type=None, X_sample: Optional[Union[list, tuple, Series, ndarray, DataFrame]] = None, y_sample: Optional[Union[list, tuple, Series, ndarray, DataFrame]] = None, **kwargs) ModelArtifact

Generates template files to aid model deployment. The model could be accompanied by other artifacts all of which can be dumped at model_path. Following files are generated: * func.yaml * func.py * requirements.txt * score.py

Parameters
  • model_path (str) – Path where the artifacts must be saved. The serialized model object and any other associated files/objects must be saved in the model_path directory

  • fn_artifact_files_included (bool) – Default is False, if turned off, function artifacts are not generated.

  • fn_name (str) – Opional parameter to specify the function name

  • force_overwrite (bool) – Opional parameter to specify if the model_artifact should overwrite the existing model_path (if it exists)

  • model ((Any, optional). Defaults to None.) – This is an optional model object which is only used to extract taxonomy metadata. Supported models: automl, keras, lightgbm, pytorch, sklearn, tensorflow, and xgboost. If the model is not under supported frameworks, then extracting taxonomy metadata will be skipped. The alternative way is using atifact.populate_metadata(model=model, usecase_type=UseCaseType.REGRESSION).

  • data_sample (ADSData) – A sample of the test data that will be provided to predict() API of scoring script Used to generate schema_input and schema_output

  • use_case_type (str) – The use case type of the model

  • X_sample (Union[list, tuple, pd.Series, np.ndarray, pd.DataFrame, dask.dataframe.core.Series, dask.dataframe.core.DataFrame]) – A sample of input data that will be provided to predict() API of scoring script Used to generate input schema.

  • y_sample (Union[list, tuple, pd.Series, np.ndarray, pd.DataFrame, dask.dataframe.core.Series, dask.dataframe.core.DataFrame]) – A sample of output data that is expected to be returned by predict() API of scoring script, corresponding to X_sample Used to generate output schema.

  • **kwargs

  • ________

  • data_science_env (bool, default: False) – If set to True, the datascience environment represented by the slug in the training conda environment will be used.

  • inference_conda_env (str, default: None) – Conda environment to use within the model deployment service for inferencing. For example, oci://bucketname@namespace/path/to/conda/env

  • ignore_deployment_error (bool, default: False) – If set to True, the prepare method will ignore all the errors that may impact model deployment.

  • underlying_model (str, default: 'UNKNOWN') – Underlying Model Type, could be “automl”, “sklearn”, “h2o”, “lightgbm”, “xgboost”, “torch”, “mxnet”, “tensorflow”, “keras”, “pyod” and etc.

  • model_libs (dict, default: {}) – Model required libraries where the key is the library names and the value is the library versions. For example, {numpy: 1.21.1}.

  • progress (int, default: None) – max number of progress.

  • inference_python_version (str, default:None.) – If provided will be added to the generated runtime yaml

  • max_col_num ((int, optional). Defaults to utils.DATA_SCHEMA_MAX_COL_NUM.) – The maximum column size of the data that allows to auto generate schema.

Examples

>>> import cloudpickle
>>> import os
>>> from sklearn.linear_model import LogisticRegression
>>> from sklearn.datasets import make_classification
>>> import ads
>>> from ads.common.model_export_util import prepare_generic_model
>>> import yaml
>>> import oci
>>>
>>> ads.set_auth('api_key', oci_config_location=oci.config.DEFAULT_LOCATION, profile='DEFAULT')
>>> model_artifact_location = os.path.expanduser('~/myusecase/model/')
>>> inference_conda_env="oci://my-bucket@namespace/conda_environments/cpu/Data_Exploration_and_Manipulation_for_CPU_Python_3.7/2.0/dataexpl_p37_cpu_v2"
>>> inference_python_version = "3.7"
>>> if not os.path.exists(model_artifact_location):
...     os.makedirs(model_artifact_location)
>>> X, y = make_classification(n_samples=100, n_features=20, n_classes=2)
>>> lrmodel = LogisticRegression().fit(X, y)
>>> with open(os.path.join(model_artifact_location, 'model.pkl'), "wb") as mfile:
...     cloudpickle.dump(lrmodel, mfile)
>>> modelartifact = prepare_generic_model(
...     model_artifact_location,
...     model = lrmodel,
...     force_overwrite=True,
...     inference_conda_env=inference_conda_env,
...     ignore_deployment_error=True,
...     inference_python_version=inference_python_version
... )
>>> modelartifact.reload() # Call reload to update the ModelArtifact object with the generated score.py
>>> assert len(modelartifact.predict(X[:5])['prediction']) == 5 #Test the generated score.py works. This may require customization.
>>> with open(os.path.join(model_artifact_location, "runtime.yaml")) as rf:
...     content = yaml.load(rf, Loader=yaml.FullLoader)
...     assert content['MODEL_DEPLOYMENT']['INFERENCE_CONDA_ENV']['INFERENCE_ENV_PATH'] == inference_conda_env
...     assert content['MODEL_DEPLOYMENT']['INFERENCE_CONDA_ENV']['INFERENCE_PYTHON_VERSION'] == inference_python_version
>>> # Save Model to model artifact
>>> ocimodel = modelartifact.save(
...     project_id="oci1......", # OCID of the project to which the model to be associated
...     compartment_id="oci1......", # OCID of the compartment where the model will reside
...     display_name="LRModel_01",
...     description="My Logistic Regression Model",
...     ignore_pending_changes=True,
...     timeout=100,
...     ignore_introspection=True,
... )
>>> print(f"The OCID of the model is: {ocimodel.id}")
Returns

model_artifact – A generic model artifact

Return type

ads.model_artifact.model_artifact

ads.common.model_export_util.serialize_model(model=None, target_dir=None, X=None, y=None, model_type=None, **kwargs)
Parameters
  • model (ads.Model) – A model to be serialized

  • target_dir (str, optional) – directory to output the serialized model

  • X (Union[pandas.DataFrame, pandas.Series]) – The X data

  • y (Union[list, pandas.DataFrame, pandas.Series]) – Tbe Y data

  • model_type (str, optional) – A string corresponding to the model type

Returns

model_kwargs – A dictionary of model kwargs for the serialized model

Return type

Dict

ads.common.function.fn_util module

ads.common.function.fn_util.generate_fn_artifacts(path: str, fn_name: Optional[str] = None, fn_attributes=None, artifact_type_generic=False, **kwargs)
Generates artifacts for fn (https://fnproject.io) at the provided path -
  • func.py

  • func.yaml

  • requirements.txt if not there. If exists appends fdk to the file.

  • score.py

Parameters
  • path (str) – Target folder where the artifacts are placed.

  • fn_attributes (dict) – dictionary specifying all the function attributes as described in https://github.com/fnproject/docs/blob/master/fn/develop/func-file.md

  • artifact_type_generic (bool) – default is False. This attribute decides which template to pick for score.py. If True, it is assumed that the code to load is provided by the user.

ads.common.function.fn_util.get_function_config() dict

Returns dictionary loaded from func_conf.yaml

ads.common.function.fn_util.prepare_fn_attributes(func_name: str, schema_version=20180708, version=None, python_runtime=None, entry_point=None, memory=None) dict

Workaround for collections.namedtuples. The defaults are not supported.

ads.common.function.fn_util.write_score(path, **kwargs)

ads.common.utils module

exception ads.common.utils.FileOverwriteError

Bases: Exception

class ads.common.utils.JsonConverter(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)

Bases: JSONEncoder

Constructor for JSONEncoder, with sensible defaults.

If skipkeys is false, then it is a TypeError to attempt encoding of keys that are not str, int, float or None. If skipkeys is True, such items are simply skipped.

If ensure_ascii is true, the output is guaranteed to be str objects with all incoming non-ASCII characters escaped. If ensure_ascii is false, the output can contain non-ASCII characters.

If check_circular is true, then lists, dicts, and custom encoded objects will be checked for circular references during encoding to prevent an infinite recursion (which would cause an OverflowError). Otherwise, no such check takes place.

If allow_nan is true, then NaN, Infinity, and -Infinity will be encoded as such. This behavior is not JSON specification compliant, but is consistent with most JavaScript based encoders and decoders. Otherwise, it will be a ValueError to encode such floats.

If sort_keys is true, then the output of dictionaries will be sorted by key; this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis.

If indent is a non-negative integer, then JSON array elements and object members will be pretty-printed with that indent level. An indent level of 0 will only insert newlines. None is the most compact representation.

If specified, separators should be an (item_separator, key_separator) tuple. The default is (’, ‘, ‘: ‘) if indent is None and (‘,’, ‘: ‘) otherwise. To get the most compact JSON representation, you should specify (‘,’, ‘:’) to eliminate whitespace.

If specified, default is a function that gets called for objects that can’t otherwise be serialized. It should return a JSON encodable version of the object or raise a TypeError.

default(obj)

Converts an object to JSON based on its type

Parameters

obj (Object) – An object which is being converted to Json, supported types are pandas Timestamp, series, dataframe, or categorical or numpy ndarrays.

Returns

Json

Return type

A json repersentation of the object.

ads.common.utils.copy_from_uri(uri: str, to_path: str, unpack: Optional[bool] = False, force_overwrite: Optional[bool] = False, auth: Optional[Dict] = None) None

Copies file(s) to local path. Can be a folder, archived folder or a separate file. The source files can be located in a local folder or in OCI Object Storage.

Parameters
  • uri (str) – The URI of the source file or directory, which can be local path or OCI object storage URI.

  • to_path (str) – The local destination path. If this is a directory, the source files will be placed under it.

  • unpack ((bool, optional). Defaults to False.) – Indicate if zip or tar.gz file specified by the uri should be unpacked. This option has no effect on other files.

  • force_overwrite ((bool, optional). Defaults to False.) – Whether to overwrite existing files or not.

  • auth ((Dict, optional). Defaults to None.) – The default authetication is set using ads.set_auth API. If you need to override the default, use the ads.common.auth.api_keys or ads.common.auth.resource_principal to create appropriate authentication signer and kwargs required to instantiate IdentityClient object.

Returns

Nothing

Return type

None

Raises

ValueError – If destination path is already exist and force_overwrite is set to False.

ads.common.utils.download_from_web(url: str, to_path: str) None

Downloads a single file from http/https/ftp.

Parameters
  • url (str) – The URL of the source file.

  • to_path (path-like object) – Local destination path.

Returns

Nothing

Return type

None

ads.common.utils.ellipsis_strings(raw, n=24)

takes a sequence (<string>, list(<string>), tuple(<string>), pd.Series(<string>) and Ellipsis’ize them at position n

ads.common.utils.extract_lib_dependencies_from_model(model) dict

Extract a dictionary of library dependencies for a model

Parameters

model

Returns

Dict

Return type

A dictionary of library dependencies.

ads.common.utils.first_not_none(itr)

returns the first non-none result from an iterable, similar to any() but return value not true/false

ads.common.utils.flatten(d, parent_key='')

Flattens nested dictionaries to a single layer dictionary

Parameters
  • d (dict) – The dictionary that needs to be flattened

  • parent_key (str) – Keys in the dictionary that are nested

Returns

a_dict – a single layer dictionary

Return type

dict

ads.common.utils.generate_requirement_file(requirements: dict, file_path: str, file_name: str = 'requirements.txt')

Generate requirements file at file_path.

Parameters
  • requirements (dict) – Key is the library name and value is the version

  • file_path (str) – Directory to save requirements.txt

  • file_name (str) – Opional parameter to specify the file name

ads.common.utils.get_base_modules(model)

Get the base modules from an ADS model

ads.common.utils.get_bootstrap_styles()

Returns HTML bootstrap style information

ads.common.utils.get_compute_accelerator_ncores()
ads.common.utils.get_cpu_count()

Returns the number of CPUs available on this machine

ads.common.utils.get_dataframe_styles(max_width=75)

Styles used for dataframe, example usage:

df.style .set_table_styles(utils.get_dataframe_styles()) .set_table_attributes(‘class=table’) .render())

Returns

styles – A list of dataframe table styler styles.

Return type

array

ads.common.utils.get_files(directory: str)

List out all the file names under this directory.

Parameters

directory (str) – The directory to list out all the files from.

Returns

List of the files in the directory.

Return type

List

ads.common.utils.get_oci_config()

Returns the OCI config location, and the OCI config profile.

ads.common.utils.get_progress_bar(max_progress, description='Initializing')

this will return an instance of ProgressBar, sensitive to the runtime environment

ads.common.utils.get_sqlalchemy_engine(connection_url, *args, **kwargs)

The SqlAlchemny docs say to use a single engine per connection_url, this class will take care of that.

Parameters

connection_url (string) – The URL to connect to

Returns

engine – The engine from which SqlAlchemny commands can be ran on

Return type

SqlAlchemny engine

ads.common.utils.highlight_text(text)

Returns text with html highlights. :param text: The text to be highlighted. :type text: String

Returns

ht – The text with html highlight information.

Return type

String

ads.common.utils.horizontal_scrollable_div(html)

Wrap html with the necessary html to make horizontal scrolling possible.

Examples

display(HTML(utils.horizontal_scrollable_div(my_html)))

Parameters

html (str) – Your HTML to wrap.

Returns

Wrapped HTML.

Return type

type

ads.common.utils.inject_and_copy_kwargs(kwargs, **args)

Takes in a dictionary and returns a copy with the args injected

Examples

>>> foo(arg1, args, utils.inject_and_copy_kwargs(kwargs, arg3=12, arg4=42))
Parameters
  • kwargs (dict) – The original kwargs.

  • **args (type) – A series of arguments, foo=42, bar=12 etc

Returns

d – new dictionary object that you can use in place of kwargs

Return type

dict

ads.common.utils.is_data_too_wide(data: Union[list, tuple, Series, ndarray, DataFrame], max_col_num: int) bool

Returns true if the data has too many columns.

Parameters
  • data (Union[list, tuple, pd.Series, np.ndarray, pd.DataFrame]) – A sample of data that will be used to generate schema.

  • max_col_num (int.) – The maximum column size of the data that allows to auto generate schema.

ads.common.utils.is_debug_mode()

Returns true if ADS is in debug mode.

ads.common.utils.is_documentation_mode()

Returns true if ADS is in documentation mode.

ads.common.utils.is_notebook()

Returns true if the environment is a jupyter notebook.

ads.common.utils.is_resource_principal_mode()

Returns true if ADS is in resource principal mode.

ads.common.utils.is_same_class(obj, cls)

checks to see if object is the same class as cls

ads.common.utils.is_test()

Returns true if ADS is in test mode.

class ads.common.utils.ml_task_types(value)

Bases: Enum

An enumeration.

BINARY_CLASSIFICATION = 2
BINARY_TEXT_CLASSIFICATION = 4
MULTI_CLASS_CLASSIFICATION = 3
MULTI_CLASS_TEXT_CLASSIFICATION = 5
REGRESSION = 1
UNSUPPORTED = 6
ads.common.utils.numeric_pandas_dtypes()

Returns a list of the “numeric” pandas data types

ads.common.utils.oci_config_file()

Returns the OCI config file location

ads.common.utils.oci_config_location()

Returns oci configuration file location.

ads.common.utils.oci_config_profile()

Returns the OCI config profile location.

ads.common.utils.oci_key_location()

Returns the OCI key location

ads.common.utils.oci_key_profile()

Returns key profile value specified in oci configuration file.

ads.common.utils.print_user_message(msg, display_type='tip', see_also_links=None, title='Tip')

This method is deprecated and will be removed in future releases. Prints in html formatted block one of tip|info|warn type.

Parameters
  • msg (str or list) – The actual message to display. display_type is “module’, msg can be a list of [module name, module package name], i.e. [“automl”, “ads[ml]”]

  • display_type (str (default 'tip')) – The type of user message.

  • see_also_links (list of tuples in the form of [('display_name', 'url')]) –

  • title (str (default 'tip')) – The title of user message.

ads.common.utils.random_valid_ocid(prefix='ocid1.dataflowapplication.oc1.iad')

Generates a random valid ocid.

Parameters

prefix (str) – A prefix, corresponding to a region location.

Returns

ocid – a valid ocid with the given prefix.

Return type

str

ads.common.utils.replace_spaces(lst)

Replace all spaces with underscores for strings in the list.

Requires that the list contains strings for each element.

lst: list of strings

ads.common.utils.set_oci_config(oci_config_location, oci_config_profile)
Parameters
  • oci_config_location – location of the config file, for example, ~/.oci/config

  • oci_config_profile – The profile to load from the config file. Defaults to “DEFAULT”

ads.common.utils.split_data(X, y, random_state=42, test_size=0.3)

Splits data using Sklearn based on the input type of the data.

Parameters
  • X (a Pandas Dataframe) – The data points.

  • y (a Pandas Dataframe) – The labels.

  • random_state (int) – A random state for reproducability.

  • test_size (int) – The number of elements that should be included in the test dataset.

ads.common.utils.to_dataframe(data: Union[list, tuple, Series, ndarray, DataFrame])

Convert to pandas DataFrame.

Parameters

data (Union[list, tuple, pd.Series, np.ndarray, pd.DataFrame]) – Convert data to pandas DataFrame.

Returns

pandas DataFrame.

Return type

pd.DataFrame

ads.common.utils.truncate_series_top_n(series, n=24)

take a series which can be interpreted as a dict, index=key, this function sorts by the values and takes the top-n values, and returns a new series

ads.common.utils.wrap_lines(li, heading='')

Wraps the elements of iterable into multi line string of fixed width

Module contents

ads.common.model_metadata_mixin module

class ads.common.model_metadata_mixin.MetadataMixin

Bases: object

MetadataMixin class which populates the custom metadata, taxonomy metadata, input/output schema and provenance metadata.

populate_metadata(use_case_type: Optional[str] = None, data_sample: Optional[ADSData] = None, X_sample: Optional[Union[list, tuple, Series, ndarray, DataFrame]] = None, y_sample: Optional[Union[list, tuple, Series, ndarray, DataFrame]] = None, training_script_path: Optional[str] = None, training_id: Optional[str] = None, ignore_pending_changes: bool = True, max_col_num: int = 2000)

Populates input schema and output schema. If the schema exceeds the limit of 32kb, save as json files to the artifact directory.

Parameters
  • use_case_type ((str, optional). Defaults to None.) – The use case type of the model.

  • data_sample ((ADSData, optional). Defaults to None.) – A sample of the data that will be used to generate intput_schema and output_schema.

  • X_sample (Union[list, tuple, pd.Series, np.ndarray, pd.DataFrame]. Defaults to None.) – A sample of input data that will be used to generate input schema.

  • y_sample (Union[list, tuple, pd.Series, np.ndarray, pd.DataFrame]. Defaults to None.) – A sample of output data that will be used to generate output schema.

  • training_script_path (str. Defaults to None.) – Training script path.

  • training_id ((str, optional). Defaults to None.) – The training model OCID.

  • ignore_pending_changes (bool. Defaults to False.) – Ignore the pending changes in git.

  • max_col_num ((int, optional). Defaults to utils.DATA_SCHEMA_MAX_COL_NUM.) – The maximum number of columns allowed in auto generated schema.

Returns

Nothing.

Return type

None