ads.common package
Submodules
ads.common.card_identifier module
credit card patterns refer to https://en.wikipedia.org/wiki/Payment_card_number#Issuer_identification_number_(IIN) Active and frequent card information American Express: 34, 37 Diners Club (US & Canada): 54,55 Discover Card: 6011, 622126 - 622925, 624000 - 626999, 628200 - 628899, 64, 65 Master Card: 2221-2720, 51–55 Visa: 4
ads.common.auth module
- class ads.common.auth.APIKey(args: Optional[Dict] = None)
Bases:
AuthSignerGenerator
Creates api keys auth instance. This signer is intended to be used when signing requests for a given user - it requires that user’s ID, their private key and certificate fingerprint. It prepares extra arguments necessary for creating clients for variety of OCI services.
Signer created based on args provided. If not provided current values of according arguments will be used from current global state from AuthState class.
- Parameters:
args (dict) –
args that are required to create api key config and signer. Contains keys: oci_config, oci_config_location, oci_key_profile, client_kwargs.
oci_config is a configuration dict that can be used to create clients
oci_config_location - path to config file
oci_key_profile - the profile to load from config file
client_kwargs - optional parameters for OCI client creation in next steps
- create_signer() Dict
Creates api keys configuration and signer with extra arguments necessary for creating clients. Signer constructed from the oci_config provided. If not ‘oci_config’, configuration will be constructed from ‘oci_config_location’ and ‘oci_key_profile’ in place.
Resturns
- dict
Contains keys - config, signer and client_kwargs.
config contains the configuration information
signer contains the signer object created. It is instantiated from signer_callable, or
signer provided in args used, or instantiated in place - client_kwargs contains the client_kwargs that was passed in as input parameter
Examples
>>> signer_args = dict( >>> client_kwargs=client_kwargs >>> ) >>> signer_generator = AuthFactory().signerGenerator(AuthType.API_KEY) >>> signer_generator(signer_args).create_signer()
- class ads.common.auth.AuthContext(**kwargs)
Bases:
object
AuthContext used in ‘with’ statement for properly managing global authentication type, signer, config and global configuration parameters.
Examples
>>> from ads import set_auth >>> from ads.jobs import DataFlowRun >>> with AuthContext(auth='resource_principal'): >>> df_run = DataFlowRun.from_ocid(run_id)
>>> from ads.model.framework.sklearn_model import SklearnModel >>> model = SklearnModel.from_model_artifact(uri="model_artifact_path", artifact_dir="model_artifact_path") >>> set_auth(auth='api_key', oci_config_location="~/.oci/config") >>> with AuthContext(auth='api_key', oci_config_location="~/another_config_location/config"): >>> # upload model to Object Storage using config from another_config_location/config >>> model.upload_artifact(uri="oci://bucket@namespace/prefix/") >>> # upload model to Object Storage using config from ~/.oci/config, which was set before 'with AuthContext():' >>> model.upload_artifact(uri="oci://bucket@namespace/prefix/")
Initialize class AuthContext and saves global state of authentication type, signer, config and global configuration parameters.
- Parameters:
**kwargs (optional, list of parameters passed to ads.set_auth() method, which can be:) –
- auth: Optional[str], default ‘api_key’
’api_key’, ‘resource_principal’ or ‘instance_principal’. Enable/disable resource principal identity, instance principal or keypair identity
- oci_config_location: Optional[str], default oci.config.DEFAULT_LOCATION, which is ‘~/.oci/config’
config file location
- profile: Optional[str], default is DEFAULT_PROFILE, which is ‘DEFAULT’
profile name for api keys config file
- config: Optional[Dict], default {}
created config dictionary
- signer: Optional[Any], default None
created signer, can be resource principals signer, instance principal signer or other
- signer_callable: Optional[Callable], default None
a callable object that returns signer
- signer_kwargs: Optional[Dict], default None
parameters accepted by the signer
- class ads.common.auth.AuthFactory
Bases:
object
AuthFactory class which contains list of registered signers and alllows to register new signers. Check documentation for more signers: https://docs.oracle.com/en-us/iaas/tools/python/latest/api/signing.html.
- Current signers:
APIKey
ResourcePrincipal
InstancePrincipal
- classes = {'api_key': <class 'ads.common.auth.APIKey'>, 'instance_principal': <class 'ads.common.auth.InstancePrincipal'>, 'resource_principal': <class 'ads.common.auth.ResourcePrincipal'>}
- classmethod register(signer_type: str, signer: Any) None
Registers a new signer.
- Parameters:
signer_type (str) – Singer type to be registers
signer (RecordParser) – A new Singer class to be registered.
- Returns:
Nothing.
- Return type:
None
- signerGenerator(iam_type: Optional[str] = 'api_key')
Generates signer classes based of iam_type, which specify one of auth methods: ‘api_key’, ‘resource_principal’ or ‘instance_principal’.
- Parameters:
iam_type (str, default 'api_key') – type of auth provided in IAM_TYPE environment variable or set in parameters in ads.set_auth() method.
- Returns:
returns one of classes, which implements creation of signer of specified type
- Return type:
- Raises:
ValueError – If iam_type is not supported.
- class ads.common.auth.AuthSignerGenerator
Bases:
object
Abstract class for auth configuration and signer creation.
- create_signer()
- class ads.common.auth.AuthState(*args, **kwargs)
Bases:
object
Class stores state of variables specified for auth method, configuration, configuration file location, profile name, signer or signer_callable, which set by use at any given time and can be provided by this class in any ADS module.
- oci_cli_auth: str = None
- oci_config: str = None
- oci_config_path: str = None
- oci_iam_type: str = None
- oci_key_profile: str = None
- oci_signer: str = None
- oci_signer_callable: str = None
- oci_signer_kwargs: str = None
- class ads.common.auth.AuthType
Bases:
str
- API_KEY = 'api_key'
- INSTANCE_PRINCIPAL = 'instance_principal'
- RESOURCE_PRINCIPAL = 'resource_principal'
- class ads.common.auth.InstancePrincipal(args: Optional[Dict] = None)
Bases:
AuthSignerGenerator
Creates Instance Principal signer - a SecurityTokenSigner which uses a security token for an instance principal. It prepares extra arguments necessary for creating clients for variety of OCI services.
Signer created based on args provided. If not provided current values of according arguments will be used from current global state from AuthState class.
- Parameters:
args (dict) –
args that are required to create Instance Principal signer. Contains keys: signer_kwargs, client_kwargs.
signer_kwargs - optional parameters required to instantiate instance principal signer
client_kwargs - optional parameters for OCI client creation in next steps
- create_signer() Dict
Creates Instance Principal signer with extra arguments necessary for creating clients. Signer instantiated from the signer_callable or if the signer provided is will be return by this method. If signer_callable or signer not provided new signer will be created in place.
Resturns
- dict
Contains keys - config, signer and client_kwargs.
config contains the configuration information
signer contains the signer object created. It is instantiated from signer_callable, or
signer provided in args used, or instantiated in place - client_kwargs contains the client_kwargs that was passed in as input parameter
Examples
>>> signer_args = dict(signer_kwargs={"log_requests": True}) >>> signer_generator = AuthFactory().signerGenerator(AuthType.INSTANCE_PRINCIPAL) >>> signer_generator(signer_args).create_signer()
- class ads.common.auth.OCIAuthContext(profile: str = None)
Bases:
object
OCIAuthContext used in ‘with’ statement for properly managing global authentication type and global configuration profile parameters.
Examples
>>> from ads.jobs import DataFlowRun >>> with OCIAuthContext(profile='TEST'): >>> df_run = DataFlowRun.from_ocid(run_id)
Initialize class OCIAuthContext and saves global state of authentication type and configuration profile.
- Parameters:
profile (str, default is None) – profile name for api keys config file
- class ads.common.auth.ResourcePrincipal(args: Optional[Dict] = None)
Bases:
AuthSignerGenerator
Creates Resource Principal signer - a security token for a resource principal. It prepares extra arguments necessary for creating clients for variety of OCI services.
Signer created based on args provided. If not provided current values of according arguments will be used from current global state from AuthState class.
- Parameters:
args (dict) –
args that are required to create Resource Principal signer. Contains keys: client_kwargs.
client_kwargs - optional parameters for OCI client creation in next steps
- create_signer() Dict
Creates Resource Principal signer with extra arguments necessary for creating clients.
Resturns
- dict
Contains keys - config, signer and client_kwargs.
config contains the configuration information
signer contains the signer object created. It is instantiated from signer_callable, or
signer provided in args used, or instantiated in place - client_kwargs contains the client_kwargs that was passed in as input parameter
Examples
>>> signer_args = dict( >>> signer=oci.auth.signers.get_resource_principals_signer() >>> ) >>> signer_generator = AuthFactory().signerGenerator(AuthType.RESOURCE_PRINCIPAL) >>> signer_generator(signer_args).create_signer()
- class ads.common.auth.SingletonMeta
Bases:
type
- ads.common.auth.api_keys(oci_config: str = '/home/docs/.oci/config', profile: str = 'DEFAULT', client_kwargs: Optional[Dict] = None) Dict
Prepares authentication and extra arguments necessary for creating clients for different OCI services using API Keys.
- Parameters:
oci_config (Optional[str], default is $HOME/.oci/config) – OCI authentication config file location.
profile (Optional[str], is DEFAULT_PROFILE, which is 'DEFAULT') – Profile name to select from the config file.
client_kwargs (Optional[Dict], default None) – kwargs that are required to instantiate the Client if we need to override the defaults.
- Returns:
Contains keys - config, signer and client_kwargs.
The config contains the config loaded from the configuration loaded from oci_config.
The signer contains the signer object created from the api keys.
client_kwargs contains the client_kwargs that was passed in as input parameter.
- Return type:
dict
Examples
>>> from ads.common import oci_client as oc >>> auth = ads.auth.api_keys(oci_config="/home/datascience/.oci/config", profile="TEST", client_kwargs={"timeout": 6000}) >>> oc.OCIClientFactory(**auth).object_storage # Creates Object storage client with timeout set to 6000 using API Key authentication
- ads.common.auth.create_signer(auth_type: Optional[str] = 'api_key', oci_config_location: Optional[str] = '~/.oci/config', profile: Optional[str] = 'DEFAULT', config: Optional[Dict] = {}, signer: Optional[Any] = None, signer_callable: Optional[Callable] = None, signer_kwargs: Optional[Dict] = {}, client_kwargs: Optional[Dict] = None) Dict
Prepares authentication and extra arguments necessary for creating clients for different OCI services based on provided parameters. If signer or signer`_callable provided, authentication with that signer will be created. If config provided, api_key type of authentication will be created. Accepted values for auth_type: api_key (default), ‘instance_principal’, ‘resource_principal’.
- Parameters:
auth_type (Optional[str], default 'api_key') –
- ‘api_key’, ‘resource_principal’ or ‘instance_principal’. Enable/disable resource principal identity,
instance principal or keypair identity in a notebook session
oci_config_location (Optional[str], default oci.config.DEFAULT_LOCATION, which is '~/.oci/config') – config file location
profile (Optional[str], default is DEFAULT_PROFILE, which is 'DEFAULT') – profile name for api keys config file
config (Optional[Dict], default {}) – created config dictionary
signer (Optional[Any], default None) – created signer, can be resource principals signer, instance principal signer or other. Check documentation for more signers: https://docs.oracle.com/en-us/iaas/tools/python/latest/api/signing.html
signer_callable (Optional[Callable], default None) – a callable object that returns signer
signer_kwargs (Optional[Dict], default None) – parameters accepted by the signer. Check documentation: https://docs.oracle.com/en-us/iaas/tools/python/latest/api/signing.html
client_kwargs (dict) – kwargs that are required to instantiate the Client if we need to override the defaults
Examples
>>> import ads >>> auth = ads.auth.create_signer() # api_key type of authentication dictionary created with default config location and default profile
>>> config = oci.config.from_file("other_config_location", "OTHER_PROFILE") >>> auth = ads.auth.create_signer(config=config) # api_key type of authentication dictionary created based on provided config
>>> singer = oci.auth.signers.get_resource_principals_signer() >>> auth = ads.auth.create_signer(config={}, singer=signer) # resource principals authentication dictionary created
>>> auth = ads.auth.create_signer(auth_type='instance_principal') # instance principals authentication dictionary created
>>> signer_callable = oci.auth.signers.InstancePrincipalsSecurityTokenSigner >>> signer_kwargs = dict(log_requests=True) # will log the request url and response data when retrieving >>> auth = ads.auth.create_signer(signer_callable=signer_callable, signer_kwargs=signer_kwargs) # instance principals authentication dictionary created based on callable with kwargs parameters
- ads.common.auth.default_signer(client_kwargs: Optional[Dict] = None) Dict
Prepares authentication and extra arguments necessary for creating clients for different OCI services based on the default authentication setting for the session. Refer ads.set_auth API for further reference.
- Parameters:
client_kwargs (dict) – kwargs that are required to instantiate the Client if we need to override the defaults.
- Returns:
Contains keys - config, signer and client_kwargs.
The config contains the config loaded from the configuration loaded from the default location if the default auth mode is API keys, otherwise it is empty dictionary.
The signer contains the signer object created from default auth mode.
client_kwargs contains the client_kwargs that was passed in as input parameter.
- Return type:
dict
Examples
>>> import ads >>> from ads.common import oci_client as oc
>>> auth = ads.auth.default_signer() >>> oc.OCIClientFactory(**auth).object_storage # Creates Object storage client
>>> ads.set_auth("resource_principal") >>> auth = ads.auth.default_signer() >>> oc.OCIClientFactory(**auth).object_storage # Creates Object storage client using resource principal authentication
>>> signer_callable = oci.auth.signers.InstancePrincipalsSecurityTokenSigner >>> ads.set_auth(signer_callable=signer_callable) # Set instance principal callable >>> auth = ads.auth.default_signer() # signer_callable instantiated >>> oc.OCIClientFactory(**auth).object_storage # Creates Object storage client using instance principal authentication
- ads.common.auth.get_signer(oci_config: Optional[str] = None, oci_profile: Optional[str] = None, **client_kwargs) Dict
Provides config and signer based given parameters. If oci_config (api key config file location) and oci_profile specified new signer will ge generated. Else singer of a type specified in OCI_CLI_AUTH environment variable will be used to generate signer and return. If OCI_CLI_AUTH not set, resource principal signer will be provided. Accepted values for OCI_CLI_AUTH: ‘api_key’, ‘instance_principal’, ‘resource_principal’.
- Parameters:
oci_config (Optional[str], default None) – Path to the config file
oci_profile (Optional[str], default None) – the profile to load from the config file
client_kwargs – kwargs that are required to instantiate the Client if we need to override the defaults
- ads.common.auth.resource_principal(client_kwargs: Optional[Dict] = None) Dict
Prepares authentication and extra arguments necessary for creating clients for different OCI services using Resource Principals.
- Parameters:
client_kwargs (Dict, default None) – kwargs that are required to instantiate the Client if we need to override the defaults.
- Returns:
Contains keys - config, signer and client_kwargs.
The config contains and empty dictionary.
The signer contains the signer object created from the resource principal.
client_kwargs contains the client_kwargs that was passed in as input parameter.
- Return type:
dict
Examples
>>> from ads.common import oci_client as oc >>> auth = ads.auth.resource_principal({"timeout": 6000}) >>> oc.OCIClientFactory(**auth).object_storage # Creates Object Storage client with timeout set to 6000 seconds using resource principal authentication
- ads.common.auth.set_auth(auth: Optional[str] = 'api_key', oci_config_location: Optional[str] = '~/.oci/config', profile: Optional[str] = 'DEFAULT', config: Optional[Dict] = {}, signer: Optional[Any] = None, signer_callable: Optional[Callable] = None, signer_kwargs: Optional[Dict] = {}) None
Save type of authentication, profile, config location, config (keypair identity) or signer, which will be used when actual creation of config or signer happens.
- Parameters:
auth (Optional[str], default 'api_key') –
- ‘api_key’, ‘resource_principal’ or ‘instance_principal’. Enable/disable resource principal identity,
instance principal or keypair identity in a notebook session
oci_config_location (Optional[str], default oci.config.DEFAULT_LOCATION, which is '~/.oci/config') – config file location
profile (Optional[str], default is DEFAULT_PROFILE, which is 'DEFAULT') – profile name for api keys config file
config (Optional[Dict], default {}) – created config dictionary
signer (Optional[Any], default None) – created signer, can be resource principals signer, instance principal signer or other. Check documentation for more signers: https://docs.oracle.com/en-us/iaas/tools/python/latest/api/signing.html
signer_callable (Optional[Callable], default None) – a callable object that returns signer
signer_kwargs (Optional[Dict], default None) – parameters accepted by the signer. Check documentation: https://docs.oracle.com/en-us/iaas/tools/python/latest/api/signing.html
Examples
>>> ads.set_auth("api_key") # default signer is set to api keys
>>> ads.set_auth("api_key", profile = "TEST") # default signer is set to api keys and to use TEST profile
>>> ads.set_auth("api_key", oci_config_location = "other_config_location") # use non-default oci_config_location
>>> other_config = oci.config.from_file("other_config_location", "OTHER_PROFILE") # Create non-default config >>> ads.set_auth(config=other_config) # Set api keys type of authentication based on provided config
>>> ads.set_auth("resource_principal") # Set resource principal authentication
>>> ads.set_auth("instance_principal") # Set instance principal authentication
>>> singer = oci.auth.signers.get_resource_principals_signer() >>> ads.auth.create_signer(config={}, singer=signer) # resource principals authentication dictionary created
>>> signer_callable = oci.auth.signers.ResourcePrincipalsFederationSigner >>> ads.set_auth(signer_callable=signer_callable) # Set resource principal federation singer callable
>>> signer_callable = oci.auth.signers.InstancePrincipalsSecurityTokenSigner >>> signer_kwargs = dict(log_requests=True) # will log the request url and response data when retrieving >>> # instance principals authentication dictionary created based on callable with kwargs parameters: >>> ads.set_auth(signer_callable=signer_callable, signer_kwargs=signer_kwargs)
ads.common.data module
- class ads.common.data.ADSData(X=None, y=None, name='', dataset_type=None)
Bases:
object
This class wraps the input dataframe to various models, evaluation, and explanation frameworks. It’s primary purpose is to hold any metadata relevant to these tasks. This can include it’s:
X - the independent variables as some dataframe-like structure,
y - the dependent variable or target column as some array-like structure,
name - a string to name the data for user convenience,
dataset_type - the type of the X value.
As part of this initiative, ADSData knows how to turn itself into an onnxruntime compatible data structure with the method .to_onnxrt(), which takes and onnx session as input.
- Parameters:
X (Union[pandas.DataFrame, dask.DataFrame, numpy.ndarray, scipy.sparse.csr.csr_matrix]) – If str, URI for the dataset. The dataset could be read from local or network file system, hdfs, s3 and gcs Should be none if X_train, y_train, X_test, Y_test are provided
y (Union[str, pandas.DataFrame, dask.DataFrame, pandas.Series, dask.Series, numpy.ndarray]) – If str, name of the target in X, otherwise series of labels corresponding to X
name (str, optional) – Name to identify this data
dataset_type (ADSDataset optional) – When this value is available, would be used to evaluate the ads task type
kwargs – Additional keyword arguments that would be passed to the underlying Pandas read API.
- static build(X=None, y=None, name='', dataset_type=None, **kwargs)
Returns an ADSData object built from the (source, target) or (X,y)
- Parameters:
X (Union[pandas.DataFrame, dask.DataFrame, numpy.ndarray, scipy.sparse.csr.csr_matrix]) – If str, URI for the dataset. The dataset could be read from local or network file system, hdfs, s3 and gcs Should be none if X_train, y_train, X_test, Y_test are provided
y (Union[str, pandas.DataFrame, dask.DataFrame, pandas.Series, dask.Series, numpy.ndarray]) – If str, name of the target in X, otherwise series of labels corresponding to X
name (str, optional) – Name to identify this data
dataset_type (ADSDataset, optional) – When this value is available, would be used to evaluate the ads task type
kwargs – Additional keyword arguments that would be passed to the underlying Pandas read API.
- Returns:
ads_data – A built ADSData object
- Return type:
Examples
>>> data = open_csv("my.csv")
>>> data_ads = ADSData(data, 'target').build(data, 'target')
- to_onnxrt(sess, idx_range=None, model=None, impute_values={}, **kwargs)
Returns itself formatted as an input for the onnxruntime session inputs passed in.
- Parameters:
sess (Session) – The session object
idx_range (Range) – The range of inputs to convert to onnx
model (SupportedModel) – A model that supports being serialized for the onnx runtime.
kwargs (additional keyword arguments) –
sess_inputs - Pass in the output from onnxruntime.InferenceSession(“model.onnx”).get_inputs()
input_dtypes (list) - If sess_inputs cannot be passed in, pass in the numpy dtypes of each input
input_shapes (list) - If sess_inputs cannot be passed in, pass in the shape of each input
input_names (list) -If sess_inputs cannot be passed in, pass in the name of each input
- Returns:
ort – array of inputs formatted for the given session.
- Return type:
Array
ads.common.model module
- class ads.common.model.ADSModel(est, target=None, transformer_pipeline=None, client=None, booster=None, classes=None, name=None)
Bases:
object
Construct an ADSModel
- Parameters:
est (fitted estimator object) – The estimator can be a standard sklearn estimator, a keras, lightgbm, or xgboost estimator, or any other object that implement methods from (BaseEstimator, RegressorMixin) for regression or (BaseEstimator, ClassifierMixin) for classification.
target (PandasSeries) – The target column you are using in your dataset, this is assigned as the “y” attribute.
transformer_pipeline (TransformerPipeline) – A custom trasnformer pipeline object.
client (Str) – Currently unused.
booster (Str) – Currently unused.
classes (list, optional) – List of target classes. Required for classification problem if the est does not contain classes attribute.
name (str, optional) – Name of the model.
- static convert_dataframe_schema(df, drop=None)
- feature_names(X=None)
- static from_estimator(est, transformers=None, classes=None, name=None)
Build ADSModel from a fitted estimator
- Parameters:
est (fitted estimator object) – The estimator can be a standard sklearn estimator or any object that implement methods from (BaseEstimator, RegressorMixin) for regression or (BaseEstimator, ClassifierMixin) for classification.
transformers (a scalar or an iterable of objects implementing transform function, optional) – The transform function would be applied on data before calling predict and predict_proba on estimator.
classes (list, optional) – List of target classes. Required for classification problem if the est does not contain classes attribute.
name (str, optional) – Name of the model.
- Returns:
model
- Return type:
Examples
>>> model = MyModelClass.train() >>> model_ads = from_estimator(model)
- static get_init_types(df, underlying_model=None)
- is_classifier()
Returns True if ADS believes that the model is a classifier
- Returns:
Boolean
- Return type:
True if the model is a classifier, False otherwise.
- predict(X)
Runs the models predict function on some data
- Parameters:
X (ADSData) – A ADSData object which holds the examples to be predicted on.
- Returns:
Usually a list or PandasSeries of predictions
- Return type:
Union[List, pandas.Series], depending on the estimator
- predict_proba(X)
Runs the models predict probabilities function on some data
- Parameters:
X (ADSData) – A ADSData object which holds the examples to be predicted on.
- Returns:
Usually a list or PandasSeries of predictions
- Return type:
Union[List, pandas.Series], depending on the estimator
- prepare(target_dir=None, data_sample=None, X_sample=None, y_sample=None, include_data_sample=False, force_overwrite=False, fn_artifact_files_included=False, fn_name='model_api', inference_conda_env=None, data_science_env=False, ignore_deployment_error=False, use_case_type=None, inference_python_version=None, imputed_values={}, **kwargs)
Prepare model artifact directory to be published to model catalog
- Parameters:
target_dir (str, default: model.name[:12]) – Target directory under which the model artifact files need to be added
data_sample (ADSData) – Note: This format is preferable to X_sample and y_sample. A sample of the test data that will be provided to predict() API of scoring script Used to generate schema_input.json and schema_output.json which defines the input and output formats
X_sample (pandas.DataFrame) – A sample of input data that will be provided to predict() API of scoring script Used to generate schema.json which defines the input formats
y_sample (pandas.Series) – A sample of output data that is expected to be returned by predict() API of scoring script, corresponding to X_sample Used to generate schema_output.json which defines the output formats
force_overwrite (bool, default: False) – If True, overwrites the target directory if exists already
fn_artifact_files_included (bool, default: True) – If True, generates artifacts to export a model as a function without ads dependency
fn_name (str, default: 'model_api') – Required parameter if fn_artifact_files_included parameter is setup.
inference_conda_env (str, default: None) – Conda environment to use within the model deployment service for inferencing
data_science_env (bool, default: False) – If set to True, datascience environment represented by the slug in the training conda environment will be used.
ignore_deployment_error (bool, default: False) – If set to True, the prepare will ignore all the errors that may impact model deployment
use_case_type (str) – The use case type of the model. Use it through UserCaseType class or string provided in UseCaseType. For example, use_case_type=UseCaseType.BINARY_CLASSIFICATION or use_case_type=”binary_classification”. Check with UseCaseType class to see all supported types.
inference_python_version (str, default:None.) – If provided will be added to the generated runtime yaml
**kwargs –
-------- –
max_col_num ((int, optional). Defaults to utils.DATA_SCHEMA_MAX_COL_NUM.) – The maximum column size of the data that allows to auto generate schema.
- Returns:
model_artifact
- Return type:
an instance of ModelArtifact that can be used to test the generated scoring script
- rename(name)
Changes the name of a model
- Parameters:
name (str) – A string which is supplied for naming a model.
- score(X, y_true, score_fn=None)
Scores a model according to a custom score function
- Parameters:
X (ADSData) – A ADSData object which holds the examples to be predicted on.
y_true (ADSData) – A ADSData object which holds ground truth labels for the examples which are being predicted on.
score_fn (Scorer (callable)) – A callable object that returns a score, usually created with sklearn.metrics.make_scorer().
- Returns:
Almost always a scalar score (usually a float).
- Return type:
float, depending on the estimator
- show_in_notebook()
Describe the model by showing it’s properties
- summary()
A summary of the ADSModel
- transform(X)
Process some ADSData through the selected ADSModel transformers
- Parameters:
X (ADSData) – A ADSData object which holds the examples to be transformed.
- visualize_transforms()
A graph of the ADSModel transformer pipeline. It is only supported in JupyterLabs Notebooks.
ads.common.model_metadata module
The module created for the back compatability. The original model_metadata was moved to the ads.model package.
ads.common.decorator.runtime_dependency module
The module that provides the decorator helping to add runtime dependencies in functions.
Examples
>>> @runtime_dependency(module="pandas", short_name="pd")
... def test_function()
... print(pd)
>>> @runtime_dependency(module="pandas", object="DataFrame", short_name="df")
... def test_function()
... print(df)
>>> @runtime_dependency(module="pandas", short_name="pd")
... @runtime_dependency(module="pandas", object="DataFrame", short_name="df")
... def test_function()
... print(df)
... print(pd)
>>> @runtime_dependency(module="pandas", object="DataFrame", short_name="df", install_from="ads[optional]")
... def test_function()
... pass
>>> @runtime_dependency(module="pandas", object="DataFrame", short_name="df", err_msg="Custom error message.")
... def test_function()
... pass
- class ads.common.decorator.runtime_dependency.OptionalDependency
Bases:
object
- BDS = 'oracle-ads[bds]'
- BOOSTED = 'oracle-ads[boosted]'
- DATA = 'oracle-ads[data]'
- GEO = 'oracle-ads[geo]'
- LABS = 'oracle-ads[labs]'
- MYSQL = 'oracle-ads[mysql]'
- NOTEBOOK = 'oracle-ads[notebook]'
- ONNX = 'oracle-ads[onnx]'
- OPCTL = 'oracle-ads[opctl]'
- OPTUNA = 'oracle-ads[optuna]'
- PYTORCH = 'oracle-ads[torch]'
- SPARK = 'oracle-ads[spark]'
- TENSORFLOW = 'oracle-ads[tensorflow]'
- TEXT = 'oracle-ads[text]'
- VIZ = 'oracle-ads[viz]'
- ads.common.decorator.runtime_dependency.runtime_dependency(module: str, short_name: str = '', object: Optional[str] = None, install_from: Optional[str] = None, err_msg: str = '', is_for_notebook_only=False)
The decorator which is helping to add runtime dependencies to functions.
- Parameters:
module (str) – The module name to be imported.
short_name ((str, optional). Defaults to empty string.) – The short name for the imported module.
object ((str, optional). Defaults to None.) – The name of the object to be imported. Can be a function or a class, or any variable provided by module.
install_from ((str, optional). Defaults to None.) – The parameter helping to answer from where the required dependency can be installed.
err_msg ((str, optional). Defaults to empty string.) – The custom error message.
is_for_notebook_only ((bool, optional). Defaults to False.) – If the value of this flag is set to True, the dependency will be added only in case when the current environment is a jupyter notebook.
- Raises:
ModuleNotFoundError – In case if requested module not found.
ImportError – In case if object cannot be imported from the module.
Examples
>>> @runtime_dependency(module="pandas", short_name="pd") ... def test_function() ... print(pd)
>>> @runtime_dependency(module="pandas", object="DataFrame", short_name="df") ... def test_function() ... print(df)
>>> @runtime_dependency(module="pandas", short_name="pd") ... @runtime_dependency(module="pandas", object="DataFrame", short_name="df") ... def test_function() ... print(df) ... print(pd)
>>> @runtime_dependency(module="pandas", object="DataFrame", short_name="df", install_from="ads[optional]") ... def test_function() ... pass
>>> @runtime_dependency(module="pandas", object="DataFrame", short_name="df", err_msg="Custom error message.") ... def test_function() ... pass
ads.common.decorator.deprecate module
- class ads.common.decorator.deprecate.TARGET_TYPE(value)
Bases:
Enum
An enumeration.
- ATTRIBUTE = 'Attribute'
- CLASS = 'Class'
- METHOD = 'Method'
- ads.common.decorator.deprecate.deprecated(deprecated_in: str, removed_in: Optional[str] = None, details: Optional[str] = None, target_type: Optional[str] = None)
This is a decorator which can be used to mark functions as deprecated. It will result in a warning being emitted when the function is used.
- Parameters:
deprecated_in (str) – Version of ADS where this function deprecated.
removed_in (str) – Future version where this function will be removed.
details (str) – More information to be shown.
ads.common.model_introspect module
The module that helps to minimize the number of errors of the model post-deployment process. The model provides a simple testing harness to ensure that model artifacts are thoroughly tested before being saved to the model catalog.
Classes
- ModelIntrospect
Class to introspect model artifacts.
Examples
>>> model_introspect = ModelIntrospect(artifact=model_artifact)
>>> model_introspect()
... Test key Test name Result Message
... ----------------------------------------------------------------------------
... test_key_1 test_name_1 Passed test passed
... test_key_2 test_name_2 Not passed some error occured
>>> model_introspect.status
... Passed
- class ads.common.model_introspect.Introspectable
Bases:
ABC
Base class that represents an introspectable object.
- exception ads.common.model_introspect.IntrospectionNotPassed
Bases:
ValueError
- class ads.common.model_introspect.ModelIntrospect(artifact: Introspectable)
Bases:
object
Class to introspect model artifacts.
- Parameters:
status (str) – Returns the current status of model introspection. The possible variants: Passed, Not passed, Not tested.
failures (int) – Returns the number of failures of introspection result.
- run(self) None
Invokes model artifacts introspection.
- to_dataframe(self) pd.DataFrame
Serializes model introspection result into a DataFrame.
Examples
>>> model_introspect = ModelIntrospect(artifact=model_artifact) >>> result = model_introspect() ... Test key Test name Result Message ... ---------------------------------------------------------------------------- ... test_key_1 test_name_1 Passed test passed ... test_key_2 test_name_2 Not passed some error occured
Initializes the Model Introspect.
- Parameters:
artifact (Introspectable) – The instance of ModelArtifact object.
- Raises:
ValueError – If model artifact object not provided.:
TypeError – If provided input paramater not a ModelArtifact instance.:
- property failures: int
Calculates the number of failures.
- Returns:
The number of failures.
- Return type:
int
- run() DataFrame
Invokes introspection.
- Returns:
The introspection result in a DataFrame format.
- Return type:
pd.DataFrame
- property status: str
Gets the current status of model introspection.
- to_dataframe() DataFrame
Serializes model introspection result into a DataFrame.
- Returns:
The model introspection result in a DataFrame representation.
- Return type:
pandas.DataFrame
- class ads.common.model_introspect.PrintItem(key: str = '', case: str = '', result: str = '', message: str = '')
Bases:
object
Class represents the model introspection print item.
- case: str = ''
- key: str = ''
- message: str = ''
- result: str = ''
- to_list() List[str]
Converts instance to a list representation.
- Returns:
The instance in a list representation.
- Return type:
List[str]
ads.common.model_export_util module
- ads.common.model_export_util.prepare_generic_model(model_path: str, fn_artifact_files_included: bool = False, fn_name: str = 'model_api', force_overwrite: bool = False, model: Any = None, data_sample: ADSData = None, use_case_type=None, X_sample: Union[list, tuple, Series, ndarray, DataFrame] = None, y_sample: Union[list, tuple, Series, ndarray, DataFrame] = None, **kwargs) ModelArtifact
Generates template files to aid model deployment. The model could be accompanied by other artifacts all of which can be dumped at model_path. Following files are generated: * func.yaml * func.py * requirements.txt * score.py
- Parameters:
model_path (str) – Path where the artifacts must be saved. The serialized model object and any other associated files/objects must be saved in the model_path directory
fn_artifact_files_included (bool) – Default is False, if turned off, function artifacts are not generated.
fn_name (str) – Opional parameter to specify the function name
force_overwrite (bool) – Opional parameter to specify if the model_artifact should overwrite the existing model_path (if it exists)
model ((Any, optional). Defaults to None.) – This is an optional model object which is only used to extract taxonomy metadata. Supported models: automl, keras, lightgbm, pytorch, sklearn, tensorflow, and xgboost. If the model is not under supported frameworks, then extracting taxonomy metadata will be skipped. The alternative way is using atifact.populate_metadata(model=model, usecase_type=UseCaseType.REGRESSION).
data_sample (ADSData) – A sample of the test data that will be provided to predict() API of scoring script Used to generate schema_input and schema_output
use_case_type (str) – The use case type of the model
X_sample (Union[list, tuple, pd.Series, np.ndarray, pd.DataFrame, dask.dataframe.core.Series, dask.dataframe.core.DataFrame]) – A sample of input data that will be provided to predict() API of scoring script Used to generate input schema.
y_sample (Union[list, tuple, pd.Series, np.ndarray, pd.DataFrame, dask.dataframe.core.Series, dask.dataframe.core.DataFrame]) – A sample of output data that is expected to be returned by predict() API of scoring script, corresponding to X_sample Used to generate output schema.
**kwargs –
________ –
data_science_env (bool, default: False) – If set to True, the datascience environment represented by the slug in the training conda environment will be used.
inference_conda_env (str, default: None) – Conda environment to use within the model deployment service for inferencing. For example, oci://bucketname@namespace/path/to/conda/env
ignore_deployment_error (bool, default: False) – If set to True, the prepare method will ignore all the errors that may impact model deployment.
underlying_model (str, default: 'UNKNOWN') – Underlying Model Type, could be “automl”, “sklearn”, “h2o”, “lightgbm”, “xgboost”, “torch”, “mxnet”, “tensorflow”, “keras”, “pyod” and etc.
model_libs (dict, default: {}) – Model required libraries where the key is the library names and the value is the library versions. For example, {numpy: 1.21.1}.
progress (int, default: None) – max number of progress.
inference_python_version (str, default:None.) – If provided will be added to the generated runtime yaml
max_col_num ((int, optional). Defaults to utils.DATA_SCHEMA_MAX_COL_NUM.) – The maximum column size of the data that allows to auto generate schema.
Examples
>>> import cloudpickle >>> import os >>> from sklearn.linear_model import LogisticRegression >>> from sklearn.datasets import make_classification >>> import ads >>> from ads.common.model_export_util import prepare_generic_model >>> import yaml >>> import oci >>> >>> ads.set_auth('api_key', oci_config_location=oci.config.DEFAULT_LOCATION, profile='DEFAULT') >>> model_artifact_location = os.path.expanduser('~/myusecase/model/') >>> inference_conda_env="oci://my-bucket@namespace/conda_environments/cpu/Data_Exploration_and_Manipulation_for_CPU_Python_3.7/2.0/dataexpl_p37_cpu_v2" >>> inference_python_version = "3.7" >>> if not os.path.exists(model_artifact_location): ... os.makedirs(model_artifact_location) >>> X, y = make_classification(n_samples=100, n_features=20, n_classes=2) >>> lrmodel = LogisticRegression().fit(X, y) >>> with open(os.path.join(model_artifact_location, 'model.pkl'), "wb") as mfile: ... cloudpickle.dump(lrmodel, mfile) >>> modelartifact = prepare_generic_model( ... model_artifact_location, ... model = lrmodel, ... force_overwrite=True, ... inference_conda_env=inference_conda_env, ... ignore_deployment_error=True, ... inference_python_version=inference_python_version ... ) >>> modelartifact.reload() # Call reload to update the ModelArtifact object with the generated score.py >>> assert len(modelartifact.predict(X[:5])['prediction']) == 5 #Test the generated score.py works. This may require customization. >>> with open(os.path.join(model_artifact_location, "runtime.yaml")) as rf: ... content = yaml.load(rf, Loader=yaml.FullLoader) ... assert content['MODEL_DEPLOYMENT']['INFERENCE_CONDA_ENV']['INFERENCE_ENV_PATH'] == inference_conda_env ... assert content['MODEL_DEPLOYMENT']['INFERENCE_CONDA_ENV']['INFERENCE_PYTHON_VERSION'] == inference_python_version >>> # Save Model to model artifact >>> ocimodel = modelartifact.save( ... project_id="oci1......", # OCID of the project to which the model to be associated ... compartment_id="oci1......", # OCID of the compartment where the model will reside ... display_name="LRModel_01", ... description="My Logistic Regression Model", ... ignore_pending_changes=True, ... timeout=100, ... ignore_introspection=True, ... ) >>> print(f"The OCID of the model is: {ocimodel.id}")
- Returns:
model_artifact – A generic model artifact
- Return type:
ads.model_artifact.model_artifact
- ads.common.model_export_util.serialize_model(model=None, target_dir=None, X=None, y=None, model_type=None, **kwargs)
- Parameters:
model (ads.Model) – A model to be serialized
target_dir (str, optional) – directory to output the serialized model
X (Union[pandas.DataFrame, pandas.Series]) – The X data
y (Union[list, pandas.DataFrame, pandas.Series]) – Tbe Y data
model_type (str, optional) – A string corresponding to the model type
- Returns:
model_kwargs – A dictionary of model kwargs for the serialized model
- Return type:
Dict
ads.common.function.fn_util module
- ads.common.function.fn_util.generate_fn_artifacts(path: str, fn_name: Optional[str] = None, fn_attributes=None, artifact_type_generic=False, **kwargs)
- Generates artifacts for fn (https://fnproject.io) at the provided path -
func.py
func.yaml
requirements.txt if not there. If exists appends fdk to the file.
score.py
- Parameters:
path (str) – Target folder where the artifacts are placed.
fn_attributes (dict) – dictionary specifying all the function attributes as described in https://github.com/fnproject/docs/blob/master/fn/develop/func-file.md
artifact_type_generic (bool) – default is False. This attribute decides which template to pick for score.py. If True, it is assumed that the code to load is provided by the user.
- ads.common.function.fn_util.get_function_config() dict
Returns dictionary loaded from func_conf.yaml
- ads.common.function.fn_util.prepare_fn_attributes(func_name: str, schema_version=20180708, version=None, python_runtime=None, entry_point=None, memory=None) dict
Workaround for collections.namedtuples. The defaults are not supported.
- ads.common.function.fn_util.write_score(path, **kwargs)
ads.common.utils module
- exception ads.common.utils.FileOverwriteError
Bases:
Exception
- class ads.common.utils.JsonConverter(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)
Bases:
JSONEncoder
Constructor for JSONEncoder, with sensible defaults.
If skipkeys is false, then it is a TypeError to attempt encoding of keys that are not str, int, float or None. If skipkeys is True, such items are simply skipped.
If ensure_ascii is true, the output is guaranteed to be str objects with all incoming non-ASCII characters escaped. If ensure_ascii is false, the output can contain non-ASCII characters.
If check_circular is true, then lists, dicts, and custom encoded objects will be checked for circular references during encoding to prevent an infinite recursion (which would cause an OverflowError). Otherwise, no such check takes place.
If allow_nan is true, then NaN, Infinity, and -Infinity will be encoded as such. This behavior is not JSON specification compliant, but is consistent with most JavaScript based encoders and decoders. Otherwise, it will be a ValueError to encode such floats.
If sort_keys is true, then the output of dictionaries will be sorted by key; this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis.
If indent is a non-negative integer, then JSON array elements and object members will be pretty-printed with that indent level. An indent level of 0 will only insert newlines. None is the most compact representation.
If specified, separators should be an (item_separator, key_separator) tuple. The default is (’, ‘, ‘: ‘) if indent is
None
and (‘,’, ‘: ‘) otherwise. To get the most compact JSON representation, you should specify (‘,’, ‘:’) to eliminate whitespace.If specified, default is a function that gets called for objects that can’t otherwise be serialized. It should return a JSON encodable version of the object or raise a
TypeError
.
- ads.common.utils.batch_convert_case(spec: dict, to_fmt: str) Dict
Convert the case of a dictionary of spec from camel to snake or vice versa.
- Parameters:
spec (dict) – dictionary of spec to convert
to_fmt (str) – format to convert to, can be “camel” or “snake”
- Returns:
dictionary of converted spec
- Return type:
dict
- ads.common.utils.camel_to_snake(name: str) str
Converts the camel case string to the snake representation.
- Parameters:
name (str) – The name to convert.
- Returns:
str
- Return type:
The name converted to the snake representation.
- ads.common.utils.copy_file(uri_src: str, uri_dst: str, force_overwrite: Optional[bool] = False, auth: Optional[Dict] = None, chunk_size: Optional[int] = 8192, progressbar_description: Optional[str] = 'Copying `{uri_src}` to `{uri_dst}`') str
Copies file from uri_src to uri_dst. If uri_dst specifies a directory, the file will be copied into uri_dst using the base filename from uri_src. Returns the path to the newly created file.
- Parameters:
uri_src (str) – The URI of the source file, which can be local path or OCI object storage URI.
uri_dst (str) – The URI of the destination file, which can be local path or OCI object storage URI.
force_overwrite ((bool, optional). Defaults to False.) – Whether to overwrite existing files or not.
auth ((Dict, optional). Defaults to None.) – The default authetication is set using ads.set_auth API. If you need to override the default, use the ads.common.auth.api_keys or ads.common.auth.resource_principal to create appropriate authentication signer and kwargs required to instantiate IdentityClient object.
chunk_size ((int, optinal). Defaults to DEFAULT_BUFFER_SIZE) – How much data can be copied in one iteration.
- Returns:
The path to the newly created file.
- Return type:
str
- Raises:
FileExistsError – If a destination file exists and force_overwrite set to False.
- ads.common.utils.copy_from_uri(uri: str, to_path: str, unpack: Optional[bool] = False, force_overwrite: Optional[bool] = False, auth: Optional[Dict] = None) None
Copies file(s) to local path. Can be a folder, archived folder or a separate file. The source files can be located in a local folder or in OCI Object Storage.
- Parameters:
uri (str) – The URI of the source file or directory, which can be local path or OCI object storage URI.
to_path (str) – The local destination path. If this is a directory, the source files will be placed under it.
unpack ((bool, optional). Defaults to False.) – Indicate if zip or tar.gz file specified by the uri should be unpacked. This option has no effect on other files.
force_overwrite ((bool, optional). Defaults to False.) – Whether to overwrite existing files or not.
auth ((Dict, optional). Defaults to None.) – The default authetication is set using ads.set_auth API. If you need to override the default, use the ads.common.auth.api_keys or ads.common.auth.resource_principal to create appropriate authentication signer and kwargs required to instantiate IdentityClient object.
- Returns:
Nothing
- Return type:
None
- Raises:
ValueError – If destination path is already exist and force_overwrite is set to False.
- ads.common.utils.download_from_web(url: str, to_path: str) None
Downloads a single file from http/https/ftp.
- Parameters:
url (str) – The URL of the source file.
to_path (path-like object) – Local destination path.
- Returns:
Nothing
- Return type:
None
- ads.common.utils.ellipsis_strings(raw, n=24)
takes a sequence (<string>, list(<string>), tuple(<string>), pd.Series(<string>) and Ellipsis’ize them at position n
- ads.common.utils.extract_lib_dependencies_from_model(model) dict
Extract a dictionary of library dependencies for a model
- Parameters:
model –
- Returns:
Dict
- Return type:
A dictionary of library dependencies.
- ads.common.utils.first_not_none(itr)
Returns the first non-none result from an iterable, similar to any() but return value not true/false
- ads.common.utils.flatten(d, parent_key='')
Flattens nested dictionaries to a single layer dictionary
- Parameters:
d (dict) – The dictionary that needs to be flattened
parent_key (str) – Keys in the dictionary that are nested
- Returns:
a_dict – a single layer dictionary
- Return type:
dict
- ads.common.utils.folder_size(path: str) int
Recursively calculating a size of the path folder.
- Parameters:
path (str) – Path to the folder.
- Returns:
The size fo the folder in bytes.
- Return type:
int
- ads.common.utils.generate_requirement_file(requirements: dict, file_path: str, file_name: str = 'requirements.txt')
Generate requirements file at file_path.
- Parameters:
requirements (dict) – Key is the library name and value is the version
file_path (str) – Directory to save requirements.txt
file_name (str) – Opional parameter to specify the file name
- ads.common.utils.get_base_modules(model)
Get the base modules from an ADS model
- ads.common.utils.get_bootstrap_styles()
Returns HTML bootstrap style information
- ads.common.utils.get_compute_accelerator_ncores()
- ads.common.utils.get_cpu_count()
Returns the number of CPUs available on this machine
- ads.common.utils.get_dataframe_styles(max_width=75)
Styles used for dataframe, example usage:
df.style .set_table_styles(utils.get_dataframe_styles()) .set_table_attributes(‘class=table’) .render())
- Returns:
styles – A list of dataframe table styler styles.
- Return type:
array
- ads.common.utils.get_files(directory: str)
List out all the file names under this directory.
- Parameters:
directory (str) – The directory to list out all the files from.
- Returns:
List of the files in the directory.
- Return type:
List
- ads.common.utils.get_oci_config()
Returns the OCI config location, and the OCI config profile.
- ads.common.utils.get_progress_bar(max_progress, description='Initializing')
this will return an instance of ProgressBar, sensitive to the runtime environment
- ads.common.utils.get_random_name_for_resource() str
Returns randomly generated easy to remember name. It consists from 1 adjective and 1 animal word, tailed by UTC timestamp (joined with ‘-‘). This is an ADS default resource name generated for models, jobs, jobruns, model deployments, pipelines.
- Returns:
Randomly generated easy to remember name for oci resources - models, jobs, jobruns, model deployments, pipelines. Example: polite-panther-2022-08-17-21:15.46; strange-spider-2022-08-17-23:55.02
- Return type:
str
- ads.common.utils.get_sqlalchemy_engine(connection_url, *args, **kwargs)
The SqlAlchemny docs say to use a single engine per connection_url, this class will take care of that.
- Parameters:
connection_url (string) – The URL to connect to
- Returns:
engine – The engine from which SqlAlchemny commands can be ran on
- Return type:
SqlAlchemny engine
- ads.common.utils.get_value(obj, attr, default=None)
Gets a copy of the value from a nested dictionary of an object with nested attributes.
- Parameters:
obj – An object or a dictionary
attr – Attributes as a string seprated by dot(.)
default – Default value to be returned if attribute is not found.
- Returns:
A copy of the attribute value. For dict or list, a deepcopy will be returned.
- Return type:
Any
- ads.common.utils.highlight_text(text)
Returns text with html highlights. :param text: The text to be highlighted. :type text: String
- Returns:
ht – The text with html highlight information.
- Return type:
- ads.common.utils.horizontal_scrollable_div(html)
Wrap html with the necessary html to make horizontal scrolling possible.
Examples
display(HTML(utils.horizontal_scrollable_div(my_html)))
- Parameters:
html (str) – Your HTML to wrap.
- Returns:
Wrapped HTML.
- Return type:
type
- ads.common.utils.human_size(num_bytes: int, precision: Optional[int] = 2) str
Converts bytes size to a string representing its value in B, KB, MB and GB.
- Parameters:
num_bytes (int) – The size in bytes.
precision ((int, optional). Defaults to 2.) – The precision of converting the bytes value.
- Returns:
A string representing the size in B, KB, MB and GB.
- Return type:
str
- ads.common.utils.inject_and_copy_kwargs(kwargs, **args)
Takes in a dictionary and returns a copy with the args injected
Examples
>>> foo(arg1, args, utils.inject_and_copy_kwargs(kwargs, arg3=12, arg4=42))
- Parameters:
kwargs (dict) – The original kwargs.
**args (type) – A series of arguments, foo=42, bar=12 etc
- Returns:
d – new dictionary object that you can use in place of kwargs
- Return type:
dict
- ads.common.utils.is_data_too_wide(data: Union[list, tuple, Series, ndarray, DataFrame], max_col_num: int) bool
Returns true if the data has too many columns.
- Parameters:
data (Union[list, tuple, pd.Series, np.ndarray, pd.DataFrame]) – A sample of data that will be used to generate schema.
max_col_num (int.) – The maximum column size of the data that allows to auto generate schema.
- ads.common.utils.is_debug_mode()
Returns true if ADS is in debug mode.
- ads.common.utils.is_documentation_mode()
Returns true if ADS is in documentation mode.
- ads.common.utils.is_notebook()
Returns true if the environment is a jupyter notebook.
- ads.common.utils.is_resource_principal_mode()
Returns true if ADS is in resource principal mode.
- ads.common.utils.is_same_class(obj, cls)
checks to see if object is the same class as cls
- ads.common.utils.is_test()
Returns true if ADS is in test mode.
- class ads.common.utils.ml_task_types(value)
Bases:
Enum
An enumeration.
- BINARY_CLASSIFICATION = 2
- BINARY_TEXT_CLASSIFICATION = 4
- MULTI_CLASS_CLASSIFICATION = 3
- MULTI_CLASS_TEXT_CLASSIFICATION = 5
- REGRESSION = 1
- UNSUPPORTED = 6
- ads.common.utils.numeric_pandas_dtypes()
Returns a list of the “numeric” pandas data types
- ads.common.utils.oci_config_file()
Returns the OCI config file location
- ads.common.utils.oci_config_location()
Returns oci configuration file location.
- ads.common.utils.oci_config_profile()
Returns the OCI config profile location.
- ads.common.utils.oci_key_location()
Returns the OCI key location
- ads.common.utils.oci_key_profile()
Returns key profile value specified in oci configuration file.
- ads.common.utils.print_user_message(msg, display_type='tip', see_also_links=None, title='Tip')
This method is deprecated and will be removed in future releases. Prints in html formatted block one of tip|info|warn type.
- Parameters:
msg (str or list) – The actual message to display. display_type is “module’, msg can be a list of [module name, module package name], i.e. [“automl”, “ads[ml]”]
display_type (str (default 'tip')) – The type of user message.
see_also_links (list of tuples in the form of [('display_name', 'url')]) –
title (str (default 'tip')) – The title of user message.
- ads.common.utils.random_valid_ocid(prefix='ocid1.dataflowapplication.oc1.iad')
Generates a random valid ocid.
- Parameters:
prefix (str) – A prefix, corresponding to a region location.
- Returns:
ocid – a valid ocid with the given prefix.
- Return type:
str
- ads.common.utils.remove_file(file_path: str, auth: Optional[Dict] = None) None
Reoves file.
- Parameters:
file_path (str) – The path of the source file, which can be local path or OCI object storage URI.
auth ((Dict, optional). Defaults to None.) – The default authetication is set using ads.set_auth API. If you need to override the default, use the ads.common.auth.api_keys or ads.common.auth.resource_principal to create appropriate authentication signer and kwargs required to instantiate IdentityClient object.
- Returns:
Nothing.
- Return type:
None
- ads.common.utils.replace_spaces(lst)
Replace all spaces with underscores for strings in the list.
Requires that the list contains strings for each element.
lst: list of strings
- ads.common.utils.set_oci_config(oci_config_location, oci_config_profile)
- Parameters:
oci_config_location – location of the config file, for example, ~/.oci/config
oci_config_profile – The profile to load from the config file. Defaults to “DEFAULT”
- ads.common.utils.snake_to_camel(name: str, capitalized_first_token: Optional[bool] = False) str
Converts the snake case string to the camel representation.
- Parameters:
name (str) – The name to convert.
capitalized_first_token ((bool, optional). Defaults to False.) – Wether the first token needs to be capitalized or not.
- Returns:
str
- Return type:
The name converted to the camel representation.
- ads.common.utils.split_data(X, y, random_state=42, test_size=0.3)
Splits data using Sklearn based on the input type of the data.
- Parameters:
X (a Pandas Dataframe) – The data points.
y (a Pandas Dataframe) – The labels.
random_state (int) – A random state for reproducability.
test_size (int) – The number of elements that should be included in the test dataset.
- ads.common.utils.to_dataframe(data: Union[list, tuple, Series, ndarray, DataFrame])
Convert to pandas DataFrame.
- Parameters:
data (Union[list, tuple, pd.Series, np.ndarray, pd.DataFrame]) – Convert data to pandas DataFrame.
- Returns:
pandas DataFrame.
- Return type:
pd.DataFrame
- ads.common.utils.truncate_series_top_n(series, n=24)
take a series which can be interpreted as a dict, index=key, this function sorts by the values and takes the top-n values, and returns a new series
- ads.common.utils.wrap_lines(li, heading='')
Wraps the elements of iterable into multi line string of fixed width
Module contents
ads.common.model_metadata_mixin module
- class ads.common.model_metadata_mixin.MetadataMixin
Bases:
object
MetadataMixin class which populates the custom metadata, taxonomy metadata, input/output schema and provenance metadata.
- populate_metadata(use_case_type: Optional[str] = None, data_sample: Optional[ADSData] = None, X_sample: Optional[Union[list, tuple, DataFrame, Series, ndarray]] = None, y_sample: Optional[Union[list, tuple, DataFrame, Series, ndarray]] = None, training_script_path: Optional[str] = None, training_id: Optional[str] = None, ignore_pending_changes: bool = True, max_col_num: int = 2000)
Populates input schema and output schema. If the schema exceeds the limit of 32kb, save as json files to the artifact directory.
- Parameters:
use_case_type ((str, optional). Defaults to None.) – The use case type of the model.
data_sample ((ADSData, optional). Defaults to None.) – A sample of the data that will be used to generate intput_schema and output_schema.
X_sample (Union[list, tuple, pd.Series, np.ndarray, pd.DataFrame]. Defaults to None.) – A sample of input data that will be used to generate input schema.
y_sample (Union[list, tuple, pd.Series, np.ndarray, pd.DataFrame]. Defaults to None.) – A sample of output data that will be used to generate output schema.
training_script_path (str. Defaults to None.) – Training script path.
training_id ((str, optional). Defaults to None.) – The training model OCID.
ignore_pending_changes (bool. Defaults to False.) – Ignore the pending changes in git.
max_col_num ((int, optional). Defaults to utils.DATA_SCHEMA_MAX_COL_NUM.) – The maximum number of columns allowed in auto generated schema.
- Returns:
Nothing.
- Return type:
None
- populate_schema(data_sample: Optional[ADSData] = None, X_sample: Optional[Union[List, Tuple, DataFrame, Series, ndarray]] = None, y_sample: Optional[Union[List, Tuple, DataFrame, Series, ndarray]] = None, max_col_num: int = 2000)
Populate input and output schemas. If the schema exceeds the limit of 32kb, save as json files to the artifact dir.
- Parameters:
data_sample (ADSData) – A sample of the data that will be used to generate input_schema and output_schema.
X_sample (Union[list, tuple, pd.Series, np.ndarray, pd.DataFrame]) – A sample of input data that will be used to generate the input schema.
y_sample (Union[list, tuple, pd.Series, np.ndarray, pd.DataFrame]) – A sample of output data that will be used to generate the output schema.
max_col_num ((int, optional). Defaults to utils.DATA_SCHEMA_MAX_COL_NUM.) – The maximum number of columns allowed in auto generated schema.