PyTorchModel¶
Overview¶
The PyTorchModel
class in ADS is designed to allow you to rapidly get a PyTorch model into production. The .prepare()
method creates the model artifacts that are needed to deploy a functioning model without you having to configure it or write code. However, you can customize the required score.py
file.
The .verify()
method simulates a model deployment by calling the load_model()
and predict()
methods in the score.py
file. With the .verify()
method, you can debug your score.py
file without deploying any models. The .save()
method deploys a model artifact to the model catalog. The .deploy()
method deploys a model to a REST endpoint.
The following steps take your trained PyTorch
model and deploy it into production with a few lines of code.
Create a PyTorch Model
Load a ResNet18 model and put it into evaluation mode.
import torch
import torchvision
model = torchvision.models.resnet18(pretrained=True)
model.eval()
Initialize¶
Instantiate a PyTorchModel()
object with a PyTorch model. Each instance accepts the following parameters:
artifact_dir: str
. Artifact directory to store the files needed for deployment.auth: (Dict, optional)
: Defaults toNone
. The default authentication is set using theads.set_auth
API. To override the default, useads.common.auth.api_keys()
orads.common.auth.resource_principal()
and create the appropriate authentication signer and the**kwargs
required to instantiate theIdentityClient
object.estimator: Callable
. Any model object generated by the PyTorch framework.properties: (ModelProperties, optional)
. Defaults toNone
. TheModelProperties
object required to save and deploy model.
The properties
is an instance of the ModelProperties
class and has the following predefined fields:
bucket_uri
(str):compartment_id
(str):deployment_access_log_id
(str):deployment_bandwidth_mbps
(int):deployment_instance_count
(int):deployment_instance_shape
(str):deployment_log_group_id
(str):deployment_predict_log_id
(str):inference_conda_env
(str):inference_python_version
(str):overwrite_existing_artifact
(bool):project_id
(str):remove_existing_artifact
(bool):training_conda_env
(str):training_id
(str):training_python_version
(str):training_resource_id
(str):training_script_path
(str):
By default, properties
is populated from the environment variables when not specified. For example, in notebook sessions the environment variables are preset and stored in project id (PROJECT_OCID
) and compartment id (NB_SESSION_COMPARTMENT_OCID). So ``properties
populates these environment variables, and uses the values in methods such as .save()
and .deploy()
. Pass in values to overwrite the defaults. When you use a method that includes an instance of properties
, then properties
records the values that you pass in. For example, when you pass inference_conda_env
into the .prepare()
method, then properties
records the value. To reuse the properties file in different places, you can export the properties file using the .to_yaml()
method then reload it into a different machine using the .from_yaml()
method.
Summary Status¶
You can call the .summary_status()
method after a model serialization instance such as GenericModel
, SklearnModel
, TensorFlowModel
, or PyTorchModel
is created. The .summary_status()
method returns a Pandas dataframe that guides you through the entire workflow. It shows which methods are available to call and which ones aren’t. Plus it outlines what each method does. If extra actions are required, it also shows those actions.
The following image displays an example summary status table created after a user initiates a model instance. The table’s Step column displays a Status of Done for the initiate step. And the Details
column explains what the initiate step did such as generating a score.py
file. The Step column also displays the prepare()
, verify()
, save()
, deploy()
, and predict()
methods for the model. The Status column displays which method is available next. After the initiate step, the prepare()
method is available. The next step is to call the prepare()
method.

Model Deployment¶
Prepare¶
The prepare step is performed by the .prepare()
method. It creates several customized files used to run the model after it is deployed. These files include:
input_schema.json
: A JSON file that defines the nature of the features of theX_sample
data. It includes metadata such as the data type, name, constraints, summary statistics, feature type, and more.model.pt
: This is the default filename of the serialized model. It can be changed with themodel_file_name
attribute. By default, the model is stored in a PyTorch file. The parameteras_onnx
can be used to save it in the ONNX format.output_schema.json
: A JSON file that defines the nature of the dependent variable in they_sample
data. It includes metadata such as the data type, name, constraints, summary statistics, feature type, and more.runtime.yaml
: This file contains information that is needed to set up the runtime environment on the deployment server. It has information about which conda environment was used to train the model, and what environment should be used to deploy the model. The file also specifies what version of Python should be used.score.py
: This script contains theload_model()
andpredict()
functions. The load_model function understands the format the model file was saved in, and loads it into memory. The.predict()
method is used to make inferences in a deployed model. There are also hooks that allow you to perform operations before and after inference. You are able to modify this script to fit your specific needs.
To create the model artifacts, use the .prepare()
method. The .prepare()
method includes parameters for storing model provenance information. The PyTorch framework serialization only saves the model parameters. Thus, you must update the score.py
file to construct the model class instance first before loading model parameters in the predict()
function of score.py
.
The .prepare()
method prepares and saves the score.py
file, serializes the model and runtime.yaml
file using the following parameters:
as_onnx: (bool, optional)
: Defaults toFalse
. IfTrue
, it will serialize as an ONNX model.force_overwrite: (bool, optional)
: Defaults toFalse
. IfTrue
, it will overwrite existing files.ignore_pending_changes: bool
: Defaults toFalse
. IfFalse
, it will ignore the pending changes in Git.inference_conda_env: (str, optional)
: Defaults toNone
. Can be either slug or the Object Storage path of the conda environment. You can only pass in slugs if the conda environment is a Data Science service environment.inference_python_version: (str, optional)
: Defaults toNone
. The version of Python to use in the model deployment.max_col_num: (int, optional)
: Defaults toutils.DATA_SCHEMA_MAX_COL_NUM
. Do not automatically generate the input schema if the input data has more than this number of features.model_file_name: (str)
: Name of the serialized model.namespace: (str, optional)
: Namespace of the OCI region. This is used for identifying which region the service environment is from when you provide a slug to theinference_conda_env
ortraining_conda_env
paramaters.training_conda_env: (str, optional)
: Defaults toNone
. Can be either slug or object storage path of the conda environment that was used to train the model. You can only pass in a slug if the conda environment is a Data Science service environment.training_id: (str, optional)
: Defaults to value from environment variables. The training OCID for the model. Can be a notebook session or job OCID.training_python_version: (str, optional)
: Defaults to None. The version of Python used to train the model.training_script_path: str
: Defaults toNone
. The training script path.use_case_type: str
: The use case type of the model. Use it with theUserCaseType
class or the string provided inUseCaseType
. For example,use_case_type=UseCaseType.BINARY_CLASSIFICATION
oruse_case_type="binary_classification"
, see theUseCaseType
class to see all supported types.X_sample: Union[list, tuple, pd.Series, np.ndarray, pd.DataFrame]
: Defaults toNone
. A sample of the input data. It is used to generate the input schema.y_sample: Union[list, tuple, pd.Series, np.ndarray, pd.DataFrame]
: Defaults to None. A sample of output data. It is used to generate the output schema.**kwargs
:dynamic_axes: (dict, optional)
: Defaults toNone
. Optional in ONNX serialization. Specify axes of tensors as dynamic (i.e. known only at run-time).input_names: (List[str], optional)
: Defaults to["input"]
. Optional in an ONNX serialization. It is an ordered list of names to assign to the input nodes of the graph.onnx_args: (tuple or torch.Tensor, optional)
: Required whenas_onnx=True
in an ONNX serialization. Contains model inputs such thatonnx_model(onnx_args)
is a valid invocation of the model.output_names: (List[str], optional)
: Defaults to["output"]
. Optional in an ONNX serialization. It is an ordered list of names to assign to the output nodes of the graph.
Verify¶
If you update the score.py
file included in a model artifact, you can verify your changes, without deploying the model. With the .verify()
method, you can debug your code without having to save the model to the model catalog and then deploying it. The .verify()
method takes a set of test parameters and performs the prediction by calling the predict()
function in score.py
. It also runs the load_model()
function to load the model.
The verify()
method tests whether the .predict()
API works in the local environment and it takes the following parameter:
data: Any
: Data expected by the predict API in thescore.py
file. For the PyTorch serialization method,data
can be in type dict, str, list, np.ndarray, ortorch.tensor
. For the ONNX serialization method,data
has to be JSON serializable ornp.ndarray
.
Save¶
After you are satisfied with the performance of your model and have verified that the score.py
file is working, use the .save()
method to save the model to the model catalog. The .save()
method bundles up the model artifacts, stores them in the model catalog, and returns the model OCID.
The .save()
method stores the model artifacts in the model catalog. It takes the following parameters:
bucket_uri
(str, optional). Defaults toNone
. The OCI Object Storage URI where model artifacts aree copied to. Thebucket_uri
is only necessary for uploading large artifacts with size greater than 2 GB. For example,oci://<bucket_name>@<namespace>/prefix/
.defined_tags
(Dict(str, dict(str, object)), optional): Defaults toNone
. Defined tags for the model.description
(str, optional): Defaults toNone
. The description of the model.display_name
(str, optional): Defaults toNone
. The name of the model.freeform_tags
Dict(str, str): Defaults toNone
. Free form tags for the model.ignore_introspection
(bool, optional): Defaults toNone
. Determines whether to ignore the result of model introspection or not. If set toTrue
, then.save()
ignores all model introspection errors.overwrite_existing_artifact
(bool, optional). Defaults toTrue
. Overwrite target bucket artifact if exists.remove_existing_artifact
(bool, optional). Defaults toTrue
. Whether artifacts uploaded to the Object Storage bucket is removed or not.**kwargs
:compartment_id
(str, optional): Compartment OCID. If not specified, the value is taken either from the environment variables or model properties.project_id
(str, optional): Project OCID. If not specified, the value is taken either from the environment variables or model properties.timeout
(int, optional): Defaults to 10 seconds. The connection timeout in seconds for the client.
The .save()
method reloads score.py
and runtime.yaml
files from disk to find any changes that have been made to the files. If ignore_introspection=False
, then it conducts an introspection test to determine if the model deployment may have issues. If potential problems are detected, it suggests possible remedies. Lastly, it uploads the artifacts to the model catalog and returns the model OCID. You can also call .instrospect()
to conduct the test any time after you call .prepare()
.
Deploy¶
You can use the .deploy()
method to deploy a model. You must first save the model to the model catalog, and then deploy it.
The .deploy()
method returns a ModelDeployment
object. Specify deployment attributes such as display name, instance type, number of instances, maximum router bandwidth, and logging groups. The API takes the following parameters:
deployment_access_log_id
(str, optional): Defaults toNone
. The access log OCID for the access logs, see logging.deployment_bandwidth_mbps
(int, optional): Defaults to 10. The bandwidth limit on the load balancer in Mbps.deployment_instance_count
(int, optional): Defaults to 1. The number of instances used for deployment.deployment_instance_shape
(str, optional): Default to VM.Standard2.1. The shape of the instance used for deployment.deployment_log_group_id
(str, optional): Defaults toNone
. The OCI logging group OCID. The access log and predict log share the same log group.deployment_predict_log_id
(str, optional): Defaults toNone
. The predict log OCID for the predict logs, see logging.description
(str, optional): Defaults toNone
. The description of the model.display_name
(str, optional): Defaults toNone
. The name of the model.wait_for_completion
(bool, optional): Defaults toTrue
. Set to wait for the deployment to complete before proceeding.**kwargs
:compartment_id
(str, optional): Compartment OCID. If not specified, the value is taken from the environment variables.max_wait_time
(int, optional): Defaults to 1200 seconds. The maximum amount of time to wait in seconds. A negative value implies an infinite wait time.poll_interval
(int, optional): Defaults to 60 seconds. Poll interval in seconds.project_id
(str, optional): Project OCID. If not specified, the value is taken from the environment variables.
Predict¶
To get a prediction for your model, after your model deployment is active, call the .predict()
method. The .predict()
method sends a request to the deployed endpoint, and computes the inference values based on the data that you input in the .predict()
method.
The .predict()
method returns a prediction of input data that is run against the model deployment endpoint and takes the following parameters:
data: Any
: Data expected by the predict API in thescore.py
file. For the PyTorch serialization method,data
can be in type dict, str, list, np.ndarray, ortorch.tensor
. For the ONNX serialization method,data
has to be JSON serializable ornp.ndarray
.
Load¶
You can restore serialization models from model artifacts, from model deployments or from models in the model catalog. This section provides details on how to restore serialization models.
Model Artifact¶
A model artifact is a collection of files used to create a model deployment. Some example files included in a model artifact are the serialized model, score.py
, and runtime.yaml
. You can store your model artifact in a local directory, in a ZIP or TAR format. Then use the .from_model_artifact()
method to import the model artifact into the serialization model class. The .from_model_artifact()
method takes the following parameters:
artifact_dir
(str): Artifact directory to store the files needed for deployment.auth
(Dict, optional): Defaults toNone
. The default authentication is set using theads.set_auth
API. To override the default, useads.common.auth.api_keys()
orads.common.auth.resource_principal()
and create the appropriate authentication signer and the**kwargs
required to instantiate theIdentityClient
object.force_overwrite
(bool, optional): Defaults toFalse
. IfTrue
, it will overwrite existing files.model_file_name
(str): The serialized model file name.properties
(ModelProperties, optional): Defaults toNone
.ModelProperties
object required to save and deploy the model.uri
(str): The path to the folder, ZIP, or TAR file that contains the model artifact. The model artifact must contain the serialized model, thescore.py
,runtime.yaml
and other files needed for deployment. The content of the URI is copied to theartifact_dir
folder.
from ads.model.framework.pytorch_model import PyTorchModel
model = PyTorchModel.from_model_artifact(
uri="/folder_to_your/artifact.zip",
model_file_name="model.pt",
artifact_dir="/folder_store_artifact"
)
Model Catalog¶
To populate a serialization model object from a model stored in the model catalog, call the .from_model_catalog()
method. This method uses the model OCID to download the model artifacts, write them to the artifact_dir
, and update the serialization model object. The .from_model_catalog()
method takes the following parameters:
artifact_dir
(str): Artifact directory to store the files needed for deployment.auth
(Dict, optional): Defaults toNone
. The default authentication is set using theads.set_auth
API. To override the default, useads.common.auth.api_keys()
orads.common.auth.resource_principal()
and create the appropriate authentication signer and the**kwargs
required to instantiate theIdentityClient
object.bucket_uri
(str, optional). Defaults toNone
. The OCI Object Storage URI where model artifacts will be copied to. Thebucket_uri
is only necessary for uploading large artifacts with size greater than 2GB. Example:oci://<bucket_name>@<namespace>/prefix/
.force_overwrite
(bool, optional): Defaults toFalse
. IfTrue
, it will overwrite existing files.model_id
(str): The model OCID.model_file_name
(str): The serialized model file name.overwrite_existing_artifact
(bool, optional). Defaults toTrue
. Overwrite target bucket artifact if exists.properties
(ModelProperties, optional): Defaults to None. Define the properties to save and deploy the model.**kwargs
:compartment_id
(str, optional): Compartment OCID. If not specified, the value will be taken from the environment variables.timeout
(int, optional): Defaults to 10 seconds. The connection timeout in seconds for the client.
from ads.model.framework.pytorch_model import PyTorchModel
model = PyTorchModel.from_model_catalog(model_id="<model_id>",
model_file_name="model.pt",
artifact_dir=tempfile.mkdtemp())
Model Deployment¶
Added in version 2.6.2.
To populate a serialization model object from a model deployment, call the .from_model_deployment()
method. This method accepts a model deployment OCID. It downloads the model artifacts, writes them to the model artifact directory (artifact_dir
), and updates the serialization model object. The .from_model_deployment()
method takes the following parameters:
artifact_dir
(str): Artifact directory to store the files needed for deployment.auth
(Dict, optional): Defaults toNone
. The default authentication is set using theads.set_auth
API. To override the default, useads.common.auth.api_keys()
orads.common.auth.resource_principal()
. Supply the appropriate authentication signer and the**kwargs
required to instantiate anIdentityClient
object.bucket_uri
(str, optional). Defaults toNone
. The OCI Object Storage URI where model artifacts are copied to. Thebucket_uri
is only necessary for uploading large artifacts with size greater than 2 GB. For example,oci://<bucket_name>@<namespace>/prefix/
.force_overwrite
(bool, optional): Defaults toFalse
. IfTrue
, it will overwrite existing files in the artifact directory.model_deployment_id
(str): The model deployment OCID.model_file_name
(str): The serialized model file name.overwrite_existing_artifact
(bool, optional). Defaults toTrue
. Overwrite target bucket artifact if exists.properties
(ModelProperties, optional): Defaults toNone
. Define the properties to save and deploy the model.**kwargs
:compartment_id
(str, optional): Compartment OCID. If not specified, the value will be taken from the environment variables.timeout
(int, optional): Defaults to 10 seconds. The connection timeout in seconds for the client.
from ads.model.generic_model import PyTorchModel
model = PyTorchModel.from_model_deployment(
model_deployment_id="<model_deployment_id>"",
model_file_name="model.pkl",
artifact_dir=tempfile.mkdtemp())
Delete a Deployment¶
Use the .delete_deployment()
method on the serialization model object to delete a model deployment. You must delete a model deployment before deleting its associated model from the model catalog.
Each time you call the .deploy()
method, it creates a new deployment. Only the most recent deployment is attached to the object.
The .delete_deployment()
method deletes the most recent deployment and takes the following optional parameter:
wait_for_completion: (bool, optional)
. Defaults toFalse
and the process runs in the background. If set toTrue
, the method returns when the model deployment is deleted.
Example¶
import tempfile
import torchvision
from ads.catalog.model import ModelCatalog
from ads.common.model_metadata import UseCaseType
from ads.model.framework.pytorch_model import PyTorchModel
# Load the PyTorch Model
model = torchvision.models.resnet18(pretrained=True)
model.eval()
# Prepare the model
artifact_dir = tempfile.mkdtemp()
pytorch_model = PyTorchModel(model, artifact_dir=artifact_dir)
pytorch_model.prepare(
inference_conda_env="generalml_p37_cpu_v1",
training_conda_env="generalml_p37_cpu_v1",
use_case_type=UseCaseType.IMAGE_CLASSIFICATION,
as_onnx=False,
force_overwrite=True,
)
# Update ``score.py`` by constructing the model class instance first.
added_line = """
import torchvision
the_model = torchvision.models.resnet18()
"""
with open(artifact_dir + "/score.py", 'r+') as f:
content = f.read()
f.seek(0, 0)
f.write(added_line.rstrip('\r\n') + '\n' + content)
# test_data will need to be defined based on the image requirements of ResNet18
# Deploy the model, test it and clean up.
pytorch_model.verify(test_data)
model_id = pytorch_model.save()
pytorch_model.deploy()
pytorch_model.predict(test_data)
pytorch_model.delete_deployment(wait_for_completion=True)
pytorch_model.delete()