TensorFlowModel¶

Overview¶

The ads.model.framework.tensorflow_model.TensorFlowModel class in ADS is designed to allow you to rapidly get a TensorFlow model into production. The .prepare() method creates the model artifacts that are needed to deploy a functioning model without you having to configure it or write code. However, you can customize the required score.py file.

The .verify() method simulates a model deployment by calling the load_model() and predict() methods in the score.py file. With the .verify() method, you can debug your score.py file without deploying any models. The .save() method deploys a model artifact to the model catalog. The .deploy() method deploys a model to a REST endpoint.

The following steps take your trained TensorFlow model and deploy it into production with a few lines of code.

Create a TensorFlow Model

import tensorflow as tf

mnist = tf.keras.datasets.mnist
(trainx, trainy), (testx, testy) = mnist.load_data()
trainx, testx = trainx / 255.0, testx / 255.0

model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(128, activation="relu"),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10),
])
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer="adam", loss=loss_fn, metrics=["accuracy"])
model.fit(trainx, trainy, epochs=1)

Prepare Model Artifact¶

from ads.common.model_metadata import UseCaseType
from ads.model.framework.tensorflow_model import TensorFlowModel
from uuid import uuid4

tensorflow_model = TensorFlowModel(estimator=model, artifact_dir=f"./model-artifact-{str(uuid4())}")
tensorflow_model.prepare(
    inference_conda_env="tensorflow28_p38_cpu_v1",
    training_conda_env="tensorflow28_p38_cpu_v1",
    X_sample=trainx,
    y_sample=trainy,
    use_case_type=UseCaseType.MULTINOMIAL_CLASSIFICATION,
)

Instantiate a ads.model.framework.tensorflow_model.TensorFlowModel() object with a TensorFlow model. Each instance accepts the following parameters:

artifact_dir: str: Artifact directory to store the files needed for deployment.
auth: (Dict, optional): Defaults to None. The default authentication is set using the ads.set_auth API. To override the default, use ads.common.auth.api_keys() or ads.common.auth.resource_principal() and create the appropriate authentication signer and the **kwargs required to instantiate the IdentityClient object.
estimator: Callable: Any model object generated by the TensorFlow framework.
properties: (ModelProperties, optional): Defaults to None. The ModelProperties object required to save and deploy a model.

The properties is an instance of the ModelProperties class and has the following predefined fields:

bucket_uri: str
compartment_id: str
deployment_access_log_id: str
deployment_bandwidth_mbps: int
deployment_instance_count: int
deployment_instance_shape: str
deployment_log_group_id: str
deployment_predict_log_id: str
deployment_memory_in_gbs: Union[float, int]
deployment_ocpus: Union[float, int]
inference_conda_env: str
inference_python_version: str
overwrite_existing_artifact: bool
project_id: str
remove_existing_artifact: bool
training_conda_env: str
training_id: str
training_python_version: str
training_resource_id: str
training_script_path: str

By default, properties is populated from the environment variables when not specified. For example, in notebook sessions the environment variables are preset and stored in project id (PROJECT_OCID) and compartment id (NB_SESSION_COMPARTMENT_OCID). So properties populates these environment variables, and uses the values in methods such as .save() and .deploy(). Pass in values to overwrite the defaults. When you use a method that includes an instance of properties, then properties records the values that you pass in. For example, when you pass inference_conda_env into the .prepare() method, then properties records the value. To reuse the properties file in different places, you can export the properties file using the .to_yaml() method then reload it into a different machine using the .from_yaml() method.

Summary Status¶

You can call the .summary_status() method after a model serialization instance such as GenericModel, SklearnModel, TensorFlowModel, EmbeddingONNXModel, or PyTorchModel is created. The .summary_status() method returns a Pandas dataframe that guides you through the entire workflow. It shows which methods are available to call and which ones aren’t. Plus it outlines what each method does. If extra actions are required, it also shows those actions.

The following image displays an example summary status table created after a user initiates a model instance. The table’s Step column displays a Status of Done for the initiate step. And the Details column explains what the initiate step did such as generating a score.py file. The Step column also displays the prepare(), verify(), save(), deploy(), and predict() methods for the model. The Status column displays which method is available next. After the initiate step, the prepare() method is available. The next step is to call the prepare() method.

In TensorFlowModel, data serialization is supported for JSON serializable objects. Plus, there is support for a dictionary, string, list, np.ndarray, and tf.python.framework.ops.EagerTensor. Not all these objects are JSON serializable, however, support to automatically serializes and deserialized is provided.

Register Model¶

>>> # Register the model
>>> model_id = tensorflow_model.save()

Start loading model.h5 from model directory /tmp/tmpapjjzeol ...
Model is successfully loaded.
['runtime.yaml', 'model.h5', 'score.py']

'ocid1.datasciencemodel.oc1.xxx.xxxxx'

Deploy and Generate Endpoint¶

# Deploy and create an endpoint for the TensorFlow model
tensorflow_model.deploy(
    display_name="TensorFlow Model For Classification",
    deployment_log_group_id="ocid1.loggroup.oc1.xxx.xxxxx",
    deployment_access_log_id="ocid1.log.oc1.xxx.xxxxx",
    deployment_predict_log_id="ocid1.log.oc1.xxx.xxxxx",
    # Shape config details mandatory for flexible shapes:
    # deployment_instance_shape="VM.Standard.E4.Flex",
    # deployment_ocpus=<number>,
    # deployment_memory_in_gbs=<number>,
)
print(f"Endpoint: {tensorflow_model.model_deployment.url}")
# Output: "Endpoint: https://modeldeployment.{region}.oci.customer-oci.com/ocid1.datasciencemodeldeployment.oc1.xxx.xxxxx"

Run Prediction against Endpoint¶

# Generate prediction by invoking the deployed endpoint
tensorflow_model.predict(testx[:3])['prediction']

[[-2.9461750984191895, -5.293642997741699, 0.4030594229698181, 3.0270071029663086, -6.470805644989014, -2.07453989982605, -9.646402359008789, 9.256569862365723, -2.6433541774749756, -0.8167083263397217],
[-3.4297854900360107, 2.4863781929016113, 8.968724250793457, 3.162344217300415, -11.153030395507812, 0.15335027873516083, -0.5451826453208923, -7.817524433135986, -1.0585914850234985, -10.736929893493652],
[-4.420501232147217, 5.841022491455078, -0.17066864669322968, -1.0071465969085693, -2.261953592300415, -3.0983355045318604, -2.0874621868133545, 1.0745809078216553, -1.2511857748031616, -2.273810625076294]]

Predict with Image¶

Added in version 2.6.7.

Predict Image by passing a uri, which can be http(s), local path, or other URLs (e.g. starting with “oci://”, “s3://”, and “gcs://”), of the image or a PIL.Image.Image object using the image argument in predict() to predict a single image. The image will be converted to a tensor and then serialized so it can be passed to the endpoint. You can catch the tensor in score.py to perform further transformation.

Common example for model deployment prediction with image passed:

# Generate prediction by invoking the deployed endpoint
prediction = tensorflow_model.predict(image=<uri>)['prediction']

See the “Predict with Image” example here.

Run Prediction with oci raw-request command¶

Model deployment endpoints can be invoked with the OCI-CLI. This example invokes a model deployment with the CLI with a numpy.ndarray payload:

numpy.ndarray payload example¶

>>> # Prepare data sample for prediction and save it to file 'data-payload'
>>> from io import BytesIO
>>> import base64
>>> import numpy as np

>>> data = testx[:3]
>>> np_bytes = BytesIO()
>>> np.save(np_bytes, data, allow_pickle=False)
>>> data = base64.b64encode(np_bytes.getvalue()).decode("utf-8")
>>> with open('data-payload', 'w') as f:
>>>     f.write('{"data": "' + data + '", "data_type": "numpy.ndarray"}')

File data-payload will have this information:

{"data": "k05VTVBZAQB2AHsnZGVzY3InOiAnPGY4JywgJ2ZvcnRyYW5fb3JkZXInOi...
.......................................................................
...................AAAAAAAAAAAAAAAAAAA=", "data_type": "numpy.ndarray"}

Use file data-payload with data and endpoint to invoke prediction with raw-request command in terminal:

export uri=https://modeldeployment.{region}.oci.customer-oci.com/ocid1.datasciencemodeldeployment.oc1.xxx.xxxxx/predict
oci raw-request \
    --http-method POST \
    --target-uri $uri \
    --request-body file://data-payload

Expected output of raw-request command¶

{
  "data": {
    "prediction": [
      [
        -1.8818638324737549,
        -6.04175329208374,
        ..................,
        -1.1257498264312744
      ],
      [
        1.2170600891113281,
        1.6379727125167847,
        ..................,
        -9.877771377563477
      ],
      [
        -4.255424499511719,
        5.320354461669922,
        ..................,
        -2.858555555343628
      ]
    ]
  },
  "headers": {
    "Connection": "keep-alive",
    "Content-Length": "594",
    "Content-Type": "application/json",
    "Date": "Thu, 08 Dec 2022 23:25:47 GMT",
    "X-Content-Type-Options": "nosniff",
    "opc-request-id": "E70002DAA3024F46B074F9B53DB6BEBB/421B34FCB12CF33F23C85D5619A62926/CAABF4A269C63B112482B2E57463CA13",
    "server": "uvicorn"
  },
  "status": "200 OK"
}

Example¶

from ads.common.model_metadata import UseCaseType
from ads.model.framework.tensorflow_model import TensorFlowModel

import tensorflow as tf
from uuid import uuid4

# Load MNIST Data
mnist = tf.keras.datasets.mnist
(trainx, trainy), (testx, testy) = mnist.load_data()
trainx, testx = trainx / 255.0, testx / 255.0

# Train TensorFlow model
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(128, activation="relu"),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10),
    ]
)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer="adam", loss=loss_fn, metrics=["accuracy"])
model.fit(trainx, trainy, epochs=1)

# Prepare Model Artifact for TensorFlow model
tensorflow_model = TensorFlowModel(estimator=model, artifact_dir=f"./model-artifact-{str(uuid4())}")
tensorflow_model.prepare(
    inference_conda_env="tensorflow28_p38_cpu_v1",
    training_conda_env="tensorflow28_p38_cpu_v1",
    X_sample=trainx,
    y_sample=trainy,
    use_case_type=UseCaseType.MULTINOMIAL_CLASSIFICATION,
)

# Check if the artifacts are generated correctly.
# The verify method invokes the ``predict`` function defined inside ``score.py`` in the artifact_dir
tensorflow_model.verify(testx[:10])["prediction"]

# Register the model
model_id = tensorflow_model.save(display_name="TensorFlow Model")

# Deploy and create an endpoint for the TensorFlow model
tensorflow_model.deploy(
    display_name="TensorFlow Model For Classification",
    deployment_log_group_id="ocid1.loggroup.oc1.xxx.xxxxx",
    deployment_access_log_id="ocid1.log.oc1.xxx.xxxxx",
    deployment_predict_log_id="ocid1.log.oc1.xxx.xxxxx",
)

# Generate prediction by invoking the deployed endpoint
tensorflow_model.predict(testx)["prediction"]

# To delete the deployed endpoint uncomment the line below
# tensorflow_model.delete_deployment(wait_for_completion=True)