Predict
Predictions can be made by calling the HTTP endpoint associated with the model deployment. The ModelDeployment
object url
attribute specifies the endpoint. You could also use the ModelDeployment
object with the .predict()
method. The format of the data that is passed to the HTTP endpoint depends on the setup of the model artifact. The default setup is to pass in a Python dictionary that has been converted to a JSON data structure. The first level defines the feature names. The second level uses an identifier for the observation (for example, row in the dataframe), and the value associated with it. Assuming the model has features F1, F2, F3, F4, and F5, then the observations are identified by the values 0, 1, and 2 and the data would look like this:
Index |
F1 |
F2 |
F3 |
F4 |
F5 |
---|---|---|---|---|---|
0 |
11 |
12 |
13 |
14 |
15 |
1 |
21 |
22 |
23 |
24 |
25 |
2 |
31 |
32 |
33 |
34 |
35 |
The Python dictionary representation would be:
test = {
'F1': { 0: 11, 1: 21, 2: 31},
'F2': { 0: 12, 1: 22, 2: 32},
'F3': { 0: 13, 1: 23, 2: 33},
'F4': { 0: 14, 1: 24, 2: 34},
'F5': { 0: 15, 1: 25, 2: 35}
}
You can use the ModelDeployment
object to call the HTTP endpoint. The returned
result is the predictions for the three observations.
deployment.predict(test)
{'prediction': [0, 2, 0]}
Model Deploy now supports binary payloads. You no longer need to convert binary images to Base64 encoded strings when making inferences.
Example
The following example shows how to use predict() with image bytes: The score.py file does not provide default deserialization for bytes input. You need to provide your own implementations. The model used in this example has its raw training data normalized. The next cell reproduces these transformations. The original image is 256x384 pixels and the training data is 224x224. Therefore, the image is resized and cropped. The color variation in the image is also adjusted to match the training data. The image is converted to a Tensor object. This object is a four-dimensional tensor and the first dimension has only a single level. This dimension is removed using the .unsqueeze() method.
Load data
from PIL import Image
im = Image.open('<image_path>')
im.convert("RGB").save("<image_path>")
with open('<image_path>', 'rb') as f:
byte_im = f.read()
Example model
# load the pre-trained model.
model = resnet18(pretrained=True)
# set the model to inference mode
_ = model.eval()
Model framework serialization
artifact_dir = "<directory>"
pytorch_model = PyTorchModel(estimator=model, artifact_dir=artifact_dir)
conda_env = 'computervision_p37_cpu_v1'
# Create a sample of the y values.
y_sample = [0] * len(prediction_not_normalized)
y_sample[prediction_normalized.index(max_value)] = 1
pytorch_model.prepare(
inference_conda_env=conda_env,
training_conda_env=conda_env,
use_case_type=UseCaseType.IMAGE_CLASSIFICATION,
X_sample=image_tensor,
y_sample=y_sample,
training_id=None,
force_overwrite=True
)
pytorch_model.verify(byte_im)['prediction'][0][:10]
model_id = pytorch_model.save(display_name='Test PyTorchModel model Bytes Input', timeout=600)
deploy = pytorch_model.deploy(display_name='Test PyTorchModel deployment')
pytorch_model.predict(byte_im)['prediction'][0][:10]
pytorch_model.delete_deployment(wait_for_completion=True)
ModelCatalog(compartment_id=os.environ['NB_SESSION_COMPARTMENT_OCID']).delete_model(model_id)
The change needed in score.py:
def deserialize(data):
if isinstance(data, bytes):
return data
...
def pre_inference(data):
data = deserialize(data)
import base64
import io
import torchvision.transforms as transforms
from PIL import Image
img_bytes = io.BytesIO(data)
image = Image.open(img_bytes)
# preprocess the data to make it accepted by the model
preprocess = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]
),
])
input_tensor = preprocess(image)
input_batch = input_tensor.unsqueeze(0)
return input_batch
def post_inference(yhat):
if isinstance(yhat, torch.Tensor):
from torch.nn import Softmax
softmax = Softmax(dim=1)
return softmax(yhat).tolist()
return yhat