Upload a Model Group artifact (homogeneous)¶
A Model Group is a logical construct used to encapsulate several machine learning models into a single, version-controlled unit.
With a model Group, you can group deployments, share resources, and perform live updates while maintaining immutability and reproducibility.
In ADS, a Model Group is represented by ads.model.datascience_model_group.DataScienceModelGroup.
Model Group types¶
Model Groups can be created in different forms depending on your deployment pattern:
Homogeneous: A group of member models that share the same runtime and can be deployed together. For homogeneous model groups, ADS supports attaching a deployment runtime artifact (
score.py+runtime.yaml) to the group.Stacked: A group with a designated base model (
base_model_id) plus additional member models. This is commonly used for stacked deployments.
Common operations¶
The DataScienceModelGroup API supports standard lifecycle operations:
create()/update()/delete()activate()/deactivate()from_id(<model_group_ocid>)list(...)
Member models¶
Member models are provided via with_member_models(...) as a list of dictionaries:
model_id: The model OCID.inference_key: A short name used to identify the model within the group.
Deployment types and runtime types¶
In OCI Data Science, deployments can be created for:
A single model (
ModelDeploymentContainerRuntime.with_model_uri(<model_ocid>)), orA model group (
ModelDeploymentContainerRuntime.with_model_group_id(<model_group_ocid>)).
ADS supports different runtime options for model deployments. Two common patterns are:
Conda-based runtime using a standard model artifact (
score.py+runtime.yaml).Container runtime (BYOC) where you specify the container image and runtime configuration.
The examples below focus on container runtime deployment using a model group.
Artifact requirements¶
The artifact must be either:
A directory containing the required files at the top level, or
A .zip file containing the same structure.
At minimum, the following files must exist:
score.pyruntime.yaml
Example usage¶
Homogeneous model group (with runtime artifact)¶
import os
from ads.model.datascience_model_group import DataScienceModelGroup
from ads.model.model_metadata import ModelCustomMetadata
# Path to a model deployment runtime artifact directory.
# Example layout:
# ./group_runtime_artifact/
# score.py
# runtime.yaml
artifact_dir = "./group_runtime_artifact"
custom_metadata = ModelCustomMetadata()
custom_metadata.add(
key="test_key",
value="test_value",
description="test_description",
category="other"
)
model_group = (
DataScienceModelGroup()
.with_compartment_id(os.environ.get("NB_SESSION_COMPARTMENT_OCID"))
.with_project_id(os.environ.get("PROJECT_OCID"))
.with_display_name("test-model-group")
.with_description("Homogeneous model group with runtime artifact")
.with_custom_metadata_list(custom_metadata)
.with_member_models(
[
{"inference_key": "model_a", "model_id": "<model_ocid_a>"},
{"inference_key": "model_b", "model_id": "<model_ocid_b>"},
]
)
.with_artifact(artifact_dir)
)
# For homogeneous model groups, `create()` uploads the artifact after the group is created.
model_group.create()
Stacked model group (no group artifact)¶
For stacked model groups, you provide a base_model_id.
The model group artifact upload is only applicable for homogeneous model groups.
import os
from ads.model.datascience_model_group import DataScienceModelGroup
from ads.model.model_metadata import ModelCustomMetadata
custom_metadata = ModelCustomMetadata()
custom_metadata.add(
key="test_key",
value="test_value",
description="test_description",
category="other",
)
base_model_id = "<base_model_ocid>"
stacked_group = (
DataScienceModelGroup()
.with_compartment_id(os.environ.get("NB_SESSION_COMPARTMENT_OCID"))
.with_project_id(os.environ.get("PROJECT_OCID"))
.with_display_name("test-stacked-model-group")
.with_description("Stacked model group")
.with_custom_metadata_list(custom_metadata)
.with_base_model_id(base_model_id)
.with_member_models(
[
{"inference_key": "base", "model_id": base_model_id},
{"inference_key": "adapter_1", "model_id": "<adapter_model_ocid_1>"},
]
)
)
stacked_group.create()
Deploy a Model Group using container runtime¶
The following example shows how to create a model group and deploy it using a custom container runtime.
from ads.model.datascience_model_group import DataScienceModelGroup
from ads.model.model_metadata import ModelCustomMetadata
from ads.model.deployment import (
ModelDeployment,
ModelDeploymentInfrastructure,
ModelDeploymentContainerRuntime,
)
custom_metadata = ModelCustomMetadata()
custom_metadata.add(
key="test_key",
value="test_value",
description="test_description",
category="other",
)
model_group = (
DataScienceModelGroup()
.with_display_name("test_create_model_group")
.with_description("test create model group description")
.with_freeform_tags(**{"test_key": "test_value"})
.with_custom_metadata_list(custom_metadata)
.with_member_models(
[
{
"inference_key": "meta-llama/Llama-2-7b-hf",
"model_id": "ocid1.datasciencemodel.oc1.<region>.<unique_id>",
},
{
"inference_key": "gemma-2b-gov-ext",
"model_id": "ocid1.datasciencemodel.oc1.<region>.<unique_id>",
},
]
)
)
model_group.create()
# Configure model deployment infrastructure
infrastructure = (
ModelDeploymentInfrastructure()
.with_project_id("<PROJECT_OCID>")
.with_compartment_id("<COMPARTMENT_OCID>")
.with_shape_name("VM.Standard.E4.Flex")
.with_shape_config_details(ocpus=1, memory_in_gbs=16)
.with_replica(1)
.with_bandwidth_mbps(10)
.with_web_concurrency(10)
.with_access_log(
log_group_id="<ACCESS_LOG_GROUP_OCID>",
log_id="<ACCESS_LOG_OCID>",
)
.with_predict_log(
log_group_id="<PREDICT_LOG_GROUP_OCID>",
log_id="<PREDICT_LOG_OCID>",
)
.with_subnet_id("<SUBNET_OCID>")
)
# Configure model deployment runtime
container_runtime = (
ModelDeploymentContainerRuntime()
.with_image("<region>.ocir.io/<namespace>/<image>:<tag>")
.with_image_digest("<IMAGE_DIGEST>")
.with_entrypoint(["python", "/opt/ds/model/deployed_model/api.py"])
.with_server_port(5000)
.with_health_check_port(5000)
.with_env({"key": "value"})
.with_deployment_mode("HTTPS_ONLY")
.with_model_group_id(model_group.id)
)
# Configure model deployment
deployment = (
ModelDeployment()
.with_display_name("Model Deployment Demo using ADS")
.with_description("The model deployment description")
.with_freeform_tags(**{"key1": "value1"})
.with_infrastructure(infrastructure)
.with_runtime(container_runtime)
)
# Deploy
deployment.deploy()
# Invoke endpoint
deployment.predict(data=<data>)