============================================ Upload a Model Group artifact (homogeneous) ============================================ A Model Group is a logical construct used to encapsulate several machine learning models into a single, version-controlled unit. With a model Group, you can group deployments, share resources, and perform live updates while maintaining immutability and reproducibility. In ADS, a Model Group is represented by :py:class:`ads.model.datascience_model_group.DataScienceModelGroup`. Model Group types ================= Model Groups can be created in different forms depending on your deployment pattern: * **Homogeneous**: A group of member models that share the same runtime and can be deployed together. For homogeneous model groups, ADS supports attaching a **deployment runtime artifact** (``score.py`` + ``runtime.yaml``) to the group. * **Stacked**: A group with a designated **base model** (``base_model_id``) plus additional member models. This is commonly used for stacked deployments. Common operations ================= The :py:class:`~ads.model.datascience_model_group.DataScienceModelGroup` API supports standard lifecycle operations: * ``create()`` / ``update()`` / ``delete()`` * ``activate()`` / ``deactivate()`` * ``from_id()`` * ``list(...)`` Member models ============= Member models are provided via ``with_member_models(...)`` as a list of dictionaries: * ``model_id``: The model OCID. * ``inference_key``: A short name used to identify the model within the group. Deployment types and runtime types ================================= In OCI Data Science, deployments can be created for: * A **single model** (``ModelDeploymentContainerRuntime.with_model_uri()``), or * A **model group** (``ModelDeploymentContainerRuntime.with_model_group_id()``). ADS supports different runtime options for model deployments. Two common patterns are: * **Conda-based runtime** using a standard model artifact (``score.py`` + ``runtime.yaml``). * **Container runtime (BYOC)** where you specify the container image and runtime configuration. The examples below focus on container runtime deployment using a model group. Artifact requirements ==================== The artifact must be either: * A **directory** containing the required files at the **top level**, or * A **.zip** file containing the same structure. At minimum, the following files must exist: * ``score.py`` * ``runtime.yaml`` Example usage ============= Homogeneous model group (with runtime artifact) ---------------------------------------------- .. code-block:: python import os from ads.model.datascience_model_group import DataScienceModelGroup from ads.model.model_metadata import ModelCustomMetadata # Path to a model deployment runtime artifact directory. # Example layout: # ./group_runtime_artifact/ # score.py # runtime.yaml artifact_dir = "./group_runtime_artifact" custom_metadata = ModelCustomMetadata() custom_metadata.add( key="test_key", value="test_value", description="test_description", category="other" ) model_group = ( DataScienceModelGroup() .with_compartment_id(os.environ.get("NB_SESSION_COMPARTMENT_OCID")) .with_project_id(os.environ.get("PROJECT_OCID")) .with_display_name("test-model-group") .with_description("Homogeneous model group with runtime artifact") .with_custom_metadata_list(custom_metadata) .with_member_models( [ {"inference_key": "model_a", "model_id": ""}, {"inference_key": "model_b", "model_id": ""}, ] ) .with_artifact(artifact_dir) ) # For homogeneous model groups, `create()` uploads the artifact after the group is created. model_group.create() Stacked model group (no group artifact) -------------------------------------- For stacked model groups, you provide a ``base_model_id``. The model group artifact upload is **only** applicable for homogeneous model groups. .. code-block:: python import os from ads.model.datascience_model_group import DataScienceModelGroup from ads.model.model_metadata import ModelCustomMetadata custom_metadata = ModelCustomMetadata() custom_metadata.add( key="test_key", value="test_value", description="test_description", category="other", ) base_model_id = "" stacked_group = ( DataScienceModelGroup() .with_compartment_id(os.environ.get("NB_SESSION_COMPARTMENT_OCID")) .with_project_id(os.environ.get("PROJECT_OCID")) .with_display_name("test-stacked-model-group") .with_description("Stacked model group") .with_custom_metadata_list(custom_metadata) .with_base_model_id(base_model_id) .with_member_models( [ {"inference_key": "base", "model_id": base_model_id}, {"inference_key": "adapter_1", "model_id": ""}, ] ) ) stacked_group.create() Deploy a Model Group using container runtime =========================================== The following example shows how to create a model group and deploy it using a **custom container runtime**. .. code-block:: python from ads.model.datascience_model_group import DataScienceModelGroup from ads.model.model_metadata import ModelCustomMetadata from ads.model.deployment import ( ModelDeployment, ModelDeploymentInfrastructure, ModelDeploymentContainerRuntime, ) custom_metadata = ModelCustomMetadata() custom_metadata.add( key="test_key", value="test_value", description="test_description", category="other", ) model_group = ( DataScienceModelGroup() .with_display_name("test_create_model_group") .with_description("test create model group description") .with_freeform_tags(**{"test_key": "test_value"}) .with_custom_metadata_list(custom_metadata) .with_member_models( [ { "inference_key": "meta-llama/Llama-2-7b-hf", "model_id": "ocid1.datasciencemodel.oc1..", }, { "inference_key": "gemma-2b-gov-ext", "model_id": "ocid1.datasciencemodel.oc1..", }, ] ) ) model_group.create() # Configure model deployment infrastructure infrastructure = ( ModelDeploymentInfrastructure() .with_project_id("") .with_compartment_id("") .with_shape_name("VM.Standard.E4.Flex") .with_shape_config_details(ocpus=1, memory_in_gbs=16) .with_replica(1) .with_bandwidth_mbps(10) .with_web_concurrency(10) .with_access_log( log_group_id="", log_id="", ) .with_predict_log( log_group_id="", log_id="", ) .with_subnet_id("") ) # Configure model deployment runtime container_runtime = ( ModelDeploymentContainerRuntime() .with_image(".ocir.io//:") .with_image_digest("") .with_entrypoint(["python", "/opt/ds/model/deployed_model/api.py"]) .with_server_port(5000) .with_health_check_port(5000) .with_env({"key": "value"}) .with_deployment_mode("HTTPS_ONLY") .with_model_group_id(model_group.id) ) # Configure model deployment deployment = ( ModelDeployment() .with_display_name("Model Deployment Demo using ADS") .with_description("The model deployment description") .with_freeform_tags(**{"key1": "value1"}) .with_infrastructure(infrastructure) .with_runtime(container_runtime) ) # Deploy deployment.deploy() # Invoke endpoint deployment.predict(data=)