Run a Script¶
This section shows how to create a job to run a script.
The ScriptRuntime
is designed for you to define job artifacts and configurations supported by OCI
Data Science Jobs natively. It can be used with any script types that is supported by the OCI Data Science Jobs,
including shell scripts and python scripts.
The source code can be a single script, files in a folder or a zip/tar file.
See also: Preparing Job Artifacts.
Here is an example:
from ads.jobs import Job, DataScienceJob, ScriptRuntime
job = (
Job(name="My Job")
.with_infrastructure(
DataScienceJob()
# Configure logging for getting the job run outputs.
.with_log_group_id("<log_group_ocid>")
# Log resource will be auto-generated if log ID is not specified.
.with_log_id("<log_ocid>")
# If you are in an OCI data science notebook session,
# the following configurations are not required.
# Configurations from the notebook session will be used as defaults.
.with_compartment_id("<compartment_ocid>")
.with_project_id("<project_ocid>")
.with_subnet_id("<subnet_ocid>")
.with_shape_name("VM.Standard.E3.Flex")
# Shape config details are applicable only for the flexible shapes.
.with_shape_config_details(memory_in_gbs=16, ocpus=1)
# Minimum/Default block storage size is 50 (GB).
.with_block_storage_size(50)
)
.with_runtime(
ScriptRuntime()
# Specify the service conda environment by slug name.
.with_service_conda("pytorch110_p38_cpu_v1")
# The job artifact can be a single Python script, a directory or a zip file.
.with_source("local/path/to/code_dir")
# Environment variable
.with_environment_variable(NAME="Welcome to OCI Data Science.")
# Command line argument
.with_argument("100 linux \"hi there\"")
# The entrypoint is applicable only to directory or zip file as source
# The entrypoint should be a path relative to the working dir.
# Here my_script.sh is a file in the code_dir/my_package directory
.with_entrypoint("my_package/my_script.sh")
)
)
kind: job
spec:
name: "My Job"
infrastructure:
kind: infrastructure
type: dataScienceJob
spec:
blockStorageSize: 50
compartmentId: <compartment_ocid>
jobInfrastructureType: STANDALONE
logGroupId: <log_group_ocid>
logId: <log_ocid>
projectId: <project_ocid>
shapeConfigDetails:
memoryInGBs: 16
ocpus: 1
shapeName: VM.Standard.E3.Flex
subnetId: <subnet_ocid>
runtime:
kind: runtime
type: script
spec:
args:
- 100 linux "hi there"
conda:
slug: pytorch110_p38_cpu_v1
type: service
entrypoint: my_package/my_script.sh
env:
- name: NAME
value: Welcome to OCI Data Science.
scriptPathURI: local/path/to/code_dir
# Create the job on OCI Data Science
job.create()
# Start a job run
run = job.run()
# Stream the job run outputs
run.watch()
An example script is available on Data Science AI Sample GitHub Repository.
Working Directory¶
The working directory is the parent directory where the job artifacts are decompressed,
for example /home/datascience/decompressed_artifact/
.
When the source code is a compressed file/archive (zip/tar) or a folder, you need to also specify the entrypoint
using with_entrypoint()
. The path of the entrypoint should be a path relative to
the working directory. Note that this directory cannot be changed when using ScriptRuntime
.
See Python Runtime Working Directory for more details.