YAML Schema
The distributed training workload is defined in YAML
and can be launched by invoking the ads opctl run -f path/to/yaml
command.
distributed
schema
Key | Value | |
---|---|---|
kind |
string
|
Must be distributed
|
apiVersion |
string
|
|
spec |
dict
|
See distributed.spec schema.
|
distributed.spec
schema
Key | Value | |
---|---|---|
infrastructure |
dict
|
See distributed.spec.infrastructure schema.
|
cluster |
dict
|
See distributed.spec.cluster schema.
|
runtime |
dict
|
See distributed.spec.runtime schema.
|
distributed.spec.infrastructure
schema
Key | Value | |
---|---|---|
kind |
string
|
Must be infrastructure
|
type |
string
|
Must be dataScienceJob
|
apiVersion |
string
|
|
spec |
dict
|
See distributed.spec.infrastructure.spec schema.
|
distributed.spec.infrastructure.spec
schema
Key | Value | |
---|---|---|
displayName |
string
|
|
compartmentId |
string
|
|
projectId |
string
|
|
logGroupId |
string
|
|
logId |
string
|
|
subnetId |
string
|
|
shapeName |
string
|
|
blockStorageSize |
integer
|
Minimum: 50 |
distributed.spec.cluster
schema
Key | Value | |
---|---|---|
kind |
string
|
Options:
PYTORCH
,
DASK
,
HOROVOD
,
dask
,
pytorch
,
horovod
|
apiVersion |
string
|
|
spec |
dict
|
See distributed.spec.cluster.spec schema.
|
distributed.spec.cluster.spec
schema
Key | Value | |
---|---|---|
image |
string
|
|
workDir |
string
|
|
name |
string
|
|
config |
dict
|
See distributed.spec.cluster.spec.config schema.
|
main |
dict
|
See distributed.spec.cluster.spec.main schema.
|
worker |
dict
|
See distributed.spec.cluster.spec.worker schema.
|
distributed.spec.cluster.spec.config
schema
Key | Value | |
---|---|---|
startOptions |
list
|
List of
string
items.
|
env |
list
|
List of
dict
items.
See env schema.
|
distributed.spec.cluster.spec.main
schema
Key | Value | |
---|---|---|
name |
string
|
|
replicas |
integer
|
|
config |
dict
|
See distributed.spec.cluster.spec.config schema.
|
distributed.spec.cluster.spec.worker
schema
Key | Value | |
---|---|---|
name |
string
|
|
replicas |
integer
|
|
config |
dict
|
See distributed.spec.cluster.spec.config schema.
|
distributed.spec.runtime
schema
Key | Value | |
---|---|---|
kind |
string
|
|
apiVersion |
string
|
|
spec |
dict
|
See distributed.spec.runtime.spec schema.
|
distributed.spec.runtime.spec
schema
Key | Value | |
---|---|---|
entryPoint |
string
|
|
kwargs |
string
|
|
args |
list
|
List of
number
,
string
items.
|
env |
list
|
List of
dict
items.
See env schema.
|
env
schema
Key | Value | |
---|---|---|
name |
string
|
|
value |
number
,
string
|
Following is the YAML schema for validating the YAML using Cerberus:
1kind:
2 type: string
3 allowed:
4 - distributed
5apiVersion:
6 type: string
7spec:
8 type: dict
9 schema:
10 infrastructure:
11 type: dict
12 schema:
13 kind:
14 type: string
15 allowed:
16 - infrastructure
17 type:
18 type: string
19 allowed:
20 - dataScienceJob
21 apiVersion:
22 type: string
23 spec:
24 type: dict
25 schema:
26 displayName:
27 type: string
28 compartmentId:
29 type: string
30 projectId:
31 type: string
32 logGroupId:
33 type: string
34 logId:
35 type: string
36 subnetId:
37 type: string
38 shapeName:
39 type: string
40 blockStorageSize:
41 type: integer
42 min: 50
43 cluster:
44 type: dict
45 schema:
46 kind:
47 type: string
48 allowed:
49 - PYTORCH
50 - DASK
51 - HOROVOD
52 - dask
53 - pytorch
54 - horovod
55 apiVersion:
56 type: string
57 spec:
58 type: dict
59 schema:
60 image:
61 type: string
62 workDir:
63 type: string
64 name:
65 type: string
66 config:
67 type: dict
68 nullable: true
69 schema:
70 startOptions:
71 type: list
72 schema:
73 type: string
74 env:
75 type: list
76 nullable: true
77 schema:
78 type: dict
79 schema:
80 name:
81 type: string
82 value:
83 type:
84 - number
85 - string
86 main:
87 type: dict
88 schema:
89 name:
90 type: string
91 replicas:
92 type: integer
93 config:
94 type: dict
95 nullable: true
96 schema:
97 env:
98 type: list
99 nullable: true
100 schema:
101 type: dict
102 schema:
103 name:
104 type: string
105 value:
106 type:
107 - number
108 - string
109 worker:
110 type: dict
111 schema:
112 name:
113 type: string
114 replicas:
115 type: integer
116 config:
117 type: dict
118 nullable: true
119 schema:
120 env:
121 type: list
122 nullable: true
123 schema:
124 type: dict
125 schema:
126 name:
127 type: string
128 value:
129 type:
130 - number
131 - string
132 runtime:
133 type: dict
134 schema:
135 kind:
136 type: string
137 apiVersion:
138 type: string
139 spec:
140 type: dict
141 schema:
142 entryPoint:
143 type: string
144 kwargs:
145 type: string
146 args:
147 type: list
148 schema:
149 type:
150 - number
151 - string
152 env:
153 type: list
154 nullable: true
155 schema:
156 type: dict
157 schema:
158 name:
159 type: string
160 value:
161 type:
162 - number
163 - string