forked from common-workflow-language/cwl-v1.3
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Base.yml
498 lines (445 loc) · 20.1 KB
/
Base.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
$base: "https://w3id.org/cwl/cwl#"
$namespaces:
cwl: "https://w3id.org/cwl/cwl#"
sld: "https://w3id.org/cwl/salad#"
$graph:
- name: CWLType
type: enum
extends: "sld:PrimitiveType"
symbols:
- cwl:File
- cwl:Directory
doc:
- "Extends primitive types with the concept of a file and directory as a builtin type."
- "File: A File object"
- "Directory: A Directory object"
- name: CWLArraySchema
type: record
extends: "sld:ArraySchema"
fields:
items:
type:
- PrimitiveType
- CWLRecordSchema
- EnumSchema
- CWLArraySchema
- string
- type: array
items:
- PrimitiveType
- CWLRecordSchema
- EnumSchema
- CWLArraySchema
- string
jsonldPredicate:
_id: "sld:items"
_type: "@vocab"
refScope: 2
doc: "Defines the type of the array elements."
- name: CWLRecordField
type: record
extends: "sld:RecordField"
fields:
- name: type
type:
- PrimitiveType
- CWLRecordSchema
- EnumSchema
- CWLArraySchema
- string
- type: array
items:
- PrimitiveType
- CWLRecordSchema
- EnumSchema
- CWLArraySchema
- string
jsonldPredicate:
_id: sld:type
_type: "@vocab"
typeDSL: true
refScope: 2
doc: |
The field type
- name: CWLRecordSchema
type: record
extends: "sld:RecordSchema"
fields:
fields:
type: CWLRecordField[]?
jsonldPredicate:
_id: sld:fields
mapSubject: name
mapPredicate: type
doc: "Defines the fields of the record."
- name: File
type: record
docParent: "#CWLType"
doc: |
Represents a file (or group of files when `secondaryFiles` is provided) that
will be accessible by tools using standard POSIX file system call API such as
open(2) and read(2).
Files are represented as objects with `class` of `File`. File objects have
a number of properties that provide metadata about the file.
The `location` property of a File is a IRI that uniquely identifies the
file. Implementations must support the `file://` IRI scheme and may support
other schemes such as `http://` and `https://`. The value of `location` may also be a
relative reference, in which case it must be resolved relative to the IRI
of the document it appears in. Alternately to `location`, implementations
must also accept the `path` property on File, which must be a filesystem
path available on the same host as the CWL runner (for inputs) or the
runtime environment of a command line tool execution (for command line tool
outputs).
If no `location` or `path` is specified, a file object must specify
`contents` with the UTF-8 text content of the file. This is a "file
literal". File literals do not correspond to external resources, but are
created on disk with `contents` with when needed for executing a tool.
Where appropriate, expressions can return file literals to define new files
on a runtime. The maximum size of `contents` is 64 kilobytes.
The `basename` property defines the filename on disk where the file is
staged. This may differ from the resource name. If not provided,
`basename` must be computed from the last path part of `location` and made
available to expressions.
The `secondaryFiles` property is a list of File or Directory objects that
must be staged in the same directory as the primary file. It is an error
for file names to be duplicated in `secondaryFiles`.
The `size` property is the size in bytes of the File. It must be computed
from the resource and made available to expressions. The `checksum` field
contains a cryptographic hash of the file content for use it verifying file
contents. Implementations may, at user option, enable or disable
computation of the `checksum` field for performance or other reasons.
However, the ability to compute output checksums is required to pass the
CWL conformance test suite.
When executing a CommandLineTool, the files and secondary files may be
staged to an arbitrary directory, but must use the value of `basename` for
the filename. The `path` property must be file path in the context of the
tool execution runtime (local to the compute node, or within the executing
container). All computed properties should be available to expressions.
File literals also must be staged and `path` must be set.
When collecting CommandLineTool outputs, `glob` matching returns file paths
(with the `path` property) and the derived properties. This can all be
modified by `outputEval`. Alternately, if the file `cwl.output.json` is
present in the output, `outputBinding` is ignored.
File objects in the output must provide either a `location` IRI or a `path`
property in the context of the tool execution runtime (local to the compute
node, or within the executing container).
When evaluating an ExpressionTool, file objects must be referenced via
`location` (the expression tool does not have access to files on disk so
`path` is meaningless) or as file literals. It is legal to return a file
object with an existing `location` but a different `basename`. The
`loadContents` field of ExpressionTool inputs behaves the same as on
CommandLineTool inputs, however it is not meaningful on the outputs.
An ExpressionTool may forward file references from input to output by using
the same value for `location`.
fields:
- name: class
type:
type: enum
name: File_class
symbols:
- cwl:File
jsonldPredicate:
_id: "@type"
_type: "@vocab"
doc: Must be `File` to indicate this object describes a file.
- name: location
type: string?
doc: |
An IRI that identifies the file resource. This may be a relative
reference, in which case it must be resolved using the base IRI of the
document. The location may refer to a local or remote resource; the
implementation must use the IRI to retrieve file content. If an
implementation is unable to retrieve the file content stored at a
remote resource (due to unsupported protocol, access denied, or other
issue) it must signal an error.
If the `location` field is not provided, the `contents` field must be
provided. The implementation must assign a unique identifier for
the `location` field.
If the `path` field is provided but the `location` field is not, an
implementation may assign the value of the `path` field to `location`,
then follow the rules above.
jsonldPredicate:
_id: "@id"
_type: "@id"
- name: path
type: string?
doc: |
The local host path where the File is available when a CommandLineTool is
executed. This field must be set by the implementation. The final
path component must match the value of `basename`. This field
must not be used in any other context. The command line tool being
executed must be able to access the file at `path` using the POSIX
`open(2)` syscall.
As a special case, if the `path` field is provided but the `location`
field is not, an implementation may assign the value of the `path`
field to `location`, and remove the `path` field.
If the `path` contains [POSIX shell metacharacters](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_02)
(`|`,`&`, `;`, `<`, `>`, `(`,`)`, `$`,`` ` ``, `\`, `"`, `'`,
`<space>`, `<tab>`, and `<newline>`) or characters
[not allowed](http://www.iana.org/assignments/idna-tables-6.3.0/idna-tables-6.3.0.xhtml)
for [Internationalized Domain Names for Applications](https://tools.ietf.org/html/rfc6452)
then implementations may terminate the process with a
`permanentFailure`.
jsonldPredicate:
"_id": "cwl:path"
"_type": "@id"
- name: basename
type: string?
doc: |
The base name of the file, that is, the name of the file without any
leading directory path. The base name must not contain a slash `/`.
If not provided, the implementation must set this field based on the
`location` field by taking the final path component after parsing
`location` as an IRI. If `basename` is provided, it is not required to
match the value from `location`.
When this file is made available to a CommandLineTool, it must be named
with `basename`, i.e. the final component of the `path` field must match
`basename`.
jsonldPredicate: "cwl:basename"
- name: dirname
type: string?
doc: |
The name of the directory containing file, that is, the path leading up
to the final slash in the path such that `dirname + '/' + basename ==
path`.
The implementation must set this field based on the value of `path`
prior to evaluating parameter references or expressions in a
CommandLineTool document. This field must not be used in any other
context.
- name: nameroot
type: string?
doc: |
The basename root such that `nameroot + nameext == basename`, and
`nameext` is empty or begins with a period and contains at most one
period. For the purposes of path splitting leading periods on the
basename are ignored; a basename of `.cshrc` will have a nameroot of
`.cshrc`.
The implementation must set this field automatically based on the value
of `basename` prior to evaluating parameter references or expressions.
- name: nameext
type: string?
doc: |
The basename extension such that `nameroot + nameext == basename`, and
`nameext` is empty or begins with a period and contains at most one
period. Leading periods on the basename are ignored; a basename of
`.cshrc` will have an empty `nameext`.
The implementation must set this field automatically based on the value
of `basename` prior to evaluating parameter references or expressions.
- name: checksum
type: string?
doc: |
Optional hash code for validating file integrity. Currently, must be in the form
"sha1$ + hexadecimal string" using the SHA-1 algorithm.
- name: size
type:
- "null"
- int
- long
doc: Optional file size (in bytes)
- name: "secondaryFiles"
type:
- "null"
- type: array
items: [File, Directory]
jsonldPredicate:
_id: "cwl:secondaryFiles"
secondaryFilesDSL: true
doc: |
A list of additional files or directories that are associated with the
primary file and must be transferred alongside the primary file.
Examples include indexes of the primary file, or external references
which must be included when loading primary document. A file object
listed in `secondaryFiles` may itself include `secondaryFiles` for
which the same rules apply.
- name: format
type: string?
jsonldPredicate:
_id: cwl:format
_type: "@id"
identity: true
noLinkCheck: true
doc: |
The format of the file: this must be an IRI of a concept node that
represents the file format, preferably defined within an ontology.
If no ontology is available, file formats may be tested by exact match.
Reasoning about format compatibility must be done by checking that an
input file format is the same, `owl:equivalentClass` or
`rdfs:subClassOf` the format required by the input parameter.
`owl:equivalentClass` is transitive with `rdfs:subClassOf`, e.g. if
`<B> owl:equivalentClass <C>` and `<B> owl:subclassOf <A>` then infer
`<C> owl:subclassOf <A>`.
File format ontologies may be provided in the "$schemas" metadata at the
root of the document. If no ontologies are specified in `$schemas`, the
runtime may perform exact file format matches.
- name: contents
type: string?
doc: |
File contents literal.
If neither `location` nor `path` is provided, `contents` must be
non-null. The implementation must assign a unique identifier for the
`location` field. When the file is staged as input to CommandLineTool,
the value of `contents` must be written to a file.
If `contents` is set as a result of a Javascript expression,
an `entry` in `InitialWorkDirRequirement`, or read in from
`cwl.output.json`, there is no specified upper limit on the
size of `contents`. Implementations may have practical limits
on the size of `contents` based on memory and storage
available to the workflow runner or other factors.
If the `loadContents` field of an `InputParameter` or
`OutputParameter` is true, and the input or output File object
`location` is valid, the file must be a UTF-8 text file 64 KiB
or smaller, and the implementation must read the entire
contents of the file and place it in the `contents` field. If
the size of the file is greater than 64 KiB, the
implementation must raise a fatal error.
- name: Directory
type: record
docAfter: "#File"
doc: |
Represents a directory to present to a command line tool.
Directories are represented as objects with `class` of `Directory`. Directory objects have
a number of properties that provide metadata about the directory.
The `location` property of a Directory is a IRI that uniquely identifies
the directory. Implementations must support the file:// IRI scheme and may
support other schemes such as http://. Alternately to `location`,
implementations must also accept the `path` property on Directory, which
must be a filesystem path available on the same host as the CWL runner (for
inputs) or the runtime environment of a command line tool execution (for
command line tool outputs).
A Directory object may have a `listing` field. This is a list of File and
Directory objects that are contained in the Directory. For each entry in
`listing`, the `basename` property defines the name of the File or
Subdirectory when staged to disk. If `listing` is not provided, the
implementation must have some way of fetching the Directory listing at
runtime based on the `location` field.
If a Directory does not have `location`, it is a Directory literal. A
Directory literal must provide `listing`. Directory literals must be
created on disk at runtime as needed.
The resources in a Directory literal do not need to have any implied
relationship in their `location`. For example, a Directory listing may
contain two files located on different hosts. It is the responsibility of
the runtime to ensure that those files are staged to disk appropriately.
Secondary files associated with files in `listing` must also be staged to
the same Directory.
When executing a CommandLineTool, Directories must be recursively staged
first and have local values of `path` assigned.
Directory objects in CommandLineTool output must provide either a
`location` IRI or a `path` property in the context of the tool execution
runtime (local to the compute node, or within the executing container).
An ExpressionTool may forward file references from input to output by using
the same value for `location`.
Name conflicts (the same `basename` appearing multiple times in `listing`
or in any entry in `secondaryFiles` in the listing) is a fatal error.
fields:
- name: class
type:
type: enum
name: Directory_class
symbols:
- cwl:Directory
jsonldPredicate:
_id: "@type"
_type: "@vocab"
doc: Must be `Directory` to indicate this object describes a Directory.
- name: location
type: string?
doc: |
An IRI that identifies the directory resource. This may be a relative
reference, in which case it must be resolved using the base IRI of the
document. The location may refer to a local or remote resource. If
the `listing` field is not set, the implementation must use the
location IRI to retrieve directory listing. If an implementation is
unable to retrieve the directory listing stored at a remote resource (due to
unsupported protocol, access denied, or other issue) it must signal an
error.
If the `location` field is not provided, the `listing` field must be
provided. The implementation must assign a unique identifier for
the `location` field.
If the `path` field is provided but the `location` field is not, an
implementation may assign the value of the `path` field to `location`,
then follow the rules above.
jsonldPredicate:
_id: "@id"
_type: "@id"
- name: path
type: string?
doc: |
The local path where the Directory is made available prior to executing a
CommandLineTool. This must be set by the implementation. This field
must not be used in any other context. The command line tool being
executed must be able to access the directory at `path` using the POSIX
`opendir(2)` syscall.
If the `path` contains [POSIX shell metacharacters](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_02)
(`|`,`&`, `;`, `<`, `>`, `(`,`)`, `$`,`` ` ``, `\`, `"`, `'`,
`<space>`, `<tab>`, and `<newline>`) or characters
[not allowed](http://www.iana.org/assignments/idna-tables-6.3.0/idna-tables-6.3.0.xhtml)
for [Internationalized Domain Names for Applications](https://tools.ietf.org/html/rfc6452)
then implementations may terminate the process with a
`permanentFailure`.
jsonldPredicate:
_id: "cwl:path"
_type: "@id"
- name: basename
type: string?
doc: |
The base name of the directory, that is, the name of the file without any
leading directory path. The base name must not contain a slash `/`.
If not provided, the implementation must set this field based on the
`location` field by taking the final path component after parsing
`location` as an IRI. If `basename` is provided, it is not required to
match the value from `location`.
When this file is made available to a CommandLineTool, it must be named
with `basename`, i.e. the final component of the `path` field must match
`basename`.
jsonldPredicate: "cwl:basename"
- name: listing
type:
- "null"
- type: array
items: [File, Directory]
doc: |
List of files or subdirectories contained in this directory. The name
of each file or subdirectory is determined by the `basename` field of
each `File` or `Directory` object. It is an error if a `File` shares a
`basename` with any other entry in `listing`. If two or more
`Directory` object share the same `basename`, this must be treated as
equivalent to a single subdirectory with the listings recursively
merged.
jsonldPredicate:
_id: "cwl:listing"
- name: CWLObjectType
type: union
names:
- boolean
- int
- long
- float
- double
- string
- File
- Directory
- type: array
items:
- "null"
- CWLObjectType
- type: map
values:
- "null"
- CWLObjectType
doc: |
Generic type representing a valid CWL object. It is used to represent
`default` values passed to CWL `InputParameter` and `WorkflowStepInput`
record fields.
- name: CWLInputFile
type: map
values:
- "null"
- type: array
items: ProcessRequirement
- CWLObjectType
doc: |
Type representing a valid CWL input file as a `map<string, union<array<ProcessRequirement>, CWLObjectType>>`.
jsonldPredicate:
_id: "cwl:inputfile"
_container: "@list"
noLinkCheck: true