Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 89 additions & 1 deletion docs/process-typed.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,8 @@ All {ref}`standard types <stdlib-types>` except for the dataflow types (`Channel

Nextflow automatically stages `Path` inputs and `Path` collections (such as `Set<Path>`) into the task directory.

### Nullable inputs

By default, tasks fail if any input receives a `null` value. To allow `null` values, add `?` to the type annotation:

```nextflow
Expand All @@ -85,10 +87,58 @@ process cat_opt {
}
```

### Stage directives
### Record inputs

Inputs with type `Record` can declare the name and type of each record field:

```nextflow
process fastqc {
input:
(id: String, fastq: Path): Record

script:
"""
echo 'id: ${id}`
echo 'fastq: ${fastq}'
"""
}
```

This pattern is called *record destructuring*. Each record field is staged into the task the same way as an individual input.

When the process is invoked, the incoming record should contain the specified fields, or else the run will fail. If the record has additional fields not declared by the process input, they are ignored.

:::{tip}
Record inputs are a useful way to select a subset of fields from a larger record. This way, the process only stages what it needs, allowing you to keep related data together in your workflow logic.
:::

### Tuple inputs

Inputs with type `Tuple` can declare the name of each tuple component:

```nextflow
process fastqc {
input:
(id, fastq): Tuple<String,Path>

script:
"""
echo 'id: ${id}`
echo 'fastq: ${fastq}'
"""
}
```

This pattern is called *tuple destructuring*. Each tuple component is staged into the task the same way as an individual input.

The generic types inside the `Tuple<...>` annotation specify the type of each tuple compomnent and should match the component names. In the above example, `id` has type `String` and `fastq` has type `Path`.

## Stage directives

The `stage:` section defines custom staging behavior using *stage directives*. It should be specified after the `input:` section. These directives serve the same purpose as input qualifiers such as `env` and `stdin` in the legacy syntax.

### Environment variables

The `env` directive declares an environment variable in terms of task inputs:

```nextflow
Expand All @@ -106,6 +156,8 @@ process echo_env {
}
```

### Standard input (stdin)

The `stdin` directive defines the standard input of the task script:

```nextflow
Expand All @@ -123,6 +175,8 @@ process cat {
}
```

### Custom file staging

The `stageAs` directive stages an input file (or files) under a custom file pattern:

```nextflow
Expand Down Expand Up @@ -222,6 +276,40 @@ process foo {
}
```

### Structured outputs

Whereas legacy process outputs could only be structured using specific qualifiers like `val` and `tuple`, typed process outputs are regular values.

The `record()` standard library function can be used to create a record:

```nextflow
process fastqc {
input:
(id: String, fastq: Path): Record

output:
record(id: id, fastqc: file('fastqc_logs'))

script:
// ...
}
```

The `tuple()` standard library function can be used to create a tuple:

```nextflow
process fastqc {
input:
(id, fastq): Tuple<String,Path>

output:
tuple(id, file('fastqc_logs'))

script:
// ...
}
```

## Topics

The `topic:` section emits values to {ref}`topic channels <channel-topic>`. A topic emission consists of an output value and a topic name:
Expand Down
5 changes: 4 additions & 1 deletion docs/reference/stdlib-namespaces.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,8 +108,11 @@ The global namespace contains globally available constants and functions.
`sleep( milliseconds: long )`
: Sleep for the given number of milliseconds.

`record( [options] ) -> Record`
: Create a record from the given named arguments.

`tuple( args... ) -> Tuple`
: Create a tuple object from the given arguments.
: Create a tuple from the given arguments.

(stdlib-namespaces-channel)=

Expand Down
32 changes: 32 additions & 0 deletions docs/reference/stdlib-types.md
Original file line number Diff line number Diff line change
Expand Up @@ -767,6 +767,38 @@ The following methods are available for splitting and counting the records in fi
`splitText() -> List<String>`
: Splits a text file into a list of lines. See the {ref}`operator-splittext` operator for available options.

(stdlib-types-record)=

## Record

A record is an immutable map of fields to values (i.e., `Map<String,?>`). Each value can have its own type.

A record can be created using the `record` function:

```nextflow
sample = record(id: '1', fastq_1: file('1_1.fastq'), fastq_2: file('1_2.fastq'))
```

Record fields can be accessed as properties:

```nextflow
sample.id
// -> '1'
```

The following operations are supported for records:

`+ : (Record, Record) -> Record`
: Given two records, returns a new record containing the fields and values of both records. When a field is present in both records, the value of the right-hand record takes precedence.

`- : (Record, Iterable<String>) -> Record`
: Given a record and a collection of strings, returns a copy of the record with the given fields removed.

The following methods are available for a record:

`subMap( keys: Iterable<String> ) -> Record`
: Returns a new record containing only the given fields.

(stdlib-types-set)=

## Set\<E\>
Expand Down
15 changes: 12 additions & 3 deletions docs/reference/syntax.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ A Nextflow script may contain the following top-level declarations:
- Process definitions
- Function definitions
- Enum types
- Record types
- Output block

Script declarations are in turn composed of statements and expressions.
Expand Down Expand Up @@ -360,9 +361,17 @@ enum Day {

Enum values in the above example can be accessed as `Day.MONDAY`, `Day.TUESDAY`, and so on.

:::{note}
Enum types cannot be included across modules at this time.
:::
### Record type

A record type declaration consists of a name and a body. The body consists of one or more fields, where each field has a name and a type:

```nextflow
record FastqPair {
id: String
fastq_1: Path
fastq_2: Path
}
```

### Output block

Expand Down
9 changes: 9 additions & 0 deletions modules/nextflow/src/main/groovy/nextflow/Nextflow.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ import nextflow.splitter.FastaSplitter
import nextflow.splitter.FastqSplitter
import nextflow.util.ArrayTuple
import nextflow.util.CacheHelper
import nextflow.util.RecordMap
import org.slf4j.Logger
import org.slf4j.LoggerFactory
/**
Expand Down Expand Up @@ -150,6 +151,14 @@ class Nextflow {
return result instanceof Collection ? result : [result]
}

/**
* Creates a {@link RecordMap} from the given named arguments.
*
* @param props
*/
static RecordMap record(Map<String,?> props) {
return new RecordMap(props)
}

/**
* Creates a {@link ArrayTuple} object with the given open array items
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1769,7 +1769,7 @@ class TaskProcessor {
if( value == null && !param.optional )
throw new ProcessUnrecoverableException("[${safeTaskName(task)}] input at index ${i} cannot be null -- append `?` to the type annotation to mark it as nullable")
if( param instanceof ProcessTupleInput )
assignTaskTupleInput(task, param, value, i)
assignTaskStructuredInput(task, param, value, i)
else
assignTaskInput(task, param, value, i)
}
Expand Down Expand Up @@ -1806,17 +1806,26 @@ class TaskProcessor {
}

@CompileStatic
private void assignTaskTupleInput(TaskRun task, ProcessTupleInput param, Object value, int index) {
if( value !instanceof List ) {
throw new ProcessUnrecoverableException("[${safeTaskName(task)}] input at index ${index} expected a tuple but received: ${value} [${value.class.simpleName}]")
private void assignTaskStructuredInput(TaskRun task, ProcessTupleInput param, Object value, int index) {
if( value instanceof List ) {
final tupleParams = param.getComponents()
final tupleValues = value as List
if( tupleParams.size() != tupleValues.size() ) {
throw new ProcessUnrecoverableException("[${safeTaskName(task)}] input at index ${index} expected a tuple with ${tupleParams.size()} elements but received ${tupleValues.size()} -- offending value: $tupleValues")
}
for( int i = 0; i < tupleParams.size(); i++ ) {
assignTaskInput(task, tupleParams[i], tupleValues[i], index)
}
}
final tupleParams = param.getComponents()
final tupleValues = value as List
if( tupleParams.size() != tupleValues.size() ) {
throw new ProcessUnrecoverableException("[${safeTaskName(task)}] input at index ${index} expected a tuple with ${tupleParams.size()} elements but received ${tupleValues.size()} -- offending value: $tupleValues")
else if( value instanceof Map ) {
final record = value as Map
for( final fieldParam : param.getComponents() ) {
final fieldName = fieldParam.getName()
assignTaskInput(task, fieldParam, record[fieldName], index)
}
}
for( int i = 0; i < tupleParams.size(); i++ ) {
assignTaskInput(task, tupleParams[i], tupleValues[i], index)
else {
throw new ProcessUnrecoverableException("[${safeTaskName(task)}] input at index ${index} expected a record or tuple but received: ${value} [${value.class.simpleName}]")
}
}

Expand Down
26 changes: 13 additions & 13 deletions modules/nf-commons/src/main/nextflow/util/HashBuilder.java
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
import java.nio.file.attribute.BasicFileAttributes;
import java.util.Collection;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.UUID;
Expand All @@ -46,7 +47,6 @@
import nextflow.extension.Bolts;
import nextflow.extension.FilesEx;
import nextflow.io.SerializableMarker;
import nextflow.script.types.Bag;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import static nextflow.Const.DEFAULT_ROOT;
Expand Down Expand Up @@ -141,28 +141,31 @@ else if( value instanceof CharSequence )
else if( value instanceof byte[] )
hasher.putBytes( (byte[])value );

else if( value instanceof Object[])
else if( value instanceof Object[]) {
for( Object item : ((Object[])value) )
with(item);
}

// note: should map be order invariant as Set ?
else if( value instanceof Map )
for( Object item : ((Map)value).values() )
else if( value instanceof CacheFunnel )
((CacheFunnel)value).funnel(hasher, mode);

else if( value instanceof List ) {
for( Object item : ((List)value) )
with(item);
}

else if( value instanceof Map )
hashUnorderedCollection(hasher, ((Map) value).entrySet(), mode);

else if( value instanceof Map.Entry ) {
Map.Entry entry = (Map.Entry)value;
with(entry.getKey());
with(entry.getValue());
}

else if( value instanceof Bag || value instanceof Set )
else if( value instanceof Collection )
hashUnorderedCollection(hasher, (Collection) value, mode);

else if( value instanceof Collection)
for( Object item : ((Collection)value) )
with(item);

else if( value instanceof Path )
hashFile(hasher, (Path)value, mode, basePath);

Expand All @@ -180,9 +183,6 @@ else if( value instanceof VersionNumber )
else if( value instanceof SerializableMarker)
hasher.putInt( value.hashCode() );

else if( value instanceof CacheFunnel )
((CacheFunnel)value).funnel(hasher, mode);

else if( value instanceof Enum )
hasher.putUnencodedChars( value.getClass().getName() + "." + value );

Expand Down
Loading
Loading