5.7 Workflow and log
The code as it is will not produce anything, because another part is needed that will actually call the process and connect it to the input channel.
This part is called a workflow.
Let’s add a workflow to our code:
#!/usr/bin/env nextflow
nextflow.enable.dsl=2
str = Channel.from('hello', 'hola', 'bonjour')
process printHello {
tag { "${str_in}" }
input:
val str_in
output:
stdout
script:
"""
echo ${str_in} in Italian is ciao
"""
}
/*
* A workflow consists of a number of invocations of processes
* where they are fed with the expected input channels
* as if they were custom functions. You can only invoke a process once per workflow.
*/
workflow {
result = printHello(str)
result.view()
}
We can run the script this time sending the execution in the background (with the -bg
option) and saving the log in the file log.txt
.
5.7.1 Nextflow log
Let’s inspect now the log file:
cat log.txt
N E X T F L O W ~ version 20.07.1
Launching `test1.nf` [high_fermat] - revision: b129d66e57
[6a/2dfcaf] Submitted process > printHello (hola)
[24/a286da] Submitted process > printHello (hello)
[04/e733db] Submitted process > printHello (bonjour)
hola in Italian is ciao
hello in Italian is ciao
bonjour in Italian is ciao
The tag allows us to see that the process printHello was launched three times on the hola, hello and bonjour values contained in the input channel.
At the start of each row, there is an alphanumeric code:
[6a/2dfcaf] Submitted process > printHello (hola)
This code indicates the path in which the process is “isolated” and where the corresponding temporary files are kept in the work directory.
IMPORTANT: Nextflow will randomly generate temporary folders so they will be named differently in your execution!!!
Let’s have a look inside that folder:
# Show the folder's full name
echo work/6a/2dfcaf*
work/6a/2dfcafc01350f475c60b2696047a87
# List was is inside the folder
ls -alht work/6a/2dfcaf*
total 40
-rw-r--r-- 1 lcozzuto staff 1B Oct 7 13:39 .exitcode
drwxr-xr-x 9 lcozzuto staff 288B Oct 7 13:39 .
-rw-r--r-- 1 lcozzuto staff 24B Oct 7 13:39 .command.log
-rw-r--r-- 1 lcozzuto staff 24B Oct 7 13:39 .command.out
-rw-r--r-- 1 lcozzuto staff 0B Oct 7 13:39 .command.err
-rw-r--r-- 1 lcozzuto staff 0B Oct 7 13:39 .command.begin
-rw-r--r-- 1 lcozzuto staff 45B Oct 7 13:39 .command.sh
-rw-r--r-- 1 lcozzuto staff 2.5K Oct 7 13:39 .command.run
drwxr-xr-x 3 lcozzuto staff 96B Oct 7 13:39 ..
You see a lot of “hidden” files:
- .exitcode, contains 0 if everything is ok, another value if there was a problem.
- .command.log, contains the log of the command execution. It is often identical to
.command.out
- .command.out, contains the standard output of the command execution
- .command.err, contains the standard error of the command execution
- .command.begin, contains what has to be executed before
.command.sh
- .command.sh, contains the block of code indicated in the process
- .command.run, contains the code made by nextflow for the execution of
.command.sh
, and contains environmental variables, eventual invocations of linux containers etc.
For instance the content of .command.sh
is:
And the content of .command.out
is
You can also give a name to workflows, so that you can combine them in the main workflow. For instance we can write:
#!/usr/bin/env nextflow
nextflow.enable.dsl=2
str = Channel.from('hello', 'hola', 'bonjour')
process printHello {
tag { "${str_in}" }
input:
val str_in
output:
stdout
script:
"""
echo ${str_in} in Italian is ciao
"""
}
/*
* A workflow can be named as a function and receive an input using the take keyword
*/
workflow first_pipeline {
take: str_input
main:
printHello(str_input).view()
}
/*
* You can re-use the previous processes and combine as you prefer
*/
workflow second_pipeline {
take: str_input
main:
printHello(str_input.collect()).view()
}
/*
* You can then invoke the different named workflows in this way
* passing the same input channel `str` to both
*/
workflow {
first_pipeline(str)
second_pipeline(str)
}
You can see that with the previous code you can execute two workflows containing the same process.
We can add the collect operator to the second workflow that collects the output from different executions and returns the resulting list as a sole emission.
Let’s run the code:
nextflow run test1.nf -bg > log2
cat log2
N E X T F L O W ~ version 20.07.1
Launching `test1.nf` [irreverent_davinci] - revision: 25a5511d1d
[de/105b97] Submitted process > first_pipeline:printHello (hello)
[ba/051c23] Submitted process > first_pipeline:printHello (bonjour)
[1f/9b41b2] Submitted process > second_pipeline:printHello (hello)
[8d/270d93] Submitted process > first_pipeline:printHello (hola)
[18/7b84c3] Submitted process > second_pipeline:printHello (hola)
hello in Italian is ciao
bonjour in Italian is ciao
[0f/f78baf] Submitted process > second_pipeline:printHello (bonjour)
hola in Italian is ciao
['hello in Italian is ciao\n', 'hola in Italian is ciao\n', 'bonjour in Italian is ciao\n']