5.10 Directives

The directives are declaration blocks that can provide optional settings to a process.
For instance, they can affect the way a process stages in and out the input and output files (stageInMode and stageOutMode), or they can indicate which file has to be considered a final result and in which folder it should be published (publishDir).

We can add the directive publishDir to our previous example:

/*
 * Simple reverse the sequences
 */
 
process reverseSequence {
    tag "$seq" // during the execution prints the indicated variable for follow-up
    publishDir "output"

    input:
    path seq 

    output:
    path "all.rev" 
 
        script:
    """
    cat ${seq} | awk '{if (\$1~">") {print \$0} else system("echo " \$0 " |rev")}' > all.rev
    """
}

We can also use storeDir in case we want to have a permanent cache.

The process is executed only if the output files do not exist in the folder specified by storeDir.
When the output files exist, the process execution is skipped and these files are used as the actual process result.

For example, this can be useful if we don’t want to generate indexes each time and we prefer to reuse them.
We can also indicate what to do in case a process fails.

The default is to stop the pipeline and to raise an error. But we can also skip the process using the errorStrategy directive:

errorStrategy 'ignore'

or retry a number of times changing something like the memory available or the maximum execution time.
This time we need a number of directives:

    memory { 1.GB * task.attempt }
    time { 1.hour * task.attempt }

    errorStrategy 'retry' 
    maxRetries 3