5.11 Resuming your pipeline
You can resume the execution after the code modification using the parameter -resume.
Nextflow is smart enough to cache the execution since input and output were not changed.
nextflow run test1.nf -bg -resume
N E X T F L O W ~ version 20.07.1
Launching `test1.nf` [determined_celsius] - revision: eaf5b4d673
[bd/f4e9a6] Cached process > flow1:splitSequences
[37/d790ab] Cached process > flow2:splitSequences
[93/c7b1c6] Cached process > flow2:reverseSequence (seq_3)
[45/86dd83] Cached process > flow2:reverseSequence (seq_1)
[87/54bfe8] Cached process > flow2:reverseSequence (seq_2)
[33/a6fc72] Cached process > flow1:reverseSequence ([seq_1, seq_2, seq_3])
/home/ec2-user/git/CoursesCRG_Containers_Nextflow_May_2021/nextflow/nextflow/work/33/a6fc72786d042cacf733034d501691/all.rev
IMPORTANT: Nextflow parameters are with one hyphen (-resume
) while pipeline parameters are with two (--inputfile
)
Sometimes you might want to resume a previous run of your pipeline.
For doing so you need to extract the job id of that run. You can do this by using the command nextflow log
nextflow log
TIMESTAMP DURATION RUN NAME STATUS REVISION ID SESSION ID COMMAND
2020-10-06 14:49:09 2s agitated_avogadro OK 61a595c5bf 4a7a8a4b-9bdb-4b15-9cc6-1b2cabe9a938 nextflow run test1.nf
2020-10-08 19:14:38 2.8s sick_edison OK 82e66714e4 4fabb863-2038-47b4-bac0-19e71f93f284 nextflow run test1.nf -bg
2020-10-08 19:16:03 3s sad_newton OK 82e66714e4 2d13e9f8-1ba6-422d-9087-5c6c9731a795 nextflow run test1.nf -bg
2020-10-08 19:30:59 2.3s disturbed_wozniak OK d33befe154 0a19b60d-d5fe-4a26-9e01-7a63d0a1d300 nextflow run test1.nf -bg
2020-10-08 19:35:52 2.5s insane_plateau OK d33befe154 b359f32c-254f-4271-95bb-6a91b281dc6d nextflow run test1.nf -bg
2020-10-08 19:56:30 2.8s determined_celsius OK eaf5b4d673 b359f32c-254f-4271-95bb-6a91b281dc6d nextflow run test1.nf -bg -resume
You can then resume the state of your execution using the SESSION ID:
Nextflow’s cache can be disabled for a specific process adding setting the directive cache to false. You can also choose three caching methods:
cache = true // (default) Cache keys are created indexing input files meta-data information (name, size and last update timestamp attributes).
cache = 'deep' // Cache keys are created indexing input files content.
cache = 'lenient' // (Best in HPC and shared file systems) Cache keys are created indexing input files path and size attributes
IMPORTANT On some shared file systems you might have inconsistent file timestamps. So cache lenient prevents you from unwanted restarting of cached processes.