6.7 Deployment in the AWS cloud

The final profile is for running the pipeline in the Amazon Cloud, known as Amazon Web Services or AWS. In particular we will use AWS Batch that allows the execution of containerised workloads in the Amazon cloud infrastructure.


   cloud {
    workDir = 's3://class-bucket-1/work'
    aws.region = 'eu-central-1'
    aws.batch.cliPath = '/home/ec2-user/miniconda/bin/aws'
    
   process {
       containerOptions = { workflow.containerEngine == "docker" ? '-u $(id -u):$(id -g)': null}
       executor = 'awsbatch'
       queue = 'spot'
       memory='1G'
       cpus='1'
       time='6h'

       withLabel: 'twocpus' {
           memory='0.6G'
           cpus='2'
       }
    }
  }

We indicate some AWS specific parameters (region and cliPath) and the executor that is awsbatch.
Then we indicate that the working directory, that is normally written locally, should be mounted as S3 volume. This is mandatory when running Nextflow on the cloud.
We can now launch the pipeline indicating -profile cloud

nextflow run test3.nf -bg -with-docker -profile cloud > log

Note that there is no longer a work folder because, on the AWS cloud, the output is copied locally.

Sometimes you can find that the Nextflow process itself is very memory intensive and the main node can run out of memory. To avoid this you can reduce the memory needed by setting an environmental variable:

export NXF_OPTS="-Xms50m -Xmx500m"

Again we can copy the output file to the bucket.
We can also tell Nextflow to directly copy the output file to the S3 bucket: to do so, change the parameter outdir in the params file:

outdir = "s3://class-bucket-1/results"