High-throughput Nextflow workflow using HyperQueue

The Slurm executor of Nextflow is not suitable for large workflows involving tens of thousands of processes on HPC systems. This is due to the usage of individual jobs/job steps that bloat the Slurm log and degrade the performance of the batch job scheduler. To maximize throughput and performance, we recommend the Nextflow workflow tasks to be run using the HyperQueue meta-scheduler as the executor. This page provides an example batch script for this purpose.

Note

Whenever you're unsure how to run your workflow efficiently, don't hesitate to contact CSC Service Desk.

Example batch script

#!/bin/bash
#SBATCH --partition=medium
#SBATCH --account=project_2001659 
#SBATCH --nodes=3
#SBATCH --exclusive
#SBATCH --time=00:10:00

# Load the required modules
module load hyperqueue
module load nextflow

# Create a per job directory
wrkdir=$PWD/WRKDIR-$SLURM_JOB_ID
mkdir -p $wrkdir/.hq-server

# Set the directory which hyperqueue will use 
export HQ_SERVER_DIR=$wrkdir/.hq-server

# Make sure nextflow uses the right executor and
# knows how much it can submit.
echo "executor {
  queueSize = $(( 128*SLURM_NNODES ))
  name = 'hq'
  cpus = $(( 128*SLURM_NNODES )) 
}" > $wrkdir/nextflow.config

cp main.nf $wrkdir
hq server start &
srun --cpu-bind=none --hint=nomultithread --mpi=none -N $SLURM_NNODES -n $SLURM_NNODES -c 128 hq worker start --cpus=128 &

num_up=$(hq worker list | grep RUNNING | wc -l)
while true; do

    echo "Checking if workers have started"
    if [[ $num_up -eq $SLURM_NNODES ]];then
        echo "Workers started"
        break
    fi
    echo "$num_up/$SLURM_NNODES workers have started"
    sleep 1
    num_up=$(hq worker list | grep RUNNING | wc -l)

done

cd $wrkdir
nextflow run main.nf

# Make sure we exit cleanly once nextflow is done
hq worker stop all
hq server stop

Where main.nf would be the nextflow script you want to run. Note that this batch script creates a per job directory and copies the nextflow script there before starting.

$ ls
example_jobscript.sh main.nf 
$ sbatch example_jobscript.sh
Submitted batch job 137
$ ls
example_jobscript.sh main.nf WRKDIR-137
$ ls WRKDIR-137
main.nf work

More information

Last edited Thu Jul 14 2022