How to run run large number of small jobs in Mahti and Puhti
In many cases, a computational analysis job contains a number of similar independent sub-tasks. A user may have several datasets that are analyzed in the same way, or the same simulation code is executed with a number of different parameters. These kind of tasks are often called as farming or embarrassingly parallel jobs as the work can in principle be distributed to as many processors as there are subtasks to run.
In Mahti these kind of task sets can be executed with the GREASY metascheduler
and sbatch-greasy
automatic submission command. GREASY enables Mahti to be effectively used for non-MPI tasks, too.
However, the task set to be executed should be large enough so that it can utilize the full capacity of at least one Mahti node (128 cores).
In Puhti GREASY can be used as an alternative for array jobs. GREASY is the recommended option in cases where individual tasks are very short. Further, GREASY allows you to define dependencies between tasks, whics is not possible in array jobs.
GREASY was originally developed at BSC. At CSC we use the GREASY version that includes the extensions developed at CSCS. For detailed documentation please check:
Note
You should not use GREASY to run MPI parallel tasks. GREASY is not able to manage MPI jobs effectively.
GREASY Task lists
Job scheduling with GREASY is based on task lists that have one task (command) in one row. In the simplest approach, the task list is just a file containing the commands to be executed. For example, analyzing 200 input files with program my_prog could be described with task list containing 200 rows:
my_prog < input1.txt > output1.txt
my_prog < input2.txt > output2.txt
my_prog < input3.txt > output3.txt
...
my_prog < input200.txt > output200.txt
If needed, you can define dependencies between jobs with the [#line number#]
syntax. For example, if you would like to merge the output files of the previous
example into a one file, you could add one more row to the task list:
my_prog < input1.txt > output1.txt
my_prog < input2.txt > output2.txt
[#1#] my_prog < input3.txt > output3.txt
...
my_prog < input200.txt > output200.txt
[#1-200#] cat output* > all_output
By default, all tasks are executed in the directory where the task list processing is launched, but you can add task specific execution directories to the task list with:
Note
You should not include srun to your tasks as GREASY will add it when it executes the tasks. Please check the GREASY user guide for more detailed description of the task list syntax.
Executing a task list
To use GREASY in Mahti or Puhti, load the GREASY module:
module load greasy
sbatch-greasy tasklist
-c
)
2. estimated average duration for one task (-t
)
3. number of nodes used to execute the tasks (-N
)
4. accounting project (-A
).
5. estimated memory usage for one task (-m
) (This parameter is not in use in Mahti).
Alternatively you can define part or all of these parameters in command line:
sbatch-greasy tasklist -c 1 -t 15:00 -N 1 -A project_2012345
With the option -f filename
you can make sbatch-greasy to save the GREASY batch
file but not to send it to be executed. This batch job file can then be further
edited according to your needs, e.g. if you need to set up additional SLURM parameters.
Submit it normally with:
sbatch filename
Caveats
Performance in threaded (OpenMP) jobs can be sensitive to the thread binding. If your job is parallelized via OpenMP, make sure the performance of individual subjobs has not suffered. A single subjob must fit in one node, but such a job could also be run as an array job. GREASY is thus better suited for jobs (much) smaller than one node.
While Greasy only creates one batch job, it will create a job step for each task it runs. A huge number of job tasks in a batch job will be problematic. If you need to run hundreds or thousands of job steps, please contact servicedesk@csc.fi to look for alternatives.
Last edited Tue Dec 15 2020