Extended instructions for using Maestro at CSC
Please first read the actual CSC Maestro page and then consult the power user and special case instructions below. Further down there are steps to help solving or diagnosing issues and to prepare data for support requests.
Standalone jobs on Puhti
All Maestro jobs must be run on compute nodes via the queuing system. Don't run any Maestro jobs, including the GUI, on the login nodes. Maestro jobs on the login node will be terminated without warning.
The recommended way to run Maestro jobs on Puhti is to create the input files on
your local computer and instead of running them, write them to disk. The procedure
is shown in a video on our
main Maestro page. Use e.g.
scp on your local machine to copy the inputs to Puhti
(edit your username and project accordingly):
scp -r my_job <your username>@puhti.csc.fi:/scratch/<your project>
Once the folder
my_job containing all input files has been copied, 1) SSH to
Puhti, 2) load the Maestro module and 3) go to the input directory:
ssh <your username>@puhti.csc.fi module load maestro cd /scratch/<your project>/my_job
The job is submitted to a compute node(s) by running the
written out by Maestro. It will formulate the task(s) as Slurm batch job(s) and
ask resources according to the selected HOST in your
schrodinger.hosts file in
Once the simulation has finished, copy the outputs back to your local computer
for analysis. On your local machine, run e.g.
scp -r <your username>@puhti.csc.fi:/scratch/<your project>/my_job .
Note that you can also use e.g. the Puhti web interface for copying files between Puhti and your local computer.
Another more advanced version is to use e.g. the
pipeline tool which allows
you to bypass some of the Schrödinger jobcontrol machinery, but requires you to
write the job script yourself. This may be useful in case some of your subjobs
terminate unexpectedly. In this case, please make note of those JobIds and
The remainder of this article explains some implementation details on Puhti and helps setting up efficient simulation workflows.
This file specifies the resources your jobs can get either locally
or from the queuing system. To use the recommended procedure
you need to edit the local (on your computer)
schrodinger.hosts file to
include the same HOSTs that you want to use on Puhti. On Windows, this will
require admin privileges.
On Puhti, Maestro complains about the location of this file, but ignore it,
it's ok. The file is created by a script (echoed on your screen when you give
module load maestro) that you need to run if the file does not exist.
As the script requests, select the computing project that will be used for
CPU/GPU usage and scratch storage. You can find the actual Slurm options
in the HOST descriptions in the
schrodinger.hosts file. If your jobs require
resources that are not satisfied by any of the predefined HOST descriptions,
feel free to edit the file.
On Puhti, you can take a look at the
schrodinger.hosts file with:
On your local computer this file will be in the Maestro installation directory,
e.g. on Windows in
After the longish header and the
localhost entry, you should see the
Puhti HOST entries as something like:
name: test queue: SLURM2.1 qargs: -p test -t 00:10:00 --mem-per-cpu=2000 --account=project_2042424 host: puhti-login11 processors: 4
For example, this HOST entry, available for Schrödinger jobs as test (from
will use the Slurm partition test (from
-p test), allocate a maximum of 10 minutes of time,
2 GB of memory and consume resources from Project_2042424. If you need different resources you
can edit this file e.g. by adding a new entry. The requests must be within the
schrodinger.hosts file on Puhti does not have the
delete the file and rerun the script to create it (
module load maestro will print out the
path to the script, copy/paste it to the command line). You don't need to have the
--account= option set in your local
schrodinger.hosts file. In your local file,
it's enough that the different HOST entries exist (and the GPU-ones have GPUs specified).
Note that the HOST entries and Slurm partitions (or queues) are two different things. The HOST entries define resources using Slurm partitions.
How to speed up simulations?
All other Maestro modules run serial jobs, except Jaguar and Quantum Espresso, which can run "real" parallel jobs. Don't choose a "parallel" HOST for any other job type. Instead of MPI-parallel jobs, Maestro modules typically split the workload into multiple parts, each of which can be run independent of the others. The Maestro documentation has an excellent section on this topic (requires signing in). In the documentation, go to "Getting started" > "Download and Installation" > "Installation and Jobs" > "Running Distributed Schrödinger Jobs".
It is typical to process a lot of molecules as a part of a particular workload. If you have enough molecules, you can split the full set into smaller subsets and process each of the subsets as a separate job. The Maestro modules have easy-to-use options for defining the number of subjobs. However, you must know in advance how many subjobs to launch. In principle, this requires knowing how long one molecule takes, or testing for each different use case.
When you start working with a new system/dataset, don't try if you got the syntax right with 1000000 molecules and 1000 subjobs. Instead, start out with e.g. 50 molecules and 2 subjobs. Learn how long it takes per molecule, confirm that your submit syntax is correct, adjust your parameters if needed and only then scale up.
If you're using the GUI to set up your job script, specify how many (sub)jobs (processors) you want to use. You can easily edit this later in the submit script if you change your mind.
The "default" submit script will work "as is" for small jobs. Just make sure you don't ask for too many (sub)jobs. As a rule of thumb, each subjob should last at least 1 hour, and for very large jobs preferably 24 hours. Running a lot of very short jobs is inefficient in many ways and may degrade the performance of the system for all users, see our high-throughput computing guidelines. For large workflows, you'll need to edit your scripts, see below.
Running multinode jobs using the "parallel" HOST works well with Quantum Espresso when appropriate
parallelization flags are carefully specified. The default parallelization is over plane waves if
no other options are specified. To improve on this, k-points (if more than one) can be partitioned
into "pools" using the
-npools flag. Also, when running on several hundred cores, the scalability
can be further extended by dividing each pool into "task groups" which distributes the workload
associated with Fast Fourier Transforms (FFTs) on the Kohn-Sham states. This is done using the
-ntg flag. In order to have good load balancing among MPI processes, the number of k-point pools
should be an integer divisor of the number of k-points and the number of processors for FFT
parallelization should be an integer divisor of the third dimension of the smooth FFT grid (this
can be checked from the output file,
grep "Smooth grid" *.out). Further parallelization levels are
presented in the QE documentation.
The QE parallelization options can be specified in the Job Settings dialog of the QE calculations
panel of the Maestro GUI. Running a job using 160 cores on Puhti (4 nodes) could be parallelized
for example with
-npools 4 -ntg 4 so that each k-point pool is given 40 cores, which are further
divided into 4 task groups of 10 cores each.
Using full nodes
When running Maestro modules such as Quantum Espresso on multiple nodes, remember to explicitly
request the appropriate number of nodes by editing the
schrodinger.hosts file with the
--nodes=<number of nodes> flag. Requesting full nodes prevents fragmenting of the system and
decreases the amount of unnecessary communication between surplus nodes. For large subjobs you
may also need to tune the time and memory requested in the
schrodinger.hosts file to suit your
A single core job as required by the driver process cannot be run on the
large partition on
Puhti. To run multinode subjobs you need to modify the submission script generated by the GUI
by specifying a separate driver HOST (e.g.
-DRIVERHOST interactive -SUBHOST parallel, see also
The following figures show the time to solution and scaling of the PSIWAT benchmark (2552 electrons, 4 k-points, Maestro 2021.3, pure MPI).
- Scaling is almost ideal up to 4 nodes when using
-npools 4 -ntg 4.
- For this system and QE binary the performance does not scale beyond 320 cores.
- Always confirm the appropriate scaling of your system before running large multinode jobs (minimum 1.5 times speedup when doubling the number of cores).
Additional flags for Maestro modules
Different modules have different options. You can set some of them in
the GUI, but you may find more options with the
-h flag, e.g.
glide would be the Maestro module you want to run, like
The Maestro documentation has a nice summary of the different options for different modules. In the documentation, select: "Getting Started" > "Download and Installation" > "Installation and Jobs" > "Running Distributed Schrödinger Jobs".
Simple HOST selection
For jobs that finish within about two days and run 10 subjobs, just use:
or if they all finish within 14 days, use:
If you have a workflow that will last longer, read on.
Advanced HOST selection
The general aim is to have the "driver process" running on a "HOST"
that will be alive for the whole duration of the workflow. Good
options are interactive and longrun if you estimate the complete
workflow to take more than 3 days (queuing included). A "driver process"
that is not using a lot of CPU is also allowed on a login node, but
a subjob is not. Never submit jobs on Puhti login nodes with
-HOST localhost. It's ok if you create your own batch script and
use localhost on a compute node, but that's for special cases only
and not discussed on this page.
Set the "driver" or "master" to run on a HOST that allows for long run times (if it's a big calculation). The driver needs to be alive for the whole duration of the workflow (otherwise, your subjob likely ends up fizzled). You can use "interactive" which allows for 7 days for one core, or "longrun" which allows for 14 days. If you need to run multiple workflows at the same time, choose "longrun" for the next drivers. In both cases select some "normal" HOST (i.e. "small" Slurm partition) for the (sub)jobs. Suitable splitting will reduce your queuing time. Asking for the longrun HOST "just in case" is not dangerous, but may lead to unnecessary queuing.
You may be able to set the number of subjobs already in the GUI. Typically, it would set the "number of processors", which in many drivers will be equal to the number of subjobs. Alternatively, you may be able to set also the number of subjobs. This enables you to limit the number of simultaneous jobs with the "processor count" (so that you and others won't run out of licenses) but keep a single subjob at a suitable size. Please have a look at the help text of your driver via the Help path described above.
In summary, for a large workflow edit the GUI-generated script along the
-HOST "normal_72h:10" to
-HOST "longrun:1 normal_72h:9" or e.g.
-HOST "normal_72h" to
-HOST "interactive:1 normal_72h:9". Another
alternative is to use explicit flags,
-DRIVERHOST interactive -SUBHOST normal_72h.
Note that you can have only one single core job running in the interactive HOST.
Desmond jobs can have the
-HOST gpu flag as set by the GUI, but Windows users
need to change the forward slash "/" to backward slash "\" in the binary name.
Authoritative job control instructions from the manual
A more detailed discussion on advanced jobs can be found in the Maestro documentation via (from the GUI or the Schrödinger website):
- "Getting started" > "Download and Installation" > "Installation and Jobs" > "Job Control Guide" > "Running jobs" > "Running Jobs from the Command Line" > "The HOST, DRIVERHOST, and SUBHOST Options"
and a table of driver process conventions from
- Getting started" > "Download and Installation" > "Installation and Jobs" > "Running Distributed Schrödinger Jobs"
Setting number of subjobs or molecules per subjob
If you don't know how long your full workflow will take, don't ask
for more than 10 subjobs and/or
NJOBS. More is not always better.
If you have very large cases, don't exceed 50 simultaneous (sub)jobs.
As an example, the "run settings dialog" of
glide offers three options:
- Recommended number of subjobs
- Exactly (fill in here) subjobs
- Subjobs with no more than (fill in here) ligands each
Aim for such numbers that an average subjob takes 1-24 hours to run. This ensures that the overhead per subjob remains small while offering efficient parallelization, i.e. you get your results quickly and each subjob (as well as the master job) has time to finish.
Don't run subjobs that complete faster than 15 minutes. You can check the subjob duration afterwards with seff and use this info in your following jobs:
If time runs out for a subjob, search for "restart" in the
Schrödinger Knowledge Base
for your module, and/or look again for the options of your driver with
-h flag. Most jobs are restartable, so you don't lose completed
work or used resources.
If you choose too many subjobs, Maestro will get confused on the Slurm messages and sorting out the issue can be difficult. Also, running too many subjobs at a time can lead to the license running out, and waiting in the queue have been in for nothing.
Optimal disk usage
The Schrödinger HOSTs in Puhti have not been configured to use the NVMe local disk, which is available only on some of the compute nodes. Since most jobs don't gain speed advantage from NVME disk, you'll likely queue less by not asking for it. If your job requires a lot or random I/O, please contact CSC Service Desk on how to request it. The only disk available for the jobs is the same where your input files already are. Hence, it does not make sense to copy the files to a "temporary" location at the start of the job. However, at the moment there doesn't seem to be a way to prevent that in latest versions (2020.1 onwards).
Copying files to and from local computer
There are detailed instructions on how to accomplish this. One
efficient method using the command line is
scp, which works even in Windows
In Windows, start the Power Shell by searching for it in the bottom left
menu. In Linux or Mac, open a terminal.
cd to the directory that contains
the directory of input files recently written out by the GUI (here named
scp -r glide-dock_1 <your username>@puhti.csc.fi:/scratch/project_2001234
will copy the whole directory (note
-r) into your Puhti scratch folder.
In a terminal in Puhti, run the job, and once it has completed, you can
copy it back (give the command on your local computer):
scp -r <your username>@puhti.csc.fi:/scratch/project_2001234/glide-dock_1 .
You might be interested in some additional SSH tips, which will release you from typing your password every time.
Running the Maestro GUI on Puhti
This is not recommended. Running the GUI remotely is slow and prone to glitches. Please run the GUI locally, and only submit the jobs (run the script) on Puhti. If this is not possible, and you have to run the GUI on Puhti, use the Puhti web interface remote desktop
Availability of licenses
The CSC Maestro license has a fixed amount of tokens that are available for everyone. First, Maestro uses module specific tokens, of which there are many for each module. If they run out, then more jobs (of that same type) can be run with "general tokens", but when they run out, no more jobs of that type (or any new jobs which need a general token) can't be run by anyone. Therefore, this should be avoided. Once a job ends, the tokens are released, and are available for everyone.
You can check the currently available licenses with:
and currently used licenses with:
Note that some Maestro tools or workflows use multiple modules and hence licenses or tokens from multiple modules. Typically, one running instance of a module (a job or a subjob) requires several tokens. For example, Desmond and Glide jobs take 8 tokes each.
CPU time (billing units) is a different resource and has nothing to do with license tokens. When you run out of billing units, you or your project manager can apply for more via the My CSC portal.
Sometimes the jobs are launched but don't finish. The state of the job as
jobcontrol, see below, is fizzled. This might be due to a
number of reasons but cleaning up and restarting the jobcontrol service
might help. When you don't have any Maestro jobs running (in Puhti), give:
$SCHRODINGER/utilities/jserver -cleanall $SCHRODINGER/utilities/jserver -shutdown
jserver -cleanall command will not work because the program thinks some jobs are
still running. To force purge these jobs, run
$SCHRODINGER/jobcontrol -delete -force <jobid>
before the above
<jobid> should be replaced by the ID of your stranded job,
Another reason for jobs ending up fizzled is too many simultaneous jobs. Please have a look at the error files for suggestions, and if this is the case, ask for less subjobs.
Run a test job to help diagnosing problems
Run one of the test jobs that come with Maestro to narrow down potential issues. In your scratch directory on Puhti, give
installation_check -test test
to try running a test job using the "test" HOST. If the test succeeds, the problem is likely in your input. In this case, please proceed to the postmortem step below.
Asking for support
Maestro has a tool called postmortem that can be used to create a zip file containing the details of a failed job and the Maestro environment. Please add that to your support request to help us analyse your issue. On Puhti, first use
to find the right JobId (something like
Then, check the right flags for
and create the postmortem file with
$SCHRODINGER/utilities/postmortem <your schrodinger jobid>
The file may be large, so instead of sending it as an email attachment, consider using a-flip and just sending a link instead. Also, see the previous recommendation to try with small systems, as this will also enable you to use the test HOST and avoid queueing.
Please have a look at our instructions on making support requests. An efficient support request will help us to solve your issue faster.
Recap of Maestro usage on Puhti
- Always test your workflow with a small sample first
- Note that you can have only one job running at a time in the
- Note that you can have only one job running at a time in the
- Don't run the Maestro GUI on the login node (use the Puhti web interface if you must run the GUI on Puhti)
- Don't specify too many subjobs - an optimal subjob takes 1-24 hours
- Don't specify too many subjobs - there are many researchers using the same license
- Don't run a heavy "driver process" on the login node (if it's heavy, for 10 simultaneous jobs use
-HOST "longrun:1 normal_72h:9")
- Never run anything in parallel on the login node (localhost should not be in your script)
- Submit all jobs from your /scratch area
- If your local computer uses Windows, edit
/in your script
- Use the same version of Maestro locally and on Puhti
If you have suggestions on how to improve this guide, e.g. to give examples of efficient workflows, fork a copy (top right pen icon), edit and propose merge, or send your suggestion to CSC Service Desk.