-
Roihu disk areas
Roihu disk areas
Roihu provides three main shared disk areas: home, projappl, and scratch. In addition, each compute node provides a local temporary disk area that is available only during a job or interactive session on that node. Please familiarize yourself with the areas and their specific purposes.
Roihu users can also apply for separate dataset projects.
These provide access to a dedicated disk area, dataset, intended for sharing
datasets between multiple projects. Unlike computational projects, dataset projects
do not include scratch or projappl directories.
These directories are shared across the login and compute nodes on the system, and are based on the Lustre filesystem. See a more technical description of the Lustre filesystem on CSC supercomputers.
CSC does not backup your data!
None of the disk areas are automatically backed up by CSC! Deleted files cannot be recovered. To avoid unintended data loss, make sure to perform regular backups to, for example, Allas. See also the allas-backup tool.
| Owner | Environment variable | Path | Cleaning | Automatic backup | |
|---|---|---|---|---|---|
| home | Personal | ${HOME} |
/users/<user-name> |
No | No |
| projappl | Project | Not defined | /projappl/<project> |
No | No |
| scratch | Project | Not defined | /scratch/<project> |
180 days | No |
| dataset | Project | Not defined | /dataset/<project> |
No | No |
These disk areas have quotas for both the amount of data and total number of files:
| Capacity | Number of files | Notes | |
|---|---|---|---|
| home | 15 GiB | 150 000 files | |
| projappl | 15 GiB | 150 000 files | |
| scratch | 250 GiB | 500 000 files | |
| dataset | 0 GiB | 0 files | Must be applied for separately |
LUE
To easily check the amount of data and number of files within a given folder on
the parallel file system, please consider using the LUE
tool. This tool is significantly faster than tools like stat or du and causes
much less load on the file system.
Quotas and cleaning
While it is possible to apply for increased quotas, we recommend that you always first ensure that the data you have stored on the shared file system is really needed and in active use. Unused data should be deleted or moved to e.g. Allas. A general tutorial on managing and cleaning data on Puhti and Mahti disks is also available.
Home directory
Each user has a home directory ($HOME) that can contain up to 15 GB of data on Roihu.
The home directory is the default location after logging in. However, it is not intended for data analysis or running jobs. Its purpose is to store configuration files and other minor personal data. Be wary of the remaining quota in your home directory, a home directory exceeding its capacity can cause various account problems.
The home directory is the only user-specific directory in supercomputers. All other directories
are project-specific. If you are a member of several projects, you also have access to several
scratch or projappl directories, but still have only one home directory.
For all computing work, you should use your project's scratch directory.
Scratch directory
Each project on Roihu has, by default, 250 GiB of scratch disk space in the
directory /scratch/<project>.
The scratch directory is a fast parallel filesystem intended temporary storage of
data used in computation, and should contain i.e. any input and output files of your
programs.
You should aim to run your jobs on the supercomputer in this scratch directory.
The scratch directory is not intended for long-term storage. Files that have not been accessed for a long time may be automatically removed to free up space. The current policy on Roihu is to remove files that have not been accessed for more than 180 days.
Make sure to consult our tutorial for tips and guidelines on how to
manage your data on scratch.
Projappl directory
Each project on Roihu has also a 15 GB project application disk space
in the directory /projappl/<project>.
Use the projappl area for storing compiled software binaries, source code, libraries, scripts and small-scale reference data that are shared within a project. It is not a personal storage space, as it is shared with all members of a project. Files in projappl are not automatically removed, but the quota is limited.
Please do not submit jobs from or write
large-scale data to your project's projappl directory, but use scratch
instead for this purpose. Note that any self-installed applications you run
can and should still be stored in projappl.
Using scratch and projappl directories
An overview of your directories in the supercomputer you are currently logged on can be displayed with:
The above command displays all scratch and projappl directories you have access to.
It also displays which of your projects are subject to the 180 day scratch cleaning cycle.
For example, if you are a member in two projects, with unix groups project_2000123
and project_2001234, then you have access to two scratch and projappl directories:
[kkayttaj@roihu-login11 ~]$ csc-workspaces
Disk area Capacity(used/max) Files(used/max) Cleanup
----------------------------------------------------------------------
Personal home folder
/users/kkayttaj 4.4G/15G 24K/150K n/a
----------------------------------------------------------------------
Project: project_2000123 "Project X"
/projappl/project_2000123 24G/15G 36K/150K n/a
/scratch/project_2000123 103G/250G 389K/500k 180d
----------------------------------------------------------------------
Project: project_2001234 "Project Y"
/projappl/project_2001234 25G/100G 282K/1.0M n/a
/scratch/project_2001234 7.2/10TB 2.1M/2.5M 180d
----------------------------------------------------------------------
Moving to the scratch directory of project_2000123:
Note that not all CSC projects have Roihu access, so you may not
necessarily find a scratch or projappl directory for all your CSC projects.
Note
The scratch and projappl directories are shared by all the members of the
project. All new files and directories are also fully accessible for other
group members (including read, write and execution permissions) by default.
If you need to restrict access from your group members, you can reset the permissions
with the chmod command as usual. In general, we recommend that you allow the group
members the access, but use a subdirectory with your username for your data, for example
This way the data is accessible to other group members in case of long vacations, etc, but the ownership is still clear and organized. Note, some programs change the file permissions from the defaults, which may restrict the access from group members.
As mentioned earlier, the scratch directory is only intended for processing data.
Any data that should be preserved for a longer time should be copied to the Allas
object storage server. Instructions for backing up files from CSC supercomputers to
Allas can be found in the Allas guide.
Dataset directory
Roihu users can apply for separate dataset projects, which provide access to a shared disk
area under /dataset/<project>, but no computational resources.
Dataset project access begins in early August
You can already apply for a dataset project in MyCSC. Based on the applications, the first dataset projects will be approved and granted access in early August 2026.
Unlike normal computational projects, dataset projects do not include scratch or projappl directories. Instead, they are designed specifically for sharing data between multiple projects.
Write access to a dataset directory is restricted to a single project, while multiple other projects can be granted read access to this disk area.
See details for how to apply for a dataset project in MyCSC.
Note
Dataset projects are intended for data sharing and active use, not long-term storage. For long term storage, consider using Allas.
Moving data between supercomputers
Data can be moved directly between supercomputers using rsync command.
See our data migration guide for migrating data from Puhti/Mahti to Roihu.
Increasing quotas
You can use the MyCSC portal to manage quotas of the scratch and projappl
directories.
Remember that even after the quota is increased, the planned automatic cleaning process
will continue removing idle files from the scratch directory. Data that is not under
active computing should be stored in the Allas storage service.
Quota increases are limited. If your workflow requires storing very large numbers of files (e.g. millions), you should reconsider your data workflow, as this can lead to performance issues on the whole filesystem.
Info
To find out how much data/files you have on the disk, please use our LUE
tool which is much more performant than standard
tools such as stat or du.
Temporary local disk areas
Roihu compute nodes provide fast local disk storage that can significantly improve performance for I/O-intensive workloads.
This storage is available via the environment variable $TMPDIR, which many
applications use automatically for temporary files.
Local disk is node-specific and available on the login node, as well as in a job or interactive session. It is intended for temporary files that do not need to be shared between nodes.
Login nodes
Each login node on both Roihu-CPU and Roihu-GPU provides 80 GB of local storage under $TMPDIR.
The local storage is intended for compiling applications and performing pre- and post-processing that require heavy I/O operations, for example packing and unpacking archive files.
Note
The local storage is meant for temporary storage and is cleaned frequently. Remember to move your data to a shared disk area after completing your task.
Compute nodes
All compute nodes in Roihu provide fast NVMe local storage.
These local disk areas are designed to support I/O intensive computing tasks and cases where you need to process large amounts (over 100 000) of small files.
Data in local storage is removed when the job finishes. You must copy any results you want to
keep to scratch or Allas before the job ends.
Based on your Slurm job reservation type, you will have access to the following amount of local disk space:
Automatic local temporary storage
For shared-node, full-node, and GPU allocations, local temporary storage is available under $TMPDIR.
| Allocation type | Path | Quota per user |
|---|---|---|
| R (Shared nodes) | $TMPDIR |
20 GiB |
| N (Full nodes) | $TMPDIR |
600 GiB |
| G (GPU nodes) | $TMPDIR |
150 GiB |
| XL (Hugemem nodes) | $TMPDIR |
1.6 TiB |
| VIZ (Visualization nodes) | $TMPDIR |
6.5 TiB |
The disk space can be accessed under $TMPDIR, and does not need to be separately reserved in
your job script to be usable. Using the local disk does not consume billing units.
Reserved local scratch storage
Roihu's hugemem (XL) and visualization (Viz) nodes provide some local disk storage under $TMPDIR.
On top of this, they provide local scratch storage under $LOCAL_SCRATCH for larger temporary storage needs.
This storage is not available automatically. You must reserve it in your Slurm job script using the appropriate GRES option.
Reserved $LOCAL_SCRATCH storage consumes billing units.
Local scratch support will be added later
The local scratch feature on XL and Visualization nodes is not yet
implemented. Use $TMPDIR for your local storage needs until this feature is added.
| Allocation type | Path | Maximum reservable local scratch |
|---|---|---|
| XL (Hugemem nodes) | $LOCAL_SCRATCH |
TBA |
| VIZ (Visualization nodes) | $LOCAL_SCRATCH |
TBA |
Find the Roihu billing section for information on the storage billing units that local scratch usage consumes.
Disaggregated storage
It is also possible to request local disk mounts from a centralised pool of fast storage resources. This fast storage capacity is provided over the network and will appear as local scratch from within a Slurm job. The total capacity of the disaggregated NVMe resource is 307.2 TB, allowing you to get larger capacity fast storage for your jobs.
Requesting storage from slurm
You must request resources in conjunction with --exclusive
At the present you can only request this storage for jobs that are making use of full nodes,
i.e. that are submitted with the --exclusive flag. Presently if you do not specify this flag
your job will fail, but will be marked "CANCELLED by 350" and you will lack any stdout or stderr
logs. This should be resolved once support for shared node jobs arrives in Q3 2026.
To request flash storage to be mounted in an sbatch job you must add the following to the resource request block of your script:
Where storagesize specifies the amount of storage you need and path the location that the
storage will be mounted.
You can also request resources directly on the command line with the --bb flag:
srun -p small --exclusive --nodes 1 --mem 20G --account <project> --bb="#BB_LUA SBF storagesize=10G path=/run/sbb/<user>" --pty bash -i
Alternatively you can pass the request in a file using the --bbf flag, for example:
Steps must use srun!
When running a multinode job with sbatch, if each step is expected to run with the disaggregated disk, then the steps must be started with srun. Otherwise, only the compute node that runs the sbatch script will be able to use the storage.
Remember to move your data!
Move any data you need off the flash storage before your job completes, i.e. within your sbatch script.