Disk areas

Puhti has three main disk areas: home, projappl and scratch. Please familiarize yourself with the areas and their specific purposes before using Puhti.

Owner Environment variable Path Cleaning
home Personal $(HOME) /users/<user-name> No
projappl Project Not available /projappl/<project> No
scratch Project Not available /scratch/<project> Yes - 90 days

These disk areas have quotas for both the amount of data and total number of files:

Capacity Number of files
home 10 GiB 100 000 files
projappl 50 GiB 100 000 files
scratch 1 TiB 1 000 000 files

See Increasing Quotas for instructions on how to apply for increased quota.

Home directory

Each Puhti user has a home directory ($HOME) that can contain up to 10 GB of data.

The home directory is the default directory where you begin after logging in to Puhti. However, typically you should change to your project's scratch directory when working with Puhti because the home directory is not intended for data analysis or computing. Its purpose is to store configuration files and other minor personal data. A home directory exceeding its capacity causes various account problems.

The home directory is the only user-specific directory in Puhti. All other directories are project-specific. If you are a member of several projects, you also have access to several scratch or projappl directories, but still have only one home directory.

Note

The home directory is not automatically backed up by CSC (the same applies to all directories in Puhti), which means that data accidentally deleted by the user cannot be recovered.

Scratch directory

Each project has by default 1 TB of scratch disk space in the directory /scratch/<project>.

This fast parallel scratch space is intended as temporary storage space for the data that is used in Puhti. The scratch directory is not intended for long-term data storage and any files that have not been used for 90 days will be automatically removed.

ProjAppl directory

Each project has also a 50 GB project application disk space in the directory /projappl/<project>.

It is intended for storing applications you have compiled yourself and libraries etc. that you are sharing within the project. It is not a personal storage space but it is shared with all members of the project team.

It is not intended for running applications, so please run them in scratch instead.

Using Scratch and ProjAppl directories

An overview of your directories in Puhti can be displayed with:

csc-workspaces 

The above command displays all scratch and projappl directories you have access to within active projects with Puhti access.

For example, if you are member in two projects, with unix groups project_2012345 and project_3587167, then you have access to two scratch and projappl directories:

[kkayttaj@puhti ~]$ csc-workspaces 
Disk area               Capacity(used/max)  Files(used/max)  Project description  
----------------------------------------------------------------------------------
Personal home folder
----------------------------------------------------------------------------------
/users/kkayttaj                2.05G/10G       23.24k/100k

Project applications 
----------------------------------------------------------------------------------
/projappl/project_2012345     3.056G/50G       23.99k/100k   Ortotopology modeling
/projappl/project_3587167     10.34G/50G       2.45/100k     Metaphysics methods

Project scratch 
----------------------------------------------------------------------------------
/scratch/project_2012345        56G/1T         150.53k/1000k Ortotopology modeling
/scratch/project_3587167       324G/1T         5.53k/1000k   Metaphysics methods

Moving to the scratch directory of project_2012345:

cd /scratch/project_2012345

Please note that not all CSC projects have Puhti access, so you may not necessarily find a scratch or projappl directory for all your CSC projects.

The scratch and projappl directories are shared by all the members of the project. All new files and directories are also fully accessible for other group members (including read, write and execution permissions). If you want to restrict access from your group members, you can reset the permissions with the chmod command.

Setting read-only permissions for your group members for the directory my_directory:

chmod -R g-w my_directory

As mentioned earlier, the scratch directory is only intended for processing data. Any data that should be preserved for a longer time should be copied to the Allas storage server. Instructions for backing up files from Puhti to Allas will be included in this guide as soon as the Allas storage service is available.

Increasing Quotas

You can use MyCSC portal to manage quotas of the scratch and projappl directories.

Remember that even after the quota is increased, the automatic cleaning process will continue removing idle files from the scratch directory. Data that is not under active computing should be stored in the Allas storage service.

Remember also, that you can increase these values only to some extent. Especially in the case of number of files, you should reconsider your data work flow, if it requires that tens of millions of files are stored to the scratch area.

Additional disk areas

Login nodes

All of the login nodes have 2900 GiB of fast local storage. The storage is located under $TMPDIR and is separate for each login node.

The local storage is good for compiling applications and performing pre- and post-processing that require heavy IO operations, for example packing and unpacking archive files.

Note

The local storage is meant for temporary storage and is cleaned frequently. Remember to move your data to a shared disk area after completing your task.

Compute nodes

Interactive batch jobs as well as jobs running in the IO- and gpu-nodes have local fast storage available. In interactive batch jobs this local disk area is defined with environment variable $TMPDIR and in normal batch jobs with $LOCAL_SCRATCH. The size of this storage space is defined in the batch job resource request (max. 3600 GB).

These local disk areas are designed to support I/O intensive computing tasks and cases where you need to process large amounts (over 100 000 files) of small files. These directories are cleaned once the batch job finishes. Thus, in the end of a batch job you must copy all the data that you want to preserve from these temporary disk areas to scratch directory or to Allas.

For more information see: creating job scripts.