Skip to content

Machine learning guide

This guide aims to help users who wish to do machine learning using CSC's computing resources.

In addition to this page, this guide contains the following subsections:

What CSC service to use?

CSC offers several services which might be relevant for machine learning users:

  • Supercomputers, Puhti and Mahti are multi-user clusters and offer the highest computing performance, including GPU acceleration in a centrally controlled software environment,

  • Pouta offers your own virtual server with full control of the software environment, but restricted computing performance compared to supercomputers,

  • Rahti offers a more automatized container-based cloud environment, useful in particular for deploying web services.

Our recommendation is to use CSC's Puhti or Mahti supercomputers, unless you need a very complicated software environment, or work with sensitive data. In those cases Pouta might be the right choice, and we also offer the ePouta variant which is suited for cases with high security requirements.

If you are developing a service, for example want to deploy a trained model as a service, then Pouta or Rahti might be most relevant for you.

If you are unsure about the right service to use, don't hesitate to contact our service desk and explain your computing needs.

CSC's supercomputers

For most machine learning needs CSC's supercomputers, Puhti and Mahti, are the way to go. Both are clusters of hundreds of computers, some of which offer GPU-acceleration. The supercomputers are multi-user systems, so individual users have limited rights to install software, and as with any shared resource one must follow the usage policy so that the service can remain usable.

If you are a new user, please read how to access Puhti and Mahti, and how to submit computing jobs.

New users may in particular be interested in Puhti's web interface, which can be accessed at www.puhti.csc.fi. Via the web interface, one can easily launch for example a Jupyter Notebook session with TensorFlow or PyTorch. Please note that Puhti's web interface is still in a beta stage and being continuously improved.

Also check the subsections related to efficient GPU utilization, how to work with big data sets, and multi-GPU and multi-node jobs.

Cloud services

Pouta

There are some use cases where the supercomputers are not the right solution, and you may need a virtual server on Pouta. Typical examples include:

  • very complex software environment,
  • need for root access,
  • computation involving sensitive data.

With Pouta you get your own virtual server, where you have root or administrator access. HPC and GPU flavors are available for heavy computing needs, however, the computing resources will always be smaller than in a supercomputer.

For computation involving highly sensitive data we also offer the ePouta variant which is suited for cases with high security requirements. With ePouta the virtual server will be integrated into your existing network infrastructure.

See our Pouta documentation pages on how to apply for access.

Rahti

For model deployment, the Rahti container cloud service might be used. However, it currently doesn't offer GPU support.

Below are a few examples of how Rahti can be used for machine learning tasks:


Last update: April 11, 2022