Machine learning guide
This guide aims to help users who wish to do machine learning using CSC's computing resources.
Machine learning guide subsections
In addition to this page, this guide contains the following subsections:
- Getting started with machine learning at CSC
- GPU-accelerated machine learning
- Data storage for machine learning
- Multi-GPU and multi-node machine learning
- Hyperparameter search
- Managing machine learning workflows on CSC's supercomputers
What CSC service to use?
CSC offers several services which might be relevant for machine learning users:
Pouta offers your own virtual server with full control of the software environment, but restricted computing performance compared to supercomputers,
Rahti offers a more automatized container-based cloud environment, useful in particular for deploying web services.
Our recommendation is to use CSC's supercomputers, unless you need a very complicated software environment, or work with sensitive data. In those cases Pouta might be the right choice, and we also offer the ePouta variant which is suited for cases with high security requirements.
If you are developing a service, for example want to deploy a trained model as a service, then Pouta or Rahti might be most relevant for you.
If you are unsure about the right service to use, don't hesitate to contact our service desk and explain your computing needs.
For most machine learning needs CSC's supercomputers are the way to go. These are clusters of hundreds (or thousands) of computers, some of which offer GPU-acceleration. The supercomputers are multi-user systems, so individual users have limited rights to install software, and as with any shared resource one must follow the usage policy so that the service can remain usable.
CSC hosts two national supercomputers: Puhti and Mahti, and the European LUMI supercomputer. If you are unsure which supercomputer to choose, read the discussion here.
Both Puhti and Mahti have a web interface which can be accessed via www.puhti.csc.fi and www.mahti.csc.fi, respectively. Via the web interface, one can easily launch for example a Jupyter Notebook session with TensorFlow or PyTorch.
There are some use cases where the supercomputers are not the right solution, and you may need a virtual server on Pouta. Typical examples include:
- very complex software environment,
- need for root access,
- computation involving sensitive data.
With Pouta you get your own virtual server, where you have root or administrator access. HPC and GPU flavors are available for heavy computing needs, however, the computing resources will always be smaller than in a supercomputer.
For computation involving highly sensitive data we also offer the ePouta variant which is suited for cases with high security requirements. With ePouta the virtual server will be integrated into your existing network infrastructure.
Below are a few examples of how Rahti can be used for machine learning tasks:
- Setting up an MLflow server on Rahti to store results and models: how-to video and GitHub repository
- How to deploy machine learning models on Rahti