TensorFlow

Deep learning framework for Python.

News

5.5.2022 Due to Mahti's update to Red Hat Enterprise Linux 8 (RHEL8), the number of fully supported TensorFlow versions has been reduced. Please contact our servicedesk if you really need access to other versions.

4.2.2022 All old TensorFlow versions which were based on direct Conda installations have been deprecated, and we encourage users to move to newer versions. Read more on our separate Conda deprecation page.

Available

Currently supported TensorFlow versions:

Version Module Puhti Mahti Environ. Horovod Notes
2.8.0 tensorflow/2.8 X X Sing. X default version
2.7.0 tensorflow/2.7 X (X) Sing. X
2.6.0 tensorflow/2.6 X (X) Sing. X
2.5.0 tensorflow/2.5 X (X) Sing. X
2.4.1 tensorflow/2.4 X (X) Sing. X
2.4.0 tensorflow/2.4-hvd X - Conda X deprecated
2.4.0 tensorflow/2.4-sng X - Sing. -
2.3.1 tensorflow/nvidia-20.12-tf2-py3 X - Sing. -
2.3.0 tensorflow/2.3 X - Sing. -
2.2.0 tensorflow/nvidia-20.07-tf2-py3 X - Conda X experimental Horovod support
2.2.0 tensorflow/2.2-hvd X - Conda X deprecated
2.2.0 tensorflow/2.2 X - Sing. -
2.1.0 tensorflow/nvidia-20.03-tf2-py3 X - Sing. -
2.1.0 tensorflow/nvidia-20.02-tf2-py3 X - Sing. -
2.0.0 tensorflow/nvidia-19.11-tf2-py3 X - Sing. -
2.0.0: tensorflow/2.0.0 X - Conda - deprecated
2.0.0 tensorflow/2.0.0-hvd X - Conda X deprecated
1.15.5 tensorflow/1.15 X - Sing. X
1.15.0 tensorflow/1.15-hvd X - Conda X deprecated
1.14.0: tensorflow/1.14.0 X - Conda - deprecated
1.14.0 tensorflow/1.14.0-cpu X - Conda - deprecated,
optimized for CPU
1.13.1: tensorflow/1.13.1 X - Conda - deprecated
1.13.1 tensorflow/1.13.1-hvd X - Conda X deprecated

Includes TensorFlow and Keras with GPU support via CUDA.

Versions marked with "(X)" on Mahti are based on old Red Hat Enterprise Linux 7 (RHEL7) images, and are no longer fully supported. In particular MPI and Horovod do not work anymore on Mahti with these modules. If you still wish to access these versions, you need to enable old RHEL7 modules by module use /appl/soft/ai/rhel7/modulefiles/.

Modules starting with nvidia are based on NVIDIA's optimized container images from NGC with some CSC specific additions. See NVIDIA's TensorFlow container release notes for more information on provided software versions.

If you find that some package is missing, you can often install it yourself with pip install --user. See our Python documentation for more information on how to install packages yourself. If you think that some important TensorFlow-related package should be included in the module provided by CSC, please contact our servicedesk.

Some modules are Singularity-based (indicated in the "Environ." column in the table above). Wrapper scripts have been provided so that common commands such as python, python3, pip and pip3 should work as normal. For more information, see CSC's general instructions on how to run Singularity containers.

Some modules support Horovod, which is our recommended framework for multi-node jobs, i.e., jobs needing more than 4 GPUs. Horovod can also be used with single-node jobs for 2-4 GPUs. For more information, read the Multi-GPU section in our machine learning guide.

License

TensorFlow is licensed under Apache License 2.0.

Usage

To use this software on Puhti or Mahti, initialize it with:

module load tensorflow

to access the default version, or if you wish to have a specific version (see above for available versions):

module load tensorflow/2.4

Please note that the module already includes CUDA and cuDNN libraries, so there is no need to load cuda and cudnn modules separately!

This command will also show all available versions:

module avail tensorflow

To check the exact packages and versions included in the loaded module you can run:

list-packages

Note

Note that Puhti login nodes are not intended for heavy computing, please use slurm batch jobs instead. See our instructions on how to use the batch job system.

Example batch script

Example batch script for reserving one GPU and 1/4 of the available CPU cores in a single node:

#!/bin/bash
#SBATCH --account=<project>
#SBATCH --partition=gpu
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=10
#SBATCH --mem=64G
#SBATCH --time=1:00:00
#SBATCH --gres=gpu:v100:1

module load tensorflow/2.8
srun python3 myprog.py <options>
#!/bin/bash
#SBATCH --account=<project>
#SBATCH --partition=gpusmall
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=32
#SBATCH --time=1:00:00
#SBATCH --gres=gpu:a100:1

module load tensorflow/2.8
srun python3 myprog.py <options>

Note

Please do not read a huge number of files from the shared file system, use fast local disk or package your data into larger files instead! See the Data storage section in our machine learning guide for more details.

Big datasets, multi-GPU and multi-node jobs

Please see our Machine learning guide, which covers more advanced topics, including efficient GPU utilization, how to work with big data sets, multi-GPU and multi-node jobs.

More information

Last edited Mon Apr 25 2022