Using PyTorch on Berzelius

Introduction

PyTorch is an open-source machine learning library for Python that provides a flexible and efficient framework for building and training deep neural networks. PyTorch is commonly used for various machine learning tasks, including deep learning, natural language processing (NLP), computer vision, reinforcement learning, and more.

You can install PyTorch on Berzelius using various methods, such as Conda/Mamba or pip, by following the official installation instructions. Alternatively, you can also install and use PyTorch through an Apptainer container.

Loading PyTorch as a Module

module load PyTorch/2.3.0-python-3.10-hpc1

To check if PyTorch detects the GPU:

python -c "import torch; print('GPU available: ' + str(torch.cuda.is_available()))"

Installing PyTorch via Conda/Mamba

We recommend following PyTorch’s official installation instructions. It’s a good practice to specify the versions of both the main package (PyTorch, in this case) and Python during installation to ensure compatibility.

module load Mambaforge/23.3.1-1-hpc1-bdist
mamba create --name pytorch-2.5.0 python=3.10
mamba activate pytorch-2.5.0
mamba install pytorch==2.5.0 torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

Installation instructions for previous versions of PyTorch can be found here.

Installing PyTorch via pip

module load Mambaforge/23.3.1-1-hpc1-bdist
mamba create --name pytorch-2.5.0 python=3.10
mamba activate pytorch-2.5.0
pip3 install torch==2.5.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Installing PyTorch via Apptainer

We can build an Apptainer image using the following definition file pytorch_2.0.1.def. To learn more refer to the Apptainer User Guide.

Bootstrap: docker
From: nvidia/cuda:11.7.1-cudnn8-devel-ubuntu22.04

%environment

export PATH=/opt/mambaforge/bin:$PATH
export PYTHONNOUSERSITE=True

%post

apt-get update && apt-get install -y --no-install-recommends \
git \
nano \
wget \
curl

# Install Mambaforge
cd /tmp
curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh"
bash Mambaforge-$(uname)-$(uname -m).sh -fp /opt/mambaforge -b
rm Mambaforge*sh

export PATH=/opt/mambaforge/bin:$PATH

mamba install python=3.10 pytorch==2.0.1 torchvision torchaudio torchdata torchtext pytorch-cuda=11.7 -c pytorch -c nvidia -y

# Pin packages
cat <<EOT > /opt/mambaforge/conda-meta/pinned
pytorch==2.0.1
EOT

mamba install matplotlib jupyterlab -y

We build the image from the definition file:

apptainer build pytorch_2.0.1 pytorch_2.0.1.def

The Apptainer image can be easily extended with more packages and software by modifying the definition file and rebuilding the image.

Performance Optimization

We expect jobs properly utilizing the GPUs on Berzelius. Particularly inefficient jobs will be automatically terminated. Please read Berzelius GPU Usage Efficiency Policy for more details.

There are many performance profilers and profiling tools that allow you to analyze the runtime behavior of your Python code, identify bottlenecks, and optimize performance.

One example workflow of code optimization is as follows.

  1. Use line_profiler to identify the bottleneck.

  2. Locate the most inefficient part of your code and optimize it.

  3. Rerun the code.

Please read PyTorch Performance Tuning Guide for the possible optimizations which can accelerate training and inference of deep learning models in PyTorch.


User Area

User support

Guides, documentation and FAQ.

Getting access

Applying for projects and login accounts.

System status

Everything OK!

No reported problems

Self-service

SUPR
NSC Express