TensorFlow is an open-source machine learning library for Python that provides a flexible and efficient framework for building and training deep neural networks. It is widely used for a variety of machine learning tasks, including deep learning, natural language processing (NLP), computer vision, reinforcement learning, and more.
On Berzelius, TensorFlow can be installed using several methods, including Conda/Mamba, pip, or via Apptainer containers. The official TensorFlow installation guide recommends using pip as the primary method.
It’s considered good practice to specify the version of the main package to install—in this case, TensorFlow—to ensure compatibility and reproducibility.
module load Miniforge3/24.7.1-2-hpc1-bdist
mamba create -n tensorflow-2.17.0-python-3.10 python=3.10
mamba activate tensorflow-2.17.0-python-3.10
mamba install tensorflow==2.17.0 "cuda-version=12.0"
To check if TensorFlow detects the GPU:
python -c "import tensorflow as tf; print('GPU available: ' + str(tf.config.list_physical_devices('GPU')))"
module load Miniforge3/24.7.1-2-hpc1-bdist
mamba create -n tensorflow-2.17.0-python-3.10 python=3.10
mamba activate tensorflow-2.17.0-python-3.10
pip install tensorflow[and-cuda]==2.17.0
We can build an Apptainer image using the following definition file tensorflow_2.0.1.def
. To learn more refer to the Apptainer User Guide.
Bootstrap: docker
From: nvidia/cuda:11.2.1-cudnn8-devel-ubuntu20.04
%environment
export PATH=/opt/mambaforge/bin:$PATH
export PYTHONNOUSERSITE=True
%post
apt-get update && apt-get install -y --no-install-recommends \
git \
nano \
wget \
curl
# Install Mambaforge
cd /tmp
curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh"
bash Mambaforge-$(uname)-$(uname -m).sh -fp /opt/mambaforge -b
rm Mambaforge*sh
export PATH=/opt/mambaforge/bin:$PATH
CONDA_OVERRIDE_CUDA="11.2" mamba install tensorflow==2.11.1 cudatoolkit=11.2 -c conda-forge -y
# Pin packages
cat <<EOT > /opt/mambaforge/conda-meta/pinned
tensorflow==2.11.1
EOT
mamba install matplotlib jupyterlab -y
We build the image from the definition file:
apptainer build tensorflow_2.11.1 tensorflow_2.11.1.def
The Apptainer image can be easily extended with more packages and software by modifying the definition file and rebuilding the image.
We expect jobs properly utilizing the GPUs on Berzelius. Particularly inefficient jobs will be automatically terminated. Please read Berzelius GPU Usage Efficiency Policy for more details.
There are many performance profilers and profiling tools that allow you to analyze the runtime behavior of your Python code, identify bottlenecks, and optimize performance.
One example workflow of code optimization is as follows.
Use line_profiler
to identify the bottleneck.
Locate the most inefficient part of your code and optimize it.
Rerun the code.
Please read TensorFlow Performance best practices for the possible optimizations which can accelerate training and inference of deep learning models in TensorFlow.
Guides, documentation and FAQ.
Applying for projects and login accounts.