Python installations at NSC

NSC's clusters have the CentOS standard Python 2 installed. That is the Python version you will get by default when you login in to one of our clusters:

$ which python
/usr/bin/python

On Centos 7, it is Python 2.7.5 while on CentOS 6, it is Python 2.6.6. We also have Python 3 from the EPEL repositories, at the moment 3.6.8 on CentOS 7 and 3.4.10 on CentOS 6, as /usr/bin/python3.

We do not attempt to install a complete set of scientific computing packages for the CentOS/EPEL system Python installations.

To get access to a recent version of Python together with recent versions of the usual scientific libaries such as NumPy, SciPy, Matplotlib, Pandas etc, we recommend that you load one of our Python modules. For example, on Tetralith, we currently have:

$ module avail Python
...
Python/2.7.14-anaconda-5.0.1-nsc1
Python/2.7.14-nsc1-gcc-2018a-eb
Python/2.7.14-nsc1-intel-2018a-eb
Python/2.7.15-anaconda-5.3.0-extras-nsc1
Python/2.7.15-env-nsc1-gcc-2018a-eb
Python/3.6.3-anaconda-5.0.1-nsc1
Python/3.6.4-nsc1-intel-2018a-eb
Python/3.6.4-nsc2-intel-2018a-eb
Python/3.6.7-env-nsc1-gcc-2018a-eb
Python/3.7.0-anaconda-5.3.0-extras-nsc1
...

After loading a suitable module, you will have a new Python installation in your PATH, where things like NumPy will work:

$ module load Python/3.7.0-anaconda-5.3.0-extras-nsc1
$ python
Python 3.7.0 (default, Jun 28 2018, 13:15:42)
[GCC 7.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.

>>> import numpy
>>> numpy.linspace(0, 2, 9)
array([0.  , 0.25, 0.5 , 0.75, 1.  , 1.25, 1.5 , 1.75, 2.  ])

The modules that contain "anaconda" in the name are based on the popular Anaconda Python distribution while the modules that contain "-eb" are built using the EasyBuild build system that is used for many other software packages on NSC CentOS 7 clusters.

Check available packages in a module

If you are looking for a specific Python package, then first load a Python module. Some guidelines to help you choose a module:

  1. There are generally more packages included in the Anaconda installations
  2. If there are modules that only differ in the -nsc build/installation tag, then choose the one with the highest integer (e.g. nsc2 rather than nsc1)

Anaconda modules

To list the installed packages in an Anaconda Python installation, simply load the module and run conda list. If you are looking for a specific package, then pipe the output from conda list to grep:

$ module load Python/3.6.3-anaconda-5.0.1-nsc1
$ conda list | grep -i scipy
scipy                     0.19.1           py36h9976243_3

More info: conda, conda list

NSC build modules

To list the installed packages in an NSC build Python installation, just load the module and run pip list. If you are looking for a specific package, then pipe the output from pip list to grep:

$ module load Python/3.6.4-nsc2-intel-2018a-eb
$ pip list --format=legacy | grep -i scipy
scipy (1.0.0)

More info: pip, pip list

Customizing your Python using conda

The NSC modules typically provide a set of of Python packages that might be difficult for users to install themselves, such as optimized versions of NumPy and SciPy, but for technical reasons, we cannot install all the packages that everyone needs in the same installation. Instead, we recommend that you install extra packages in your own home directory.

If you use one of the Anaconda-based modules (contains "anaconda" in the name), you can use "conda create" to create a customized Python environment with exactly the packages (and versions) you need. A basic example, assuming you want the latest Python 3 together with the pandas and seaborn packages:

$ module load Python/3.7.0-anaconda-5.3.0-extras-nsc1
$ conda create -n myownenv python=3 pandas seaborn
$ source activate myownenv
$ which python
~/.conda/envs/myownenv/bin/python
$ python
$ python
Python 3.7.3 (default, Mar 27 2019, 22:11:17)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas
>>> pandas.__version__
'0.24.2'

When you login next time, you do not have to create the environment again, but can activate it using:

$ module load Python/3.7.0-anaconda-5.3.0-extras-nsc1
$ source activate myownenv

You can install additional packages using "conda install" inside your environment, for example:

$ conda install cython

If you cannot install a particular software using "conda install", it is still possible to use "pip install" in your environment. As you already have an environment you can write to, you should not use "--user" in that case. An example:

$ pip install python-hostlist
$ python
...
>>> import hostlist
>>> hostlist.__file__
'/home/x_abcde/.conda/envs/myownenv/lib/python3.7/site-packages/hostlist.py'

Conda environments outside of the home directory

You might want to create a conda environment outside of your home directory (the default when using "-n NAME" is to create it as $HOME/.conda/envs/NAME).

One reason for this could be to save space on the /home filesystem where the quota is more restrictive. Another reason could be that you want to curate a Python environment used by a group of fellow users, and want to make it less dependent on your home directory.

To accomplish this, use "-p PREFIX" instead of "-n NAME" when creating your environment and use the full prefix when activating. So, the start of the example above would become

$ module load Python/3.7.0-anaconda-5.3.0-extras-nsc1
$ conda create -p /proj/ourprojname/pythonenvs/test1 python=3 pandas seaborn
$ source activate /proj/ourprojname/pythonenvs/test1

assuming you want your environment in /proj/ourprojname/pythonenvs/test1 (and also assuming for this example that /proj/ourprojname/pythonenvs have been created and you have write access).

Customizing your Python using virtualenv

You can also use the Python standard "virtualenv" mechanism to customize your Python and install additional packages. This works for all Python modules as well as the system Python. In this example, we base our virtual environment on Python/3.7.0-anaconda-5.3.0-extras-nsc1 and use "--system-site-packages" to have access to all the packages installed there instead of starting from scratch:

$ module load Python/3.7.0-anaconda-5.3.0-extras-nsc1
$ virtualenv --system-site-packages myownvirtualenv
$ source myownvirtualenv/bin/activate
$ pip install python-hostlist
$ python
Python 3.7.0 (default, Jun 28 2018, 13:15:42)
[GCC 7.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import hostlist
>>> hostlist.__file__
'/home/kent/myownvirtualenv/lib/python3.7/site-packages/hostlist.py'

When you login next time, you do not have to create the environment again, but can activate it using:

$ source myownvirtualenv/bin/activate

When not to use conda environments

If you need to install a python package that requires compiling, then you shouldn't use a conda environment!

In this case, you can try to use a virtual environment based on one of the NSC build Python modules and if that fails, then contact support for help.

How do I control which version of Python my scripts use?

If a script demands the system Python using a first line like this

#!/usr/bin/python

you can change it to

#!/usr/bin/env python

to make the script pick up the Python from the currently loaded module (or your activated environment, if you have one).

If you want to "lock" a script to a specific Python version, figure out the full path to the desired python binary (use "which python") and use that instead. For example, using one of the Anaconda modules:

$ module load Python/3.7.0-anaconda-5.3.0-extras-nsc1
$ which python
/software/sse/easybuild/prefix/software/Anaconda3/5.3.0-extras-nsc1/bin-wrapped/python

Thus, to always use this Python for the script, make the first line read

#!/software/sse/easybuild/prefix/software/Anaconda3/5.3.0-extras-nsc1/bin-wrapped/python

Why doesn't my Python program write to slurm.out?

To see the output from Python script in a running job in real-time, you have to instruct Python to not buffer its output. Otherwise, all the output from your script will get written to the slurm.out file when the job has finished (or the buffer is full). To get the expected behavior, simply add the -u command line flag when you start the script to requested unbuffered mode.

python -u myscript.py

For an executable script, you can add the flag to the first line:

#!/usr/bin/env python -u

User Area

User support

Guides, documentation and FAQ.

Getting access

Applying for projects and login accounts.

System status

Everything OK!

No reported problems

Self-service

SUPR
NSC Express