This is not the final version of this page. If you cannot find the information you are looking for, please check back later or ask NSC Support.

What has happened?

Ending on January 8th, the Tetralith login and compute nodes were upgraded from CentOS 7 to Rocky Linux 9.

The changes were kept to a minimum. You can still use SSH or Thinlinc to log in, you can still submit your jobs to Slurm, etc.

Storage (/proj, /home) are not affected (i.e no need to move your data).

What will happen?

  • During early 2024 (February to April): a 1-2 day stop to upgrade the cluster system servers (job scheduler, etc.) and do some other required maintenance. Date not decided yet.

What do I need to do?

To continue using Tetralith without interruption, you should move your jobs to the upgraded part of Tetralith sometime between November 13th and January 8th.

You yourself choose when to do this, but we recommend that you do it sooner rather than later to leave as much time as possible for you (and possibly NSC) to fix any problems you may encounter.

Typically, it will look something like this:

  1. Make some basic tests to see what you need to change and verify that at least one job seems to start correctly in the upgraded part of Tetralith (i.e when submitted from tetralith-el9.nsc.liu.se (hostname "tetralith2")).
  2. Leave existing running and queued jobs as-is (they can run in the non-upgraded part until January 8th). You may of course cancel queued jobs and resubmit them in the new part of the system if you prefer.
  3. Submit any new jobs from tetralith-el9.nsc.liu.se (hostname "tetralith2") with possibly modified settings/job scripts/modules/...
  4. Monitor the first few new jobs and once they have completed, check that the results are as expected.

Once your jobs are working in the upgraded part of Tetralith, please stop using the non-upgraded part. NSC will monitor the demand for nodes in both parts of the system and adjust the number of nodes in each part accordingly, i.e it will not be easier to run jobs in the old part.

Software migration guide

This section covers how to get software you have been running on CentOS 7 to run on Rocky Linux 9.

Applications provided by NSC

Most applications provided by NSC will continue to work.

Initially (i.e on November 13th) we will provide the latest and most commonly used versions of the most common applications.

Additional applications and versions will be added later, but only if requested via NSC Support.

If you don't find your application/version in the output from module avail (on upgraded nodes) and it's not mentioned on this page already, please contact NSC Support and ask about it.

Some of the NSC-provided applications will run natively in the new operating system, some will have been recompiled and others will be run in a CentOS 7 container using Apptainer. We aim to make all this as transparent to the user as possible, but minor adjustments to job scripts may be needed (and documented on this page).

Compiled applications that you or your project have installed yourself on CentOS 7 to run on compute nodes

If you or someone in your project has built or installed your own application you will need to choose a suitable way to run the application in the new environment.

There are several ways to do this (recompile, run as-is, run in a container). Which one to use depends on the application. NSC Support can assist you with this. Some documentation is provided below, more will follow later.

Inside a supercomputer job, please try the following steps (i.e., inside a submit script, or after issuing interactive). Note that it is a good idea to verify not just that the application starts, but also to try an example job to check that it performs as expected.

  1. Run your application as usual, i.e., as mpprun <application>. Some applications just work without no additional steps. However, you may see error messages about "missing symbols" or "missing libraries". In that case, go to next step.

  2. Try mpprun --compat el7 <application>. This runs your application inside a "compatibility configuration" designed to mimic CentOS 7 as closely as possible. The compatibility consists of running your application, leveraging modern OS features, inside an NSC provided OS container image of the old CentOS 7 OS mimicing the old software environment to very high fidelity. (Note: if you have already recompiled your binary on Rocky 9, you should NOT run it with the --compat el7 flag.)

  3. If step 2 also fails, the next step is to try to rebuild the application with one of the build environments provided in the Rocky Linux 9 environment. Load an apropriate buildenv-<something> module and follow the usual instructions for how to install software. Note: make sure to completely rebuild the application, i.e., do a make dist-clean, make clean, or equivalent as the first step to make sure all components are rebulit from scratch.

Compiled applications that you or your project have installed yourself on CentOS 7 to run on the login nodes

If you or your project have installed a software application that you like to use on the login node for data anaysis etc., please try the following steps:

  1. Try to run your application as usual, e.g., as ./<application>. Some applications just work without no additional steps. However, you may see error messages about "missing symbols" or "missing libraries". In that case, go to next step.

  2. Try hpc_compat_el7 <application>. This runs your application inside a "compatibility configuration" designed to mimic CentOS 7 as closely as possible. The compatibility solution is the same as that used for mpprun --compat el7. (Note: if you have already recompiled your binary on Rocky 9, you should NOT run it via hpc_compat_el7.) `
  3. If step 2 also fails, the next step is to try to rebuild the application with one of the build environments provided in the Rocky Linux 9 environment. Load an apropriate buildenv-<something> module and follow the usual instructions for how to install software. Note: make sure to completely rebuild the application, i.e., do a make dist-clean, make clean, or equivalent as the first step to make sure all components are rebulit from scratch.

If your application includes a GUI:

  • 2D graphics: test using the "compatibility configuration" (step 2 above), if this fails please try to use an interactive session. If this also fails you may have to rebuild your application (step 3 above).

  • 3D graphics: Your application will have to be reinstalled (step 3 above)

Python applications

The Rocky Linux 9 environment provides Anaconda and Condaforge modules that work the same way as those available in CentOS 7. You should be able to load these modules and just keep using the conda environments you installed under CentOS 7.

If you rather used the modules provided under Python/, the Rocky Linux 9 environment provides a similar Python 3 module that you can try. However, users who find that the changes in dependency libraries compared to the CentOS 7 modules causes issues are directed to set up the necessary environments using the Anaconda or Condaforge modules.

Updated mpprun

Note that the mpprun application differs substantially on Rocky Linux 9 compared to CentOS 7. Use mpprun -h to see the help, and the option mpprun -i <binary> can be helpful for power users to understand more precisely how mpprun will launch a binary through the various MPI-specific launchers.

Old build environments

The old build environments of CentOS 7, buildenv-<something>/<old-version>, will not be carried over to EL9, but there will be newer version replacement modules available corresponding to them. If you can not make use of these refreshed modules, please contact NSC Support for assistance on your migration to these environments.

Application performance

NSC has run applications on the upgraded part of Tetralith (both recompiled, run as-is, and run through a CentOS 7 container) and not seen any significant negative performance impact. In fact, some applications are even running faster.

If you see a significant performance loss in the upgraded part of Tetralith, please let NSC Support know as soon as possible.

If you cannot get your application working on Rocky 9

  1. Ask NSC Support for help. Do this early, do not wait until January 8th!
  2. If the application cannot be made to work even with assistance from NSC, we have the option to leave some non-upgraded compute nodes running for a while (but no longer than June 30th, 2024). However, only users that have asked for help and that NSC has not been able to help will be allowed to use such nodes. The number of such nodes will likely be very limited.

Some more technical details

Users can choose which part to use by logging in to the corresponding login node (using SSH or Thinlinc):

  • Upgraded/Rocky 9: tetralith-el9.nsc.liu.se (a.k.a tetralith2, available from November 13th)
  • Non-upgraded/CentOS 7: tetralith-el7.nsc.liu.se (a.k.a tetralith1, available until January 8th)

The "devel" reservation will contain CentOS 7 nodes but will be removed ON December 12th. Halfway through the migration window there should be no or very little need for test and development of CentOS 7 jobs.

For Rocky 9, nodes for test and development will be available in a reservation named "now". The same policy (test and development only, max wall time 1h, max 64 cores per user) as for CentOS 7 apply (but might be relaxed later on).

The "lsda" nodes were upgraded to EL9 November 13th.

The number of upgraded compute nodes will gradually increase during the upgrade window. The rate at which this happens will be adjusted based on how fast users are moving their jobs.

Why are we doing this upgrade?

The current operating system CentOS 7 (which is based on RedHat Enterprise Linux 7) will not receive any security updates after 2024-06-30.

As Tetralith is planned to continue operating longer than this, possibly until late 2025, we need to upgrade the operating system.

What about Sigma?

The Sigma cluster was upgraded December 8th.

  • Known issues

    Known issues regarding the Tetralith OS upgrade to EL9

User Area

User support

Guides, documentation and FAQ.

Getting access

Applying for projects and login accounts.

System status

Everything OK!

No reported problems

Self-service

SUPR
NSC Express