During March-April 2024 testing of the new OS can be done by connecting to the stratus2 and cirrus2 login nodes. Those login nodes run the new OS and jobs started from them will run in a small reservation of the cluster that has upgraded compute nodes.
During May staggered upgrades of the entire system will be done. Final dates for each system will be a discussion between the representatives and NSC, but software you want to run should be tested and ready by April 30.
During the first week of May Cirrus will have a few hours downtime where all compute nodes are switched and where the “cirrus” login node alias is redirected to point to a login node with the new OS by default.
During the second week of May Stratus will also change over in the same manner as Cirrus.
Some time in the August - September timeframe there will be a whole day of downtime for each system to finalize the switchover. Until then we will retain the capacity to wholy or partially switch back to the old OS fairly quickly in an emergency situation.
Some groups have been testing the upgrade for some time now on a limited number of nodes.
On the production followup between Metcoop representatives 2024-02-29 we decided on the final plan which calls for a complete switch of OS for both Cirrus and Stratus before May 1 2024.
To continue using Stratus and Cirrus without interruption, you should make sure your jobs run on the upgraded parts of Stratus and Cirrus immediately and sort out any problems before April 14.
You yourself choose when to do this, but we recommend that you do it sooner rather than later to leave as much time as possible for you (and possibly NSC) to fix any problems you may encounter.
Typically, it will look something like this:
Note that the amount of compute nodes available to you with the new OS currently is very limited to prevent impacting running forecasts. If you can’t test your software stack with the amount of nodes available please let Rafael Grote and Lars Berggren know so they can coordinate tests of that without disrupting production.
Storage (/nobackup, /home) are not affected (i.e no need to move your data).
In short: Test your software today, and don’t hesitate to contact NSC Support if you need help.
This section covers how to get software you have been running on CentOS 7 to run on Rocky Linux 9.
Unlike most clusters managed by NSC, there is not a lot of software provided directly by NSC since users prefer and are expected to build their own software stacks.
There are the basic set of compilers and some niceties like Anaconda and Mambaforge.
Additional applications and versions can be added, but only if requested via NSC Support.
If you don’t find your application/version in the output from module
avail
(on upgraded nodes) and it’s not mentioned on this page
already, please contact NSC Support and ask about it.
If you or someone in your project has built or installed your own application you will need to choose a suitable way to run the application in the new environment.
There are several ways to do this (recompile, run as-is). Which one to use depends on the application. NSC Support can assist you with this. Some documentation is provided below, more will follow later.
Inside a supercomputer job, please try the following steps
(i.e., inside a submit script, or after issuing interactive
).
Note that it is a good idea to verify not just that the application
starts, but also to try an example job to check that it performs as expected.
Run your application as usual, i.e., as mpprun <application>
.
Some applications just work without any additional steps.
However, you may see error messages about “missing symbols” or
“missing libraries”. In that case, go to the next step.
The next step is to try to rebuild the application with one of the
build environments provided in the Rocky Linux 9 environment. Load
an appropriate buildenv-<something>
module and follow the usual
instructions for how to install software.
Note: make sure to completely rebuild the application, i.e., do a make dist-clean
, make clean
, or equivalent as the first step to make sure all components are rebuilt from scratch.
If you or your project have installed a software application that you like to use on the login node for data analysis etc., please try the following steps:
Try to run your application as usual, e.g., as ./<application>
.
Some applications just work without any additional steps.
However, you may see error messages about “missing symbols” or
“missing libraries”. In that case, go to the next step.
The next step is to try to rebuild the application with one of the
build environments provided in the Rocky Linux 9 environment. Load
an appropriate buildenv-<something>
module and follow the usual
instructions for how to install software.
Note: make sure to completely rebuild the application, i.e., do a make dist-clean
, make clean
, or equivalent as the first step to make sure all components are rebuilt from scratch.
Note that the mpprun
application differs substantially on Rocky Linux 9 compared to CentOS 7.
Use mpprun -h
to see the help, and the option mpprun -i <binary>
can be helpful for power
users to understand more precisely how mpprun
will launch a binary through the various
MPI-specific launchers.
The old build environments of CentOS 7, buildenv-<something>/<old-version>
, will not be carried over to EL9, but there will be newer version replacement modules available corresponding to them. If you can not make use of these refreshed modules, please contact NSC Support for assistance on your migration to these environments.
If you see a significant performance loss in the upgraded part of Stratus and Cirrus, please let NSC Support know as soon as possible.
Users can choose which part to use by logging in to the corresponding login node (using SSH or Thinlinc):
The current operating system CentOS 7 (which is based on RedHat Enterprise Linux 7) will not receive any security updates after 2024-06-30.
As Stratus and Cirrus are planned to continue operating longer than this, we need to upgrade the operating system.
Guides, documentation and FAQ.
Applying for projects and login accounts.