Cirrus and Stratus uses simple first-in-first-out scheduling (with backfill so that shorter jobs that fit can run as long as they do not delay the start of jobs with higher priority).
Cirrus has 256 nodes (8 fat and 248 thin) while Stratus has 520 nodes (12 fat and 508 thin).
There will be node limits between groups later, but during the initial test period there are none.
If you are a member of more than one group, you should always use an
option like -A ha
, -A metcoop
etc. to sbatch/interactive to
tell Slurm what account to run under.
If you are only part of one group you do not need to use the -A option for normal job submission. You might have to use it under special circumstances, such as cron jobs.
There are 8 fat nodes on Cirrus and 12 fat nodes on Stratus with more memory (384 GiB). To use them, add -C fat
to
sbatch/interactive etc.
Node sharing is available on Cirrus and Stratus. The idea behind
node sharing is that you do not have to allocate a full compute node
in order to run a small job using, say, 1 or 2 cores. Thus, if you
request a job like sbatch -n 1 ...
the job may share the node with
other jobs smaller than 1 node. Jobs using a full node or more will
not experience this (e.g. we will not pack two 48 core jobs into 3
nodes). You can turn off node-sharing for otherwise eligible jobs
using the --exclusive
flag.
Warning: If you do not include -n
, -N
or --exclusive
to commands like sbatch
and interactive
, you will get
a single core, not a full node.
When you allocate less than a full node, you get a proportional share of the node’s memory. On a thin node with 96 GiB, that means that you get 1.5 GiB per allocated hyperthread which is the same as 3 GiB per allocated core.
If you need more memory you need to declare that using an option like
--mem-per-cpu=MEM
, where MEM is the memory in MiB per hyperthread
(even if you do not allocate your tasks on the hyperthread level).
Example: to run a process that needs approximately 32 GiB on one core,
you can use -n1 --mem-per-cpu=16000
. As you have not turned on
hyperthreading, you allocate a whole core, but the memory is still
specified per hyperthread.
As a comparison, -n2 --ntasks-per-core=2 --mem-per-cpu=16000
allocates two hyperthreads (on a core). Together, they will also
have approximately 32 GiB of memory to share.
Note: you cannot request a fat node on Cirrus or Stratus by passing a --mem
or
--mem-per-cpu
option too large for thin nodes. You need to use the
-C fat
option discussed above.
Each compute node has a local hard disk with approximately 210 GiB (on
thin nodes, 870 Gib on fat nodes) available for user files. The
environment variable $SNIC_TMP
in the job script environment points
to a writable directory on the local disk that you can use. A
difference on Cirrus and Stratus vs older systems on NSC (like
Byvind) is that each job has private copies of the following
directories used for temporary storage:
/scratch/local ($SNIC_TMP)
/tmp
/var/tmp
This means that one job cannot read files written by another job running on the same node. This applies even if it is two of your own jobs running on the same node!
Please note that anything stored on the local disk is deleted when your job ends. If some temporary or output files stored there needs to be preserved, copy them to project storage at the end of your job script.
Guides, documentation and FAQ.
Applying for projects and login accounts.