All new Medium and Small compute projects on Tetralith and LiU local compute projects on Sigma are automatically allocated a small (currently 500 GiB) project storage area which is automatically accessible by all project members.
The limit of 500 GiB per project is set to a level which we believe will be enough for most small projects.
We are well aware that 500 GiB will not be enough for many projects. If your project needs to store more data, you are welcome to use the Storage Rounds in SUPR to apply for more space. See below for details on how to do this.
We aim to provide enough storage space to store all active data belonging to a project. However, we do not provide storage for archive purposes.
You must have an active computing project (or have applied for one) on Tetralith or Sigma in order to apply for storage allocations.
All Medium, Small and LiU local compute projects are given a default allocation that allows them to store up to 500 GiB data and one million files. If your project needs to store more data, the project PI should submit a proposal for a storage project (which will then replace the default allocation and use the same directory name).
The storage proposal should be submitted in SUPR. Note the possibility to clone a previous proposal in SUPR and that the PI can assign a Proxy to handle the proposal in SUPR.
It is possible for a storage project to be used for more than one compute project. If you want such a setup, just mention all the compute projects when you apply for the storage project, and link all the compute projects to the storage project using the “Link Compute Project” feature (see below).
If your storage will predominantly be used by NAISS projects, use one of the NAISS storage rounds. If your storage will predominantly be used by LiU projects, use the LiU Local Storage round.
There are four types of storage rounds to choose from:
Use the round NAISS Small Storage for small allocations (up to a maximum of 5 000 GiB or 5 million files in total) connected to a NAISS compute project.
Such proposals will usually be processed within 1-2 working days, and usually granted.
If you already have a NAISS Small Storage project with storage granted on Centre Storage at NSC, you can apply for an increase in its storage allocation (up to the limit for Small Storage project allcations on Centre Storage at NSC, i.e 5000 GiB or 5 million files) by sending an email to support@nsc.liu.se that includes how much space/files you need in total, and a motivation for the increase.
Use the round NAISS Medium Storage for medium allocations (up to a maximum of 50 000 GiB or 50 million files in total) connected to a NAISS compute project.
Such proposals will usually be processed monthly to make sure the proposal will fit into our long-term storage plan.
If you already have a NAISS Medium Storage project, you can apply for an increase in the storage allocation at NSC (up to 2x the original allocation but not exceeding the limits for a Medium Storage project at NSC, i.e 50 000 GiB or 50 million files). To apply, send an email to support@nsc.liu.se that includes how much space/files you need in total, and a motivation for the increase.
To increase the allocation to more than 2x the original amount, a new proposal is needed (in SUPR).
Use the round NAISS Large Storage for large allocations (more than 50 000 GiB or more than 50 million files in total) connected to a NAISS compute project.
Such proposals will be processed twice a year by NAISS.
A granted NAISS Large Storage allocation at NSC can be increased, but a new application is always needed. Since this process is new, please contact support@nsc.liu.se before applying for an increase to a Large Storage project.
Note: the “DCS” project type has been replaced by NAISS Large Storage.
Use the round LiU Local Storage for all sizes of storage allocations connected to a LiU Local compute project.
Such proposals will be handled as similar sized proposals for NAISS storage are handled.
The procedure for applications for increased allocations are the same as for correspondingly sized NAISS storage projects.
There is no maximum size for LiU projects, but the amount of LiU storage is limited, so it is unlikely that proposals for storage above 50 000 GiB or 50 million files will be approved.
To allow NSC to quickly process (and hopefully approve) your application, please make sure you include all relevant information when applying in SUPR or via email, including:
In some cases we might get back to you and request additional details, or suggest improvements that can be made to how you store data. The more storage you request, the more information NSC will want from you, to ensure storage resources are used efficiently.
Once your storage project has been approved, you can link it to one or more compute projects so that all members of the compute projects automatically can access the storage.
In SUPR, go to the storage project, then under “Project Members”, look for a button “Link Compute Project”. Use it to link the storage project to one or more compute projects. The storage will then become accessible to all members of all linked compute projects.
If you do not do this, you need to manage membership in the storage project manually (in SUPR).
There are several reasons why storing a large number of files is problematic.
NSC is well aware that certain applications and entire projects have a greater need to store a large number of files than others.
In order to be able to allow those projects to store as many files as they need, we need to limit other projects.
Therefore, before we grant an increase in the files quota/limit we want you to make a reasonable effort (and explain it to us) to keep the number of files down.
What is a reasonable efforts depends on the circumstances5.
Things to consider:
Are there alternative file formats that your application can use with little or no change? E.g a single output file rather than one file per record, or one per MPI rank.
Package large sets of output files into archive files (e.g tar, zip) if they not going to be used for a while.
Package large sets of input files into archives, and unpack them to the local disk in each compute node at the start of the job. This can sometimes give a decent performance increase, and might be worth considering regardless of the files limit.
If you need help with any of this, or if you want to discuss what a reasonable effort is for you, contact NSC Support.
of pointers that tell the system where the pieces of a file are physically located on the physical disks and where in the directory tree it is located. It is not often damaged, but it has happened several times at NSC in the last 10 years, so it’s something we must assume will happen again. Damage to the file system structure has historically usually been caused by storage system software bugs, so it’s not something that e.g RAID will protect against.
If your compute project was created before the November 2019 policy change, the following applies:
e.g copying files to, from and within the cluster using scp, rsync, etc. But also things like checking the size of a directory tree using “du”. Even reading and writing data from your application can sometimes be significantly slower if you use many files. ↩
There is a certain overhead associated with e.g copying a file that is the same regardless of the size of the file. This has a cost in memory, CPU usage, disk and memory bandwidth, etc. This affects both the node where the operation is performed (login node or compute node) and the storage system. ↩
We have to delete snapshots since they use a lot of disk space. Deleting a snapshot requires the storage system to scan all files, check which snapshots they are a member of, and delete the file if the deleted snapshot was the last snapshot the file was a part of. This means that deleting snapshots will take time (and system resources) proportional to the total number of files in the system. Since we have to be able to delete as many snapshots per day as we create, this puts an upper limit on how many snapshots we can have. Also, deleting snapshots is quite I/O-intensive, so if the system is spending many hours per day deleting snapshots it means accessing the data is slower for all users during those times. ↩
The file system structure consist of hundreds of millions ↩
Doing significant manual work before and after each job to save a few thousand files is probably not a reasonable effort. Spending a few hours to automate a process that saves a few million files is definitely a reasonable effort. You’re always welcome to discuss this balance with NSC Support. ↩
Guides, documentation and FAQ.
Applying for projects and login accounts.