Please note that the way you apply for additional project (/proj) storage has changed. All projects that need more than the default (500 GiB / 1 million files) allocation will now need to make a separate application in SUPR for a storage project.

Overview

All new Medium and Small compute projects on Tetralith and LiU local compute projects on Sigma are automatically allocated a small (currently 500 GiB) project storage area which is automatically accessible by all project members.

The limit of 500 GiB per project is set to a level which we believe will be enough for most small projects.

We are well aware that 500 GiB will not be enough for many projects. If your project needs to store more data, you are welcome to use the Storage Rounds in SUPR to apply for more space. See below for details on how to do this.

We aim to provide enough storage space to store all active data belonging to a project. However, we do not provide storage for archive purposes.

You must have an active computing project (or have applied for one) on Tetralith or Sigma in order to apply for storage allocations.

Applying for storage (space and number of files)

All Medium, Small and LiU local compute projects are given a default allocation that allows them to store up to 500 GiB data and one million files. If your project needs to store more data, the project PI should submit a proposal for a storage project (which will then replace the default allocation and use the same directory name).

The storage proposal should be submitted in SUPR. Note the possibility to clone a previous proposal in SUPR and that the PI can assign a Proxy to handle the proposal in SUPR.

It is possible for a storage project to be used for more than one compute project. If you want such a setup, just mention all the compute projects when you apply for the storage project, and link all the compute projects to the storage project using the "Link Compute Project" feature (see below).

If your storage will predominantly be used by NAISS projects, use one of the NAISS storage rounds. If your storage will predominantly be used by LiU projects, use the LiU Local Storage round.

There are four types of storage rounds to choose from:

NAISS Small Storage (less than 5 000 GiB and less than 5 million files)

Use the round NAISS Small Storage for small allocations (up to a maximum of 5 000 GiB or 5 million files in total) connected to a NAISS compute project.

Such proposals will usually be processed within 1-2 working days, and usually granted.

If you already have a NAISS Small Storage project with storage granted on Centre Storage at NSC, you can apply for an increase in its storage allocation (up to the limit for Small Storage project allcations on Centre Storage at NSC, i.e 5000 GiB or 5 million files) by sending an email to support@nsc.liu.se that includes how much space/files you need in total, and a motivation for the increase.

NAISS Medium Storage (up to 50 000 GiB and 50 million files)

Use the round NAISS Medium Storage for medium allocations (up to a maximum of 50 000 GiB or 50 million files in total) connected to a NAISS compute project.

Such proposals will usually be processed monthly to make sure the proposal will fit into our long-term storage plan.

If you already have a NAISS Medium Storage project, you can apply for an increase in the storage allocation at NSC (up to 2x the original allocation but not exceeding the limits for a Medium Storage project at NSC, i.e 50 000 GiB or 50 million files). To apply, send an email to support@nsc.liu.se that includes how much space/files you need in total, and a motivation for the increase.

To increase the allocation to more than 2x the original amount, a new proposal is needed (in SUPR).

NAISS Large Storage (more than 50 000 GiB or more than 50 million files)

Use the round NAISS Large Storage for large allocations (more than 50 000 GiB or more than 50 million files in total) connected to a NAISS compute project.

Such proposals will be processed twice a year by NAISS.

A granted NAISS Large Storage allocation at NSC can be increased, but a new application is always needed. Since this process is new, please contact support@nsc.liu.se before applying for an increase to a Large Storage project.

Note: the "DCS" project type has been replaced by NAISS Large Storage.

LiU Local Storage

Use the round LiU Local Storage for all sizes of storage allocations connected to a LiU Local compute project.

Such proposals will be handled as similar sized proposals for NAISS storage are handled.

The procedure for applications for increased allocations are the same as for correspondingly sized NAISS storage projects.

There is no maximum size for LiU projects, but the amount of LiU storage is limited, so it is unlikely that proposals for storage above 50 000 GiB or 50 million files will be approved.

Information to include in your proposal

To allow NSC to quickly process (and hopefully approve) your application, please make sure you include all relevant information when applying in SUPR or via email, including:

  • Your (existing) compute project(s) name (e.g "NAISS 2023/17-42", "LiU-2019-24", ...).
  • How much storage space you need in total on /proj (e.g 1 500 GiB).
  • How many files you need to store in total on /proj (e.g 3 million files). If you need to store less than the default limit of one million files, you do not need to include this.
  • If your needs will vary significantly over time, please indicate this (e.g 2 000 GiB from January to June, then 6 000 GiB until the end of the project)
  • How you came up with the amount of space and files you requested (we want to see a short description of how you came up with the numbers above, i.e "we need to save the output of 100 jobs, each is 100 GiB in size")
  • State that you have verified that you're using your current storage allocation in a reasonable way (e.g you have reviewed what's actually being stored, deleted unnecessary files, use sensible data formats, etc.)

In some cases we might get back to you and request additional details, or suggest improvements that can be made to how you store data. The more storage you request, the more information NSC will want from you, to ensure storage resources are used efficiently.

Once your storage project has been approved, you can link it to one or more compute projects so that all members of the compute projects automatically can access the storage.

In SUPR, go to the storage project, then under "Project Members", look for a button "Link Compute Project". Use it to link the storage project to one or more compute projects. The storage will then become accessible to all members of all linked compute projects.

If you do not do this, you need to manage membership in the storage project manually (in SUPR).

Why we are limiting both data volume (GiB) and the number of files

There are several reasons why storing a large number of files is problematic.

  • Managing data1 spread over many files is slower and use more system resources2 than if the same volume of data is stored in fewer larger files.
  • When you access files, memory is used both on the nodes where the file is accessed and on the storage servers to keep track of changes to the files, file locking, etc. The amount of memory needed is proportional to the number of open files.
  • The "snapshot" feature that lets you recover deleted files gets more expensive the more files we have in total3. With too many files stored, we will not be able to take snapshots as often, or store them for as long as we do today. This makes it less likely that users can recover a deleted file.
  • If the file system structure is damaged, the system has to be taken offline for repairs. The time it takes to scan the file system for damage and repair the damage can be significant (many hours to several days) and depends largely on the number of files stored4.

NSC is well aware that certain applications and entire projects have a greater need to store a large number of files than others.

In order to be able to allow those projects to store as many files as they need, we need to limit other projects.

Therefore, before we grant an increase in the files quota/limit we want you to make a reasonable effort (and explain it to us) to keep the number of files down.

What is a reasonable efforts depends on the circumstances5.

Things to consider:

  • Are there alternative file formats that your application can use with little or no change? E.g a single output file rather than one file per record, or one per MPI rank.

  • Package large sets of output files into archive files (e.g tar, zip) if they not going to be used for a while.

  • Package large sets of input files into archives, and unpack them to the local disk in each compute node at the start of the job. This can sometimes give a decent performance increase, and might be worth considering regardless of the files limit.

If you need help with any of this, or if you want to discuss what a reasonable effort is for you, contact NSC Support.

Existing compute projects created before November 2019

If your compute project was created before the November 2019 policy change, the following applies:

  • You can keep the current storage allocation (GiB and number of files) until the end of the current project.
  • Any increase in the storage allocation size (GiB and number of files) or the length of the project means you will have to apply for a separate storage project.

  1. e.g copying files to, from and within the cluster using scp, rsync, etc. But also things like checking the size of a directory tree using "du". Even reading and writing data from your application can sometimes be significantly slower if you use many files.

  2. There is a certain overhead associated with e.g copying a file that is the same regardless of the size of the file. This has a cost in memory, CPU usage, disk and memory bandwidth, etc. This affects both the node where the operation is performed (login node or compute node) and the storage system.

  3. We have to delete snapshots since they use a lot of disk space. Deleting a snapshot requires the storage system to scan all files, check which snapshots they are a member of, and delete the file if the deleted snapshot was the last snapshot the file was a part of. This means that deleting snapshots will take time (and system resources) proportional to the total number of files in the system. Since we have to be able to delete as many snapshots per day as we create, this puts an upper limit on how many snapshots we can have. Also, deleting snapshots is quite I/O-intensive, so if the system is spending many hours per day deleting snapshots it means accessing the data is slower for all users during those times.

  4. The file system structure consist of hundreds of millions of pointers that tell the system where the pieces of a file are physically located on the physical disks and where in the directory tree it is located. It is not often damaged, but it has happened several times at NSC in the last 10 years, so it's something we must assume will happen again. Damage to the file system structure has historically usually been caused by storage system software bugs, so it's not something that e.g RAID will protect against.

  5. Doing significant manual work before and after each job to save a few thousand files is probably not a reasonable effort. Spending a few hours to automate a process that saves a few million files is definitely a reasonable effort. You're always welcome to discuss this balance with NSC Support.


User Area

User support

Guides, documentation and FAQ.

Getting access

Applying for projects and login accounts.

System status

Everything OK!

No reported problems

Self-service

SUPR
NSC Express