Overview of NSC Centre Storage

All user and project data on Tetralith and Sigma is stored on the NSC Centre Storage system.

It is intended for short- and medium-term storage associated with one or more ongoing compute projects.

It is not an archive system for long-term storage of data not in active use.

There are three basic types of disk storage available to Tetralith and Sigma users:

  • Your home directory (e.g /home/x_abcde). All users have one. This is a small area (20 GiB) where you can store applications settings and small personal files not related to any project.
  • Project storage directories (e.g /proj/myproject). All projects have access to project storage (at least 500 GiB, but projects can apply for more space at any time if needed).
  • The local disk in each compute node. This is where you should store temporary files that are only needed during a compute job and that does not need to be shared between compute nodes1.

There are limits to how much data you can store in each location. On /home and /proj, a quota system limits how much you can use. On /scratch/local you are limited by the physical size of the disk in each compute node.

The command storagequota will show how much space is available in each location and how much is used.

The command storagereport can show more detailed information about data age and where inside the storage directory large amounts of data volume and/or files are located.

Do not store large amounts of data in other writable locations on the login nodes (e.g on /tmp, /var/tmp, /dev/shm), since the space there is very limited and shared by all users.

Your home directory (/home/YOUR_USERNAME, e.g /home/x_abcde)

Your home directory is intended for storage of small amounts of personal data, e.g:

  • Application settings (usually stored in hidden files/directories in your home directory)
  • Scripts, source code etc that does not belong to any one project.

By default, your home directory is limited to 20 GiB of data (quota). You can store up to 30 GiB for a week (limit). There is also a limit of one million files per user.

Your login account at NSC is locked (i.e you cannot log in) 30 days after your last active project ends. Please transfer any valuable data from your home directory before the account is closed.

NSC will try (but there’s no guarantee) to keep the contents of your home directory available for a year after your account is closed. If you need your account temporarily reopened to access data, or if you want us to delete your home directory immediately, please contact NSC Support.

Even after deleting your home directory from disk, its contents may still be stored on backup tapes. If you want such data deleted or if you want it restored, please contact NSC Support.

By default, your home directory and its contents are only accessible by you. If you feel comfortable using Unix file permissions, you may change the permissions on your home directory to allow others to read data. However, if you need to share data with another user it’s usually better to do this in the project directory.

Project storage directories (/proj/PROJECTDIR, e.g /proj/sensibleshoes)

Each project that has been allocated computing time on Tetralith and Sigma normally has a directory under /proj where the project members can store their data associated with that project.

The name of the directory is decided by the project Principal Investigator (“PI”) when applying for the project in SUPR. Most projects choose to use either the project name as the directory name (e.g /proj/snic2019-1-123) or a name that describes the project or the group using it (e.g /proj/metaphysics101).

If you don’t know the location of your project directory, you can:

  • Ask the project PI
  • Look for the “Project storage directories available to you” message when you log in to the cluster using SSH.
  • Run the storagequota command, it shows project directories available to you, and how much you can store in them. If you run storagequota -a you can see how much data each project member is using.
  • Ask NSC Support

If you can find the project directory for a project you believe you are a member of, but cannot access it, try logging out and back in again. Group memberships in Linux are only initialized when you log in. If you are using Thinlinc, make sure you log out rather than just disconnect your session. If logging in again does not help, contact NSC Support.

Note: despite the lack of the word “nobackup” in the directory name /proj, we do not make tape backups of /proj data! Read the “Is my data safe?” page for more information.

The amount of data and the number of files a project can store in the project directory is limited. Both limits can be raised, all that is needed in most cases is an electronic application in SUPR, or sometimes just an email to NSC. See this page.

When you run storagequota, you will see how much data is used (“Used”), the long-term limit is (“Quota”) and the absolute limit (“Limit”) are. The long-term limit can be exceeded for up to 30 days (“Grace” time).

Due to the significant impact it will have on your running jobs (they will almost certainly fail), you should avoid exceeding the Hard limit or the 30-day Grace time limit. It is better to ask for more storage space than to risk hitting your limit and having jobs fail.

Your personal area inside a project storage directory

Unless the project PI has decided otherwise, each member of the project is automatically given a personal directory in the “users” subdirectory of the project storage directory (e.g /proj/snic2019-1-123/users/x_abcde). This directory is created when the user logs in for the first time after becoming a member of the project.

This directory is by default not accessible by any other user (not even other project members).

It is intended as a personal work area where you put temporary files, job work directories etc.

If you want to open up your personal area to others you can do so using the chmod command. If you do, please be aware that this also (unless you change your “umask” and change permissions on existing files) will make it possible for other project members to delete or change your files (intentionally or by mistake).

Non-personal areas in a project storage directory

If you want to store data or applications that should be accessible by all project members, we recommend that you create directories outside your personal area.

All directories and files you create there will by default be read- and writable by all project members.

If the project is going to have lots of shared data, we recommend that the PI decide on a suitable directory structure, e.g

/proj/snic2019-1-123/users/...            - personal areas
/proj/snic2019-1-123/datasets/YYYY/MM/DD  - shared datasets
/proj/snic2019-1-123/scripts              - useful scripts
/proj/snic2019-1-123/pkg/someapp-X.Y      - applications
[...]

What types of data to store in project storage directories

You should use the project storage directory for all data associated with the project, except for temporary files that are only used from a single compute node during a job (such files should be stored on the local disk in the compute node, see this page. This includes:

  • Input files
  • Output files
  • Job scripts
  • Any applications installed by project members

If you want extra protection for small-volume, high-value data such as source code or scripts, you can store it on /home (or keep an extra copy there or outside NSC).

Inactive/cold data tool: storagereport

It is NAISS policy that Centre Storage should only be used for data that is actively used during a project. Inactive/cold data should be stored elsewhere, e.g at your home university (which is responsible for storing research results even if the research was conducted on a NAISS system).

To aid in identifying cold data, we provide the tool storagereport.

If you run e.g storagereport /proj/<your project directory> it will tell you e.g

  • How many files and bytes that were created more than 7, 30, 90, … days ago.
  • How many files and bytes that were modified more than 7, 30, 90, … days ago.
  • The largest subdirectories by data volume and number of files

The tool can tell you a few more things, see storagereport --help.

Try e.g storagereport --subdir-limit=9999 /proj/<your project directory> or storagereport --user=x_examp /proj/<your project directory>

Extracting this data puts a certain load on the storage system. Due to this, we only re-scan /proj weekly. Storagereport will by default show the most recent data, but you can access older data using the --list and --date options.

This tool is (as of 2024-12-13) brand new, if you find problems with it or have suggestions for other things you would like it to do, please email support@nsc.liu.se.

If you are curious as to exactly what data is available, you can look in the JSON file that the storagereport script reads (storagereport will tell you the location of the JSON file when you run it). E.g jq . < /proj/.usage/...json | less

The storagereport data (as it contains some directory names) is only accessible by project members.

  1. The local disk is technically not a part of Centre Storage, but it makes sense to mention it here anyway. See node-local storage and job-local storage for more information about this. 


User Area

User support

Guides, documentation and FAQ.

Getting access

Applying for projects and login accounts.

System status

Everything OK!

No reported problems

Self-service

SUPR
NSC Express