Introduction to batch jobs

What is a batch job?

A batch job is a non-interactive (no user input is possible) way to run an application in a pre-determined way. What happens during the batch job is controlled by the job script (sometimes known as “submit script”). When a batch job is submitted to the system, it is put in a queue, and is then started at a later time (sometimes immediately). An obvious advantage with this approach is that you can queue many batch jobs at the same time, which will start automatically once resources are available, i.e. you do not need to sit in front of the computer in order to start calculations.

Running a batch job

Preparing a batch job:

Copy any needed input files to the shared file system on the login node.
Write the job script (some examples are included below).

Submitting a batch job:

Load any modules needed to run your job. The environment in the shell where you run “sbatch” will be saved and recreated when starting the job. This includes the current working directory. You can also place the “module load” commands in your job script, and they will be run automatically then the job starts.
Submit the job to the queue (e.g “sbatch myjob.sh”)
- Job options (e.g amount of memory reserved, number of CPU cores reserved, maximum wall time etc) can either be set in the job script (by adding “#SBATCH " lines) or by giving the same options to sbatch. You can put options in both locations. If an option is present in both places, the sbatch option is used.
- The environment (current directory, loaded modules, $PATH and other environment variables) is recorded by sbatch and will be restored when the job starts.
The job is now in the queue.

Once in the queue, the job might be started immediately (if enough idle compute resources are available) or it might have to wait in the queue for a while (minutes, hours, days or in extreme cases even longer).

Different NSC systems have very different scheduling policies and utilization, so queue times vary significantly between systems and projects. See the system documentation for more details.

If you don’t understand why your job won’t start, please contact NSC Support.

Monitoring a batch job:

You can monitor all your jobs, both batch and interactive, using the “squeue” command (e.g squeue -u $USER to see your jobs).

When the job has started, the standard output and standard error from the job script (which will contain output from your application if you have not redirected it elsewhere) will be written to a file named slurm-NNNNN.out in the directory where you submitted the job (NNNNN is replaced with the job ID).

If you need all the details about a pending or running job, use scontrol show job JOBID. Use squeue to find the job ID you need.

Ending a queued or running job

If you want to cancel (end) a queued or running job, use the scancel command and provide the job ID (e.g scancel 12345).

What happens when a job starts?

The environment (current working directory and environment variables such as $PATH) that was set when you submitted the job is recreated on the node where the job will be started.
The job script starts executing on the first node allocated to the job. If you have requested more than one node, your job script is responsible for starting your processes on all nodes in the job, e.g by using srun, ssh or an MPI launcher.
The job ends when your job script ends. All processes started by the job will be terminated if they are still running. The resources allocated to the job are now free to use for other jobs.
- Note: if you run applications in the background (“application &”) from your job script, you have to make sure that the job script does not end until all background applications has ended. This can be accomplished by adding a “wait” line to the script. Wait will cause the script to stop executing on that line until all background applications have finished.
- Note: if your job runs for longer than the time you requested (sbatch -t HH:MM:SS), the job will be killed automatically.
You can now fetch the output files generated by your job.

Sample job script: run an MPI application “mympiapp” on two “exclusive” (not shared with others) nodes

#!/bin/bash
#
#SBATCH -J myjobname
#SBATCH -t 00:30:00
#SBATCH -N 2
#SBATCH --exclusive
#
mpprun ./mympiapp

# Script ends here

Sample job script: run a single-threaded application on a single core and allocate 2GB RAM (the node might be shared with other jobs). Also send an email then when job starts and ends. Note! Replace the string “put-your-email-address-here” with the real email address you want the notifications to be sent to.

#!/bin/bash
#
#SBATCH -J myjobname
#SBATCH -t 00:30:00
#SBATCH --mem=2000
#SBATCH -n 1
#SBATCH --mail-type=ALL
#SBATCH --mail-user=put-your-email-address-here
#
# Run a single task in the foreground.
./myapp --flags
#
# Script ends here

Developing and testing your batch job

Hint: most of our clusters have a few nodes reserved for test and development (see the system documentation for details). Use these nodes to quickly check your job script before submitting it to the normal queue (where you might need to wait for hours or days before your job starts, only to find out that you made a simple error in the job script).

The reservation name can vary, on Tetralith it is now, on most other clusters it is devel.

You can also use the interactive command to get an interactive login session on a compute node. From there you can test your application and job script interactively in an environment that is almost 100% identical with the environment the real batch job will run in.

interactive takes the same command line options (e.g -t, -N) as sbatch.

The advantage of testing batch jobs in an interactive session is that you can quickly fix a bug, re-run the script, find another bug, fix it, … This can speed up the process of debugging job scripts significantly compared to submitting them normally.

Example:

[x_makro@tetralith1 ~]$ interactive -t 00:10:00 -n2 --reservation=now
Waiting for JOBID 1817147 to start

[x_makro@n1 ~]$ bash myjob.sh 
myjob.sh: line 2: badspell: command not found

Now I edit myjob.sh and fix the problem, and run it again:

[x_makro@n1 ~]$ bash myjob.sh 
1
2
3

Here I press Control-C to stop the job, as it seems to be working now.

[x_makro@n1 ~]$

Great, now all that remains is to end the interactive session (type exit) and submit the job normally:

[x_makro@n1 ~]$ exit
[x_makro@tetralith1 ~]$ sbatch -t 3-00:00:00 -N 128 --exclusive myjob.sh
Submitted batch job 1817151
[x_makro@tetralith1 ~]$ 

Choosing a time limit for your job

The “wall time” limit (set with the -t D-HH:MM:SS option to sbatch/interactive) determines how long your job may run (in actual hours, not core hours) before it is terminated by the system.

If your job ends before the time limit is up, your project will only be charged for the actual time used.

However, there are a few reasons for not always asking for the maximum allowed time:

Short jobs can often be started faster than long jobs (due to “backfill”.
In the days before planned system maintenance, only jobs short enough to finish before the start of the maintenance period can start.
If everyone always asks for the maximum possible wall time, it becomes impossible to estimate when queued jobs will start.

We recommend adding a margin to the wall time setting to prevent jobs failing if they for some reason run slightly slower than expected (e.g due to high load in the disk storage system).

More information about sinfo, sbatch, scancel

Please read the man pages (e.g run man sbatch) on the cluster or read them online.

Introduction to batch jobs

What is a batch job?

Running a batch job

Monitoring a batch job:

Ending a queued or running job

What happens when a job starts?

Developing and testing your batch job

Choosing a time limit for your job

More information about sinfo, sbatch, scancel

User support

Getting access

Everything OK!

Self-service