Introduction to SLURM and Job Submission¶

Now that you are connected to the cluster, it's time to learn about SLURM (Simple Linux Utility for Resource Management), which is the workload manager used on the cluster. SLURM handles job scheduling, resource allocation, and job monitoring.

In general, cluster users are expected to submit most computations to the job scheduler to be run on the dedicated compute nodes. The login nodes are meant for tasks like editing source/command files and running short test programs that do not use much memory, time, and only need one or two CPUs. Some rough guidelines are: the test will run less than five minutes, will use less than 5 GB of memory, and will not use more than two CPUs.

Specifying the computing resources¶

When you submit a job to Slurm, you must specify the resources needed for that job. The following options are used to specify the resources per physical computer (node).

--partition From which group of machines (nodes) should the ones for your job be selected.

--nodes How many physical machines should be used for the job.

--ntasks-per-node How many copies of the program should be started when using mpirun or srun.

--cpus-per-task How many CPUs (processors, cores) should each task have.

--mem Memory available to all tasks/processors on each node.

--mem-per-cpu How much memory per CPU. This option should be avoided in general because specifying --mem provides the system more flexibility in memory assignment.

--time How long should the job be able to run before Slurm stops the job.

Tasks versus CPUs per task

Clusters are often used to run software than can communicate among many nodes during calculation. That is done by starting many copies of the same program, and each copy can then determine what it should do. Each of those copies is called a task. The --ntasks-per-node option should only be used if you know your software uses MPI.

More software can use multiple CPUs on the same node than can use multiple nodes. A single copy of that software is started. To specify that it can access multiple CPUs, you use --cpus-per-task. For most software, you will want to specify one node with one task and then request --cpus-per-taks larger than 1.

Submitting a Job with `sbatch`¶

You can submit a job by writing a job script. It's a simple text file that contains both the resource requirements and the commands you want to execute.

Let's create our first job script. (you can use the editor of your choice, e.g. emacs, joe, nano, vim, etc)

$ nano test_job.sh

We need to have a shebang line at the beginning of the script to specify the file is a shell script.

#!/bin/sh

Slurm lets you specify options directly in a batch script, called Slurm “directives.” These directives can provide job setup information used by Slurm, including resource requests, email options, and more. This information is then followed by the commands to be executed to do the computational work of your job.

Slurm directives must precede the executable section in your script.

# Run on the general partition 
#SBATCH --partition=general

# Request one node
#SBATCH --nodes=1

# Request one task
#SBATCH --ntasks=1

# Request 4GB of RAM
#SBATCH --mem=4G

# Run for a maximum of 5 minutes
#SBATCH --time=5:00

# Name of the job
#SBATCH --job-name=testjob

# Name the output file 
#SBATCH --output=%x_%j.out

# Specify when Slurm should send you e-mail.  You may choose from
# BEGIN, END, FAIL to receive mail, or NONE to skip mail entirely.
#SBATCH --mail-type=NONE

Below the job script’s directives is the section of code that Slurm will execute. This section is equivalent to running a Bash script in the command line – it’ll go through and sequentially run each command that you include. When there are no more commands to run, the job will stop.

For example, these commands go to jshmoe's home directory and executes a python program.

# go to jshmoe's home directory
cd /gpfs1/home/j/s/jshmoe
# in that directory, run test.py
python test.py

When you are done editing your file, save and exit.

To submit the job we use the sbatch command.

$ sbatch test_job.sh
Submitted batch job 123456

Your job will be submited and run onces the requested resources are avialable.

Jobs with fewer resources requested will run sooner.

Although jobs submitted before you are further ahead in the queue, the slurm scheduler looks for jobs that can fit in the gaps between larger jobs, as long as they do not delay them. This means that being conservative in your resource requests will result in jobs running sooner.

Running an Interactive Job with `srun`¶

In addition to batch jobs, you can run interactive jobs on the cluster using SLURM. An interactive job gives you direct access to a compute node, allowing you to run commands interactively as if you were logged into that node. This is useful for tasks like debugging and testing code.

To start an interactive session, use the srun command. Here's an example:

$ srun --partition=general --nodes=1 --ntasks=1 --mem=4G --time=30:00 --pty /bin/bash

In this command, the options are the same as they would be in a job script except the --pty option, which tells Slurm that you wish to start /bin/bash as a shell at which you can type commands and have them run on the node assigned to the job. An interactive job is like a login session, but the resources on the computer match those requested.

To end the interactive session, simply type:

$ exit

Interactive jobs are ideal for real-time experimentation and testing, complementing the batch job process.

Job Constraints¶

When a job has specific hardware requirements, you can use constraints to select the appropriate nodes. For example, to limit your job to a node with an Infiniband network card, you might use

#SBATCH --constraint=ib

or add --constraint=ib to your srun or salloc command.

The constraints in the table below were available when this page was last updated. For the most current available constraints, you can run

show_node_constraints

Constraint	Description
intel	Nodes with Intel processors
v100	Nodes with V100 GPUs
a100	Nodes with A100 GPUs
h100	Nodes with H100 GPUs
h200	Nodes with H200 GPUs
noib	Nodes without Infiniband
ib	Infiniband nodes, all types
ib1	Infiniband nodes, group 1
ib2	Infiniband nodes, group 2
10g	10 Gig Ethernet
hc	High clockspeed nodes
cascadelake	Nodes with cascadelake generation processors

Introduction to SLURM and Job Submission¶

Login vs. Compute Nodes¶

Specifying the computing resources¶

Submitting a Job with sbatch¶

Running an Interactive Job with srun¶

Job Constraints¶

Submitting a Job with `sbatch`¶

Running an Interactive Job with `srun`¶