Scheduling policy on Nebula

Nebula scheduling is based on first-in-first-out (with backfill so that shorter jobs that fit can run as long as they do not delay the start of jobs with higher priority), but with a few additions:

A single user may use at most 50 nodes at the same time.
- Attempting to submit a job that uses more than 50 nodes will fail.
- Jobs in the queue that would make a user exceed the limit are held in the queue with the reason AssocGrpNodeLimit until enough of that user’s jobs have finished.
Jobs with a time limit of more than 6 hours are collectively limited to 68 nodes.
- Attempting to submit a job with a time limit of more than 6 hours using more than 68 nodes will fail.
- New jobs with a time limit of more than 6 hours that would cause the limit to be exceeded are held in the queue with the reason QOSGrpNodeLimit until enough long jobs have finished.

**The maximum wall time for a job is 7 days.** The default time limit (if you do not use a "-t" flag) is 2 hours.
Please use the "-t" flag to set a time limit that is appropriate for each job!

Requesting nodes with more memory

There are 12 fat nodes with more memory (384 GiB). To use them, add -C fat to sbatch/interactive etc.

Node sharing is available on Nebula. The idea behind node sharing is that you do not have to allocate a full compute node in order to run a small job using, say, 1 or 2 cores. Thus, if you request a job like sbatch -n 1 ... the job may share the node with other jobs smaller than 1 node. Jobs using a full node or more will not experience this (e.g. we will not pack two 48 core jobs into 3 nodes). You can turn off node-sharing for otherwise eligible jobs using the --exclusive flag.

Warning: If you do not include -n, -N or --exclusive to commands like sbatch and interactive, you will get a single core, not a full node.

When you allocate less than a full node, you get a proportional share of the node’s memory. On a thin node with 96 GiB, that means that you get 1.5 GiB per allocated hyperthread which is the same as 3 GiB per allocated core.

If you need more memory you need to declare that using an option like --mem-per-cpu=MEM, where MEM is the memory in MiB per hyperthread (even if you do not allocate your tasks on the hyperthread level).

Example: to run a process that needs approximately 32 GiB on one core, you can use -n1 --mem-per-cpu=16000. As you have not turned on hyperthreading, you allocate a whole core, but the memory is still specified per hyperthread.

As a comparison, -n2 --ntasks-per-core=2 --mem-per-cpu=16000 allocates two hyperthreads (on a core). Together, they will also have approximately 32 GiB of memory to share.

Note: you cannot request a fat node on Nebula by passing a --mem or --mem-per-cpu option too large for thin nodes. You need to use the -C fat option discussed above.

Job private directories

Each compute node has a local hard disk with approximately 210 GiB (on thin nodes, 870 Gib on fat nodes) available for user files. The environment variable $SNIC_TMP in the job script environment points to a writable directory on the local disk that you can use. A difference on Nebula vs older systems is that each job has private copies of the following directories used for temporary storage:

/scratch/local (`$SNIC_TMP`)
/tmp
/var/tmp

This means that one job cannot read files written by another job running on the same node. This applies even if it is two of your own jobs running on the same node!

Please note that anything stored on the local disk is deleted when your job ends. If some temporary or output files stored there needs to be preserved, copy them to project storage at the end of your job script.

Scheduling policy on Nebula

Requesting nodes with more memory

Job private directories

User support

Getting access

Everything OK!

Self-service