Fair-share scheduling

How does fairshare scheduling work on Tetralith?

Fairshare scheduling on Tetralith attempts to give each project (not user) a “fair share” of the available computing time of the system over time.

A “fair share” is not an equal share. A project’s share is the time NAISS allocated to the project (e.g 100000 core hours per month) divided by the total capacity of the system (43.5 million core hours per month).

A project that makes a reasonable effort to use its allocated time (i.e keeps some jobs in the queue most of the time) can expect to be able to run approximately as many core hours as allocated by NAISS, or more.

The fairshare scheduler tries to achieve this by adjusting the priority of queued jobs. Since the queue is continuously re-sorted by priority, this generally results in short queue times for jobs submitted by projects with high priority, and long queue times for jobs submitted by projects with low priority.

The priority of a queued job is determined by how much the project has run recently compared to its allocation. The higher the usage is (as a percentage of the allocation), the lower the priority is.

I.e the more you run, the harder it will be to run the next job.

If you are interested in the gory details of how this is implemented: we use the Slurm multifactor priority plugin.

The running configuration settings for the multifactor plugin can be seen by running “scontrol show config”. As of 2019-04-10, the most interesting ones are:

PriorityDecayHalfLife   = 21-00:00:00
PriorityWeightFairShare = 1000000
PriorityWeightAge       = 1000

This means that the job priority is almost entirely determined by the FairShare of the project. The PriorityWeightAge is so much smaller that the age of a job will never affect the ordering of projects, only the ordering of jobs belonging to the same project.

How does fairshare scheduling work on Tetralith?

User support

Getting access

Everything OK!

Self-service