Scheduling policy and limits

  • Jobs are primarily started in priority order. The priority of a job is determined by how much the project associated with the job has run recently in relation to its allocated computing time (“fair-share scheduling”).1
  • Lower priority jobs are sometimes started using “backfill” scheduling if this can be done without affecting higher-priority jobs.2
  • Once started, a job is never terminated/preempted to allow a higher-priority job to start.
  • A project may at maximum schedule their given monthly allocation of GPU hours at the same time. Exceeding this limit results new jobs being prevented from being scheduled with the reason AssocGrpGRESRunMinutes shown in squeue until the project once again is under its given limit.

Additional details

  1. Within a project, fair-share is also used so that users that have run a lot recently have lower priority than users that have run little recently. 

  2. Backfilling is the process of scheduling jobs on compute nodes that would otherwise be idle while waiting for enough nodes to become available to start a large job. If there are idle nodes available and a lower priority job can be started without affecting the estimated start time of the highest priority job, the lower priority job is started. If more than one low-priority job could be started using backfill, the highest priority one is selected. In general, jobs shorter than a few hours have a good chance to be started using backfill. However, please dont make your jobs too short, see this page for why. 


User Area

User support

Guides, documentation and FAQ.

Getting access

Applying for projects and login accounts.

System status

Everything OK!

No reported problems

Self-service

SUPR
NSC Express