Key differences between Triolith and previous systems

16 cores per node

If you have hard-coded the number of cores per node, please note that Triolith has 16 cores per compute node, not 8 as on the previous generation (e.g Neolith/Kappa/Matter).

NOTE: It is not recommended to hard-code the number of cores in this way. It will break your jobs if you run on e.g the "huge" Kappa nodes. Please use the relevant SLURM environment variables instead, e.g $SLURM_JOB_CPUS_ON_NODE. For more information, read the sbatch man page.

Using the local disk in compute nodes

There is still a local scratch disk available on each node, but you can no longer write files directly to /scratch/local. Instead use the environment variable $SNIC_TMP, which will be set to a directory that will be created for each job (and deleted when the job ends).

E.g: if your job old script looks like this

#!/bin/bash
#SBATCH -t 00:10:00
#
./myapp --tempdir=/scratch/local

then change it to

#!/bin/bash
#SBATCH -t 00:10:00
#
./myapp --tempdir=$SNIC_TMP

Note: --tempdir in the example above is just used in this example, not a magic option that you can use for any application to control where it write temporary files!

This change is being done to enable sharing of nodes between jobs.

By the way: there are other standardized SNIC environment variables that you can use to make your job script more portable across all SNIC HPC sites:

SNIC standardized environment variables
Env.variable Definition On Triolith
SNIC_SITE At what SNIC site am I running? nsc
SNIC_RESOURCE At what compute resource at the SNIC site $SNIC_SITE (see above) am I running? triolith
SNIC_BACKUP Shared directory with tape backup /home/$USERNAME
SNIC_NOBACKUP Shared directory without tape backup /proj/PROJECT / users / USERNAME (if such a directory exists)
SNIC_TMP Recommended directory for best performance during a job (local disk on nodes if applicable) set for each individual job, e.g /scratch/local/12345

The "module" command can be used in job scripts

Sometimes it can be useful to use the "module" command in job scripts. By doing this, you do not have to remember loading certain module before submitting the job.

Example:

#!/bin/bash 

#SBATCH --time=10:00:00
#SBATCH --nodes=2
#SBATCH --exclusive

module load someapp/1.2.3

someapp my_input_file.dat

If you write your job scripts using /bin/bash, /bin/csh or /bin/tcsh, the "module" command is automatically available from your script.

If you use /bin/ksh or /bin/sh, the module command is not available.

Batch jobs and scheduling

Take a look at the Triolith specific batch job and scheduling information.

Optimizing your code for Triolith

Recompile your own applications! If you have previously compiled your own software we definitely recommend recompiling it on Triolith. See instructions on this page for how to build your applications on Triolith.

There are basically two compiler suites supported by NSC on Triolith, Intel (intel) and the GNU compiler collection (gcc). Other compilers may be installed and made accessible in due course, but support will be restricted to intel and gcc for the foreseeable future. The optimization instructions below only refer to these.

Intel Compilers

Support for the Sandybridge type of processor in Triolith is invoked by means of the -xAVX switch to all intel compilers (icc, ifort and icpc). Alternatively if you compile on Triolith you can invoke the -xHost switch which makes the compilation default to the highest available instruction set on the compilation host machine, effectively -xAVX on Triolith.

Code compiled this way on Triolith can only be run on Intel processors supporting the AVX instruction set, at the time this is written only Intel Sandybridge and Ivybridge (consumer level processors presently) based CPU:s support AVX. If support for generic x86 processors, earlier Intel CPU:s and AMD CPU:s, is desired the -axAVX can be attempted. There may be a performance penalty on Triolith using this option which will have to be checked on a case by case basis.

Regarding the global optimization level switch -O<X>, where X is between 0-3 for the Intel compilers, it is tempting to turn this all the way up to 3. However, this will not unequivocally yield better performing binaries, often they will perform worse than those using the default -O2, but it will unconditionally lead to more compilation trouble for any decent size code. If you still choose to try the -O3 switch, it is good practise to also add the -no-ipo switch which removes many problems related to the use of -O3.

If you have OpenMP code to compile, you need to also add the -openmp switch to enable this by the intel compilers.

Examples:

ifort -O2 -xAVX -o mybinary mycode.f90

icc -O3 -no-ipo -xAVX -o mybinary mycode.c

icpc -xHost -openmp -o mybinary mycode.cpp #default global optimization level is "-O2"

GNU Compiler Collection

The GNU compilers shipped with CentOS 6 (the operating system on Triolith) were released well before the Intel Sandybridge line of processors. The support for AVX is therefore not as well developed on these compilers as in the Intel compilers. There is some support however and later compiler releases can be expected to produce better performing binaries.

A good choice of GCC compiler flags on Triolith is -O3 -mavx -march=native for any installed version of GCC, either those shipped with CentOS 6 or those installed by NSC, accessible via the module system. A binary built this way will run exclusively on AVX capable CPU:s. The choice of -O3 is safe for GCC compilers in general as the developers are more conservative with respect to numerically less precise code generation.

If you instead desire a binary capable of running on generic x86 CPU:s while retaining some tuning similar to the Intel -axAVX switch you could consider the switches -O3 -mtune=native -msse<X> with a suitable value for , e.g. -msse3 should let the binary run on the vast majority of current HPC CPU:s from both AMD and Intel. If your code uses OpenMP you are advised to use the -fopenmp switch to make use of this feature,

Examples:

gfortran -O3 -mavx -march=native mybinary mycode.f90

gcc -O3 -mtune=native -msse3 -o mybinary mycode.c

g++ -O3 -mavx -march=native -fopenmp -o mybinary mycode.cpp

The normal NSC compiler wrappers and mpprun are available, so to build and run an MPI application you only need to do load a module containing an MPI (e.g build-environment/nsc-recommended) and add the -Nmpi flag when compiling.

Example:

module add build-environment/nsc-recommended
icc -Nmpi -o myapp myapp.c

To run such an application you only need to use mpprun to start it, e.g

mpprun ./myapp

The compiler wrapper and mpprun will handle the compiler options to build against the loaded MPI version (Intel MPI in this case, it's part of build-environment/nsc-recommended) and how to launch an MPI application built against that MPI.

MPI

Currently both Intel MPI and OpenMPI are installed and supported on Triolith.

NSC recommends Intel MPI, as it has shown the best performance for most applications. However, if your application does not work/compile with Intel MPI or gets better performance when using OpenMPI, please use that instead.

You can see which versions of Intel MPI and OpenMPI are installed by running "module avail" (look for "impi" and "openmpi").

Intel MPI is loaded in the build-environment/nsc-recommended module. To use Intel MPI with the Intel compilers, just load build-environment/nsc-recommended. To use OpenMPI, load build-environment/nsc-recommended and then load an openmpi module (which will then unload the impi module).

The recommended way to build MPI binaries at NSC is to load an MPI module (e.g module load build-environment/nsc-recommended), then compile your application normally.

Our compiler wrappers will figure out how to link your application against the version of MPI corresponding to the module that you loaded.

Once built, we recommend that you use our MPI launcher mpprun. When launching the binary using mpprun, you do not need to specify how many ranks to start or which MPI should be used, mpprun will figure that out from the binary and the job environment.

Important note regarding OpenMPI performance

The currently (2012-10-24) installed OpenMPI version is close to IntelMPI performance wise only if you set the core binding yourself using extra flags to mpprun (unlike IntelMPI where this is done by default). We are working on incorporating this into mpprun, but there are many corner cases to work out. Until this is done, we recommend that you launch your OpenMPI applications like this:

mpprun --pass="--bind-to-core --bysocket" /software/apps/vasp/5.3.2-13Sep12/openmpi/vasp-gamma

MKL

The Intel Math Kernel Library (MKL) is available, and we strongly recommend using it. Several versions of MKL may exist, you can see which versions are available with the "module avail" command.

See Math libraries for more information.

Common problems

Not specifying how many cores you want

Note: If you on e.g Neolith have used the "-N" option (e.g -N2) only to get a number of full nodes, on Triolith you need to add --exclusive (or some other means of specifying the number of cores to allocate).

sbatch -N2 will give you a total of two cores spread out over two nodes, which is probably not what you want.

Cannot submit jobs when you are a member of multiple projects?

If you are a member of a single project, the scheduler assumes that your will run all your jobs using that project.

When you are a member of multiple projects however, the scheduler cannot decide for you which project to use, to you need to specify which project to use for each job.

If you don't specify a project when you're a member of multiple project, you will get the following error:

sbatch: error: Batch job submission failed: Invalid account or account/partition combination specified

Note: since the PI of a project may add new members at any time, you might suddenly find yourself a member of multiple projects, so if you want to be on the safe side, always specify which project to use, even if you are currently a member of just one.

You can specify which project to use for a job in several ways:

  1. Add -A PROJECT_NAME or --account=PROJECT_NAME as an option to sbatch or interactive on the command line (e.g interactive -A snic-123-456)
  2. Add #SBATCH -A PROJECT_NAME or #SBATCH --account=PROJECT_NAME to your job script
  3. Set the environment variable $SBATCH_ACCOUNT, e.g export SBATCH_ACCOUNT=PROJECT_NAME

Note: replace PROJECT_NAME in the examples with your actual project name, which you can find in NSC Express (the Resource Manager Name line when you view a project) or by running projinfo on Triolith.