![]() | ||
|
Gimle User Guide
Short DescriptionBore/Gimle is a Linux-based cluster with 140 HP ProLiant DL160 G5 and 128 HP Proliant DL170h G6 compute servers with a combined peak performance of 20 Tflops. Each DL160 compute server is equipped with two quad-core Intel® Xeon® E5462 processors while the DL170h compute servers contain two quad-core Intel® Xeon® E5520 each. The installation also includes a total of 7 ProLiant DL380 G5 system servers that handle cluster storage and administration tasks. In total, the cluster has over 4.5 TiB of main memory. The compute nodes communicate over a high-speed network based on Infiniband equipment from Cisco and Voltaire. The compute servers are split between the Bore and Gimle parts of the cluster. The Bore part of the cluster is dedicated to weather forecast production. The Gimle part of the cluster, the topic of the rest of this guide, is used for research and development. At this moment (May 30, 2011) the Gimle part has 108+120 nodes with the rest assigned to Bore. The environment on Gimle is based on the modern environment developed for the SNIC cluster Neolith. Please take your time and learn more about Gimle from the information in this user guide. Hardware
Software
Quickstart Guide
Security and Accessing the SystemAccessing the SystemLog in to Gimle with sshTo log into the system, use the username provided to you by NSC, and issue
$ ssh username@gimle.nsc.liu.se
Unix: Windows: Both OpenSSH and PuTTY can be used for "X forwarding": With ssh add the command line flag -X, or with PuTTY toggle "Enable X11 forwarding" in the preferences. Note that using X forwarding may require additional configuration of your local machine, e.g. you will need an X server. Please consult your local system administrator if you run into trouble. File transfer is available using scp, sftp, or sshfs
SecurityWhen a system is compromised and passwords stolen, the thing that causes the most grief is when the stolen password can be used for more than one system. A user that has accounts on many different computers and gets his/her shared password stolen will allow the intruders to easily cross administrative domains and further compromise other systems.
Logging into a system and traversing from that system to another one in a chain (as illustrated below) should be avoided. ![]() When logging into a system, please check the “last login” information shown. If you can't verify the information, contact smhi-support@nsc.liu.se as soon as possible.
SSH Public-key AuthenticationThere is an alternative to traditional passwords. This method of authentication is known as key-pair or public-key authentication. While a password is simple to understand (the secret is in your head until you give it to the ssh server which grants or denies access), a key-pair is somewhat more complicated. A key-pair is as the name suggests a pair of cryptographic keys. One of the keys is called the private key (this one should be kept secure and protected with a pass phrase) and a public key (this one can be passed around freely as the name suggests). After you have created the pair, you have to copy the public key to all systems to which you wish to establish a ssh-connection. The private key is kept as secure as possible and protected with a good pass phrase. On your laptop/workstation you use a key-agent to hold the private key while you work. Benefits and drawbacks:
Short description of SSH public-key authentication (see also Chapter 4 in SSH tips, tricks & protocol tutorial by Damien Miller):
StorageAvailable file systemsUsers have access to different file systems on Gimle. Below is a list of available file systems and their respective total sizes. Note, however, that the available size per user may be limited by quotas. Use the command$ quota -sto see your own quotas.
/home, used for important dataThe home file system is mounted at /home on each machine in the cluster, and is backed up on a dayly basis. Each user has its own home directory (see the environment variable HOME)./nobackup, used for scratch dataThe nobackup file systems are mounted on subdirectories of /nobackup/ on each machine in the cluster, and is not backed up. Each user has own directories /nobackup/filesystem/$USER (where $USER means the username of corresponding user)./scratch/local, used for local scratch dataOn each compute node, there is a node-local file system mounted at /scratch/local. This can be useful for certain applicatons./software, contain applicationsCommon applications installed by NSC are found on the /software file system and is accessable from every machine in the cluster. This file system is not user writable.Publishing data to non-Gimle usersGimle is connected to the SMHI Publisher system, which allows Gimle users to copy data to a publishing server, from where it can be downloaded by users without the need for a Gimle account. Please read the Publisher User Guide for more information. [top]EnvironmentWe use cmod (module) to handle the environment when there exist several installed versions of the same software. This application sets up the correct paths to the binaries, man-pages, libraries, etc. for the currently selected module.The correct environment is set up by using the module command. A list of some subcommands to module includes:
A default environment is automatically declared when you log in. The default modules are: [username@gimle ~]$ module list Currently loaded modules: 1) ifort 2) icc 3) idb 4) dotmodules 5) base-config 6) default In order to find out to which version of the compiler the module ifort refer, you may list all modules: [username@gimle ~]$ module avail In directory /etc/cmod/modulefiles: -base-config/1 (def) -ifort/9.1 -base-config/default -ifort/9.1.052 +default -ifort/default +dotmodules -intel/10.1 -icc/10.1 (def) -intel/9.1 -icc/10.1.011 -intel/default -icc/10.1.017 -mkl/10.0.3.020 (def) -icc/9.1 -mkl/9.1.023 -icc/9.1.052 -mkl/default -icc/default -openmpi/1.2.3-g411 -idb/10.1 (def) -openmpi/1.2.3-i100025 -idb/10.1.011 -openmpi/1.2.4-i100026 -idb/10.1.017 -openmpi/1.2.5-i101011 (def) -idb/9.1 -openmpi/default -idb/9.1.052 -pyenv/default -idb/default -pyenv/nsc1 (def) -ifort/10.1 (def) -scampi/3.12.0-1 (def) -ifort/10.1.011 -scampi/default -ifort/10.1.017 The note "(def)" indicates which version that is the default, and, in case of the Fortran compiler, it is thus version 10.1. Please note, however, that the choice of default module may change over time. Therefore, if you wish to re-compile part of a program and link a new executable, you may need to ensure that you are using the same version of the compiler that you had at the time of the first built. You can switch to another version of the compiler as follows: [username@gimle ~]$ module list Currently loaded modules: 1) ifort 2) icc 3) idb 4) dotmodules 5) base-config 6) default [username@gimle ~]$ module unload ifort [username@gimle ~]$ module list Currently loaded modules: 1) icc 2) idb 3) dotmodules 4) base-config 5) default [username@gimle ~]$ module load ifort/9.1.052 [username@gimle ~]$ module list Currently loaded modules: 1) icc 2) idb 3) dotmodules 4) base-config 5) default 6) ifort/9.1.052 Hint: The environment is specified in the files located under /etc/cmod/modulefiles. Resource Name Environment VariableIf you are using several NSC resources and copying scripts between them, it can be useful for a script to have a way of knowing what resource it is running on. You can use the NSC_RESOURCE_NAME variable for that: [username@gimle ~]$ echo "Running on $NSC_RESOURCE_NAME" Running on gimle CompilingWe recommend using the Intel compilers: ifort (Fortran), icc (C), and icpc (C++). Compiling OpenMP ApplicationsExample: compiling the OpenMP-program, openmp.f with ifort:
$ ifort -openmp openmp.f
Example: compiling the OpenMP-program, openmp.c with icc:
$ icc -openmp openmp.c
Compiling MPI ApplicationsBefore compiling an MPI application you should load an MPI module. We recommend the Scali MPI, which is added to your environment with the command:
$ module add scampi
Example: compiling the MPI-program, mpiprog.f with ifort:
$ ifort -Nmpi mpiprog.f
Where mpiprog.f being:
program mpiprog
implicit none
include "mpif.h"
C
integer error, rank, size, mpi_common_world
C
call mpi_init(error)
call mpi_comm_rank(mpi_comm_world,rank,error)
call mpi_comm_size(mpi_comm_world,size,error)
C
print *, "Rank number", rank, " of total", size, "."
C
call mpi_finalize(error)
C
end program mpiprog
Example: compiling the MPI-program, mpiprog.c with icc:
$ icc -Nmpi mpiprog.c
Compiler WrappersWhen invoking any of the intel compilers (icc, ifort, or icpc), there is a wrapper-script that looks for Gimle-specific options. Options starting with -N are used by the wrapper to affect the compilation and/or linking processes, but these options are not passed to the compiler itself.
For example: $ module load mkl $ ifort -Nverbose -Nmkl -o example example.F -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_lapack -lmkl_core -openmp -lpthread ifort INFO: Linking with MKL mkl/10.0.3.020. ifort INFO: -Nmkl resolved to: -I/software/intel/mkl/10.0.3.020/include -L/software/intel/mkl/10.0.3.020/lib/em64t -Wl,--rpath,/software/intel/mkl/10.0.3.020/lib/em64t The wrappers add tags to the executables with information regarding the compilation and linking. You may use the dumptag command to get a list of these labels: [user@gimle ~]$ dumptag mpiprog.x -- NSC-tag ---------------------------------------------------------- File name: /home/kent/mpiprog.x Properly tagged: yes Tag version: 4 Build date: 080702 Build time: 142958 Built with MPI: scampi 3_12_0_1 Built with MKL: no (or build in an unsupported way) Linked with: ifort 10_1_011 --------------------------------------------------------------------- [user@gimle ~]$ Useful Options for the Intel CompilersBelow is a short list of useful compiler options. OptimizationThere are three different optimization levels in Intel's compilers and then some more knobs to turn:
Recommended optimization options
Debugging
Profiling
Options that only apply to Fortran programs
MiscellaneousLittle endian to Big endian conversion in Fortran is done through the F_UFMTENDIAN environment variable. When set, the following operations are done:
Math librariesMKL, Intel Math Kernel LibraryThe Intel Math Kernel Library (MKL) is available, and we strongly recommend using it. Several versions of MKL may exist, you can see which versions are available with the "module avail" command. The instructions here are valid for MKL 10.0 and newer, older versions worked differently. The library includes the following groups of routines:
Full documentation can be found online at http://www.intel.com/software/products/mkl/ and in ${MKL_ROOT}/doc on Gimle. Library structureThe Intel MKL is located in the /software/intel/mkl/ directory. The MKL consists of two parts: a linear algebra package and processor specific kernels. The former part contains LAPACK and ScaLAPACK routines and drivers that were optimized as without regard to processor so that it can be used effectively on different processors. The latter part contains processor specific kernels such as BLAS, FFT, BLACS, and VML that were optimized for the specific processor.Linking with MKLTo use LAPACK and BLAS software you must link several libraries: MKL LAPACK and the threaded or sequential kernel. The required MKL-path is automatically added by the compiler wrapper if the option -Nmkl is added, and the appropriate MKL-module is loaded.This table lists the most common MKL link options. See the following chapter for examples.
MKL and threadingIf threaded or sequential MKL gives best performance varies between applications. MPI applications will typically launch one MPI-rank on each processor core on each node, in this case threads are not needed as all cores are already used. However if you use threaded MKL you can start fewer ranks per node and increase the number of threads per rank accordingly. The threading of MKL can be controlled at run time through the use of a few special environment variables.
Example, dynamic linking using ifort and lapackUse MKL LAPACK and threaded MKL: $ module load mkl $ ifort -Nmkl -o example example.o -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_lapack -lmkl_core -openmp -lpthread ifort INFO: Linking with MKL mkl/10.0.3.020. Use MKL LAPACK and sequential MKL: $ module load mkl $ ifort -Nmkl -o example example.o -lmkl_intel_lp64 -lmkl_sequential -lmkl_lapack -lmkl_core ifort INFO: Linking with MKL mkl/10.0.3.020. Example, linking with MKL ScaLAPACK and OpenMPIScaLAPACK depends on BLACS, LAPACK, and BLAS (in that order), where the BLACS library also depends on an underlying MPI. Therefore, it is important to choose the correct combination of libraries in the right order when linking a program with ScaLAPACK. MKL is shipped with BLACS-libraries which are precompiled for OpenMPI and IntelMPI (the latter is not installed on Gimle). To link a program with ScaLAPACK and OpenMPI:$ module load mkl $ module load openmpi $ ifort -Nmkl -Nmpi -o my_binary my_code.f90 -lmkl_scalapack_lp64 -lmkl_blacs_openmpi_lp64 \ -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_lapack -lmkl_core -openmp -lpthread ifort INFO: Linking with MPI openmpi/1.2.5-i101011. ifort INFO: Linking with MKL mkl/10.0.2.018. Example, linking with ScaLAPACK, alternatives to MKL and OpenMPIBy default we would recommend using the above combination (OpenMPI + MKL), but there are alteratives. It so happens that both mvapich2 and IntelMPI are derived from the same code base (mpich2), and mvapich2 can (usally) be used as a drop in replacement for IntelMPI. As compared to the OpenMPI+MKL example above, instead of blacs_openmpi use blacs_intelmpi. I.e.:$ module load mkl $ module load mvapich2 $ ifort -Nmkl -Nmpi -o my_binary my_code.f90 -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 \ -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_lapack -lmkl_core -openmp -lpthread ifort INFO: Linking with MPI mvapich2/1.0.2-i101011. ifort INFO: Linking with MKL mkl/10.0.2.018.It is also possible to use ScaliMPI by using the "vanilla" netlib ScaLAPACK and BLACS, and link them against your LAPACK/BLAS of choice. If your choice of LAPACK/BLAS is MKL (generally the best choice): $ module load mkl $ module load scampi $ sppath=/software/libs/scalapack/1.8.0/i101011 $ blpath=/software/libs/BLACS/i101011/LIB-scamp $ ifort -Nmkl -Nmpi -o my_binary my_code.f90 $sppath/libscalapack.a \ $blpath/blacsF77init_MPI-Gimle-0.a $blpath/blacs_MPI-Gimle-0.a \ -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_lapack -lmkl_core -openmp -lpthread[top] Executing Parallel JobsThere are two main alternatives to develop program codes that can be executed on multiple processor cores: OpenMP and MPI. OpenMP parallelization can be used for paralllelization of code that is to run within a single node (with up to 8 cores), whereas MPI is used for parallelization of code that can run on single as well as multiple nodes. The two types of applications are executed differently. Executing an MPI applicationAn MPI application is started with the command:$ mpprun mpiprog.x Use "mpprun --help" to get a list of options and a brief description. Note:
Executing an OpenMP applicationThe number of threads to be used by the application must be defined, and should be less or equal to eight. You can set the number of threads to be used by the application in two ways, either by defining a shell environment variable before starting the application or by calling an OpenMP library routine in the serial portion of the code.
Submitting JobsThe batch queue system is comprised of two parts: (i) the SLURM resource manager and (ii) the Moab scheduler. There are two ways to submit jobs to the batch queue system, either as an interactive job or as a batch job. Interactive jobs are most useful for debugging as you get interactive access to the input and the output of the job when it is running. But the normal way to run the applications is by submitting them as batch jobs. Interactive job submissionAn interactive access to the compute nodes is provided with the command interactive. This command accepts the same options as the sbatch command described below. In order to start an interactive jobs allocating 2 nodes and 10 cores for 10 minutes, you type $ interactive -N 2 -n 10 -t 00:10:00 Note: If you leave out the "-n 10" argument in the command, you will by default be given all available cores (in this case 16). Once your interactive jobs has started, you are logged in to the first node in the list of nodes that has been assigned for the job. An environment has been created for you that in addition to ordinary variables also contain a number of SLURM environment variables: [user@n212 ~]$ env | grep -i slurm SLURM_NODELIST=n[212-213] SLURMD_NODENAME=n212 SLURM_PRIO_PROCESS=0 SLURM_NNODES=2 SLURM_JOBID=5341 SLURM_TASKS_PER_NODE=8(x2) STY=1755.slurm5341 SLURM_JOB_ID=5341 SLURM_UMASK=0022 SLURM_NODEID=0 SLURM_TASK_PID=1755 SLURM_NPROCS=10 SLURM_PROCID=0 SLURM_JOB_NODELIST=n[212-213] SLURM_LOCALID=0 SLURM_JOB_CPUS_PER_NODE=8(x2) SLURM_GTIDS=0 SLURM_JOB_NUM_NODES=2 [user@n212 ~]$ Let us now run the trivial MPI Fortran application given above [mpiprog.f]: [user@n212 ~]$ mpprun mpiprog.x mpprun: INFO: using job specified number of tasks mpprun: INFO: starting scampi run on 2 nodes (10 tasks) Taking nodenames from "/tmp/tmp.hIniRn1821", number of nodes specified by -np /opt/scali/bin/mpimon -stdin all mpiprog.x -- n212 5 n213 5 Rank number 8 of total 10 . Rank number 1 of total 10 . Rank number 5 of total 10 . Rank number 6 of total 10 . Rank number 3 of total 10 . Rank number 7 of total 10 . Rank number 9 of total 10 . Rank number 0 of total 10 . Rank number 2 of total 10 . Rank number 4 of total 10 . [user@n212 ~]$ Batch job submissionThe two main commands for handling job submissions are:
Batch jobs are submitted to the queue system with the command sbatch: $ sbatch -J jobname submit.sh A minimal submit script that requires 2 nodes and 16 cores for 10 minutes may look like: #!/bin/bash #SBATCH -N 2 #SBATCH -t 00:10:00 mpprun ./mpiprog.x # End of script We note the use of "#SBATCH" lines in the script. This is an alternative way of specifying options to the sbatch command. We could thus have specified the jobname in the script with an additional line reading #SBATCH -J jobname Let us submit the above script: [user@gimle ~]$ sbatch -J mpiprog submit.sh sbatch: Submitted batch job 5351 [user@gimle ~]$ After the job has completed, the output to standard out and standard error (if not re-directed) is returned from the system in a file called slurm-JOBID.out In this case this is where we find the output from our program: [user@gimle paralllel_program_test]$ cat slurm-5351.out mpprun: INFO: number of tasks set to all cores on allocated nodes mpprun: INFO: starting scampi run on 2 nodes (16 tasks) Taking nodenames from "/tmp/tmp.IieKGI4556", number of nodes specified by -np /opt/scali/bin/mpimon -stdin all ./mpiprog.x -- n212 8 n213 8 Rank number 8 of total 16 . Rank number 11 of total 16 . Rank number 13 of total 16 . Rank number 10 of total 16 . Rank number 14 of total 16 . Rank number 15 of total 16 . Rank number 2 of total 16 . Rank number 12 of total 16 . Rank number 9 of total 16 . Rank number 1 of total 16 . Rank number 0 of total 16 . Rank number 4 of total 16 . Rank number 3 of total 16 . Rank number 6 of total 16 . Rank number 5 of total 16 . Rank number 7 of total 16 . [user@gimle paralllel_program_test]$ Useful options to sbatch are listed with the command $man sbatch The most useful options are listed below. They work for the interactive command too.
PartitionsAs Gimle now contains two types of nodes (connected to separate InfiniBand interconnects too), we need to separate the two types. That is done using the SLURM partition concept. To run on the "old" nodes, specify partition harpertown (the codename of that processor generation). To run on the "new" nodes, specify partition nehalem (the codename of that processor generation). As of December 2009, different groups at SMHI are assigned to either the "old" or the "new" nodes. We try to set the SBATCH_PARTITION environment variable automatically to make sure that your jobs end up in the right partition. If that does not work, please set SBATCH_PARTITION yourselves or use the -p flag to the sbatch and interactive commands. Opportunistic jobs ("riskjobb")Sometimes, all nodes of the system are not running regular jobs, because of project restrictions or system reservations. To fill them up, you may use opportunistic jobs (our Swedish translation is "riskjobb"). Those are able to bypass project and system restrictions but, on the down side, have two drawbacks:
When using a requeueable opportunistic job, please note that it may be interrupted anywhere during execution, and later rerun from the start. This works for many applications and scripts, but not for all. Your will have to save restart information repeatedly within your job script, but you must be aware that the script might be cancelled in the middle of the saving. An opportunistic job that is cancelled by the system will get a line like the one below in the SLURM output file: *** JOB 297014 CANCELLED AT 08/28-09:12:40 *** (You will get the same kind of message if you cancel the job
yourself using [top] Supervising JobsIn many cases it is desirable to supervise your running and scheduled jobs in order to find out if jobs have started or completed, how much remains of the allocated wall clock time, if a job produces sensible results, if a job makes efficient use of the cores, etc. Get a Quick Overview via the WebIf you need a quick overview of the scheduling status of the cluster, please look at the Scheduling Status for Gimle web page. Monitor the queueUseful commands to monitor the queue are:
User selective information is obtained with the "squeue" command: [user@gimle ~]$ squeue -u panor JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 5351 gimle mpiprog panor R 0:01 2 n[91-92] [user@gimle ~]$ We note that the output from "squeue" includes information about which nodes your application is running on. This information (plus other details) is also available with use of the "checkjob" command: [user@gimle ~]$ checkjob 28905 job 28905 AName: "cf3cl" State: Running Creds: user:panor group:nsc account:nsc class:slabanja qos:Normal WallTime: 12:11:25:22 of 20:16:00:00 SubmitTime: Mon Feb 18 13:54:07 (Time Queued Total: 1:21:50:49 Eligible: 3:30:40) StartTime: Wed Feb 20 11:44:56 Total Requested Tasks: 8 Req[0] TaskCount: 8 Partition: slurm Memory >= 1M Disk >= 1M Swap >= 0 Opsys: --- Arch: --- Features: --- NodeCount: 1 Allocated Nodes: [n650:8] StartCount: 3 Partition Mask: [slurm] StartPriority: 7730245 Reservation '28905' ( - -13days -> 7:13:26:52 Duration: 20:16:00:00) [user@gimle ~]$ Monitor a running jobApplications have various ways to return output from the calculations; some write to standard output (which may be re-directed) whereas others write specific output files that often reside in the scratch directory. In order to list the output of a running calculation in the latter case, you may need to access the local file systems of the compute nodes named "/scratch/local/". This is possible since you are allowed to log in with "ssh" to all compute nodes where you have running applications: [user@gimle ~]$ ssh n650 Last login: Mon Mar 3 10:28:03 2008 from l1 [user@n650 ~]$ df -m Filesystem 1M-blocks Used Available Use% Mounted on /dev/sda1 9844 1496 7848 17% / tmpfs 8028 0 8028 0% /dev/shm /dev/sda3 226365 36184 190181 16% /scratch/local d1:/home 4194172 1602713 2591460 39% /home s1:/software 95834 10259 85575 11% /software [user@n650 ~]$ Once logged in to a compute node with a running application, you may monitor the performance of your application with e.g. the "top" command: [user@n650 ~]$ top -u panor top - 14:35:09 up 14 days, 23:56, 1 user, load average: 1.73, 1.69, 1.60 Tasks: 170 total, 2 running, 168 sleeping, 0 stopped, 0 zombie Cpu(s): 9.2%us, 3.4%sy, 0.0%ni, 87.3%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 16439708k total, 16353084k used, 86624k free, 880k buffers Swap: 2047840k total, 180k used, 2047660k free, 14840652k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7615 panor 25 0 1855m 928m 7768 R 99 5.8 6661:50 dalton.x 3350 panor 15 0 12712 1164 832 R 0 0.0 0:00.09 top 3249 panor 15 0 87504 1668 964 S 0 0.0 0:00.00 sshd 3250 panor 16 0 68240 1768 1312 S 0 0.0 0:00.03 bash 7596 panor 17 0 65872 1192 1004 S 0 0.0 0:00.00 script 7597 panor 23 0 65876 1288 1056 S 0 0.0 0:00.00 dalton You can also run a command on each node in a job using srun from the login node as shown in the example below (where uptime is run on every node belonging to job 22684): [user@gimle ~]$ srun --jobid=22684 uptime 16:11:32 up 23:35, 0 users, load average: 7.74, 7.45, 7.42 16:11:32 up 23:35, 0 users, load average: 7.74, 7.44, 7.41 16:11:32 up 23:35, 0 users, load average: 7.74, 7.43, 7.40 16:11:32 up 23:35, 0 users, load average: 7.75, 7.46, 7.43 16:11:32 up 23:35, 0 users, load average: 7.79, 7.54, 7.48 16:11:32 up 23:35, 0 users, load average: 7.75, 7.46, 7.41 16:11:32 up 23:35, 0 users, load average: 7.79, 7.57, 7.50 16:11:32 up 23:35, 0 users, load average: 7.74, 7.45, 7.41 Job SchedulingThe priority of your queued job is calculated as the number of minutes your job has been eligible/idle in the queue, ready to run. "An early bird catches the worm." [top] Debugging and tracingStandard debugging tools like the GNU debugger gdb and Intel debugger idb are installed on Gimle. There are also a few special programs available to help trace and debug parallel applications. Intel Trace Analyzer and CollectorThis tool was previously named Vampir. It can be used to trace the communication patterns of a MPI application. This is accomplished by recompiling you application linked against trace libraries. The application then writes trace files when it is executed. These files can then be analyzed using the graphical trace analyzer from the login node. ITAC have several features not described here, full documentation is available in the directory /software/intel/itac/7.1/doc
How to use: $ module add impi 2. Load the Intel Trace Analyzer module: $ module add itac
3. Compile and link the MPI program with the extra CFLAGS " $ icc mpiprog.c -o mpiprog -Nmpi -lVT -I$VT_ROOT/include -L$VT_LIB_DIR $VT_ADD_LIBS 4. Run the program with mpprun as usual. This will write trace files in the work directory. $ mpprun ./mpiprog 5. Open the trace files using the trace analyzer on the login node. [faxen@gimle ~]$ traceanalyzer mpiprog-0(mpi:24646@n8).stf TotalView Parallel DebuggerFull documentation for TotalView, including a User Guide is available in the directory /software/apps/toolworks/totalview.8.7.0-7/doc/pdf or at the vendor's website. License information: There is currently only one single license for TotalView installed. If you encounter license availability problems then please contact support@nsc.liu.se so we can consider purchasing more licenses. Recipe for running TotalView: 1. Make sure that you can run X11 applications on the login node. (start an xterm or something similar to verify) 2. Load the MPI module you use. At the moment, Scali MPI, Intel MPI and OpenMPI (version 1.4.1 and higher) work: $ module add scampi 3. Load the TotalView module: $ module add totalview 4. Compile your application with -Nmpi -g to get MPI support and debug information in the binary (of course you need to use ifort instead of icc if your program is using Fortran): $ icc -Nmpi -g -o myapp myapp.c 5. Start an interactive job: $ interactive -N 1 -t 01:00:00 6. Launch the MPI program with TotalView in the interactive job shell by adding --totalview to the rest of the flags you use with mppun: $ mpprun --totalview ./myapp 7. Quick Start:
List of AcronymsGiB gibibyte, 1024**3 bytes MiB mebibyte, 1024**2 bytes MKL Math Kernel Library MPI Message Passing Interface OpenMP Open Multi-Processing scp secure copy SLURM Simple Linux Utility for Resource Management ssh secure shell TiB tebibyte, 1024**4 bytes Frequently Asked QuestionsThis part will be filled as needed. [top]
|