Xeon Phi installation


Some nodes in Triolith are equipped with Xeon Phi co-processor SE10/7120 series cards. The "ntelĀ® Manycore Platform Software Stack (MPSS) is necessary to run the IntelĀ® Xeon Phi co-processor.


Installation of MPSS-3.5.1

The basic steps for Xeon Phi installation are described below

  • Get the tar mpss tar file
cd installdir
wget http://registrationcenter.intel.com/irc_nas/7661/mpss-3.5.1-linux.tar
  • The compute node does not have some necessary dependencies. These are next installed:
sudo yum -y install elfutils.x86_64
sudo yum -y install rpm-build.x86_64
sudo yum -y install "kernel-devel-uname-r == $(uname -r)"
  • Untar the mpss archive
tar -xf mpss-3.5.1-linux.tar
cd mpss-3.5.1
  • Rebuild the rpms from the source. This is necessary to build rpm for the kernel version that is currently running
rpmbuild --define "_topdir $(pwd)/kmod_nsc" --rebuild src/mpss-modules*.src.rpm
cp kmod_nsc/RPMS/x86_64/mpss-modules-* modules/
cp modules/*`uname -r`*.rpm .
  • Install the rpms:
sudo yum -y install *.rpm
  • Add mic module to the Linux kernel
sudo modprobe mic
  • Initialize the Xeon Phi card. Do not add any users to Xeon Phi card (except root & phiuser)
sudo micctrl --initdefaults --users=none
  • Update mic flash to the current version
sudo micflash -update -device all

After mic flash update the node will require a reboot for the changes to take effect.


Starting the Xeon Phi card

To start using Xeon Phi following steps need to be done:

  • Add user to Xeon Phi card
sudo micctrl --useradd=user

The user's public ssh key will be added to the mic card. For password less login this key should be pass-phrase less. Other methods of password less login for Xeon Phi card may be explored later.

  • Start the service
sudo service mpss start

Once the service has started the specified user can use Xeon Phi card both in offload mode as well as in native mode.

  • To login to the Xeon Phi card from the compute (host) node
ssh 172.31.1.1

Stoping the Xeon Phi card

To stop using Xeon Phi following steps need to be done:

  • Stop the service
sudo service mpss stop
  • Delete the user from the Xeon Phi card. This will free up the RAM space occupied by the user from the Xeon Phi card.
sudo micctrl --userdel=user
  • Sometimes if it is needed to cleanup everything the following command can be used
sudo micctrl --cleanconfig

NFS mounting on Xeon Phi card

The Xeon Phi card has no persistence storage attached to it. We can NFS mount the host node file systems on the Xeon phi card. From host node run:

  • Start NFS service
chkconfig nfs on
service nfs start
  • export the /scratch/local/exported_folder to 172.31.1.1 (mic card) with read and write permission
exportfs -o rw 172.31.1.1:/scratch/local/exported_folder
  • mount the /scratch/local/exported_folder on mic card as /scratch/local/exported_folder:
micctrl --addnfs=$(hostname -i):/scratch/local/exported_folder --dir=/scratch/local/exported_folder

NFS unmounting on Xeon Phi card

From host node run:

micctrl --remnfs=/scratch/local/exported_folder
exportfs -u 172.31.1.1:/scratch/local/exported_folder
service nfs stop
chkconfig nfs off

Special considerations for Triolith

On Triolith compute node images are refreshed with default os image every time the node is rebooted. Hence the above installation procedure is done through a script. The script is invoked when the node is made online. The user account creation on Xeon Phi card is done when the user gets SLURM allocation for the Xeon Phi node. The user account is deleted from Xeon Phi card when the user SLURM allocation ends.


References

  1. mpps user guide
  2. System Administration for the Xeon Phi Coprocessor

Chandan Basu - cbasu@nsc.liu.se
Published - 2015/08/01