Xenomorph

 

Please read the Updates section below if you get an error message when logging in to the mic0 or mic1 cards from the xenomorph host node.

 

Specification

Xenomorph comprises one server hosting two Intel 7120P “Xeon Phi” (aka MIC / Knights Corner) co-processor cards.

The host node (xenomorph):

  • Two six-core Intel Sandybridge CPUs
  • 32 GB RAM

Each card (mic0 and mic1):

  • 61 CPU cores, each with four threads, giving 244 logical cores (Intel x86)
  • 16 GB RAM
  • 16-element wide vector units (MIC instruction set)
  • Update 13-Feb-2015: MPSS 3.3.3, driver 3.3.3-1 now installed.

Intel Software:

  • Composer XE 2015 3.187 (aka v15.0.3)
  • Composer XE 2013 SP1.3.174 (aka v14.0.3)
  • Composer XE 2013 SP1.0.080 (aka v14.0.0) – uninstalled
  • VTtune Amplifier XE 2015
  • VTtune Amplifier XE 2013

Filesystems are available on the host (xenomorph) and MIC cards – hence no need to copy files between host and MICs:

  • $HOME (your home directory)
  • /opt/intel (Intel compilers, libraries, etc)
  • /opt/gridware (other applications)

All users are recommended to read the Intel developer documentation available at

Updates

The MPSS software stack has been upgraded to version 3.3.3 (the last available supporting Scientific Linux 6.2).

Existing users: when logging in to the Phi cards (mic0 and mic1) from the xenomorph host to run native applications you may receive the following error message:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that the RSA host key has just been changed.

To fix this remove any lines beginning mic0 and mic1 from the file ~/.ssh/known_hosts. You can edit this file with a text editor or simply run the following command on the xenomorph host:

sed -i '/^mic/d' ~/.ssh/known_hosts

You can then run:

ssh mic0

and will be asked if you want to continue:

The authenticity of host 'mic0 (172.31.1.1)' can't be established.
RSA key fingerprint is ba:6d:cd:a8:.......:a7:37:1b:7f.
Are you sure you want to continue connecting (yes/no)?

You can answer yes to this question. You will then be logged in to the mic card and should not be asked for a password. If you are asked for a password please email the ITS RI team at its-ri-team@manchester.ac.uk.

Getting Access to Xenomorph

To gain access to Xenomorph, please email the ITS RI team at its-ri-team@manchester.ac.uk.

Restrictions on Access

Priority is given to those who funded the system, but other University of Manchester academics and computational researchers may gain access to evaluation and pump-priming purposes.

Accessing the Host Node (Xenomorph)

Those who have been given access to Xenomorph can login to the node by using qrsh from the zrek login node

# Reserve one (of the two) Xeon Phis (MIC) cards:
qrsh -l xeonphi bash

# Reserve both Xeon Phis (MIC) cards:
qrsh -l xeonphiduo bash

No password required from the zrek login node.

Any references to the host in the instructions below are referring to this node (xenomorph). Most of the commands you run will be run on the host but it is possible to log in to the Phi cards themselves (details below). Ensure you are aware of which node you are currently logged in to (xenomorph, mic0, mic1) and hence where your commands will be executed.

Card Micro OS

Each Xeon Phi (aka MIC) card runs a “micro OS” — a very basic version of Linux. Once logged into the host node, users can login to a card’s micro OS by using SSH. This allows you to run native executables (compiled with -mmic) that have been compiled to run entirely on the MIC cards. The alternative is to compile and run offload executables which you run on the host CPU (xenomorph) but then automatically launch sections of their code on to the MIC card(s) – see below for compilation instructions.

One or both MIC cards will have been reserved for your session. To see which run the command:

echo $OFFLOAD_DEVICES

It will display either 0 or 1 or 0,1. In the commands below, when referring to mic0 or mic1 you should only log in to the MIC card(s) that have been reserved for you:

ssh  mic0
ssh  mic1

or, using the IP addresses

ssh  172.31.1.1
ssh  172.31.2.1

Authentication is by passwordless SSH key — keys are generated for each user when accounts are created.

Note: If you are asked for a password when logging in to the mic0 or mic1 cards then something is wrong with your key. Please email its-ri-team@manchester.ac.uk to report the error so that we can fix it.

Shared Filesystems

Your home directory on mic0 and mic1 is the same as on the host (xenomorph). Hence there is no need to manually copy compiled executables to the MIC cards if running native code directly on the MIC cards.

The /opt filesystem is also visible on the MIC cards. Hence all Intel compiler libraries are visible on the MIC cards – there is no need to transfer any shared objects (.so files) to the MIC cards if running native code directly on the MIC cards

Running Executables

Executables can be compiled to run

  • Directly on the MIC cards (log in to them, run your executable). These are native executables.
  • On the host (xenomorph) with some portions of the code automatically run on the MICs. These are offload executables.

We now give details on running these executables. See below for compilation details.

Running Native Code on the MIC

Native code is an executable that has been compiled to run directly on the MIC cards. See below for more info on compiling your code.

Once logged in to a MIC card you should set the LD_LIBRARY_PATH environment variable on the MIC to pick up the MIC-specific libraries in the compiler installation. You can do this manually using:

# We have ssh'd in to either mic0 or mic1. Uses the latest of the Intel compiler libraries:
export LD_LIBRARY_PATH=/opt/intel/composerxe/compiler/lib/mic:/opt/intel/composerxe/mkl/lib/mic

# OR use a specific version of the compiler libraries (choose one)
# export IVER=composer_xe_2013_sp1.3.174
export IVER=composer_xe_2015.3.187
export LD_LIBRARY_PATH=/opt/intel/$IVER/compiler/lib/mic:/opt/intel/$IVER/mkl/lib/mic

or alternatively you can add this to a file in your home directory named .bash_profile (notice the dot at the start of the filename):

# Only set variables when logging in to a MIC
if [ "`hostname | grep -c mic`" -gt 0 ]; then
  # Use latest compiler libraries (see above if you require a specific version)
  export LD_LIBRARY_PATH=/opt/intel/composerxe/compiler/lib/mic:/opt/intel/composerxe/mkl/lib/mic
fi

Note that we only set the variable if logging in to a MIC card. This is because home directories on the central isilon storage are often shared between several systems (e.g., the CSF, iCSF and zrek/xenomorph) and so the .bash_profile file will be read when logging in to several different systems. So we only make the settings for the MIC cards if logging in to a MIC.

If you forget to set the LD_LIBRARY_PATH when logging in to a MIC to run native code you’ll see an error similar to the following:

# We have ssh'd in to either mic0 or mic1
./my_native.exe: error while loading shared libraries: libiomp5.so: \
  cannot open shared object file: No such file or directory

This indicates a native executable needs the OpenMP library on the MIC but LD_LIBRARY_PATH has not been set. Set it as above.

Running Offload Code on the MIC

Offload applications are compiled to run on both the xenomorph host CPU and the MIC cards (some portions of the code run on the MIC, some on the host CPU). Once you have compiled the offload code (see below for compilation instructions) simply run your offload application on the xenomorph host:

./my_offload.exe

You must be on the xenomorph host when you run this type of application.

Compilation

Compilation must be performed on the xenomorph host, not the MIC cards. Generally you either compile off-load executables, which compiles some of the code to run on the host and some to be automatically offloaded to the MIC cards, or, you compile native executables which you run entirely on the MIC cards by logging in to them and running the executable there. In all cases though you do the compilation on the host (xenomorph). You never compile on the MIC cards.

Please see the RAC Team’s Xeon Phi Getting Started Guide for in-depth advice on compiling and running code on this system.

To setup the compiler load the following modulefile

# Choose one of the following:
module load compilers/intel/15.0.3
module load compilers/intel/14.0.3

or alternatively use the usual Intel dot file(s):

# Loads the latest version:
source /opt/intel/bin/compilervars.sh intel64

# Loads a specific version (choose one):
source /opt/intel/composer_xe_2015.3.187/bin/compilervars.sh
source /opt/intel/composer_xe_2013_sp1.3.174/bin/compilervars.sh

Compiler Basics

A brief introduction to compilation is now given. We recommend you read the RAC Team’s Xeon Phi Getting Started Guide for in-depth advice on compiling and running code on this system.

Code can be compiled to run:

  • In offload mode, i.e., parts of your code run on the host and other parts run on a MIC card (a fat binary)
  • In native mode, i.e, to be run entirely on a MIC card (a native binary)
  • Or to be run entirely on the host as usual (usually your first step before porting code to run using one of the above two methods).

But in all cases the compilation is performed on the host (xenomorph). Compiler flags (and source code directives) determine where you can run your code.

# Choose one of the following:
module load compilers/intel/15.0.3
module load compilers/intel/14.0.3

Compiling Offload-mode Code

offload mode – compiles a fat binary to run some parts on Xeon (host CPU) and parts on Phi card. Directives in your code determine which parts run where. If you don’t have any directives in the code you simply get an ordinary host CPU executable.

icc -openmp my_openmp_app.c -O3 -o my_openmp_app_offload.exe

Compiling Native-mode Code

‘native mode’ – compile executable to be run directly on a Phi card. Compilation is done on the xenomorph host. You then run the compiled executable by ssh-ing to one of the Phi cards and running the executable directly (see above about LD_LIBRARY_PATH)

icc -mmic -openmp my_openmp_app.c -O3 -o my_openmp_app_native.exe

OpenCL

OpenCL applications can be developed on xenomorph to run on both the host and the MICs. OpenCL may be familiar to GPU programmers. OpenCL is somewhat similar to CUDA but can run on many different devices and architectures. A complete OpenCL tutorial is beyond the scope of this page so we only give brief notes on how to compile and run OpenCL code.

To set up for OpenCL usage load the following modulefile:

module load libs/intel/opencl/4.4.0

You should load an Intel Compiler modulefile yourself.

The Intel OpenCL SDK support OpenCL 1.2 (the 4.x.y number is Intel’s SDK version number).

The OpenCL modulefiles will set the $INTELOCLSDKROOT variable in your environment which gives the location of the Intel OpenCL tools and sample codes.

OpenCL Samples

There are five OpenCL sample codes which you can copy and edit. These are located in the directory:

$INTELOCLSDKROOT/samples/

For example, to run the sample matrix multiply example:

# Take a copy of the code:
mkdir ~/ocl
cd ~/ocl
cp -r $INTELOCLSDKROOT/samples/GEMM .
cp -r $INTELOCLSDKROOT/samples/common .
cd GEMM

# The code is already compiled (run 'make' if you edit it) or
# run 'make clean' to remove all compiled files then 'make' to
# recompile it.

# Run the code on the CPU then the MIC we have reserved
./gemm.exe -t cpu
./gemm.exe -t acc

# Notice the difference in speed / GLFOPS when running on the MIC

OpenCL Host Compiler

Your host code should be compiled with the usual Intel C/C++ compilers. It is also possible to use gcc/g++ if your host code does not make use of Intel-specific features such as the MKL. However, we recommend using the Intel compilers to stay within that toolset.

The following command will compile C++ OpenCL host code:

icpc myoclapp.cpp -o myoclapp.exe  -lOpenCL

Notice that we do not need to specify -Ipath or -Lpath flags to indicate where the OpenCL header and library files are located – they are in the standard system locations.

In the above example we are only compiling C++ host code. It is assumed that the host code will, when executed, read a kernel source file (or kernel binary file – see below) and compile the kernel code using the usual OpenCL functions (e.g., clBuildProgram()). If you pre-compile the kernel code externally (see below) then you can also read in that compiled kernel from your host code. The point is, at this stage, we have only compiled host code with the Intel compiler and this is potentially the only code you need to compile on the xenomorph command-line.

To select the Xeon Phi accelerator device from your host code, you should request a device of type CL_DEVICE_TYPE_ACCELERATOR (if converting GPU code you will need to change any references to CL_DEVICE_TYPE_GPU). For example:

clGetDeviceIDs(platform, CL_DEVICE_TYPE_ACCELERATOR, 1, &device_id, NULL);

OpenCL Kernel Compiler

As demonstrated in the samples OpenCL code can be compiled at runtime by the OpenCL driver – i.e., when you run your host code it first reads the kernel source file (usually a .cl file) and compiles it before lauching the kernel on the MIC device. An offline compiler is also available in your PATH (once you have loaded the OpenCL modulefile) and can be used to compile (and debug) kernel source before running the host code (which may do a lot of work – e.g., reading in a large input file). Run the following:

ioc64 -help

to see the command-line compiler options.

A graphical OpenCL kernel code compiler and debugger is available by running:

KernelBuilder64

The host code in your OpenCL application should then read a binary kernel file (rather than a source kernel file) before launching it on the MIC devices.

Further information is available in the Intel OpenCL SDK Users Guide.

Intel’s Parallel Universe magazine contains an article on using OpenCL on Intel Hardware: Leverage Your OpenCL™ Investment on Intel® Architectures (go to page 42).

VTune Profiling/Debugging

Please read the zCSF VTune application page for information on using the Intel VTune Profiler on Xenomorph and the Xeon Phi cards.

Load Monitor

A graphical utility showing load on the Phi cards is available by running on the host:

micsmc-gui

From the Cards pulldown menu select Show All.

Further Info

Last modified on April 21, 2016 at 10:01 am by Site Admin