Xenomorph
Please read the Updates section below if you get an error message when logging in to the mic0 or mic1 cards from the xenomorph host node. |
Specification
Xenomorph comprises one server hosting two Intel 7120P “Xeon Phi” (aka MIC / Knights Corner) co-processor cards.
The host node (xenomorph):
- Two six-core Intel Sandybridge CPUs
- 32 GB RAM
Each card (mic0 and mic1):
- 61 CPU cores, each with four threads, giving 244 logical cores (Intel x86)
- 16 GB RAM
- 16-element wide vector units (MIC instruction set)
- Update 13-Feb-2015: MPSS 3.3.3, driver 3.3.3-1 now installed.
Intel Software:
- Composer XE 2015 3.187 (aka v15.0.3)
- Composer XE 2013 SP1.3.174 (aka v14.0.3)
Composer XE 2013 SP1.0.080 (aka v14.0.0)– uninstalled- VTtune Amplifier XE 2015
- VTtune Amplifier XE 2013
Filesystems are available on the host (xenomorph) and MIC cards – hence no need to copy files between host and MICs:
$HOME
(your home directory)/opt/intel
(Intel compilers, libraries, etc)/opt/gridware
(other applications)
All users are recommended to read the Intel developer documentation available at
Updates
The MPSS software stack has been upgraded to version 3.3.3 (the last available supporting Scientific Linux 6.2).
Existing users: when logging in to the Phi cards (mic0
and mic1
) from the xenomorph
host to run native applications you may receive the following error message:
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Someone could be eavesdropping on you right now (man-in-the-middle attack)! It is also possible that the RSA host key has just been changed.
To fix this remove any lines beginning mic0
and mic1
from the file ~/.ssh/known_hosts
. You can edit this file with a text editor or simply run the following command on the xenomorph host:
sed -i '/^mic/d' ~/.ssh/known_hosts
You can then run:
ssh mic0
and will be asked if you want to continue:
The authenticity of host 'mic0 (172.31.1.1)' can't be established. RSA key fingerprint is ba:6d:cd:a8:.......:a7:37:1b:7f. Are you sure you want to continue connecting (yes/no)?
You can answer yes
to this question. You will then be logged in to the mic card and should not be asked for a password. If you are asked for a password please email the ITS RI team at its-ri-team@manchester.ac.uk.
Getting Access to Xenomorph
To gain access to Xenomorph, please email the ITS RI team at its-ri-team@manchester.ac.uk.
Restrictions on Access
Priority is given to those who funded the system, but other University of Manchester academics and computational researchers may gain access to evaluation and pump-priming purposes.
Accessing the Host Node (Xenomorph)
Those who have been given access to Xenomorph can login to the node by using qrsh from the zrek login node
# Reserve one (of the two) Xeon Phis (MIC) cards: qrsh -l xeonphi bash # Reserve both Xeon Phis (MIC) cards: qrsh -l xeonphiduo bash
No password required from the zrek login node.
Any references to the host in the instructions below are referring to this node (xenomorph). Most of the commands you run will be run on the host but it is possible to log in to the Phi cards themselves (details below). Ensure you are aware of which node you are currently logged in to (xenomorph, mic0, mic1) and hence where your commands will be executed.
Card Micro OS
Each Xeon Phi (aka MIC) card runs a “micro OS” — a very basic version of Linux. Once logged into the host node, users can login to a card’s micro OS by using SSH. This allows you to run native executables (compiled with -mmic
) that have been compiled to run entirely on the MIC cards. The alternative is to compile and run offload executables which you run on the host CPU (xenomorph) but then automatically launch sections of their code on to the MIC card(s) – see below for compilation instructions.
One or both MIC cards will have been reserved for your session. To see which run the command:
echo $OFFLOAD_DEVICES
It will display either 0
or 1
or 0,1
. In the commands below, when referring to mic0
or mic1
you should only log in to the MIC card(s) that have been reserved for you:
ssh mic0 ssh mic1
or, using the IP addresses
ssh 172.31.1.1 ssh 172.31.2.1
Authentication is by passwordless SSH key — keys are generated for each user when accounts are created.
Note: If you are asked for a password when logging in to the mic0 or mic1 cards then something is wrong with your key. Please email its-ri-team@manchester.ac.uk to report the error so that we can fix it.
Your home directory on mic0 and mic1 is the same as on the host (xenomorph). Hence there is no need to manually copy compiled executables to the MIC cards if running native code directly on the MIC cards.
The /opt
filesystem is also visible on the MIC cards. Hence all Intel compiler libraries are visible on the MIC cards – there is no need to transfer any shared objects (.so files) to the MIC cards if running native code directly on the MIC cards
Running Executables
Executables can be compiled to run
- Directly on the MIC cards (log in to them, run your executable). These are native executables.
- On the host (xenomorph) with some portions of the code automatically run on the MICs. These are offload executables.
We now give details on running these executables. See below for compilation details.
Running Native Code on the MIC
Native code is an executable that has been compiled to run directly on the MIC cards. See below for more info on compiling your code.
Once logged in to a MIC card you should set the LD_LIBRARY_PATH
environment variable on the MIC to pick up the MIC-specific libraries in the compiler installation. You can do this manually using:
# We have ssh'd in to either mic0 or mic1. Uses the latest of the Intel compiler libraries: export LD_LIBRARY_PATH=/opt/intel/composerxe/compiler/lib/mic:/opt/intel/composerxe/mkl/lib/mic # OR use a specific version of the compiler libraries (choose one) # export IVER=composer_xe_2013_sp1.3.174 export IVER=composer_xe_2015.3.187 export LD_LIBRARY_PATH=/opt/intel/$IVER/compiler/lib/mic:/opt/intel/$IVER/mkl/lib/mic
or alternatively you can add this to a file in your home directory named .bash_profile
(notice the dot at the start of the filename):
.profile
(but not .bash_profile
if you have one). Hence a suitable .profile
could be as follows
# Only set variables when logging in to a MIC if [ "`hostname | grep -c mic`" -gt 0 ]; then # Use latest compiler libraries (see above if you require a specific version) export LD_LIBRARY_PATH=/opt/intel/composerxe/compiler/lib/mic:/opt/intel/composerxe/mkl/lib/mic fi
Note that we only set the variable if logging in to a MIC card. This is because home directories on the central isilon storage are often shared between several systems (e.g., the CSF, iCSF and zrek/xenomorph) and so the .bash_profile
file will be read when logging in to several different systems. So we only make the settings for the MIC cards if logging in to a MIC.
If you forget to set the LD_LIBRARY_PATH
when logging in to a MIC to run native code you’ll see an error similar to the following:
# We have ssh'd in to either mic0 or mic1 ./my_native.exe: error while loading shared libraries: libiomp5.so: \ cannot open shared object file: No such file or directory
This indicates a native executable needs the OpenMP library on the MIC but LD_LIBRARY_PATH
has not been set. Set it as above.
Running Offload Code on the MIC
Offload applications are compiled to run on both the xenomorph host CPU and the MIC cards (some portions of the code run on the MIC, some on the host CPU). Once you have compiled the offload code (see below for compilation instructions) simply run your offload application on the xenomorph host:
./my_offload.exe
You must be on the xenomorph host when you run this type of application.
Compilation
Compilation must be performed on the xenomorph host, not the MIC cards. Generally you either compile off-load executables, which compiles some of the code to run on the host and some to be automatically offloaded to the MIC cards, or, you compile native executables which you run entirely on the MIC cards by logging in to them and running the executable there. In all cases though you do the compilation on the host (xenomorph). You never compile on the MIC cards.
Please see the RAC Team’s Xeon Phi Getting Started Guide for in-depth advice on compiling and running code on this system.
To setup the compiler load the following modulefile
# Choose one of the following: module load compilers/intel/15.0.3 module load compilers/intel/14.0.3
or alternatively use the usual Intel dot file(s):
# Loads the latest version: source /opt/intel/bin/compilervars.sh intel64 # Loads a specific version (choose one): source /opt/intel/composer_xe_2015.3.187/bin/compilervars.sh source /opt/intel/composer_xe_2013_sp1.3.174/bin/compilervars.sh
Compiler Basics
A brief introduction to compilation is now given. We recommend you read the RAC Team’s Xeon Phi Getting Started Guide for in-depth advice on compiling and running code on this system.
Code can be compiled to run:
- In offload mode, i.e., parts of your code run on the host and other parts run on a MIC card (a fat binary)
- In native mode, i.e, to be run entirely on a MIC card (a native binary)
- Or to be run entirely on the host as usual (usually your first step before porting code to run using one of the above two methods).
But in all cases the compilation is performed on the host (xenomorph). Compiler flags (and source code directives) determine where you can run your code.
# Choose one of the following: module load compilers/intel/15.0.3 module load compilers/intel/14.0.3
Compiling Offload-mode Code
offload mode – compiles a fat binary to run some parts on Xeon (host CPU) and parts on Phi card. Directives in your code determine which parts run where. If you don’t have any directives in the code you simply get an ordinary host CPU executable.
icc -openmp my_openmp_app.c -O3 -o my_openmp_app_offload.exe
Compiling Native-mode Code
‘native mode’ – compile executable to be run directly on a Phi card. Compilation is done on the xenomorph host. You then run the compiled executable by ssh-ing to one of the Phi cards and running the executable directly (see above about LD_LIBRARY_PATH)
icc -mmic -openmp my_openmp_app.c -O3 -o my_openmp_app_native.exe
OpenCL
OpenCL applications can be developed on xenomorph to run on both the host and the MICs. OpenCL may be familiar to GPU programmers. OpenCL is somewhat similar to CUDA but can run on many different devices and architectures. A complete OpenCL tutorial is beyond the scope of this page so we only give brief notes on how to compile and run OpenCL code.
To set up for OpenCL usage load the following modulefile:
module load libs/intel/opencl/4.4.0
You should load an Intel Compiler modulefile yourself.
The Intel OpenCL SDK support OpenCL 1.2 (the 4.x.y number is Intel’s SDK version number).
The OpenCL modulefiles will set the $INTELOCLSDKROOT
variable in your environment which gives the location of the Intel OpenCL tools and sample codes.
OpenCL Samples
There are five OpenCL sample codes which you can copy and edit. These are located in the directory:
$INTELOCLSDKROOT/samples/
For example, to run the sample matrix multiply example:
# Take a copy of the code: mkdir ~/ocl cd ~/ocl cp -r $INTELOCLSDKROOT/samples/GEMM . cp -r $INTELOCLSDKROOT/samples/common . cd GEMM # The code is already compiled (run 'make' if you edit it) or # run 'make clean' to remove all compiled files then 'make' to # recompile it. # Run the code on the CPU then the MIC we have reserved ./gemm.exe -t cpu ./gemm.exe -t acc # Notice the difference in speed / GLFOPS when running on the MIC
OpenCL Host Compiler
Your host code should be compiled with the usual Intel C/C++ compilers. It is also possible to use gcc/g++ if your host code does not make use of Intel-specific features such as the MKL. However, we recommend using the Intel compilers to stay within that toolset.
The following command will compile C++ OpenCL host code:
icpc myoclapp.cpp -o myoclapp.exe -lOpenCL
Notice that we do not need to specify -Ipath
or -Lpath
flags to indicate where the OpenCL header and library files are located – they are in the standard system locations.
In the above example we are only compiling C++ host code. It is assumed that the host code will, when executed, read a kernel source file (or kernel binary file – see below) and compile the kernel code using the usual OpenCL functions (e.g., clBuildProgram()
). If you pre-compile the kernel code externally (see below) then you can also read in that compiled kernel from your host code. The point is, at this stage, we have only compiled host code with the Intel compiler and this is potentially the only code you need to compile on the xenomorph command-line.
To select the Xeon Phi accelerator device from your host code, you should request a device of type CL_DEVICE_TYPE_ACCELERATOR
(if converting GPU code you will need to change any references to CL_DEVICE_TYPE_GPU
). For example:
clGetDeviceIDs(platform, CL_DEVICE_TYPE_ACCELERATOR, 1, &device_id, NULL);
OpenCL Kernel Compiler
As demonstrated in the samples OpenCL code can be compiled at runtime by the OpenCL driver – i.e., when you run your host code it first reads the kernel source file (usually a .cl
file) and compiles it before lauching the kernel on the MIC device. An offline compiler is also available in your PATH (once you have loaded the OpenCL modulefile) and can be used to compile (and debug) kernel source before running the host code (which may do a lot of work – e.g., reading in a large input file). Run the following:
ioc64 -help
to see the command-line compiler options.
A graphical OpenCL kernel code compiler and debugger is available by running:
KernelBuilder64
The host code in your OpenCL application should then read a binary kernel file (rather than a source kernel file) before launching it on the MIC devices.
Further information is available in the Intel OpenCL SDK Users Guide.
Intel’s Parallel Universe magazine contains an article on using OpenCL on Intel Hardware: Leverage Your OpenCL™ Investment on Intel® Architectures (go to page 42).
VTune Profiling/Debugging
Please read the zCSF VTune application page for information on using the Intel VTune Profiler on Xenomorph and the Xeon Phi cards.
Load Monitor
A graphical utility showing load on the Phi cards is available by running on the host:
micsmc-gui
From the Cards pulldown menu select Show All.