Intel Architecture Flags
Do I need to use Architecture Flags?
In short, no, most jobs will run perfectly well without you needing to know anything about architecture flags – they are not a requirement in your jobscripts.
If you are interested in the available Intel CPU architectures in the CSF, or, for testing or job profiling purposes wish to target a specific architecture, read on.
Why use Architecture Flags?
All CSF3 CPU-only compute nodes use Intel CPUs (some users from CSF2 may recall it used AMD CPUs in addition to Intel CPUs). Jobs running on the CSF3 Intel hardware can run on Intel Haswell, Broadwell or Skylake CPUs (Skylake being the newest type in CSF3). These names are Intel codenames used to refer to processor microarchitectures.
You can choose to restrict your job to a specific architecture or allow the system to choose for you:
- Using a specific architecture is needed when timing code, doing several runs of the code with increasing numbers of cores – the same architecture should be used for each run.
- Using a specific architecture will however mean the scheduler has a smaller number of nodes on which your job can run. Hence your job may wait longer in the batch queue.
- By not specifying an architecture (leaving it to the system to choose) the scheduler has many more nodes on which it can look for free cores to run your job.
- If the number of cores requested by your job is only available on a specific node architecture (e.g., 32 core jobs can only run on Skylake CPUs currently) then the system will automatically choose the correct architecture – you do not need to specify the architecture.
In general, if your job doesn’t care which architecture it runs on, do not restrict it to a specific architecture – your job may spend less time in the queue. See the CSF’s current configuration for the number of compute nodes of each architecture.
See below for other advantages of running on Haswell, Skylake or Broadwell CPUs.
Number of Cores Available in each Architecture
The number of cores in CSF compute nodes of the various architecture types is given below:
- Haswell provides 24 cores per node
- Broadwell provides 28 cores per node
- Skylake provides 32 cores per node
The number of cores you request in a job may restrict the architectures that the job can use. For example, a job requesting 20 cores will only ever run on either Haswell, Broadwell or Skylake CPUs.
Compiling Code for an Architecture
Your job may run faster on these nodes depending on whether the application software it runs has been compiled with the Intel Compiler and AVX
, AVX2
or AVX512
flags to take advantage of the architecture. Please see the CSF Intel Compiler documentation for further info.
Note that applications that have been compiled by the Research Infrastructure Team usually have been compiled with these AVX flags enabled to create an optimized executable.
Not all Jobs can Use Architecture Flags
Some types of jobs do not accept architecture flags. The qsub
command will attempt to check your job’s resource request (combination of architecture, time-limit, parallel environment flags) to determine if you have specified a valid combination. The serial jobs and parallel jobs pages list exactly which flags can be used.
If you do not want to specify an architecture (instead letting the system choose) simply ignore the flags below.
Intel CPU-name Flags
To request Haswell cores (max 24 cores, not valid for serial jobs) for your jobs add the following in your job script:
#$ -l haswell
To request Broadwell cores (max 28 cores, valid for serial jobs) for your jobs add the following in your job script:
#$ -l broadwell
To request Skylake cores (max 32 cores, not valid for serial jobs) for your jobs add the following in your job script:
#$ -l skylake
Intel AVX Flags
The Intel Advanced Vector Extensions (AVX) features of the Intel CPUs can improve the performance of applications if they have been compiled to use these extensions. A discussions of the AVX, AVX2 and AVX512 features is beyond the scope of this page. However, please see the CSF’s Intel compiler page if compiling your own software for how to compile your code for these architectures.
If you wish to run your job on a compute node with a minimum AVX capability, the following flags can be used as an alternative to the CPU names given above. You should specify only one of the following flags in your job:
#$ -l avx # Job can run on Haswell, Broadwell, Skylake CPUs #$ -l avx2 # Job can run on Haswell, Broadwell, Skylake CPUs #$ -l avx512 # Job can run on Skylake CPUs
Limiting the available types of CPUs for your job may mean a longer wait in the job queue for those nodes to become free. We generally recommend not targeting specific CPUs. If the Research Infrastructure team have compiled the software using the Intel compiler then it will usually make best use of the AVX capabilities for the particular node the job lands on.
Example Jobscripts
A parallel jobscript that will run on a Haswell compute node.
#!/bin/bash --login #$ -cwd #$ -pe smp.pe 6 # A parallel job. Remove this line for serial jobs. #$ -l haswell # Ensure the job always lands on a Haswell node. # Load any required modulefiles for your app. For example: module load compilers/intel/17.0.7 # Run your app ./my_parallel_app
Without the -pe
flag you will be given one core for a serial job.
A parallel jobscript that will run on any compute node that support AVX2 instructions – i.e., Haswell, Broadwell or Skylake – the CSF will choose:
#!/bin/bash --login #$ -cwd #$ -pe smp.pe 6 #$ -l avx2 # Ensure the job always lands on an AVX2-capable compute node node. # Load any required modulefiles for your app. For example: module load compilers/intel/17.0.7 # Run your app ./my_parallel_app