Research Infrastructure > CSF2 (retired) > Software > Compilers, Debuggers, Profilers > Compiler (PGI)

- Recent Posts & Updates

Page Contents

The CSF2 has been replaced by the CSF3 - please use that system! This documentation may be out of date. Please read the CSF3 documentation instead.
To display this old CSF2 page click here.

PGI Compilers

Overview

The Portland Compiler Suite contains Fortran 95, Fortran 90, Fortran 77, C, C++, HPF and CUDA Accelerator compilers for machines running 64bit Linux.

Version 13.6 is the most up to date installed on the CSF

Note: the PGI CUDA Accelerator supports OpenACC compiler directives. This is available via the PGI C compiler (pgcc) from version 12.10 and the PGI C++ and fortran compilers (pgcpp and pgf90) from version 13.6.

The PGI compiler can be used to compile for Intel (Westmere, Sandybridge), AMD (Magny-Cours, Bulldozer) and Nvidia CUDA GPU architectures.

Restrictions on use

There are only two network licenses available site wide for each of Fortran and C/C++. If you get a license related error, it is almost certainly because all of our licenses are in use and you should try again at a later time. For further licensing information see the following webpage:

http://www.applications.itservices.manchester.ac.uk/show_product.php?id=321&tab=licensing

Set up procedure

To gain access to these compilers, run one of the following command after logging into the login node.

module load compilers/PGI/13.6
module load compilers/PGI/12.10
module load compilers/PGI/12.8

If your code is using the AMD Core Math Libraries (ACML) for routines such as BLAS, LAPACK, FFT and you are compiling for AMD Bulldozer nodes then a version of the compiler containing an optmized ACML utilizing the Fused Multiply Add instruction in Bulldozer (FMA4) is available by loading

module load compilers/PGI/14.10-acml-fma4
module load compilers/PGI/13.6-acml-fma4

Basic compilation commands

Basic compilation commands are as follows:

pgcc hello.c -o hello

pgCC hello.cpp -o hello
pgcpp hello.cpp -o hello

Note: Use pgc++ if link compatibility with the GNU C++ compiler is required (see man pgc++)

Fortran

pgf90 hello.f90 -o hello
pgf95 hello.f90 -o hello
pgfortran hello.f90 -o hello

High Performance Fortran

pghpf hello.hpf -o hello

Target Architectures

To compile specifically for architectures such as Intel Sandybridge or AMD Bulldozer use the -tp=architecture flag. For example:

pgcc -tp=sandybridge hello.c -o hello_sb
pgcc -tp=bulldozer hello.c -o hello_bd

Further optimization flags may be appropriate for different architectures. For example, on AMD Bulldozer it is recommended to use the following:

pgcc -tp=bulldozer -O3 -fast

See below for compiling for Nvidia GPUs (which uses the -ta=accelerator flag to specify nvidia accelerators).

Sample code

Sample code can be found in three subfolders of $PGI/linux86-64/$PGIVER/etc/samples

/accel
/cudafor
/openacc

Each subfolder contains makefiles to compile the sample code. If you copy these folders and see an errors like /usr/bin/ld: cannot open output file c1.exe: Permission denied when compiling ensure you have the necessary permissions to write files to your folders.

Compiling and submitting a test CUDA accelerator application in C

Note, this method uses PGI-specific directives, not OpenACC directives. We recommend using OpenACC for portability (see below).

It is not necessary to be logged into a node hosting a GPU in order to compile CUDA executables. Copy one of the samples to your current working directory:

cp $PGI/linux86-64/$PGIVER/etc/samples/accel/c1.c .

Compile it for NVIDIA CUDA either using the makefile provided in /accel or use the following command:

pgcc -ta=nvidia -Minfo=accel -fast c1.c -o c1.exe

The following SGE script, pgi.sge, is suitable for submitting the above executable:

#!/bin/bash
#$-cwd
#$-S /bin/bash
#$-l nvidia
./c1.exe > out.txt

Submit this as a batch job using:

qsub pgi.sge

If successful, the output file out.txt will contain the following:

100000 iterations completed

Source code for an example application

Here is the source code for the example referred to above:

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

int main( int argc, char* argv[] )
{
    int n;      /* size of the vector */
    float *restrict a;  /* the vector */
    float *restrict r;  /* the results */
    float *restrict e;  /* expected results */
    int i;
    if( argc > 1 )
        n = atoi( argv[1] );
    else
        n = 100000;
    if( n <= 0 ) n = 100000;

    a = (float*)malloc(n*sizeof(float));
    r = (float*)malloc(n*sizeof(float));
    e = (float*)malloc(n*sizeof(float));
    for( i = 0; i < n; ++i ) a[i] = (float)(i+1);

    #pragma acc region
    {
        for( i = 0; i < n; ++i ) r[i] = a[i]*2.0f;
    }
    /* compute on the host to compare */
        for( i = 0; i < n; ++i ) e[i] = a[i]*2.0f;
    /* check the results */
    for( i = 0; i < n; ++i )
        assert( r[i] == e[i] );
    printf( "%d iterations completed\n", n );
    return 0;
}

Note that all that is required in order to run a loop on the GPU is the line:

#pragma acc region

Compiling and submitting OpenACC C code

Copy one of the samples to your current working directory:

cp $PGI/linux86-64/$PGIVER/etc/samples/openacc/acc_c1.c .

Compile it either using the makefile provided in /openacc or use the following command:

pgcc -o acc_c1.exe acc_c1.c -acc -Minfo=accel -fast             # notice the additional -acc flag

The code in this example is very similar to that given above. However, the OpenACC pragma is slightly different as it introduces the kernel keyword. You should examine carefully the acc_c1.c file.

#pragma acc kernels loop