gprMax

Overview

gprMax is open source software that simulates electromagnetic wave propagation. It uses Yee’s algorithm to solve Maxwell’s equations in 3D using the Finite-Difference Time-Domain (FDTD) method. The finite difference expressions for the spatial and temporal derivatives are central-difference in nature and second-order accurate. It is designed for simulating Ground Penetrating Radar (GPR) and can be used to model electromagnetic wave propagation in fields such as engineering, geophysics, archaeology, and medicine. It support Multi-core (OpenMP) and Multi-node (OpenMPI) parallelism and can also be used on the Nvidia GPU nodes.

Please note, this software is not installed centrally because it requires you to install it in an Anaconda Python conda environment which will go in to your CSF home directory.

However, we provide an installer script to do the install and generate some example jobscripts and a helper modulefile. Complete instructions are provided below.

Restrictions on use

There are no restrictions on accessing this software on the CSF. It is released under the GNU GPL v3 license and all usage must adhere to that license.

Installation Procedure

The following procedure will install gprMax in to an Anaconda Python conda environment in your CSF home directory. It will also create a private modulefile (a modulefile in your CSF home area) to help when you run the application in your jobs. It will also create some example jobscripts in your scratch area.

The installer will download the latest version (the master branch) from github. Hence the version installed will be whatever is in the github repo at the time of installation.

# First start an interactive session
qrsh -l short

# Then, when you have been logged in to a node:
module load apps/python/gprmax-installer/1.0
gprMax-install.sh
   #
   # NOTE: If it detects you already have an install of gprMax
   # it will ask you to add some extra flags to say whether to
   # update your installation or leave it as it. So this script
   # will not do any harm.
   #
   # To see the various options available run:
   # gprMax-install.sh -h

# If everything installs OK, go back to the login node
exit

After installing you should be able to do the following to check what see what has been installed:

ls ~/software/gprMax            # The git repo
ls ~/scratch/cylinder-*.qsub    # Example jobscripts
ls ~/privatemodules/gprMax      # A local modulefile
conda list -n gprMax            # A local conda environment (lists a lot of python packages)

Remember to exit any interactive session to return to the login node before submitting batch jobs. It isn’t possible to submit jobs with qsub while you are in an interactive session started with the qrsh command.

Manual Installation

NOTE: You do NOT need to do this step if you ran the gprMax-install.sh installer script.

If you prefer to see what is happening at each stage of the install, the commands are used by the above script to install the software. You can run these commands instead of the script, although this will not create the private modulefile or jobscript examples.

qrsh -l short

#### Create area for git repo named ~/software/gprMax (change as required)
mkdir ~/software
cd ~/software

#### Clone the developers' git repo
module load tools/env/proxy
module load apps/git/2.19.0
git clone https://github.com/gprMax/gprMax.git

#### Now create a conda environment containing the prerequisite packages
module load apps/anaconda3/5.2.0
module load libs/cuda/10.1.168
module load mpi/gcc/openmpi/3.1.4         # Use our central OpenMPI library
module load compilers/gcc/6.4.0           # Need to use this compiler with our OpenMPI

cd ~/software/gprMax
conda env create -f conda_env.yml         # Builds to ~/.conda/envs/gprMax

#### Now build/package the repo source in to the conda env
source activate gprMax
cd ~/software/gprMax
python setup.py build 2>&1 | tee python-setup-py-build.log
python setup.py install 2>&1 | tee python-setup-py-install.log
pip install --log pip-mpi4py.log mpi4py
pip install --log pip-pycuda.log pycuda

# That's it! You still have your conda env activated. To deactivate:
source deactivate

# Exit from the interactive session back to the login node
exit

Private Modulefile

NOTE: You do NOT need to do this step if you ran the gprMax-install.sh installer script. The private modulefile is automatically written for you.

If you did not use our installer script, you may wish to create the modulefile yourself:

To help you use gprMax it is convenient to create a private modulefile. These modulefiles exist in (a special folder in) your home directory and behave just like the central modulefiles you have probably used for centrally installed software. Run the following commands on the login node to create the private modulefile. After that we show how to load the private modulefile:

# Create the special directory where private modules are kept (it must use this name)
mkdir -p ~/privatemodules

Now create a modulefile named gprMax which will load the central modulefiles used by gprMax. This makes it easy to use gprMax that you installed in to your home directory earlier. You can copy-n-paste the following lines in one go as a command to run on the login node – it will create the modulefile for you without you needing to use a text-editor (ensure you copy all the way to, and including, the EOF line)

cat > ~/privatemodules/gprMax << EOF
#%Module
#### Load the modulefiles needed for gprMax and set an env var
module load apps/anaconda3/5.2.0
module load libs/cuda/10.1.168
module load mpi/gcc/openmpi/3.1.4
module load compilers/gcc/6.4.0

# Needed to stop gprMax and HDF5 complaining about running in scratch
setenv HDF5_USE_FILE_LOCKING FALSE

# Convenience
setenv GPRMAXDIR ~/software/gprMax

if { [ module-info mode load ] } {
  puts stderr "You should now activate the gprMax conda environment using:"
  puts stderr "source activate gprMax"
}
EOF

To use the private modulefile you must load a special modulefile named use.own (which makes the module command look for modulefiles within your ~/privatemodules folder. Then load the gprMax modulefile. This can be done in one command using:

module load use.own gprMax

Set up procedure

We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. See the example jobscript below for how to do this. Alternatively, you may load modulefiles on the login node and let the job inherit these settings.

Load one of the following modulefiles:

# The assumes you have followed the installation instructions above
module load use.own gprMax

Running the application

Please do not run gprMax on the login node. Jobs should be submitted to the compute nodes via batch.

Serial batch or multi-core (OpenMP) CPU-only A-scan job submission

NOTE: This jobscript will already exist in your scratch area if you ran the gprMax-install.sh installer script.

The is not a GPU job – it is for CPUs only. See below for a GPU job (the jobscript has some extra lines to allow gprMax to use the GPU correctly).

The job below runs a single simulation using multiple CPU cores. The gprMax documation refers to this as an A-scan job, where only a single model run is performed (see later for B-scan jobs where multiple runs are performed).

Create a batch submission script (which will load the modulefile in the jobscript), for example:

#!/bin/bash --login
#$ -cwd             # Job will run from the current directory
                    # NO -V line - we load modulefiles in the jobscript

### To run a multi-core (parallel CPU) job add the line:
#$ -pe smp.pe   8   # Where 8 is the number of cores (can be 2--32)

# Inform gprMax how many cores it can use
export OMP_NUM_THREADS=$NSLOTS

# Load the private modulefiles
module load use.own gprMax

# Create a local scratch dir for this job and copy the .in file there
mkdir ~/scratch/gprmax_job_${JOB_ID}
cd ~/scratch/gprmax_job_${JOB_ID}

# As an example, we use a sample .in file from the installation.
# The $GPRMAXDIR is the install area (set by the private modulefile)
cp $GPRMAXDIR/user_models/cylinder_Ascan_2D.in .

# Activate the conda env and run python. Note, in jobscripts
# use 'source activate' instead of 'conda activate'.
source activate gprMax
python -m gprMax cylinder_Ascan_2D.in
source deactivate

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Single GPU batch A-scan job submission

NOTE: This jobscript will already exist in your scratch area if you ran the gprMax-install.sh installer script.

As with the previous CPU-only example, this is what gprMax refers to as an A-scan job – a single model run performed on the GPU (and also with multiple CPU cores).

There are some extra lines in this jobscript to make gprMax work correctly on our multi-GPU compute nodes. Please ensure your jobscript is correct.

#!/bin/bash --login
#$ -cwd             # Job will run from the current directory
#$ -l v100=1        # Use a Nvidia v100 (volta) GPUs

### GPU jobs can use up to 8 cores per GPU. To use multiple CPU cores, add:
#$ -pe smp.pe   8   # Where 8 is the number of cores

# Inform gprMax how many CPU cores it can use
export OMP_NUM_THREADS=$NSLOTS

# Load the private modulefiles
module load use.own gprMax

# Create a local scratch dir for this job and copy the .in file there
mkdir ~/scratch/gprmax_job_${JOB_ID}
cd ~/scratch/gprmax_job_${JOB_ID}

# As an example, we use a sample .in file from the installation.
# The $GPRMAXDIR is the install area (set by the private modulefile)
cp $GPRMAXDIR/user_models/cylinder_Ascan_2D.in .

# Fix a problem with how gprMax and pyCUDA use CUDA_VISIBLE_DEVICES
export MYGPU=$CUDA_VISIBLE_DEVICES
unset CUDA_VISIBLE_DEVICES

# Activate the conda env and run python. Note, in jobscripts
# use 'source activate' instead of 'conda activate'.
source activate gprMax
python -m gprMax cylinder_Ascan_2D.in -gpu $MYGPU
source deactivate

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Small parallel (MPI) batch CPU-only B-scan job submission

The is not a GPU job – it is for CPUs only. See above for a GPU job (the jobscript has some extra lines to allow gprMax to use the GPU correctly).

The MPI method of running parallel programs allows gprMax to do task farming. This is where a number of simulations (tasks) are run. The gprMax documentation refers to this as a B-scan job, where a B-scan consists of a number of A-scan jobs which can be run independently.

For example, a B-scan job below consists of running 7 models, by running 7 MPI processes, each with 4 CPU cores. Note that gprMax uses an extra MPI process as a master task to coordinate the work. Therefore we will run 8 MPI processes in total and each uses 4 CPU cores giving a total of 32 cores in use by the job.

Create a batch submission script (which will load the modulefile in the jobscript), for example:

#!/bin/bash --login
#$ -cwd             # Job will run from the current directory
#$ -pe smp.pe 32    # Number of cores for a single-node job (2--32)

# Load the modulefiles
module load use.own gprMax

# Do a gprMax 'B-scan' job (multiple A-scans) using an MPI job where each
# MPI process does a multi-core A-scan.
#
# For example: Run 8 MPI processes, each using 4 CPU cores = 32 cores in total

# We choose how many cores each MPI process running gprMax can use
export OMP_NUM_THREADS=4

# Calculate how many MPI processes we can run and how many models
export NUM_MPI_PROCS=$((NSLOTS/OMP_NUM_THREADS))
export NUM_MODELS=$((NUM_MPI_PROCS-1))

echo "Processing $NUM_MODELS models by running $NUM_MPI_PROCS MPI procs"
echo "(with extra for master task, with MPI spawn), each using $OMP_NUM_THREADS cores"

# Create a local scratch dir for this job and copy the .in file there
mkdir ~/scratch/gprmax_job_${JOB_ID}
cd ~/scratch/gprmax_job_${JOB_ID}

# As an example, we use a sample .in file from the installation.
# The $GPRMAXDIR is the install area (set by the private modulefile)
cp $GPRMAXDIR/user_models/cylinder_Bscan_2D.in .

# Activate the conda env and run python. Note, in jobscripts
# use 'source activate' instead of 'conda activate'.
source activate gprMax
# Run gprMax as an MPI job (this will use the gprMax MPI spawn method)
python -m gprMax cylinder_Bscan_2D.in -n $NUM_MODELS -mpi $NUM_MPI_PROCS
source deactivate

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Large parallel (multi-node MPI) batch CPU-only B-scan job submission

The is not a GPU job – it is for CPUs only. See above for a GPU job (the jobscript has some extra lines to allow gprMax to use the GPU correctly).

The MPI method of running parallel programs allows gprMax to do task farming. This is where a number of simulations (tasks) are run. The gprMax documentation refers to this as a B-scan job, where a B-scan consists of a number of A-scan jobs which can be run independently.

This is a multi-node MPI job which spans across two (or more) compute nodes. This will allow you to process a large number of models in a B-scan job. For example, a B-scan job below consists of running 7 models, by running 7 MPI processes, each with 6 CPU cores. Note that gprMax uses an extra MPI process as a master task to coordinate the work. Therefore we will run 8 MPI processes in total, 4 MPI process on each compute node (which have 24 CPU-cores in them) and each MPI process uses 6 CPU cores giving a total of 48 cores in use by the job.

Create a batch submission script (which will load the modulefile in the jobscript), for example:

#!/bin/bash --login
#$ -cwd
#$ -pe mpi-24-ib.pe 48            # Multiple compute-node MPI job

### This does not use the gprMax MPI spawn method so that
### jobs can correctly span multiple nodes.

# Do a gprMax 'B-scan' job (multiple A-scans) using an MPI job
# where each MPI process does a multi-core A-scan.

# Inform each MPI process running gprMax how many cores it can use.
# Each CSF compute node used in multinode jobs has 24 cores. This
# example will run 4 MPI process on *each* compute node, where each MPI
# process will use 6 cores, and there will be two compute nodes in use.
# We choose how many CPU cores each MPI process will use
export OMP_NUM_THREADS=6

# Calculate how many MPI processes to run
export NUM_MPI_PROCS=$((NSLOTS/OMP_NUM_THREADS))
export NUM_MODELS=$((NUM_MPI_PROCS-1))
echo "Processing $NUM_MODELS models by running $NUM_MPI_PROCS MPI procs"
echo "(extra one for master task, without MPI spawn), each using $OMP_NUM_THREADS cores"

# Load the modulefiles
module load use.own gprMax

# Create a local scratch dir for this job and copy the .in file there
mkdir ~/scratch/gprmax_job_${JOB_ID}
cd ~/scratch/gprmax_job_${JOB_ID}
cp $GPRMAXDIR/user_models/cylinder_Bscan_2D.in .

# Activate the conda env and run python. Note, in jobscripts
# use 'source activate' instead of 'conda activate'.
source activate gprMax
# Use the usual mpirun MPI command (which does not use MPI spawn) when doing multi-node
mpirun -n $NUM_MPI_PROCS --map-by node python -m gprMax cylinder_Bscan_2D.in -n $NUM_MODELS --mpi-no-spawn
source deactivate

Submit the jobscript using:

qsub scriptname

where scriptname is the name of your jobscript.

Further info

Updates

None.

Last modified on February 7, 2024 at 4:04 pm by George Leaver