Basics of running uncompiled MATLAB code

October 2023: It is no longer necessary to compile your MATLAB code. Here we show how to run un-compiled MATLAB code as a batch job.

Please note: all MATLAB jobs must be submitted to the batch system whether you compile your MATLAB code or not. Users must NOT run MATLAB on the CSF login nodes!

Controlling the number of cores used by MATLAB jobs

Before we show how to run your un-compiled MATLAB code, we must discuss how many CPU cores you wish your MATLAB jobs to use.

It is now possible to finely control the number of cores used by MATLAB jobs.  There are two options available for MATLAB code:

  • Use a single CPU core on the compute node where the job runs, or
  • Use the parpool MATLAB command in your code for explicit parallelisation, or
  • Rely on the implicit parallelisation of many MATLAB functions – they may automatically use multiple cores, particularly when processing large arrays or matrices.

Hence you can run your code as a single-core (serial) application or a multi-core (parallel) application.

Some documents on the web refer to the maxNumCompThreads function but this is unreliable, and will soon be removed from MATLAB, so should not be used.

Many MATLAB functions are multi-threaded by default. An extensive list of these functions can be found at the following blog post written by Dr. Mike Croucher http://www.walkingrandomly.com/?p=1894. Hence if you run your MATLAB code as multi-core application, it may automatically use multiple CPU cores without you having to explicitly write parallel MATLAB code.

In addition to the implicit, multi-threaded computation offered by these functions, you may wish to investigate use of the Parallel Computing Toolbox.  This allows for explicit multi-core computation as well as GPU computing.

Running MATLAB

Un-compiled MATLAB requires a different command in the jobscript to run the application compared to compiled MATLAB jobs.

You need to turn off use of the GUI, and then give MATLAB a command to run. It is common for the command to have the same name as one of your .m files. This will allow MATLAB to find the file.

For example:

% Write a MATLAB file named 'my_analysis.m' containing the code to run:
% My matlab code
myVar = ones(5);
save('exampleoutput.mat', 'myVar');
fprintf('Hello from the CSF!\n');
fprintf('Be sure to save any variables you need to a mat file.\n');
exit

When running the above file as an un-compiled MATLAB code you should use the command:

# The command to use in your jobscripts to run the my_analysis.m file un-compiled
matlab -nojvm -batch "my_analysis"
                          #
                          # NO .m added - just use the name without the .m
                          # The -nojvm flag disables Java and makes start-up faster.
                          # If you use features of Matlab that use Java, remove this flag.

MATLAB will look for a file named my_analysis.m and run the commands it contains.

We must also distinguish between serial (1-core) and parallel (multicore) jobs when running un-compiled MATLAB code.

Serial (1-core) Jobs

By default MATLAB will try to make use of multiple CPU cores. If you want to use only one CPU core, you must tell MATLAB to use only one core:

An example jobscript to run un-compiled MATLAB code, using 1 core:

#!/bin/bash --login
#$ -cwd
# Load the version you require
module load apps/binapps/matlab/R2022a

# Run MATLAB without the GUI or any Java, with only 1 core
matlab -singleCompThread -nojvm -batch "my_analysis"
                                           #
                                           # MATLAB will look for a file named my_analysis.m

Submit the jobscript to the batch system using:

qsub jobscript

where jobscript is the name of your jobscript.

Parallel Jobs

If you don’t force MATLAB to use a single core (see above) then you must say how many cores MATLAB can use for parallel computation. By default it will try to use all cores on the compute node. If your jobscript doesn’t actually request all of the cores then you will be trying to use the cores in use by other users’ jobs. This will slow down your own job and also their jobs!!

Users found to be running MATLAB jobs that are overloading the compute nodes (using more cores than the jobscript requests) may have the number of jobs they can run severely limited by the sysadmins.

Two examples of parallel MATLAB code are given here – explicit and implicit parallelism. Many MATLAB functions use implicit parallelism so you may benefit from running that type of job if nothing else. First we discuss explicit parallel jobs.

Explicit Parallel Jobs

An example of explicit parallelism where you explicitly request the parallel functionality in MATLAB using parpool (e.g., in a file named par_explicit.m):

% Explicit parallelism using 'parpool'. We read the number of cores
% to use from the SGE batch system NSLOTS environment variable.
nslots = str2double(getenv('NSLOTS'));

% Start a parpool (will use the 'local' profile by default)
parpool(nslots);
pp = gcp;
fprintf('Parpool size %d\n', pp.NumWorkers);

% Note: see the MATLAB manual for spmd or parfor commands
spmd
    fprintf('Hello from worker %d\n', labindex);
end

% Shutdown the parpool  
delete(gcp);
exit

A suitable jobscript is as follows:

#!/bin/bash --login
#$ -cwd
#$ -pe smp.pe 8      # Can be 2--32 cores
# Load the version you require
module load apps/binapps/matlab/R2022a
matlab -batch "par_explicit"
                     #
                     # MATLAB will look for a file named par_explicit.m
                     # Note: You CANNOT use the -nojvm flag with parpool.

Submit the jobscript to the batch system using:

qsub jobscript

where jobscript is the name of your jobscript.

Implicit Parallel Jobs

Here we assume MATLAB will be using multiple CPU cores internally in its own functions (e.g., when processing large matrices.) You don’t request any parallel features, but you still need to control how many cores MATLAB uses to match the number of cores requested in the jobscript. If you fail to do this, MATLAB may try to use all of the cores on a compute node and some of those may be in use by other jobs:

Some example MATLAB code for implicit parallelism (e.g., in a file named par_implicit.m):

% Implicit parallelism using maxNumCompThreads
nslots = str2double(getenv('NSLOTS'));
maxNumCompThreads(nslots);

% ... create some matrices ...
A = rand(1500);
B = rand(1500);

% Now, operations that use implicit parallelism such as matrix multiplication
% will use the required number of threads. We time the operation with tic ... toc.
% Try submitting this job with a different number of cores.
tic
C = A * B;                  % matrix multiplication
toc
exit

A suitable jobscript is as follows:

#!/bin/bash --login
#$ -cwd
#$ -pe smp.pe 8      # Can be 2--32 cores
# Load the version you require
module load apps/binapps/matlab/R2022a
matlab -nojvm -batch "par_implicit"
                        #
                        # MATLAB will look for a file named par_implicit.m

Submit the jobscript to the batch system using:

qsub jobscript

where jobscript is the name of your jobscript.

Hints, Tips and Code Samples

Passing Command-line Args to Un-compiled Code

You may wish to pass command-line parameters to your compiled MATLAB code. For example suppose you wish to pass in a couple of numbers and a string representing settings to be used by your code. The jobscript will look something like:

#!/bin/bash --login
#$ -cwd
#$ -pe smp.pe 4
module load apps/binapps/matlab/R2022a

## If you need to run the same code with many different parameters see the next tip about job arrays.

# Pass in two numbers used by the MATLAB code: 500 and 10000 (for example)
# and a string 'experiment_name' (for example), which must be enclosed in single quotes.
matlab -nojvm -batch "my_analysis_function(500, 10000, 'experiment_name')"
                          #
                          # MATLAB will look for a file named my_analysis_function.m

You will need to write your my_analysis_function.m file to read these args. You must make the entire code be run from a top-level function:

% my_analysis_function.m source code
function exitcode = my_analysis_function(xparam, iters, xp_name)
  % Args come in as numbers or, if supplied in quotes, as a string.
  fprintf( 'Args supplied: xparam = %d, iters = %d, xp_name = %s\n', xparam, iters, xp_name );

  % ...your code...

  % Set dummy function return val
  exitcode = 0;
end

In the above sample note that

  • The code is wrapped in a function and so will need an exit value – we set a dummy value at the end.
  • The args passed in are presented as numbers to your code, unless passed in inside quotes, then it is treated as a string.
  • To pass a string from an environment variable as an arg (e.g., $JOB_NAME), you must still use single quotes:
    # The $JOB_NAME environment variable contains the name of your batch job - pass in as a string
    matlab -nojvm -batch "my_analysis_function(500, 10000, '$JOB_NAME')"
    

Using the Job Array task ID as a Command-line Arg

The above method could be used within an SGE job array. This is a single job that runs the same MATLAB code multiple times as individual tasks, but with different input parameters. More than one task can run at the same time. Each task uses a special SGE variable ($SGE_TASK_ID) to pass a different parameter to your MATLAB code. This is a great alternative to using a for loop, because rather than running each loop sequentially, the job array tasks can run at the same time i.e. in parallel.

In the example jobscript below, the $SGE_TASK_ID value is passed to the MATLAB code. This can then do something unique with that value. For example:

#!/bin/bash --login
#$ -cwd
### Jobarray with 1000 tasks numbered 1,2,...,1000
#$ -t 1-1000

module load apps/binapps/matlab/R2022a
# Include the next two lines to avoid a common problem with MATLAB job arrays (described in the next tip)
export MCR_CACHE_ROOT=$TMPDIR/mcrCache
mkdir -p $MCR_CACHE_ROOT

# Run our matlab code giving it the current task id on the command-line
matlab -jvm -batch "my_analysis_function($SGE_TASK_ID)"
                          #
                          # MATLAB will look for a file named my_analysis_function.m

The MATLAB code will then read the first arg as shown earlier:

% my_analysis_function.m source code - receives a number parameter
function exitcode = my_analysis_function(task_id_arg)
  % Args come in as numbers:
  fprintf( 'Executing my code with SGE_TASK_ID = %d\n', task_id_arg);

  % ...your code...

  % Set dummy function return val
  exitcode = 0;
end

Our batch system documentation contains lots of examples of how to use job array.

MATLAB Job Array Error and the Fix

If using job arrays to run multiple instances of MATLAB (similar to condor), you may receive an error message about accessing a lock file:

terminate called after throwing an instance of 'dsFileBasedLockError'
what(): \
 Tried to obtain a lock on a directory without write permission: \
/mnt/iusers01/xy01/mabcxyz12/.mcrCache7.17/.deploy_lock.27

Or you may receive an error message of the form:

Could not access the MATLAB Runtime component cache. Details: Some error has occurred in the file: mcr_cache/mclComponentCache.cpp, at line: 328.
 The error message is: 
dsFileAccessError exception
; component cache root:/mnt/iusers01/xy01/mabcxyz12/.mcrCache9.6; componentname: example

This occurs when many job array tasks run concurrently and all try to access the same temporary directory used to store the lock file. The solution is to force MATLAB to create the temporary lock files on the nodes where the tasks are running rather than in your home or scratch space. Add the following lines to your jobscript before the matlab -nojvm ... line:

# Extra settings in your jobscript to make job-arrays run correctly
export MCR_CACHE_ROOT=$TMPDIR/mcrCache
mkdir -p $MCR_CACHE_ROOT

# Run matlab as normal
matlab -nojvm -batch ...

The $TMPDIR variables specifies a directory local to the compute node where the job is running and private to your job. It is automatically set by the batch system.

Further information

  • Some functions cannot be compiled (Everything from the symbolic toolbox for example). A full list of restrictions and exclusions can be found at
  • It is often possible to speed-up MATLAB code significantly using techniques such as vectorisation, mex files, parallelisation and more. If you would like advice on how to optimise your MATLAB application please get in touch.k

Last modified on November 8, 2023 at 1:22 pm by George Leaver