{"id":941,"date":"2018-11-22T15:28:11","date_gmt":"2018-11-22T15:28:11","guid":{"rendered":"http:\/\/ri.itservices.manchester.ac.uk\/csf3\/?page_id=941"},"modified":"2026-04-07T17:43:15","modified_gmt":"2026-04-07T16:43:15","slug":"r","status":"publish","type":"page","link":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/software\/applications\/r\/","title":{"rendered":"R &#038; bioconductor"},"content":{"rendered":"<div class=\"warning\">\nJune 2023: The proxy is <strong>no longer available<\/strong>.<\/p>\n<p>To download data from external sites, please do so from a batch job or use an <em>interactive session<\/em> on a backend node by running <code>qrsh -l short<\/code>. You DO NOT then need to load the proxy modulefiles. Please see the <a href=\"\/csf3\/batch\/qrsh\/\">qrsh notes<\/a> for more information on interactive use.<\/div>\n<h2>Overview<\/h2>\n<p><a href=\"https:\/\/www.r-project.org\/\">R<\/a> is a free software environment for statistical computing and graphics.<\/p>\n<p>See modulefile section below for list of available versions.<\/p>\n<p><a href=\"http:\/\/www.bioconductor.org\/\">Bioconductor<\/a> is available via a separate modulefile &#8211; see <a href=\"#bioc\">below<\/a>.<\/p>\n<p>You can install packages to your own home directory using the <a href=\"#addpack\">Adding Packages<\/a> instructions below (and in conjunction with information from the bioconductor website.)<\/p>\n<h2>Restrictions on use<\/h2>\n<p>There are no restrictions on access to R as it is a free piece of software released under a GNU license. All users should familiarise themselves with the licensing information available via the <a href=\"http:\/\/www.r-project.org\/\">R website<\/a>.<\/p>\n<p>All R jobs, aside from very short test jobs (e.g. those lasting less than one minute) must be submitted to the batch system.<\/p>\n<h2>Set up procedure<\/h2>\n<p>We now recommend loading modulefiles within your jobscript so that you have a full record of how the job was run. See the example jobscript below for how to do this. <\/p>\n<p>Load <strong>one<\/strong> of the following modulefiles:<\/p>\n<p>Standard open source R:<\/p>\n<pre>\r\nmodule load apps\/gcc\/R\/4.5.0   # CSF3_SLURM includes default gcc 11.4.1\r\nmodule load apps\/gcc\/R\/4.5.0   # CSF3_SGE includes default gcc 11.2.0\r\n\r\nmodule load apps\/gcc\/R\/4.4.2   # Includes gcc 14.2.0 (helps package installs)\r\n\r\n### Versions <em>above<\/em> here are only available on the upgraded CSF3 (Slurm)   ###\r\n### Versions <em>below<\/em> here are available on both CSF3s (SGE and Slurm)       ###\r\n\r\nmodule load apps\/gcc\/R\/4.4.1   # Includes gcc 13.3.0 (helps package installs)\r\n                               # For BioConductor see below notes\r\n\r\nmodule load apps\/gcc\/R\/4.4.0   # Includes gcc 12.2.0 (helps package installs)\r\n                               # For BioConductor see below notes\r\n\r\nmodule load apps\/gcc\/R\/4.3.1   # Includes BioConductor 3.16, gcc 9.3 (helps package installs)  \r\nmodule load apps\/gcc\/R\/4.2.2-gcc14.2   # Includes gcc 14.2(helps package installs)  \r\nmodule load apps\/gcc\/R\/4.1.2   # Includes BioConductor 3.14, gcc 9.3 (helps package installs)\r\nmodule load apps\/gcc\/R\/4.1.0   # Includes BioConductor 3.13, gcc 8.2 (helps package installs)\r\nmodule load apps\/gcc\/R\/4.0.2   # Includes BioConductor 3.11, gcc 8.2 (helps package installs)\r\nmodule load apps\/gcc\/R\/3.6.2   # Includes BioConductor 3.10, gcc 8.2 (helps package installs)\r\nmodule load apps\/gcc\/R\/3.6.1   # Includes BioConductor 3.9, gcc 8.2 (helps package installs)\r\nmodule load apps\/gcc\/R\/3.6.0   # Includes BioConductor 3.9, gcc 8.2\r\n\r\nmodule load apps\/R\/3.5.2    # Does not include BioConductor - see modulefile below\r\nmodule load apps\/R\/3.4.2    # Does not include BioConductor - see modulefile below\r\n<\/pre>\n<h3 id=\"bioc\">BioConduction Installation<\/h3>\n<p>The central installations of R versions 4.4.0, 4.4.1 and later, do not include BioConductor.<\/p>\n<p>Users can install BioConductor themselves by running the following commands in an <a href=\"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/batch\/qrsh\/\" target=\"_blank\" rel=\"noopener\">interactive session<\/a>: <\/p>\n<pre>\r\n# On the CSF login node, start an interactive session\r\nsrun -p interactive -t 0-1 --pty bash\r\n\r\n# You'll now be on a compute node. You can run your commands directly, and R\r\n# will be able to download packages from the outside world:\r\nmodule load apps\/gcc\/R\/4.4.1      # Choose your required version\r\nR\r\ninstall.packages(\"BiocManager\")\r\nq();\r\n\r\n# To return to the login node, exit from the interative (qrsh) session\r\nexit\r\n<\/pre>\n<p>Batch jobs can now use your local installation of BioConductor. For more information on using BioConductor, please see the <a href=\"https:\/\/www.bioconductor.org\/install\/\">BioConductor installation documentation<\/a>.<\/p>\n<p>Older version of R require a separate BioConductor modulefile. To use BioConductor, load:<\/p>\n<pre># This is NOT needed for R 3.6 and newer!\r\nmodule load libs\/bioconductor\/3.4\r\n<\/pre>\n<p>This will load the R modulefile if not already loaded.<\/p>\n<p><a name=\"running\"><\/a><\/p>\n<h2>Running the application<\/h2>\n<p>Note that using <code>R CMD BATCH<\/code>, as below, may save and restore your workspace, which may not be what you want. Using <code>Rscript<\/code> instead avoids that.<\/p>\n<h3>Serial Batch job<\/h3>\n<p>Write a submission script, for example:<\/p>\n<pre>\r\n#!\/bin\/bash --login\r\n#SBATCH -p serial        # (or --partition=) Run on the nodes dedicated to 1-core jobs\r\n#SBATCH -t 2-0           # Wallclock time limit (2-0 is 2 days, max permitted is 7-0)\r\n\r\n## We now recommend loading the modulefile in the jobscript. Change the version as needed.\r\nmodule purge\r\nmodule load apps\/R\/3.4.2\r\n\r\nR CMD BATCH --no-restore <em>my_test.R<\/em>  <em>my_test.R<\/em>.$SLURM_JOB_ID\r\n   #                #                   #\r\n   #                #                   # The final argument, \"<em>my_test.R<\/em>.$SLURM_JOB_ID\", tells R to send\r\n   #                #                   #  output to a file with this name unique to the current job.\r\n   #                #\r\n   #                # Do not restore any previously saved objects. Ensures you don't load in possibly\r\n   #                # large objects from previous runs of R. If jobs are failing due to lack of memory\r\n   #                # please add this flag or alternatively use --vanilla which applies the following:\r\n   #                # --no-save, --no-restore, --no-site-file, --no-init-file and --no-environ\r\n   #\r\n   # R must be called with both the \"CMD\" and \"BATCH\" options which tell it\r\n   # to run an R <em>program<\/em>, in this case <em>my_test.R<\/em>, instead of presenting\r\n   # an interactive prompt\r\n<\/pre>\n<p>Submit the job using<\/p>\n<pre>sbatch <em>runmyRjob.slurm<\/em>\r\n<\/pre>\n<p>where <code><em>runmyRjob.slurm<\/em><\/code> is the name of your job script.<\/p>\n<p>By default, graphical output from batch jobs is sent to a file called <code>Rplots.pdf<\/code>. See <a href=\"#plotting\">below<\/a> for more info on plotting in to an image file.<\/p>\n<h3>Parallel Batch Job (single node, multi-core)<\/h3>\n<p>Please note that your R code must be parallelised (usually with the &#8216;parallel&#8217; library) before you submit to more than 1 core. Asking for more than 1 core does not mean your code will automatically use them.<\/p>\n<pre>\r\n#!\/bin\/bash --login\r\n#SBATCH -p multicore     # (or --partition=) Run on the nodes dedicated to 1-core jobs\r\n#SBATCH -n 8             # (or --ntasks=) Number of cores\r\n#SBATCH -t 2-0           # Wallclock time limit (2-0 is 2 days, max permitted is 7-0)\r\n\r\nmodule load\r\nmodule load apps\/R\/3.4.2\r\n\r\nR CMD BATCH --no-restore my_test.R my_test.R.$SLURM_JOB_ID\r\n               #\r\n               # See the serial jobscript example above for a description\r\n               # of the command-line flags.\r\n<\/pre>\n<ul>\n<li>Then submit your job to the batch system<\/li>\n<\/ul>\n<pre>sbatch <em>runmyRjob.slurm<\/em>\r\n<\/pre>\n<p>where <code><em>runmyRjob.slurm<\/em><\/code> is the name of your job script.<\/p>\n<p>The various libraries for performing parallel computation in R each have their own way of setting the number of cores to use within R. This will sometimes default to the total number of cores on the node. You need to make sure that your code is using no more than the number of cores you&#8217;ve requested in your job script, otherwise your job is liable to be killed without warning.<\/p>\n<p>You can return the number of cores you requested in your jobscript as a variable, using the code:<\/p>\n<pre>numCoresAllowed &lt;- Sys.getenv(\"SLURM_NTASKS\", unset=1)\r\n<\/pre>\n<p>(If you&#8217;re running the job interactively or on your local machine, the value specified in &#8220;unset&#8221; will be returned)<\/p>\n<p>You should use this value when you set the number of cores. For example, if you&#8217;re using the &#8220;doMC&#8221; package, you&#8217;d use:<\/p>\n<pre>registerDoMC(cores = numCoresAllowed)\r\n<\/pre>\n<p>Some libraries, e.g. the &#8220;parallel&#8221; library will take the number of cores to use from an environment variable (e.g. <code>MC_CORES<\/code>) directly. You can set the environment variable in your job script:<\/p>\n<pre>export MC_CORES=$SLURM_NTASKS\r\n<\/pre>\n<p>Add this to your jobscript before the <code>R CMD BATCH ...<\/code> line.<\/p>\n<h2>Running R interactively<\/h2>\n<p>It is expected that most use of R on the CSF will be in batch mode, i.e., computational jobs will be submitted to the batch system and there will be no subsequent user interaction. However, if required, R can be run on the CSF using either the R command line, or GUI.<\/p>\n<p><em>Do not simply login to the CSF and start R<\/em> \u2014 your jobs will be killed by the system administrator! The only exception to this is when installing a package from a mirror in R (<a href=\"#addpack\">see below<\/a>).<\/p>\n<p>To run R jobs interactively on the CSF, make use of the <a href=\"\/csf3\/batch-slurm\/srun\/\">srun<\/a> facility, which literally queues interactive jobs. To start the R command line type<\/p>\n<pre>\r\n# Load the modulefile on the login node (use your required version)\r\n# If you loaded other modulefiles to do a package installation (e.g., nlopt)\r\n# you should also load them here.\r\nmodule purge\r\nmodule load apps\/gcc\/R\/4.4.2\r\n\r\n# Start R in an interactive session on a compute node (text-mode only)\r\nsrun -p interactive -t 0-1 --pty R --vanilla\r\n                                      # \r\n                                      # \r\n                                      #\r\n                                      # Can be: \"--save\", \"--no-save\" or \"--vanilla\"...\r\n\r\n<\/pre>\n<p><a name=\"addpack\"><\/a><\/p>\n<h2>Adding packages<\/h2>\n<p>You may wish to use a particular package (library) in your code. The central installations of R <em>may<\/em> already have that package installed. If not, you can install it yourself (it will go in to a folder in your home directory).<\/p>\n<p>In this section we provide details of the standard method of installing packages in R, using the <code>install.packages()<\/code> command. This will install packages within your <em>home directory<\/em> area, but does not allow you to install packages on a project-by-project basis. <\/p>\n<div class=\"hint\">If you prefer a more project-oriented installation method (similar to python&#8217;s virtual environments), where the packages you install for one project can be kept separate to those of another, please see the <a href=\"#renv\">Renv method described further down<\/a>. <strong>We recommend this method<\/strong>.<\/div>\n<h3>Check if a package is already installed<\/h3>\n<p>To determine if a package is already installed, simply try loading it in R. For example, on the login node:<\/p>\n<pre>R\r\n&gt; library(<em>thing<\/em>)\r\nError in library(<em>thing<\/em>) : there is no package called \u2018<em>thing<\/em>\u2019\r\n\r\n# (if you get no output it usually means the library is already installed!)\r\n<\/pre>\n<p>This tells use we need to install a package\/library named &#8216;thing&#8217;. See below for how to do that. Installing BioConductor packages is also possible and this is also covered below.<\/p>\n<p>Note: For the purposes of adding packages you can run R on the login node. But this is the <strong>only<\/strong> time you should run R on the login node. All data processing, development and testing must be run in batch jobs or in an interactive session on a compute node (see above for how to run R).<\/p>\n<h3>Install a package by Automatically Downloading from CRAN (the default repo)<\/h3>\n<p>To add packages to your personal R package directory (<code>~\/R\/<em>platform<\/em>\/<em>version<\/em><\/code>), downloading from CRAN:<\/p>\n<p>Note: You can do R package installations on the login node.<\/p>\n<pre>\r\nmodule purge\r\nmodule load apps\/gcc\/R\/4.4.2\r\n  #\r\n  # Note: you may need to load other modulefiles to complete a package installation.\r\n  # If your install fails, look at the errors. You can exit from R, load some more\r\n  # modulefiles, then run R again and try the install. Common packages are nlopt and\r\n  # cmake - see sections below for more details.\r\n\r\n# Note: you may have old proxy settings in an ~\/.Renviron file. You'll need to remove these:\r\ncat ~\/.Renviron\r\n  #\r\n  # <strong>If you see the following, you do not need to do anything!<\/strong>\r\n  cat: .Renviron: No such file or directory\r\n\r\n  # <strong>If you see some lines containing<\/strong>\r\n  http_proxy=http:\/\/proxy.man.ac.uk:3128\r\n  https_proxy=https:\/\/proxy.man.ac.uk:3128\r\n    #\r\n    # Delete these lines or place a # at the start of each line.\r\n\r\n  # <strong>If your ~\/.Renviron file contains <em>only<\/em> the above proxy lines<\/strong>\r\n  # <strong>you can delete the file<\/strong>\r\n  rm ~\/.Renviron\r\n<\/pre>\n<p>Now start R in the usual way: <\/p>\n<pre>\r\nR\r\n<\/pre>\n<p>Now ask R to install the required package and answer <code>y<\/code> when asked if you wish to create a personal library:<\/p>\n<pre>\r\n&gt; <strong>install.packages(\"thing\")<\/strong>\r\nWarning in install.packages(\"thing\") :\r\n'lib = \"\/opt\/apps\/apps\/gcc\/R\/4.4.2\/lib64\/R\/library\"' is not writeable\r\nWould you like to use a personal library instead?  (y\/n) <strong>y<\/strong>      # Answer '<strong>y<\/strong>'\r\nWould you like to create a personal library                     # (if first ever package!)\r\n~\/R\/x86_64-pc-linux-gnu-library\/4.2\r\nto install packages into?  (y\/n) <strong>y<\/strong>                              # Answer '<strong>y<\/strong>'\r\n<\/pre>\n<p>Select a UK mirror when prompted (e.g., UK Bristol which is near the bottom of the list.)<\/p>\n<p>Once the package is installed, you can now check it has installed correctly by loading the library:<\/p>\n<pre>\r\nlibrary(<em>thing<\/em>)\r\n  #\r\n  # No output (or some library-specific info) means it is installed correctly.\r\n<\/pre>\n<p>You can now exit R and then exit from your interactive sessions or install more libraries by repeating the above steps.<\/p>\n<pre>\r\nq()\r\n<\/pre>\n<p>Please remember that your usual R usage, to run scripts and process data <em>must<\/em> be done in batch or via srun (interactively) on a compute node (<a href=\"#running\">see above<\/a>). <strong>Do not<\/strong> continue to run computational work on the login node!<\/p>\n<p>In the above instructions replace the <code>module load<\/code> command with the one appropriate to the R version you wish to use.<\/p>\n<p>If you wish to specify a mirror in the <code>install.packages<\/code> command instead of selecting it from a menu, try:<\/p>\n<pre>\r\ninstall.packages('<em>thing<\/em>', repos='http:\/\/www.stats.bris.ac.uk\/R')\r\n<\/pre>\n<h3>Installing a Library from a source package<\/h3>\n<p>If you&#8217;ve downloaded an R library source file you can add it to your local workspace using the following commands (which assume the source package is in your home directory on the CSF):<\/p>\n<p>Start R with extra command-line args (choose the version of R you require):<\/p>\n<pre>module load apps\/gcc\/R\/4.4.2\r\nR CMD INSTALL <em>thing.x.y.z.tar.gz<\/em>\r\n\r\n* installing to library \u2018\/mnt\/iusers01\/support\/<em>mabcxyz1<\/em>\/R\/x86_64-unknown-linux-gnu-library\/4.2\u2019\r\n* installing *source* package \u2018thing\u2019 ...\r\n** package \u2018thing\u2019 successfully unpacked and MD5 sums checked\r\n** R\r\n** data\r\n** demo\r\n** preparing package for lazy loading\r\n** help\r\n*** installing help indices\r\n** building package indices\r\n** testing if installed package can be loaded\r\n* DONE (thing)\r\n<\/pre>\n<h3>Using an installed package<\/h3>\n<p>Now run R and test that the library can be loaded:<\/p>\n<pre>\r\nR\r\n&gt; library(thing)\r\n<\/pre>\n<p>The compiled library files will be save in a directory named <code>R<\/code> in your home directory. It contains subdirectories for each version of R so if you want to use the library in different versions of R you will have to repeat the above commands for each version.<\/p>\n<h3>nloptr dependency<\/h3>\n<p>Some packages fail to install because they depend on the <code>nloptr<\/code> R package. Trying to install that specific package often fails due to a dependency on the <code>nlopt<\/code> library, which R fails to compile. So we have provided this as a  separate modulefile. For example:<\/p>\n<pre>\r\n# <strong>This will fail due to a failure to compile nloptr<\/strong>\r\nmodule load apps\/gcc\/R\/4.3.2\r\nR\r\ninstall.packages(\"nloptr\")           # Other R packages that depend on this one will also fail\r\nq()\r\n\r\n# <strong>The solution is to load an extra modulefile:<\/strong>\r\nmodule load apps\/gcc\/R\/4.3.2\r\nmodule load libs\/gcc\/nlopt\/2.6.2\r\nR\r\ninstall.packages(\"nloptr\")\r\n<\/pre>\n<p><strong>Note:<\/strong> you will also need to load the nlopt modulefile in your jobscript when submitting jobs to the batch system.<\/p>\n<h3>cmake dependency<\/h3>\n<p>If your packages requires <code>cmake<\/code> to complete its installation, you can load the cmake modulefile before running R, then R will be able to find it:<\/p>\n<pre>\r\nmodule load apps\/gcc\/R\/4.3.2\r\nmodule load tools\/gcc\/cmake\/3.25.1      # Other versions of cmake are available\r\nR\r\ninstall.packages(\"mice\")\r\n<\/pre>\n<p><strong>Note:<\/strong> you will likely <strong>NOT<\/strong> need to load the cmake modulefile in your jobscript when submitting jobs to the batch system. cmake is usually only used during the installation, not when you run R.<\/p>\n<p>Please see the <a href=\"..\/cmake\">cmake page<\/a> for available versions.<\/p>\n<h3>Adding BioConductor Packages &#8211; R 3.6.0 and newer<\/h3>\n<p>Note: This is NOT the method used for older versions of R (3.5 and older). See below for that.<\/p>\n<p>The &#8216;manager&#8217; for bioconductor has changed in version 3.6.0. Details are given here on how to install BioConductor packages in R 3.6.0 (and up).<\/p>\n<pre># Check the BiocManager version\r\nBiocManager::version()\r\n\r\n# See what is installed\r\nBiocManager::available()\r\n\r\n# Install a package to your home directory\r\nBiocManager::install(c(\"esATAC\"))\r\n     ## In this case esATAC (replace that with the package you are interested in)\r\n     ## You will be prompted to install to a local (your home) directory as below\r\n\r\nBioconductor version 3.9 (BiocManager 1.30.4), R 3.6.0 (2019-04-26)\r\nInstalling package(s) 'esATAC'\r\nWarning in install.packages(pkgs = doing, lib = lib, repos = repos, ...) :\r\n  'lib = \"\/opt\/apps\/apps\/gcc\/R\/3.6.0\/lib64\/R\/library\"' is not writable\r\nWould you like to use a personal library instead? (yes\/No\/cancel) <strong>y<\/strong>   # Answer 'y'\r\nWould you like to create a personal library\r\n\u2018~\/R\/x86_64-pc-linux-gnu-library\/3.6\u2019\r\nto install packages into? (yes\/No\/cancel) <strong>y<\/strong>                           # Answer 'y'\r\n<\/pre>\n<h3>Adding BioConductor Packages &#8211; R 3.5.2 and R 3.4.2<\/h3>\n<p>Note: This is NOT the method used for newer versions of R (3.6 and newer). See above for that.<\/p>\n<p>BioConductor packages can be installed in to your local R library (in your home directory) as follows:<\/p>\n<pre># This will automatically load the R modulefile as well\r\nmodule load libs\/bioconductor\/3.4\r\nR\r\nsource(\"https:\/\/bioconductor.org\/biocLite.R\")\r\nbiocLite(\"<em>packagename<\/em>\")             # Give a biocLite package name: EG: \"S4Vectors\"\r\n\r\n# You will see some output then be asked to install locally:\r\n\r\n  'lib = \"\/opt\/gridware\/apps\/R\/3.4.2\/lib64\/R\/library\"' is not writable\r\nWould you like to use a personal library instead?  (y\/n) <strong>y<\/strong>      # Answer 'y'\r\nWould you like to create a personal library\r\n~\/R\/x86_64-pc-linux-gnu-library\/3.4\r\nto install packages into?  (y\/n) <strong>y<\/strong>                              # Answer 'y'\r\n<\/pre>\n<p>The package will be downloaded and installed in to your local R library.<\/p>\n<h3>Using BioConductor Packages<\/h3>\n<p>BioConductor packages have to be loaded like any other package if you&#8217;ve previously installed them. For example, assuming you have installed a BioConductor package named <code>bioThing<\/code>, to use it in your code use:<\/p>\n<pre># Load\/use a BioConductor package named 'bioThing' previously installed\r\nlibrary(bioThing)\r\n<\/pre>\n<h3>Adding rjags and related packages<\/h3>\n<p><a href=\"https:\/\/cran.r-project.org\/web\/packages\/rjags\/index.html\">rjags<\/a> is a popular package for working with Bayesian graphical models using MCMC. If is also used by other packages such as <a href=\"https:\/\/cran.r-project.org\/web\/packages\/JMbayes\/index.html\">JMbayes<\/a>. The <code>rjags<\/code> package relies on a library named <code>JAGS<\/code>. This is already installed on the CSF so you can make it available to R by loading its modulefile. This will allow you to then install <code>rjags<\/code> and related packages such as <code>JMbayes<\/code>. If you are using R 3.6.2 or later you must load the JAGS modulefile that is compatible with the GCC 8.2.0 compiler (which was used to install R 3.6.2). Here is a complete example of installing <code>JMbayes<\/code>, which will install <code>rjags<\/code> in your local R directory in your home directory:<\/p>\n<pre>\r\n# Packages can be installed while you are on the login node\r\nmodule purge\r\nmodule load apps\/gcc\/R\/3.6.2                # Uses GCC 8.2.0\r\nmodule load apps\/gcc\/jags\/4.3.0-gcc-8.2.0   # Use the gcc-8.2.0 compatible version\r\nR\r\ninstall.packages(\"JMbayes\")\r\nlibrary(JMbayes)\r\n<\/pre>\n<p>You will be asked to select a <em>mirror<\/em> site from which to download the JMbayes packages (we typically use the Bristol UK mirror).<\/p>\n<p>Once the package has been installed, the <code>library(JMbayes)<\/code> command should be used each time you wish to use the package. You will also need to load the <code>jags<\/code> modulefile, as well as the R modulefile, in your jobscripts.<\/p>\n<h3>Adding RStan <\/h3>\n<p><a href=\"https:\/\/mc-stan.org\/rstan\/\">RStan<\/a> is the R interface to <a href=\"https:\/\/mc-stan.org\/\">Stan<\/a>, a popular software for Bayesian Data Analysis. To install <code>RStan<\/code> you would need to load a couple of additional libraries as modules. You would also have to load these modules every time you need to run <code>RStan<\/code>, e.g in a batch file.<br \/>\nHere is an example of installing <code>RStan<\/code> in your local R directory in your home directory:<\/p>\n<pre>\r\n# Request an interactive job on a compute node for 1hr.\r\nsrun -p interactive -t 0-1 --pty bash\r\n\r\n\r\n# When srun finds a node for you, load the required modules\r\n# <strong>You need to load these modules every time you use RStan (e.g. in batch jobs)<\/strong>\r\nmodule purge\r\nmodule load apps\/gcc\/R\/4.5.0\r\nmodule load libs\/gcc\/flexiblas\/3.4.5\r\nmodule load libs\/gcc\/glpk\/5.0\r\n\r\n# Now run R at the CSF command-prompt:\r\nR\r\n# Type below commands inside the R shell\r\nSys.setenv(DOWNLOAD_STATIC_LIBV8 = 1)\r\ninstall.packages(\"rstan\", repos = c(\"https:\/\/mc-stan.org\/r-packages\/\", getOption(\"repos\")), dependencies = TRUE)\r\n  #\r\n  # choose a mirror that is close to you, e.g. 64: Bristol\r\n  # answer 'yes' if asked: Would you like to use a personal library instead?\r\n  # Then 'yes' again to confirm the personal library path.\r\n  # It should take around 30 mins to download and compile the packages\r\n  # When done quit and restart R\r\nq()\r\nn\r\n### Now run R again at the CSF command-prompt:\r\nR\r\n# check that rstan is installed\r\nlibrary(\"rstan\")\r\n# Run the example in the rstan documentation to test (optional)\r\n&gt; example(stan_model, package = \"rstan\", run.dontrun = TRUE)\r\n<\/pre>\n<h3>Listing Packages<\/h3>\n<p>To list the installed packages run:<\/p>\n<pre>installed.packages();\r\n<\/pre>\n<p>To list loaded packages run:<\/p>\n<pre>(.packages())\r\n<\/pre>\n<h2>Removing Packages<\/h2>\n<p>Should you need to delete an installed package:<\/p>\n<pre>\r\nmodule load apps\/gcc\/R\/<em>version<\/em>\r\nR\r\nremove.packages('<em>thing<\/em>')\r\n<\/pre>\n<p>This will remove it from your local library of packages, for the version of R you are currently using. If you&#8217;ve used several versions of R over time and have installed the package with each one, you would need to load the modulefile for each version and remove the package from each one in turn.<\/p>\n<p><a name=\"R_Project_Environments_renv\"><\/a><\/p>\n<h2 id=\"renv\">R Project Environments<\/h2>\n<p>We advise using <a href=\"https:\/\/rstudio.github.io\/renv\/\" target=\"_blank\" rel=\"noopener\">renv<\/a> to install R packages that you need. The <strong>renv<\/strong> package helps you create conflict-free reproducible environments for your R projects. Using renv you can maintain separate project folder with their own set of R packages and they will not conflict with other R packages in other renv project folders or R modules. This is somewhat similar to using conda virtual environments. Benefit of using renv includes:<\/p>\n<ol>\n<li>\n<strong>Isolation:<\/strong> Installing a new or updated package for one pipeline will not break your other projects\/piplelines, and vice versa. That\u2019s because renv gives each project its own private library. You can have separate isolated project directory for each of your work\/pipelines with its own sets of packages.\n<\/li>\n<p><\/p>\n<li>\n<strong>Portability:<\/strong> You can easily transport your projects from one cluster\/computer to another, even across different platforms. renv makes it easy to install the packages your project depends on in the new environment.\n<\/li>\n<p><\/p>\n<li>\n<strong>Reproducibility:<\/strong> renv records the exact package versions you have installed in a project. This helps in ensuring those exact versions are installed wherever you want to move your work to.\n<\/li>\n<p>\n<\/ol>\n<p>The main steps involved are:<\/p>\n<ol>\n<li>\nCreate a separate directory for each of your projects\/pipelines and move (cd) to that directory. <\/p>\n<p>*If you are accessing multiple servers\/clusters from the same home directory and you are using different R modules for them, it is important that you use renv for your projects\/pipelines to avoid conflicts.\n<\/li>\n<p><\/p>\n<li>\nLoad appropriate R module as per your requirement. <\/p>\n<p>*It is advisable to use the latest R module available for your system as packages of older versions are not always maintained in the CRAN repositories.\n<\/li>\n<p><\/p>\n<li>\nStart R\n<\/li>\n<p><\/p>\n<li>\nInstall &#8216;renv&#8217; package.\n<\/li>\n<p><\/p>\n<li>\nInitialize renv. <\/p>\n<p>This will set up a project library. Following files and folder are created in that directory at the time of initialization which records the packages and the metadata needed to reinstall them: <strong>renv.lock, .Rprofile<\/strong> &#038; <strong>renv<\/strong>.<br \/>\nThese files and folders should not be altered. You can see them when you exit R and run the command <code><strong>ls -al<\/strong><\/code> from that directory later.\n<\/li>\n<p><\/p>\n<li>\nQuit\/Exit R. <\/p>\n<p>*This is IMPORTANT, you need to quit\/exit R after you have initialized renv for the first time after installation.\n<\/li>\n<p><\/p>\n<li>\nStart R again from the same directory.\n<\/li>\n<p><\/p>\n<li>\nInstall required R package(s).\n<\/li>\n<p><\/p>\n<li>\nCreate a snapshot of the installation.\n<\/li>\n<p><\/p>\n<li>\nCreate a text file and add information like the platform\/server the project was created and the R and other modules that were used, inside the project folder for your future reference.\n<\/li>\n<p><\/p>\n<li>\nIf needed, repeat the same steps for a different project\/pipleline requiring different sets of packages in a separate directory.\n<\/li>\n<p><\/p>\n<li>\nTo run a job using a package installed within a specific project created like this please see the <a href=\"#sample_jobscripts_renv\">sample jobscripts below<\/a>:\n<\/li>\n<\/ol>\n<p>Here are the commands needed to perform the steps described above.<br \/>\nIn this example we will install only the &#8216;BiocManager&#8217; package in this renv R Project folder, but you can install as many packages you need in a project.<\/p>\n<pre>\r\nmkdir ~\/MyFirstRProject\r\ncd ~\/MyFirstRProject\r\nmodule load apps\/gcc\/R\/4.4.2\r\nR\r\ninstall.packages(\"renv\")\r\n# Select the preferred CRAN mirror from the presented list\r\nrenv::init()\r\nq()\r\nn\r\nls -al\r\nR\r\ninstall.packages(\"BiocManager\")\r\n# Run commands to install additional R packages if needed\r\nY\r\nrenv::snapshot()\r\nq()\r\nn\r\n\r\ncat README.txt\r\n------------------------------------------------------------\r\n| Project Platform: CSF3_EL9                               |\r\n| Project directory location: ~\/MyFirstRProject            |\r\n| Modules used: apps\/gcc\/R\/4.4.2                           |\r\n| Packages Installed: BiocManager                          |\r\n------------------------------------------------------------\r\n<\/pre>\n<p><a name=\"sample_jobscripts_renv\"><\/a><\/p>\n<h3>Sample Jobscripts to run jobs using packages installed within a specific project created using &#8216;renv&#8217;<\/h3>\n<pre class=\"slurm\">\r\n#!\/bin\/bash --login\r\n#SBATCH -p serial   # Partition name is required (serial will default to 1 core)\r\n#SBATCH -t 4-0      # Job \"wallclock\" limit is required. Max permitted is 7 days (7-0)\r\n\r\nmodule load apps\/gcc\/R\/4.4.2\r\ncd ~\/MyFirstRProject      # You need to cd to the renv project folder first.\r\nR CMD BATCH --no-restore input.R\r\n<\/pre>\n<p><strong>For more information on &#8216;renv&#8217; please visit this <a href=\"https:\/\/rstudio.github.io\/renv\/articles\/renv.html\" target=\"_blank\" rel=\"noopener\">link<\/a>.<\/strong><\/p>\n<p><a name=\"plotting\"><\/a><\/p>\n<h2>Plotting<\/h2>\n<p>If you wish to plot graphs, for example, to image files, you will need to use the <code>cairo<\/code> plotting device.<\/p>\n<p>The following example generates a histogram and plots it to a <code>.png<\/code> file and a <code>.jpg<\/code> file. The jobscript is:<\/p>\n<pre>\r\n#!\/bin\/bash --login\r\n#$ -cwd\r\n#$ -l short\r\nmodule load apps\/gcc\/R\/4.4.0\r\nR CMD BATCH --no-restore plot.R\r\n<\/pre>\n<p>The R-code is:<\/p>\n<pre>\r\n# R script to demonstrate plotting to image files on the CSF\r\n\r\n# Enable cairo device <strong>(needed to prevent 'X11 not available' errors)<\/strong>\r\noptions(bitmapType='cairo')\r\n\r\n# Initialize some data to plot\r\nx = rnorm(100)\r\n\r\n# Save a png plot\r\npng(file=\"hist.png\")\r\nhist(x)\r\nrug(x,side=1)\r\ndev.off()\r\n\r\n# How about jpg\r\njpeg(file=\"hist.jpg\")\r\nhist(x)\r\nrug(x,side=1)\r\ndev.off()\r\n\r\n# R 3.6.1 (and later) can also do tiff\r\ntiff(file=\"hist.tif\")\r\nhist(x)\r\nrug(x,side=1)\r\ndev.off()\r\n<\/pre>\n<p>Now to view your images while on the CSF, use the <em>eye of gnome<\/em> (<code>eog<\/code>) image viewer:<\/p>\n<pre># List the image files created by the above example\r\nls hist.*\r\nhist.jpg  hist.png  hist.tif\r\n\r\n# Use the image viewer name 'eog' (Eye of Gnome) on the CSF login node\r\neog hist.png\r\n<\/pre>\n<p>If you need other image file formats you can then convert your PNG file using the <code>convert<\/code> command-line tool, available on the login node or can be run in your jobscript (note that <code>convert<\/code> is a Linux command-line program, not an R function):<\/p>\n<pre>\r\n# Using the hist.png example file from the above R script, convert it to another format:\r\nconvert hist.png hist.tif      # R 3.6.1 can write tif files directly (see above) but older versions can't\r\n\r\n# How about a .pdf\r\nconvert hist.png hist.pdf\r\n\r\n# Now view a .pdf on the login node\r\nevince hist.pdf\r\n<\/pre>\n<h2>Installing a package from source<\/h2>\n<p>The following example shows how to install an R package from source. This allows us to modify the source to resolve a C++ problem with the <code>std::isnan<\/code> method. Here we install the <code>igraph<\/code> package.<\/p>\n<pre>\r\nmodule purge\r\nmodule load apps\/gcc\/R\/4.4.2   # Use which ever version you require\r\nmkdir -p ~\/software\r\ncd ~\/software\r\n# Download igraph v2.1.4\r\nwget https:\/\/www.stats.bris.ac.uk\/R\/src\/contrib\/igraph_2.1.4.tar.gz\r\ntar xzf igraph_2.1.4.tar.gz\r\n# Modify the source to remove the \"using std::isnan\" declaration.\r\ngrep -lr '^using std::isnan' igraph | xargs sed -i 's@^using std::isnan@\/\/\\0@'\r\n# Install the package from source\r\nR -e \"install.packages('$PWD\/igraph\/', repos=NULL, type='source')\"\r\n<\/pre>\n<p>You can then use the library in the usual manner<\/p>\n<pre>\r\nR\r\nlibrary(igraph)\r\n<\/pre>\n<h2>Error in socketAccept<\/h2>\n<p>If you get an error message like the following while running any R package that you have installed from somewhere:<\/p>\n<pre> \r\nError in socketAccept(socket = socket, blocking = TRUE, open = \"a+b\",  :\r\n  all 128 connections are in use\r\n....\r\nExecution halted\r\n<\/pre>\n<p>this might point to the fact that a part of code in your installed package is trying to run across all CPU cores instead of the number of CPU cores that you have requested for the job. In such a case you should refer the documentation of the package to understand how to set\/control the number of CPU cores utilized by it and then submit your job requesting that many CPU cores.<\/p>\n<h2>Further Info<\/h2>\n<ul>\n<li><a href=\"https:\/\/www.r-project.org\/\">R website<\/a><\/li>\n<li><a href=\"http:\/\/www.bioconductor.org\/\">Bioconductor website<\/a><\/li>\n<li>There is a <a href=\"https:\/\/github.com\/RUMgroup\/Home\">University R user group<\/a> and an external <a href=\"http:\/\/www.rmanchester.org\/\">Manchester R group<\/a>.<\/li>\n<li><a href=\"https:\/\/r-openresearch-reproducibility.netlify.app\/\">R, Open Research, and Reproducibility<\/a> by Andrew Stewart course materials.<\/li>\n<\/ul>\n<h2>Updates<\/h2>\n<p>3.6.1 was installed October 2019.<br \/>\n3.6.0 was installed June 2019.<br \/>\n3.5.2 was installed Feb 2019.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>June 2023: The proxy is no longer available. To download data from external sites, please do so from a batch job or use an interactive session on a backend node by running qrsh -l short. You DO NOT then need to load the proxy modulefiles. Please see the qrsh notes for more information on interactive use. Overview R is a free software environment for statistical computing and graphics. See modulefile section below for list of.. <a href=\"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/software\/applications\/r\/\">Read more &raquo;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"parent":86,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-941","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/941","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/comments?post=941"}],"version-history":[{"count":24,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/941\/revisions"}],"predecessor-version":[{"id":12241,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/941\/revisions\/12241"}],"up":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/pages\/86"}],"wp:attachment":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf3\/wp-json\/wp\/v2\/media?parent=941"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}