{"id":335,"date":"2013-04-26T10:12:35","date_gmt":"2013-04-26T10:12:35","guid":{"rendered":"http:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/?page_id=335"},"modified":"2017-06-21T09:52:20","modified_gmt":"2017-06-21T09:52:20","slug":"openmp","status":"publish","type":"page","link":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/software\/applications\/openmp\/","title":{"rendered":"OpenMP"},"content":{"rendered":"<h2>Overview<\/h2>\n<p>OpenMP (Open Multi-Processing) is a specification for shared memory parallelism.<\/p>\n<p>Programming using OpenMP is currently beyond the scope of this webpage. IT Services for Research run <a href=\"http:\/\/www.staffnet.manchester.ac.uk\/staff-learning-and-development\/academicandresearch\/practical-skills-and-knowledge\/it-skills\/research-computing\/research-courses\/\n\">training courses<\/a> in parallel programming.<\/p>\n<h2>Restrictions on use<\/h2>\n<p>You will need to use an OpenMP-compliant compiler to produce a threaded executable. Threaded executables required shared memory which, on CSF, means you can only run on a single node (see <a href=\"#batch\">examples<\/a>).<\/p>\n<h2>Set up procedure<\/h2>\n<p>Accessible via the compiler being used. For details on how to access compilers on the CSF please see:<\/p>\n<ul>\n<li><a href=\"\/csf-apps\/software\/applications\/compilersintel\">Intel Compiler<\/a><\/li>\n<li><a href=\"\/csf-apps\/software\/applications\/compilersgnu\">GNU Compiler<\/a><\/li>\n<li><a href=\"\/csf-apps\/software\/applications\/pgi\">PGI Compiler<\/a>\n<\/li>\n<li><a href=\"\/csf-apps\/software\/applications\/compilersamd\">Open64 (AMD Bulldozer) Compiler<\/a><\/li>\n<\/ul>\n<h2>Compiling<\/h2>\n<p>The compilers all use different flags to turn on OpenMP compilation as follows:<\/p>\n<ul>\n<li>Intel compiler: <code>-openmp<\/code><\/li>\n<li>GNU compiler: <code>-fopenmp<\/code><\/li>\n<li>PGI compiler: <code>-mp<\/code><\/li>\n<li>Open64 compiler: <code>-openmp<\/code><\/li>\n<\/ul>\n<p>Your code will also need to include the relevant OpenMP header and use OpenMP directives. The behaviour of your code (e.g., how many threads to run) will be determined by OpenMP environment variables, which you can query by the OpenMP run time library functions.<\/p>\n<p>Example compiles, all of which produce an executable called &#8216;omp_hello&#8217;:<\/p>\n<ul>\n<li>Intel fortran:\n<pre>\r\nifort omp_hello.f -o omp_hello -openmp\r\n<\/pre>\n<\/li>\n<li>gfortran:\n<pre>\r\ngfortran omp_hello.f -o omp_hello -fopenmp\r\n<\/pre>\n<\/li>\n<li>Intel C example:\n<pre>\r\nicc omp_hello.c -o omp_hello -openmp\r\n<\/pre>\n<\/li>\n<li>GCC:\n<pre>\r\ngcc omp_hello.c -o omp_hello -fopenmp\r\n<\/pre>\n<\/li>\n<li>Open64:\n<pre>\r\nopencc omp_hello.c -o omp_hello -march=bdver1 -openmp\r\n<\/pre>\n<\/li>\n<\/ul>\n<p><a name=\"ompjobscript\"><\/a><\/p>\n<h2>Running the application<\/h2>\n<p>You must inform your application how many cores to use and this <strong>must match<\/strong> the number of cores you request from the batch system. The batch system does not automatically run your program with the number of cores reserved for your job. You need to reserve the required number of cores in your jobscript <strong>AND<\/strong> inform your application to use that number of cores.<\/p>\n<p>The preferred method of informing your application how many cores to use is to set the <code>OMP_NUM_THREADS<\/code> environment variable to the number required. This is normally done in your jobscript. Alternatively use the <code>omp_set_num_threads()<\/code> run time library function in your code (but this is not the preferred method).<\/p>\n<p>If you forget to set <code>OMP_NUM_THREADS<\/code>, for example, your application will use <strong>all<\/strong> cores on the node where it runs. This may be more than you have told the batch system you will be using and may slow down other users&#8217; jobs on that node. Jobs that are found to be using more cores than they have requested in the batch system will be killed by the sys admins.<\/p>\n<p><a name=\"batch\"><\/a><\/p>\n<h3>Example Parallel batch submission scripts (Intel nodes)<\/h3>\n<ul>\n<li>Setting OMP_NUM_THREADS within the submission script (<strong>the preferred method<\/strong>). Here we use the value of the variable <code>$NSLOTS<\/code> which is automatically set by the batch system to the number of cores you request in your batch submission script:\n<pre>\r\n#!\/bin\/bash\r\n## SGE Stuff\r\n#$ -cwd\r\n#$ -V\r\n#$ -pe smp.pe 8              # Example: request 8 cores\r\n\r\n## The variable NSLOTS is automatically set to the number specified after smp.pe above.\r\n## We use it to set OMP_NUM_THREADS so that our code will use that many cores.\r\nexport OMP_NUM_THREADS=$NSLOTS\r\n\r\n.\/omp_hello\r\n<\/pre>\n<\/li>\n<li>If you have set OMP_NUM_THREADS on the command line using\n<pre>export OMP_NUM_THREADS=8<\/pre>\n<p> (for example) before submitting your job:<\/p>\n<pre>\r\n#!\/bin\/bash\r\n## SGE Stuff\r\n#$ -cwd\r\n## Must use -V to inherit the OMP_NUM_THREADS set outside of the jobscript\r\n#$ -V\r\n## The number on the pe line (number of cores) **must** match the number you set for OMP_NUM_THREADS \r\n#$ -pe smp.pe 8\r\n\r\n.\/omp_hello\r\n<\/pre>\n<\/li>\n<\/ul>\n<p>In both cases submit the jobscript using <code>qsub <em>jobscript<\/em><\/code> where <em>jobscript<\/em> is the name of your file.<\/p>\n<h3>Example Parallel batch submission scripts (AMD Bulldozer nodes)<\/h3>\n<ul>\n<li>Setting OMP_NUM_THREADS within the submission script (<strong>the preferred method<\/strong>). Here we use the value of the variable <code>$NSLOTS<\/code> which is automatically set by the batch system to the number of cores you request in your batch submission script:\n<pre>\r\n#!\/bin\/bash\r\n## SGE Stuff\r\n#$ -cwd\r\n#$ -V\r\n#$ -pe smp-64bd.pe 64              # Example: request all 64 cores\r\n\r\n## The variable NSLOTS is automatically set to the number specified after smp-64bd.pe above.\r\n## We use it to set OMP_NUM_THREADS so that our code will use that many cores.\r\nexport OMP_NUM_THREADS=$NSLOTS\r\n\r\n.\/omp_hello\r\n<\/pre>\n<\/li>\n<\/ul>\n<p>Submit the jobscript using <code>qsub <em>jobscript<\/em><\/code> where <em>jobscript<\/em> is the name of your file.<\/p>\n<h3>Example Parallel batch submission scripts (AMD Magny-Cours nodes)<\/h3>\n<ul>\n<li>Setting OMP_NUM_THREADS within the submission script (<strong>the preferred method<\/strong>). Here we use the value of the variable <code>$NSLOTS<\/code> which is automatically set by the batch system to the number of cores you request in your batch submission script:\n<pre>\r\n#!\/bin\/bash\r\n## SGE Stuff\r\n#$ -cwd\r\n#$ -V\r\n#$ -pe smp-32mc.pe 32              # Example: request all 32 cores\r\n\r\n## The variable NSLOTS is automatically set to the number specified after smp-32mc.pe above.\r\n## We use it to set OMP_NUM_THREADS so that our code will use that many cores.\r\nexport OMP_NUM_THREADS=$NSLOTS\r\n\r\n.\/omp_hello\r\n<\/pre>\n<\/li>\n<\/ul>\n<p>Submit the jobscript using <code>qsub <em>jobscript<\/em><\/code> where <em>jobscript<\/em> is the name of your file.<\/p>\n<h2>Further info<\/h2>\n<ul>\n<li><a href=\"http:\/\/www.openmp.org\">The official OpenMP site<\/a><\/li>\n<li><a href=\"http:\/\/www.staffnet.manchester.ac.uk\/staff-learning-and-development\/academicandresearch\/practical-skills-and-knowledge\/it-skills\/research-computing\/research-courses\/\">Training Courses<\/a> run by IT Services for Research.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Overview OpenMP (Open Multi-Processing) is a specification for shared memory parallelism. Programming using OpenMP is currently beyond the scope of this webpage. IT Services for Research run training courses in parallel programming. Restrictions on use You will need to use an OpenMP-compliant compiler to produce a threaded executable. Threaded executables required shared memory which, on CSF, means you can only run on a single node (see examples). Set up procedure Accessible via the compiler being.. <a href=\"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/software\/applications\/openmp\/\">Read more &raquo;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"parent":31,"menu_order":0,"comment_status":"open","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-335","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/pages\/335","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/comments?post=335"}],"version-history":[{"count":14,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/pages\/335\/revisions"}],"predecessor-version":[{"id":3960,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/pages\/335\/revisions\/3960"}],"up":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/pages\/31"}],"wp:attachment":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/media?parent=335"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}