{"id":1213,"date":"2024-06-07T15:49:56","date_gmt":"2024-06-07T14:49:56","guid":{"rendered":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/?page_id=1213"},"modified":"2024-06-20T12:36:51","modified_gmt":"2024-06-20T11:36:51","slug":"may-2024-upgrade","status":"publish","type":"page","link":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/may-2024-upgrade\/","title":{"rendered":"Known Issues and Workarounds &#8211; May 2024 upgrade"},"content":{"rendered":"<p>As part of the May 2024 CSF4 upgrade the Operating System was upgraded from CentOS 7.9 to Red Hat Enterprise Linux 9.3. As a result, there&#8217;s a possibility certain pieces of software will require re-installing, or perhaps additional modulefiles will need to be loaded in your jobscript.<\/p>\n<p>We list below known problems and work-arounds discovered so far.<\/p>\n<p>Your own testing of your jobs will help us. If you discover any such things in your own jobs, please do let us know!<\/p>\n<h2>How to report an issue<\/h2>\n<p>If you would like to report a problem with a piece of software following the May 2024 upgrade, please do <strong>ONE<\/strong> of the following (no need to do both).<\/p>\n<ol>\n<li>Fill out the following <a href=\"https:\/\/manchester.saasiteu.com\/Modules\/SelfService\/#serviceCatalog\/request\/366AD8FC5ADE46A881E87E238F84EF52\">request form<\/a> and do include the following.\n<ul>\n<li>Request Type = <strong>Request access to \/install software on HPC\/HTC<\/strong><\/li>\n<li>System in Use = <strong>CSF4<\/strong><\/li>\n<li>Software (name &amp; version) = <strong>Name of Software inc. Version<\/strong><\/li>\n<li>Additional Information = In the body of text please outline the error, be sure to include any error messages, location of job scripts, relevant locations\/directories, etc.<\/li>\n<\/ul>\n<p><strong>OR<\/strong><\/li>\n<li>Send and\u00a0 email <a href=\"&#109;&#x61;&#105;&#x6c;t&#x6f;:&#105;&#x74;&#115;&#x2d;&#114;&#x69;-&#x74;e&#97;&#x6d;&#64;&#x6d;a&#x6e;c&#x68;&#x65;&#115;&#x74;&#101;&#x72;&#46;&#x61;c&#46;&#x75;&#107;\">&#105;&#x74;&#115;&#x2d;r&#105;&#x2d;&#116;&#x65;a&#x6d;&#64;&#109;&#x61;&#110;&#x63;h&#x65;&#x73;&#116;&#x65;r&#x2e;a&#99;&#x2e;&#117;&#x6b;<\/a>. In the subject line please include the the following,\n<ul>\n<li>Subject &#8211; <strong>CSF4 &#8211; Software May2024 upgrade &#8211; <em>nameOfSoftware + version<\/em>&#8220;<\/strong>.<\/li>\n<li>Email body &#8211; In the body of text please outline the error, be sure to include any error messages, location of job scripts, relevant locations, etc.<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<h2>Known Issues and Workarounds<\/h2>\n<h3>sbatch: error: Batch job submission failed: Node count specification invalid<\/h3>\n<p>If you submit a multinode job using:<\/p>\n<pre>#!\/bin\/bash --login\r\n#SBATCH -p multinode\r\n#SBATCH -n 80            # 80 cores in total = 2 x 40-core compute nodes\r\nmodule load ......\r\nmpirun -n $SLURM_NTASKS <em>someapp.exe<\/em>\r\n\r\n######## <strong>THIS JOB WILL BE REJECTED<\/strong> #########\r\n<\/pre>\n<p>When you submit the job you will see an error:<\/p>\n<pre>sbatch myjobscript\r\nsbatch: error: Batch job submission failed: Node count specification invalid\r\n<\/pre>\n<p><strong>The solution<\/strong> is to ALSO specify the number of nodes:<\/p>\n<pre>#!\/bin\/bash --login\r\n#SBATCH -p multinode\r\n<strong>#SBATCH -N 2             # 2 nodes<\/strong> \r\n#SBATCH -n 80            # 80 cores in total = 2 x 40-core compute nodes\r\nmodule load ......\r\nmpirun -n $SLURM_NTASKS <em>someapp.exe<\/em>\r\n\r\n######## <strong>THIS JOB WILL BE ACCEPTED<\/strong> #########\r\n<\/pre>\n<p>We are looking in to this behaviour change in SLURM.<\/p>\n<h3>version `XZ_5.2&#8242; not found<\/h3>\n<pre>\r\nflatpak: \/opt\/software\/RI\/apps\/XZ\/5.2.5-GCCcore-10.3.0\/lib\/liblzma.so.5: version `XZ_5.2' not found (required by \/lib64\/libarchive.so.13)\r\nflatpak: \/opt\/software\/RI\/apps\/XZ\/5.2.5-GCCcore-10.3.0\/lib\/liblzma.so.5: version `XZ_5.2' not found (required by \/lib64\/librpmio.so.9)\r\n<\/pre>\n<p>Known applications affected: <strong>GROMACS<\/strong><\/p>\n<p><strong>Solution<\/strong>: Please load the following module file in any job scripts that report the above error.<\/p>\n<pre>module load xz\/5.2.5-gcccore-10.3.0\r\n<\/pre>\n<h3>Job email notifications not working<\/h3>\n<p>When adding lines such as the following to your jobscript, you won&#8217;t actually receive any email:<\/p>\n<pre>#SBATCH --mail-type=ALL\r\n#SBATCH --mail-user=<em>firstname.lastname<\/em>@manchester.ac.uk\r\n<\/pre>\n<p>There is no workaround &#8211; we will address this issue next week.<\/p>\n<h3>Gaussview gview.exe: libGLU.so.1: cannot open shared object file<\/h3>\n<p>gview.exe: error while loading shared libraries: libGLU.so.1: cannot open shared object file: No such file or directory<\/p>\n<p><strong>Solution<\/strong>: <del datetime=\"2024-06-14T10:11:10+00:00\">Please load the following module to prevent this error:<\/del><br \/>\n14\/06\/2024: The libglu libraries are now installed on the login and compute nodes, so there is no need to load this modulefile.<\/p>\n<pre>module load libglu\/9.0.1-gcccore-10.3.0<\/pre>\n<h3>-bash: nano: command not found<\/h3>\n<p><strong>Solution<\/strong>: The <code>nano<\/code> editor has now been installed on the login nodes.<\/p>\n<h3>Licensing errors<\/h3>\n<p>Known applications affected:<strong>MATLBAB,StarCCM, other applications that contact on-campus license servers will more than likely be affected.<\/strong><br \/>\nExample of errors<\/p>\n<pre>\r\nMATLAB\r\nLicense checkout failed.\r\nLicense Manager Error -15\r\nUnable to connect to the license server. \r\nCheck that the network license manager has been started, and that the client machine can communicate\r\nwith the license server.\r\n<\/pre>\n<p><del datetime=\"2024-06-11T16:29:09+00:00\">Currently under investigation<\/del><br \/>\n<del datetime=\"2024-06-16T16:45:56+00:00\">11\/06\/24 &#8211; root cause has been identified and a solution is being worked on.<\/del><br \/>\n16\/06\/24 &#8211; this issue has now been resolved &#8211; applications that use the campus license servers will now run.<br \/>\n<strong>Solution<\/strong>: Please just submit your jobs as usual &#8211; no changes are required to youor jobscripts.<\/p>\n<h3>Error Running Paraview Macro<\/h3>\n<pre>\r\nHYDU_create_process (utils\/launch\/launch.c:73): execvp error on file srun (No such file or directory)\r\n<\/pre>\n<p><del datetime=\"2024-06-17T09:44:07+00:00\">currently under investigation<\/del><br \/>\n<strong>Solution<\/strong>: Please just submit your jobs as usual &#8211; no changes are required to your jobscripts.<\/p>\n<h3>libnsl.so.1: cannot open shared object file<\/h3>\n<pre>\r\nerror while loading shared libraries: libnsl.so.1: cannot open shared object file: No such file or directory\r\n<\/pre>\n<p>Known applications affected:<strong>StarCCM<\/strong><\/p>\n<p><strong>Solution<\/strong>: Missing dependency has been installed<\/strong><\/p>\n<h3>libssl.so.10: cannot open shared object file<\/h3>\n<pre>\r\nflatpak: error while loading shared libraries: libssl.so.10: cannot open shared object file: No such file or directory\r\n<\/pre>\n<p><strong>Solution<\/strong>: Please add the following to your jobscript, after loading your usual modulefiles:<\/p>\n<pre>\r\nmodule load openssl\/1.0.2k\r\n<\/pre>\n<h3>LAMMPS libssl.so.10, libcrypto.so.10, libz.so.1 : not found \/ cannot open shared object file<\/h3>\n<p><strong>Solution<\/strong>: Please add the following additional modules in your jobscript, after loading your LAMMPS module, <strong>in the given order<\/strong>:<\/p>\n<pre>\r\nmodule load openssl\/1.0.2k\r\nmodule load zlib\/1.2.11-gcccore-9.3.0\r\n<\/pre>\n<p>CSF4 <a href=\"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/software\/applications\/lammps\/\">wiki on LAMMPS<\/a> updated with this info.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>As part of the May 2024 CSF4 upgrade the Operating System was upgraded from CentOS 7.9 to Red Hat Enterprise Linux 9.3. As a result, there&#8217;s a possibility certain pieces of software will require re-installing, or perhaps additional modulefiles will need to be loaded in your jobscript. We list below known problems and work-arounds discovered so far. Your own testing of your jobs will help us. If you discover any such things in your own.. <a href=\"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/may-2024-upgrade\/\">Read more &raquo;<\/a><\/p>\n","protected":false},"author":8,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1213","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/wp-json\/wp\/v2\/pages\/1213","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/wp-json\/wp\/v2\/users\/8"}],"replies":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/wp-json\/wp\/v2\/comments?post=1213"}],"version-history":[{"count":20,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/wp-json\/wp\/v2\/pages\/1213\/revisions"}],"predecessor-version":[{"id":1288,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/wp-json\/wp\/v2\/pages\/1213\/revisions\/1288"}],"wp:attachment":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf4\/wp-json\/wp\/v2\/media?parent=1213"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}