{"id":3092,"date":"2016-06-09T13:32:12","date_gmt":"2016-06-09T13:32:12","guid":{"rendered":"http:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/?page_id=3092"},"modified":"2018-10-01T14:12:53","modified_gmt":"2018-10-01T14:12:53","slug":"tensorflow","status":"publish","type":"page","link":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/software\/applications\/tensorflow\/","title":{"rendered":"Tensorflow"},"content":{"rendered":"<h2>Overview<\/h2>\n<p><a href=\"https:\/\/www.tensorflow.org\/\">TensorFlow<\/a> is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs.<\/p>\n<p>Version 1.8.0 (using Python 3.5) has been installed for GPUs on the CSF.<\/p>\n<p>Versions 0.8.0, 0.9.0rc0, 1.0.0 (using Python 2.7), version 0.10.0 (using Python 3.4), version 0.11.0, 0.12.1, 1.0.0, 1.2.1, 1.8.0 (using Python 3.5) all for CPUs, versions 1.10.1 and 1.11.0 (using Python 3.6) for CPUs and GPUs have been installed on the CSF.<\/p>\n<h2>Restrictions on use<\/h2>\n<p>There are no access restrictions on the CSF.<\/p>\n<h2>Set up procedure<\/h2>\n<p>Note that as of Tensorflow 1.10.1 it is not possible to run on a Westmere CPU because the Tensorflow libraries require support for the Intel AVX instruction set. This means you must run on Sandybridge, Ivybridge, Haswell or Broadwell CPUs. So if you are submitting a batch CPU job, please add <em>one<\/em> of the following flags to your jobscript:<\/p>\n<pre>\r\n# Use only one of the following flags\r\n-l sandybridge\r\n-l ivybridge\r\n-l haswell\r\n-l broadwell\r\n<\/pre>\n<p>The CSF Nvidia K20 GPU nodes are Sandybridge nodes so you should NOT add any of these flags if using a GPU node.<\/p>\n<p>It is not currently possible to run interactive CPU jobs using 1.10.1 and up because there are no Sandybridge (or better) interactive nodes in the CSF.<\/p>\n<p>To access the software you must first load <em>one<\/em> of the following modulefile:<\/p>\n<pre>\r\n# Python 3.6 for GPUs: (uses CUDA 9.0.176, cuDNN 7.3.0, Anaconda3 5.2.0, <em>not<\/em> westmere CPUs)\r\napps\/gcc\/tensorflow\/1.11.0-py36-<strong>gpu<\/strong>\r\napps\/gcc\/tensorflow\/1.10.1-py36-<strong>gpu<\/strong>\r\n\r\n# Python 3.5 for GPUs: (uses CUDA 9.0.176, cuDNN 7.0.3, Anaconda3 4.2.0)\r\napps\/gcc\/tensorflow\/1.8.0-py35-<strong>gpu<\/strong>\r\n\r\n# Python 3.6 for CPUs: (uses Anaconda3 5.2.0, <em>not<\/em> Westmere CPUs)\r\napps\/gcc\/tensorflow\/1.11.0-py36-cpu\r\napps\/gcc\/tensorflow\/1.10.1-py36-cpu\r\n\r\n# Python 3.5 for CPUs: (uses Anaconda3 4.2.0)\r\napps\/gcc\/tensorflow\/1.8.0-py35-cpu\r\napps\/gcc\/tensorflow\/1.2.1-py35-cpu\r\napps\/gcc\/tensorflow\/1.0.0-py35-cpu\r\napps\/gcc\/tensorflow\/0.12.1-py35-cpu\r\napps\/gcc\/tensorflow\/0.11.0-py35-cpu\r\n\r\n# Python 3.4 for CPUs: (new versions <em>not<\/em> being installed unless requested)\r\napps\/gcc\/tensorflow\/0.10.0-py34-cpu\r\napps\/gcc\/tensorflow\/0.9.0rc0-py34-cpu\r\napps\/gcc\/tensorflow\/0.8.0-py34-cpu\r\n\r\n# Python 2.7 for CPUs:\r\napps\/gcc\/tensorflow\/1.2.1-py27-cpu          # New\r\napps\/gcc\/tensorflow\/1.0.0-py27-cpu\r\napps\/gcc\/tensorflow\/0.9.0rc0-py27-cpu\r\napps\/gcc\/tensorflow\/0.8.0-py27-cpu\r\n<\/pre>\n<p>The above modulefiles will load the following modulefiles automatically:<\/p>\n<ul>\n<li><em>One<\/em> of the following Anaconda python modulefiles:\n<ul>\n<li><code>apps\/binapps\/anaconda\/3\/4.2.0<\/code> (python 3.5.2)<\/li>\n<li><code>apps\/binapps\/anaconda\/3\/2.3.0<\/code> (python 3.4.3)<\/li>\n<li><code>apps\/binapps\/anaconda\/2.5.0<\/code> (python 2.7.11)<\/li>\n<\/ul>\n<\/li>\n<li><code>compilers\/gcc\/4.8.2<\/code> (C++11 compatible compiler)<\/li>\n<\/ul>\n<h2>Running the application<\/h2>\n<p>Please do not run Tensorflow on the login node. Jobs should be run interactively on the backend nodes (via <code>qrsh<\/code>) or submitted to the compute nodes via batch.<\/p>\n<p>The following instructions describe interactive use on a backend node and batch jobs from the login node.<\/p>\n<p>Technical Note (you are not required to do anything &#8211; this is for information only)<\/p>\n<ul>\n<li>We use a modified python executable (a shell script) named <code>python<\/code> to start the usual Anaconda python interpreter. This actually runs the following:\n<pre>LD_PRELOAD=\/usr\/lib64\/librt.so:$TFDIR\/fixes\/stubs\/mylibc.so:$GCCDIR\/lib64\/libstdc++.so.6 python<\/pre>\n<p>The <code>LD_PRELOAD<\/code> is needed to load a few libraries that replace system libraries. The pre-compiled TensorFlow installation supplied by Google requires a newer version of GLIBC than is available on the CSF. We have modified the TensorFlow library <code>_pywrap_tensorflow.so<\/code> to be less strict about the version of GLIBC present. But we then supply some function that are missing in our older GLIBC library that are required by TensorFlow.<\/li>\n<\/ul>\n<h3>Interactive use on a Backend GPU Node<\/h3>\n<p>June 2018: Currently only a couple of Nvidia K20 GPUs are available. To request access to these nodes please email <a href=\"&#x6d;&#x61;&#x69;&#x6c;&#x74;&#x6f;&#58;&#105;&#116;&#115;-ri-&#x74;&#x65;&#x61;&#x6d;&#x40;&#x6d;&#97;&#110;&#99;&#104;&#101;ste&#x72;&#x2e;&#x61;&#x63;&#x2e;&#x75;&#107;\">&#x69;&#x74;&#x73;&#x2d;&#114;&#105;&#45;&#116;eam&#x40;&#x6d;&#x61;&#x6e;&#x63;&#104;&#101;&#115;ter&#x2e;&#x61;&#x63;&#x2e;&#x75;&#107;<\/a>.<\/p>\n<p>Once you have been granted access to the Nvidia K20 node, start an interactive session as follows:<\/p>\n<pre>\r\nqrsh -l inter -l nvidia_k20\r\n\r\n# Wait until you are logged in to a backed compute node, then:\r\nmodule load apps\/gcc\/tensorflow\/1.8.0-py35-gpu\r\npython\r\n<\/pre>\n<p>An example TensorFlow GPU script is as follows:<\/p>\n<pre>\r\n# 3. Start python then enter the commands\r\npython\r\n\r\n# Now enter the following python commands:\r\n\r\n# Load the tensorflow library (using a short name for convenience)\r\nimport tensorflow as tf\r\n\r\n  # You should see:\r\n  #   successfully opened CUDA library libcudnn.so locally\r\n  #   (and other GPU details)...\r\n\r\n# Create a graph\r\na = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')\r\nb = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')\r\nc = tf.matmul(a, b)\r\n\r\n# Turn on device placement reporting so we can see where a graph runs\r\nsess = tf.Session(config=tf.ConfigProto(log_device_placement=True))\r\n\r\n  # You should see:\r\n  # Created TensorFlow device (\/job:localhost\/replica:0\/task:0\/device:GPU:0 with 4332 MB memory) -> physical GPU (device: 0, name: Tesla K20m, pci bus id: 0000:03:00.0, compute capability: 3.5)\r\n  # Device mapping:\r\n  # \/job:localhost\/replica:0\/task:0\/device:GPU:0 -> device: 0, name: Tesla K20m, pci bus id: 0000:03:00.0, compute capability: 3.5\r\n  # 2018-06-26 12:11:40.198864: I tensorflow\/core\/common_runtime\/direct_session.cc:284] Device mapping:\r\n  # \/job:localhost\/replica:0\/task:0\/device:GPU:0 -> device: 0, name: Tesla K20m, pci bus id: 0000:03:00.0, compute capability: 3.5\r\n\r\n# Run the graph. It will report the GPU used to do so.\r\nsess.run(c)\r\n\r\n  # You should see\r\n  # MatMul: (MatMul): \/job:localhost\/replica:0\/task:0\/device:GPU:0\r\n  # 2018-06-26 12:11:44.481336: I tensorflow\/core\/common_runtime\/placer.cc:886] MatMul: (MatMul)\/job:localhost\/replica:0\/task:0\/device:GPU:0\r\n  # b: (Const): \/job:localhost\/replica:0\/task:0\/device:GPU:0\r\n  # 2018-06-26 12:11:44.481358: I tensorflow\/core\/common_runtime\/placer.cc:886] b: (Const)\/job:localhost\/replica:0\/task:0\/device:GPU:0\r\n  # a: (Const): \/job:localhost\/replica:0\/task:0\/device:GPU:0\r\n  # 2018-06-26 12:11:44.481370: I tensorflow\/core\/common_runtime\/placer.cc:886] a: (Const)\/job:localhost\/replica:0\/task:0\/device:GPU:0\r\n  #\r\n  #   array([[ 22.,  28.],\r\n  #          [ 49.,  64.]], dtype=float32)\r\n\r\n# Exit the python shell\r\nCtrl-D<\/pre>\n<h3>Interactive use on a Backend CPU-only Node<\/h3>\n<p>To request an interactive session on a backend compute node run:<\/p>\n<pre>qrsh -l inter -l short\r\n\r\n# Wait until you are logged in to a backend compute node, then:\r\n\r\nmodule load apps\/gcc\/tensorflow\/1.2.1-py35-cpu\r\npython<\/pre>\n<p>An example TensorFlow session is given below.<\/p>\n<p>If there are no free <em>interactive<\/em> resources the <code>qrsh<\/code> command will ask you to try again later. Please do not run TensorFlow on the login node. Any jobs running there will be killed without warning.<\/p>\n<h3>Single CPU Example<\/h3>\n<p>A simple TensorFlow test is as follows:<\/p>\n<pre># Assuming you are at the CSF login node:\r\n\r\n# 1. Log in to a backend node \r\nqrsh -l inter -l short\r\n\r\n# 2. Load the modulefile on the backend node\r\nmodule load apps\/gcc\/tensorflow\/1.2.1-py35-cpu\r\n\r\n# 3. Start python then enter the commands\r\npython\r\n# Enter the following program\r\nimport tensorflow as tf\r\n\r\n# Create a graph\r\na = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')\r\nb = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')\r\nc = tf.matmul(a, b)\r\n\r\n# Create the TensorFlow session and restrict threads to number of cores we can use.\r\n# An interactive 'qrsh' session can only use one core.\r\nsess = tf.Session(config=tf.ConfigProto(inter_op_parallelism_threads=1,\r\n  intra_op_parallelism_threads=1))\r\n\r\n# Run the graph. It will report the GPU used to do so.\r\nsess.run(c)\r\n\r\n# You should see the following reported\r\nMatMul: \/job:localhost\/replica:0\/task:0\/cpu:0\r\nb: \/job:localhost\/replica:0\/task:0\/cpu:0\r\na: \/job:localhost\/replica:0\/task:0\/cpu:0\r\narray([[ 22.,  28.],\r\n       [ 49.,  64.]], dtype=float32)\r\n\r\n# Exit the python shell\r\nCtrl-D<\/pre>\n<h3>Serial batch job submission<\/h3>\n<p>Ensure you have loaded the correct modulefile on the login node. Create a python script (e.g., <code>my-script.py<\/code>) as follows. It will detect how many cores it can use:<\/p>\n<pre>import tensorflow as tf\r\nimport os\r\n\r\n# Get number of cores reserved by the batch system (NSLOTS is automatically set, or use 1 if not)\r\nNUMCORES=int(os.getenv(\"NSLOTS\",1))\r\nprint(\"Using\", NUMCORES, \"core(s)\" )\r\n\r\n# Create TF session using correct number of cores\r\nsess = tf.Session(config=tf.ConfigProto(inter_op_parallelism_threads=NUMCORES,\r\n   intra_op_parallelism_threads=NUMCORES))\r\n\r\n# Now create a TF graph\r\na = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')\r\nb = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')\r\nc = tf.matmul(a, b)\r\n\r\n# Run the graph and print result\r\nprint(sess.run(c))<\/pre>\n<p>Now create a jobscript similar to the following:<\/p>\n<pre>#!\/bin\/bash\r\n#$ -S \/bin\/bash\r\n#$ -cwd                   # Run job from directory where submitted\r\n#$ -V                     # Inherit environment (modulefile) settings\r\n\r\n# $NSLOTS is automatically set to 1. The python script uses this (see above).\r\npython my-script.py<\/pre>\n<p>Submit your jobscript using<\/p>\n<pre>qsub <em>jobscript<\/em><\/pre>\n<p>where <code><em>jobscript<\/em><\/code> is the name of your jobscript.<\/p>\n<h3>Parallel batch job submission<\/h3>\n<p>Ensure you have loaded the correct modulefile and then create a jobscript similar to the following:<\/p>\n<pre>#!\/bin\/bash\r\n#$ -S \/bin\/bash\r\n#$ -cwd                   # Run job from directory where submitted\r\n#$ -V                     # Inherit environment (modulefile) settings\r\n#$ -pe smp.pe 16          # Number of cores on a single compute node. Can be 2-24.\r\n\r\n# $NSLOTS is automatically set to the number of cores requested on the pe line\r\n# and can be read by your python code (see example above).\r\npython my-script.py<\/pre>\n<p>The above <code>my-script.py<\/code> example will get the number of cores to use from the <code>$NSLOTS<\/code> environment variable.<\/p>\n<p>Submit your jobscript using<\/p>\n<pre>qsub <em>jobscript<\/em><\/pre>\n<p>where <code><em>jobscript<\/em><\/code> is the name of your jobscript.<\/p>\n<h2>Further info<\/h2>\n<ul>\n<li><a href=\"https:\/\/www.tensorflow.org\/\">TensorFlow website<\/a><\/li>\n<\/ul>\n<h2>Updates<\/h2>\n<p>None.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Overview TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs. Version 1.8.0 (using Python 3.5) has been installed for GPUs on the CSF. Versions 0.8.0, 0.9.0rc0, 1.0.0 (using Python 2.7), version 0.10.0 (using Python 3.4),.. <a href=\"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/software\/applications\/tensorflow\/\">Read more &raquo;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"parent":31,"menu_order":0,"comment_status":"open","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-3092","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/pages\/3092","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/comments?post=3092"}],"version-history":[{"count":20,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/pages\/3092\/revisions"}],"predecessor-version":[{"id":4857,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/pages\/3092\/revisions\/4857"}],"up":[{"embeddable":true,"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/pages\/31"}],"wp:attachment":[{"href":"https:\/\/ri.itservices.manchester.ac.uk\/csf-apps\/wp-json\/wp\/v2\/media?parent=3092"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}