Slide Navigation






[[ Slide navigation:

Forwards:right arrow, space-bar or enter key
Reverse:left arrow

]]    


The Computational Shared Facility Update
2

 

Introductions

The Computational Shared Facility: 2011 October Update


Simon Hood

simon.hood@manchester.ac.uk


The Computational Shared Facility Update
3

 

Hardware Update

CSF Update: Hardware


The Computational Shared Facility Update
4

 

Contributions and Hardware Update

What hardware do we have?

Compute: 1448 cores (excludes RQ2 and GPU hosts)
  • 608 cores at 2GB/core — with Infiniband (fast i/c)
  • 468 + 36 + 12 cores at 4GB/core (7-day, 2-day, short)
  • 252 at 4GB/core — with Infiniband (fast i/c)
  • 96 cores at 8GB/core
  • 48 cores awaiting installation
Nvidia GPGPUs 2050s
  • 16 with Infiniband on hosts
  • 3 without (soon 7)


The Computational Shared Facility Update
5

 

Contributions

Chris Taylor20kImag., Gen. and Prot.
Mike Sutcliffe55kCEAS
Ian Hillier15kChemistry
Richard Bryce22kPharmacy
School contribution125kMACE
Paola Carbone + school contrib.45kCEAS
Simon Lovell and Simon Whelan15kBioinf. (FLS)
Jane Worthington15kTranslational Med.
Ben Rogers18kMACE
Nick Higham54kMaths
Stephen Welbourne15kPsychology
Neil Burton15kChemistry
Paul Grassia15kCEAS
Paul Popelier7kChemistry
School/faculty contribution48kFLS
Total:484k

Amounts contributed and source, to date.


The Computational Shared Facility Update
6

 

Known Future Contributions

With lots more coming r-s-n!

Another 115k
  • Paul Popelier
  • Chris Taylor
  • RGF
    • includes four more Nvidia 2050s!
  • Andrew Masters
RQ2 Integration
  • In testing
  • 264 cores
Another seed unit from Dell
  • C6145 with AMD Interlagos CPUs (November)


The Computational Shared Facility Update
7

 

Internal Network Upgrade

That two-day downtime. . .


The Computational Shared Facility Update
8

 

Service Improvements

Service Improvements


The Computational Shared Facility Update
9

 

Improved Documentation

Significant work on application doc — pls give us feedback!


The Computational Shared Facility Update
10

 

(GP)GPU Doc and Software

Use of Nvidia (GP)GPUs getting easier on CSF


Thanks particularly to Mike Croucher, EPS.


The Computational Shared Facility Update
11

 

Redqueen

Moves afoot to bring RedQueen closer to the CSF


The Computational Shared Facility Update
12

 

Accounting

Accounting


The Computational Shared Facility Update
13

 

Accounting Results One

Usage by group from 2011/02/07 to 2011/05/11

  Who       Shares         CPU Seconds           Fraction of 
                                                 Contrib Used

  ms01 :    46 (0.53) :     36465791 (0.07) :    0.03 
  ih01 :    12 (0.14) :    244226356 (0.49) :    0.77 
  ct01 :    17 (0.20) :      2688065 (0.01) :    0.01 
  rb01 :    12 (0.14) :    219109077 (0.44) :    0.69 


The Computational Shared Facility Update
14

 

Accounting Results Two

Usage by group from 2011/05/11 to 2011/07/09

  Who       Shares         CPU Seconds           Fraction of 
                                                 Contrib Used

  ms01 :     46 (0.23) :    274368334 (0.11) :   0.28 
  ih01 :     12 (0.06) :    267930176 (0.10) :   1.05 
  ct01 :     17 (0.08) :      4679642 (0.00) :   0.01 
  rb01 :     12 (0.06) :    409340623 (0.16) :   1.60 
mace01 :    117 (0.57) :   1599603757 (0.63) :   0.64 


The Computational Shared Facility Update
15

 

Accounting Results Three

Usage by group from 2011/07/09 to 2011/07/21

  Who       Shares         CPU Seconds           Fraction of 
                                                 Contrib Used

  ms01 :     46 (0.18) :   121059103 (0.14) :    0.61 
  ih01 :     12 (0.05) :    59091621 (0.07) :    1.15 
  ct01 :     17 (0.07) :      151941 (0.00) :    0.00 
  rb01 :     12 (0.05) :   100628586 (0.12) :    1.96 
mace01 :    117 (0.46) :   520917352 (0.60) :    1.04 
  pc01 :     27 (0.11) :    56908426 (0.07) :    0.49 
slsw01 :     12 (0.05) :    15233699 (0.02) :    0.30 


The Computational Shared Facility Update
16

 

Accounting Results Four

Usage by group from 2011/07/21 to 2011/10/18

  Who       Shares         CPU Seconds           Fraction of 
                                                 Contrib Used

  ms01 :     58 (0.14) :    296263950 (0.05) :   0.18 
  ih01 :     12 (0.03) :    356669215 (0.07) :   1.04 
  ct01 :     17 (0.04) :      6008512 (0.00) :   0.01 
  rb01 :     18 (0.04) :    283804885 (0.05) :   0.55 
mace01 :    117 (0.28) :   3507229753 (0.65) :   1.05 
  pc01 :     39 (0.09) :    708707558 (0.13) :   0.64 
slsw01 :     12 (0.03) :     52273465 (0.01) :   0.15 
  nb01 :     15 (0.04) :     86515225 (0.02) :   0.20 
  jw01 :     12 (0.03) :       264959 (0.00) :   0.00 
  nh01 :     45 (0.11) :     14445581 (0.00) :   0.01 
  sw01 :     13 (0.03) :     50799708 (0.01) :   0.14 
 fls01 :     48 (0.12) :     42501501 (0.01) :   0.03 
  pp01 :      5 (0.01) :        88513 (0.00) :   0.00 


The Computational Shared Facility Update
17

 

Accounting Plans

Next week. . .


The Computational Shared Facility Update
18

 

Policies

Policies


The Computational Shared Facility Update
19

 

Small Contributions

What is the minimum contribution?



The Computational Shared Facility Update
20

 

RGF Kit Usage

What do we do with the RGF-funded kit?

  1. Allow access to all comp. researchers at UoM (No!)
    • Not practical to support this
  2. Current contributors only (No)

  3. System evaluation for potential contributors (Yes)
  4. Pump priming (Yes — (very) lightweight case required?)
  5. Soft landing (Yes)
  6. Training (Of course!)


The Computational Shared Facility Update
21

 

Share Depreciation Rate

CSF Share Should Decrease in Line with Relative Decrease in Contributed Compute Power

Year 1 2 3 4 5 6 7
Linear@20% 1.00 0.80 0.60 0.40 0.20 0.00 0.00
Moore's law (2-year) 1.00 0.71 0.50 0.35 0.25 0.18 0.13
Moore's law (3-year) 1.00 0.79 0.63 0.50 0.40 0.31 0.25
Geometric@20% 1.00 0.80 0.64 0.51 0.41 0.33 0.26
Bespoke - balance 1.00 0.85 0.65 0.45 0.35 0.20 0.10


The Computational Shared Facility Update
22

 

Scratch Space

Scratch: not a permanent home for files!

Policy options:

  1. One month limit. . .
  2. . . .or three months? (Easier to start short and increase. . .)
  3. Is touching files (changing time stamp) an abuse, or not?
  4. Automated emails to owners of old files:
    • warning, second warning, deletion
  5. Quotas — based on contribution?
    • "informal"


The Computational Shared Facility Update
23

 

Extra Storage Options?

Req. for long-term, large-scale storage on CSF?

Scratch
  • Opportunity to invest in scratch, just like compute?
  • Fast (Lustre)
  • Less resilient, no backup — do not rely on this for long-term storage
  • Comes in big chunks — price???
Home
Buy extra home space — 1.5k per TB for five years (to be confirmed)
  • "Resilient storage" from IT Services new SAN
  • Not as fast
  • Comes in small chunks (good)

Recommendation:


The Computational Shared Facility Update
24

 

Finale

Finale


The Computational Shared Facility Update
25

 

Photo 1


C6100s and switches


The Computational Shared Facility Update
26

 

Photo 4


The beginnings of a dedicated research network?


The Computational Shared Facility Update
27

 

Photo 2


Nvidia GPU-hosting blades.


The Computational Shared Facility Update
28

 

Photo 3


The hardest working bit of kit in the room.


The Computational Shared Facility Update
29

 

Value for Money

Is the UoM CSF good value for money to UoM academics?


UoM CSF vs Commercial HPC Cloud:

CSF
  • Cost to academics at point of use:
    • approx 0.01 pounds (one penny) per core-hour
Penguin Computing HPC Cloud (POD)
  • 0.20 cents per core-hour — current as of 6/21/2011


Page Contents:


Hardware Update


Service Improvements


Accounting


Policies


Finale

[raiser] [escape] [lower]

Contents: