Slide Navigation






[[ Slide navigation:

Forwards:right arrow, space-bar or enter key
Reverse:left arrow

]]   


The Computational Shared Facility: Status, Strategy and Policy Proposals
2

 

Introductions

The Computational Shared Facility: Status and Strategy


Simon Hood

simon.hood@manchester.ac.uk

Project Manager & Technical Lead
Infra Ops, IT Services


The Computational Shared Facility: Status, Strategy and Policy Proposals
3

 

Background and Context

Background and Context


The Computational Shared Facility: Status, Strategy and Policy Proposals
4

 

Uni RC Strategy: Mi Whitepaper

Funding Model — IT Services-Run Comp. Shared-Fac.
  • One-off capital secured from centre: 90k
    • Cluster infra. (head nodes, storage, network h/w. . .)
  • All compute nodes must be paid for by research groups
    • With "tax" to contribute to future infrastructure
  • No contribution, (almost) no use!
From Many to One
  • Academics encouraged to buy in to central facility
    • strongly discouraged from buying own (small) clusters


Whitepaper published in latter part of 2009. . .


The Computational Shared Facility: Status, Strategy and Policy Proposals
5

 

Status

Current Status of the CSF


The Computational Shared Facility: Status, Strategy and Policy Proposals
6

 

The Present

What do we have?
  • Adoption of the Redqueen model.
  • 90k sounds small. . .
  • . . .but, for the first time(?), both:
    • University political backing from the top
    • and University central IT support (esp. for dedicated network).





The Computational Shared Facility: Status, Strategy and Policy Proposals
7

 

New System (Hardware)

Apps installed, testing finished, user accounts created:
  • 68TB parallel, high-perf scratch (Lustre)
  • 240 cores at 4GB/core
  • 48 cores at 8GB/core
  • 96 cores awaiting installation
Much more on its way. . .
  • 512 cores on order!
  • 96 cores to be ordered (Monday?)
  • More expected in Spring. . .

In Reynolds House. . .


The Computational Shared Facility: Status, Strategy and Policy Proposals
8

 

New System (Apps and Users)

Software:
  • Apps installed
    • Gaussian 09, Amber. . .
    • Matlab, R. . .
    • CHARMM, NAMD, Polyrate. . .
Users:
  • Testing by users well underway
  • Awaiting registration system. . .


The Computational Shared Facility: Status, Strategy and Policy Proposals
9

 

Pic 1


The Computational Shared Facility: Status, Strategy and Policy Proposals
10

 

Pic 2


The Computational Shared Facility: Status, Strategy and Policy Proposals
11

 

Pic 3


The Computational Shared Facility: Status, Strategy and Policy Proposals
12

 

Who and What

Contributors thus far:

Chris Taylor20kImag., Gen. and Prot.
Mike Sutcliffe55kCEAS
Ian Hillier15kChemistry
Richard Bryce15kPharmacy
Peter Stansby/Colin Bailey125kMACE/EPS
School contribution32kCEAS

Upcoming (expected):

15kTranslational Med.
15kFLS (Bioinf)
30k(?)Mathematics
6kMBS


The Computational Shared Facility: Status, Strategy and Policy Proposals
13

 

GPGPUs

Pooled purchasing clearly having the desired effect!

Dell provided loss-lead blade chassis (M1000) and two blades (M610x):

. . .a very low price. . .


More?


The Computational Shared Facility: Status, Strategy and Policy Proposals
14

 

Next Steps

Next Steps


The Computational Shared Facility: Status, Strategy and Policy Proposals
15

 

Assimilation

Tightly-Integrate Clusters on Campus

Link private cluster networks (dedicated 10Gb links):
  • Share filesystems — easy to implement (with reqd h/w);
    • much better for users!
  • Ultimately, one "collective" instance of workload manager, Grid Engine (aka SGE)
  • No requirement for "grid" middleware
What? New System, RQ2 and Redqueen
  • Dedicated 10Gb link between Reynolds and Kilburn
  • Total ~2000 cores

. . .details/timescales over. . .


The Computational Shared Facility: Status, Strategy and Policy Proposals
16

 

RQ2

The first step in assimilation. . .

RQ2:


The Computational Shared Facility: Status, Strategy and Policy Proposals
17

 

Redqueen

The second step will take longer. . .

Redqueen:

What?
  • ~800 cores; 16 Nvdia 2050s
  • MACE, Economics, SEAES (Atmos), EEE, Chemistry, MBS
  • Different machine room from CSF (Kilburn)
Steps
  • Awaiting dedicated 10Gb link between Reynolds and Kilburn
  • Filesystem upgrade
    • Has ~25TB storage — upgrade some to Lustre
  • Summer?


The Computational Shared Facility: Status, Strategy and Policy Proposals
18

 

Man1, Man2 and The RGF

Can we use the Revolving Green Fund?

We hope:


The Computational Shared Facility: Status, Strategy and Policy Proposals
19

 

Phase Two: Cloud

The Cloud. . .

Whitepaper:

"Centralised, shared facility fits well with cloud computing model."



The Computational Shared Facility: Status, Strategy and Policy Proposals
20

 

Web Portal

Database searches. . .


The Computational Shared Facility: Status, Strategy and Policy Proposals
21

 

External Access

Basic, off-campus access. . .


The Computational Shared Facility: Status, Strategy and Policy Proposals
22

 

Virtual Desktop Service

Start interactive work at work, finish at home?


The Computational Shared Facility: Status, Strategy and Policy Proposals
23

 

Condor Integration

Integrate with the other big computation resource on campus. . .


The Computational Shared Facility: Status, Strategy and Policy Proposals
24

 

Finally



computational-sf@manchester.ac.uk


Page Contents:


Status


Next Steps


Phase Two: Cloud

[raiser] [escape] [lower]

Contents: