DPSF Slide Deck

Click [slideshow] to begin presentation.


Slide Navigation

[[ Slide navigation:

Forwards:right arrow, space-bar or enter key
Reverse:left arrow


DPSF Slide Deck



The Data-Processing
Shared Facility

The Newest Component of SCRE@M

Dr Simon Hood


Research Platforms Manager (acting), IT Services,
University of Manchester

DPSF Slide Deck




Scalable Compute for Research @ Manchester

Tightly-integrated package of platforms/capabilities, including:

DPSF Slide Deck


What is it?

The Data-Processing Shared Facility

The DPSF is a new computational resource within SCRE@M designed for processing large amounts of data:

DPSF Slide Deck


Context and History

In the beginning. . .

  • Skunkworks(?)
  • Prototype platform based on pooled funding/shared resources.
  • 90k from Uni, first jobs Dec 2010, 288 cores;
  • now 3 million, 8000 cores, still growing fast;
  • approx 40 research groups.
CPU-bound, e.g., simulations/modelling
  • commonly multi-host MPI-based work.
  • CSF ideal for this.

. . .dominated by EPS, but then. . .

DPSF Slide Deck


Changing Computational Landscape

. . .the nature of computational research changed — FLS/MHS came joined the party. . .

FLS/MHS people kept breaking things!

Huge growth in IO-bound and high-memory work
  • dominated by life and medical sciences (FLS and MHS);
  • huge amounts of RAM and also fast storage required;
  • serial and SMP.
  • Address thse issues;
  • SCRE@M has previously lacked capability in this area.

DPSF Slide Deck


The DPSF Nee Hydra

Take one local cluster and inject one business case. . .

Hydra, an FLS cluster:
  • Hydra created Dec 2013 via 100k from Chris Knight, FLS.
  • Subsequent financial contributions from FLS (faculty funds), Sam Griffiths-Jones (FLS), Bernard Keavney (MHS), Magnus Rattray (FLS), John Keane (Comp. Sci.), Tony Whetton (Stoller Centre, MHS).
2015/6 jump:
  • Business case to The University for 250k (Simon Hood and Kurt Weideling);
  • Magnus Rattray (MRC-funded Single Cell), 250k.

DPSF Slide Deck


Specifications and Resources

High-memory compute nodes:
  • all 16-core;
  • 30 nodes, 512 GB RAM;
  • 26 nodes, 256 GB RAM
Two local, fast, high-capaciy filesystems:
  • temporary workspace /scratch, 600 TB, three-month file deletion policy, no quotas;
  • local copies of datasets /data, 250 TB, quotas, no file deletion policy — files may be kept long-term;
  • NOT resilient, NO backups.

DPSF Slide Deck


Isilon RI

Dedicated Resilient Storage Too!

Dedicated Isilon cluster — "Isilon-RI":

Currently available to contributors only.


CIR Ecosystem


DPSF Architecture

DPSF Slide Deck


Use Cases

Why not simply use my local workstation?

DPSF Slide Deck


Operating Model

Funding Model and Entitlement to Compute
  • Some F@TPOU compute resources are available!
    • Notionally: 5 * 512 GB nodes + 13 * 256 GB nodes.
  • Remaining resource paid for and shared amongst financial contributors.
Job Prioritisation:
  • Scheduling is "fair share";
  • each contributing research group receives at least as much resource as they paid for. . .
  • . . .integrated over a month (provided sufficient jobs are submitted).

DPSF Slide Deck


Future Directions

Hadoop/Spark etc.:

Job Submission Engines:

i.e., easy-to-use interfaces. Currently working on:

DPSF Slide Deck


Enquiry Routes

How do I. . .?


Get access to The DPSF?
  • For free-at-the-point-of-use access, a light-touch application process will be advertised.
  • Contributors simply ask for accounts to be created.
Make a Contribution to The DPSF or Isilon-RI?
  • Contact Research IT and we will arrange a meeting to discuss your needs in detail and the likely cost.

DPSF Slide Deck


More Information

For more information about the SCRE@M and The DPSF please visit:


[raiser] [escape] [lower]