N8 CIR Bede: GH200 pilot

Peter Heywood, Research Software Engineer

The University of Sheffield

N8 CIR Bede - Tier 2 HPC facility

  • GPU system with non-x86_64 CPUs
    • Higher-bandwidth host-device communication
    • Reduced software compatibility
  • Free at point of use for N8 institutes
    • Any compatible GPU workloads
  • Supports multi-node multi-GPU jobs
  • Funded until at least 2025-03-31

  • Power 9 (ppc64le) Partitions:
    • 128 (+8) V100 GPUs
    • 16 T4 GPUs
  • ARM (aarch64) Pilot:
    • 2 (+1) GH200 Superchips

NVIDIA Grace Hopper Superchip

  • GH200 480GB
    • 72-core ARM CPU
    • 480GB LPDDR5X
    • H100 GPU (132 SMs)
    • 96GB HBM3e (4TB/s)
    • 450-1000W (900W in Bede)
    • NVLink-C2C (900 GB/s bidirectional bandwidth)

NVIDIA Grace Hopper Superchip

Host-device interconnect bandwidth

PyTorch GPT-2 fine-tuning benchmark

  • Based on previous work by RIT & RSE
  • GPT 2 (124 million parameters)
  • Wikitext-2 (raw)
  • Batch size 8
  • FP32 & FP16
  • NGC 24.02 container

GPT-2 Wikitext 2 Benchmark FP32 performance

Accessing Bede

Any questions on Bede (or JADE):

tier-2-hpc-support-group@sheffield.ac.uk

Bede documentation includes aarch64