HECToR Phase 1 Hardware Configuration

The HECToR Phase 1 configuration is an integrated system known as "Rainier", which includes a scalar MPP XT4 system, a vector system known as "BlackWidow", and storage systems.

Cray XT4 scalar supercomputer

The XT4 comprises 1416 compute blades, each of which has 4 dual-core processor sockets. This amounts to a total of 11,328 cores, each of which acts as a single CPU. The processor is an AMD 2.8 GHz Opteron. Each dual-core socket shares 6 GB of memory, giving a total of 33.2 TB in all. The theoretical peak performance of the system is 59 Tflops.

There are 24 service blades, each with 2 dual-core processor sockets. They act as login nodes and controllers for I/O and for the network.

Each dual-core socket controls a Cray SeaStar2 chip router. This has 6 links which are used to implement a 3D-torus of processors. The point-to-point bandwidth is 2.17 GB/s, and the minimum bi-section bandwidth is 4.1 TB/s. The latency between two nodes is around 6μs.

The system is held in 60 cabinets.

Cray vector "BlackWidow" system

In August 2008, a Cray vector system known as "BlackWidow" was added to the Rainier system. It includes 28 vector compute nodes; each node has 4 Cray vector processors, making 112 processors in all. Each processor is capable of 25.6 Gflops, giving a peak performance of 2.87 Tflops. Each 4-processor node shares 32 GB of memory.

The BlackWidow interconnection network has a point-to-point bandwidth of 16 GB/s and a bi-section bandwidth of 254 GB/s. The average ping-pong MPI latency ~ 4.6 microsec.

Storage systems

The storage systems are accessible both by the XT4 and the "BlackWidow" systems.

Direct attached storage

576 TB of high-performance RAID disks are controlled by 3 controllers through 12 IO nodes. The disks are accessible globally from any compute node and use the Lustre distributed parallel file system.

Backup system

The backup system includes 40 TB of disc space, known as NAS (Network Attached Storage) space, which holds the users' home directory space, as well as other files which users wish to backup. Files are backed up initially to a 56 TB MAID (Massive Array of Idle Disks) disc space, from which they are staged to a tape system as required.

The NAS storage is held on BlueArc Titan 2200 servers. MAID storage is held on a COPAN Systems Revolution 220TX MAID VTL. The tape subsystem is an ADIC Scalar i2000 tape library with 3 LTO tape drives. The backup system is controlled by 3 net nodes through 12 Gigabit Ethernet connections. The system includes 2 redundant Windows servers running Veritas NetBackup software.

Phase 1 upgrade

Later in Phase 1, the storage components will be upgraded as follows:

  • Direct attached storage: from 576 TB to 934 TB
  • NAS storage: from 40 TB to 70 TB
  • MAID storage: from 56 TB to 112 TB
  • Tape drives: from 3 to 6

Information on the current hardware configuration.