Hardware
HECToR includes the first production Cray XT6 24-core system in the world (phase 2b) and the Phase 2a XT5h system - which includes a Cray XT4 quad-core system and a Cray X2 vector component.
Phase 2b: Cray XT6 system
The Phase 2b (XT6) is contained in 20 cabinets and comprises a total of 464 compute blades. Each blade contains four compute nodes, each with two 12-core AMD Opteron 2.1GHz Magny Cours processors. This amounts to a total of 44,544 cores. Each 12-core socket is coupled with a Cray SeaStar2 routing and communications chip. This will be upgraded in late 2010 to the Cray Gemini interconnect. Each 12-core processor shares 16Gb of memory, giving a system total of 59.4 Tb. The theoretical peak performance of the phase 2b system is over 360 Tflops.
There are 16 service blades on XT6, each with two dual-core processor sockets. They act as login nodes, controllers for the I/O and for the network.
Phase 2b storage systems
Direct attached storage
84 TB of high-performance RAID disks are available. The disks are accessible globally from any phase 2b compute node and use the Lustre distributed parallel file system.
Phase 2a: Cray XT5h system
The phase 2a system consists of a XT4 superscaler component and a X2 vector component.
XT4 component
Cray XT4 has been reduced in size to 33 cabinets from 60, which brings its capacity down to 3072 quad-core compute nodes from 5664. This amounts to a total of 12,288 cores, each of which acts as a single CPU. The processor is an AMD 2.3 GHz Opteron. Each quad-core socket shares 8 GB of memory. Cray XT4 operating system has been upgraded to CLE 2.2.
There are 24 service blades, each with 2 dual-core processor sockets. They act as login nodes and controllers for I/O and for the network.
Each quad-core socket controls a Cray SeaStar2 chip router. This has 6 links which are used to implement a 3D-torus of processors. The point-to-point bandwidth is 2.17 GB/s, and the minimum bi-section bandwidth is 4.1 TB/s. The latency between two nodes is around 6μs.
X2 vector component
The X2 part of the system includes 28 vector compute nodes; each node has 4 Cray vector processors, making 112 processors in all. Each processor is capable of 25.6 Gflops, giving a peak performance of 2.87 Tflops. Each 4-processor node shares 32 GB of memory.
The X2 interconnection network has a point-to-point bandwidth of 16 GB/s and a bi-section bandwidth of 254 GB/s. The average ping-pong MPI latency ~4.6 microsec.
Phase 2a storage systems
The phase 2a storage systems are accessible both by the XT4 and the X2 systems.
Direct attached storage
508 TB of high-performance RAID disks are controlled by 3 controllers through 12 IO nodes. The disks are accessible globally from any phase 2a compute node and use the Lustre distributed parallel file system.
Archive system
The archive system is based on Symantec's Enterprise NetBackup and currently consists of 1300 800GB tapes, with a maximum capacity of approximately 1.02 PetaBytes.
Note: the archive is only accessible from the phase 2a system.
Backup system
The backup system (which is shared by the phase 2a and phase 2b systems) includes 70 TB of disk space, known as NAS (Network Attached Storage) space, which holds the users' home directory space. Files are backed up initially to a 168 TB MAID (Massive Array of Idle Disks) disk space, from which they are staged to a tape system as required.
The NAS storage is held on BlueArc Titan 2200 servers. MAID storage is held on a COPAN Systems Revolution 220TX MAID VTL. The tape subsystem is a Quantum i2000 tape library with 4 LTO-4 FC tape drives. The backup system is controlled by 3 net nodes through 12 Gigabit Ethernet connections. The system includes 2 redundant Windows servers running Veritas NetBackup software.
Previous hardware
You can also find information on the previous phases of HECToR:
