This page provides a basic introduction to using the HECToR GPGPU testbed system. It includes hardware details; information on the software on the system; compiling code and running jobs.

HECToR GPGPU Testbed

The HECToR GPGPU testbed machine has been provided for researchers to test their scientific codes and problems on a modern GPGPU-accelerated system.

Accessing the GPGPU Machine

Once you have successfully applied for an account on the testbed via SAFE, you can access the frontend node via a SSH connection to:

<username>@gpu.hector.ac.uk

Note : Your GPU password is not synchronised with your HECToR password

Hardware Details

Frontend node

The frontend node has a single quad-core Intel Xeon 1.87GHz cpu (E5502) and a single NVidia C2050 GPU card. Please note that the GPU card on the frontend is for debugging only - please do not run large production runs on the frontend node.

Compute nodes

Currently the testbed machine has four compute nodes connected by Quad-band Infiniband interconnects. All of the compute nodes have a single quad-core Intel Xeon 2.4GHz CPU (E5620) and 32 GB of main memory. Three of the compute nodes (gpu1, gpu2, gpu3) have 4 NVidia Fermi GPGPU cards installed and the remaining compute node (gpu4) has 2 AMD FirePro GPGPU cards installed. The layout is summarised in the table below.

Compute NodeCPUMain MemoryGPGPU Cards
gpu1Quad-core Intel Xeon 2.4GHz32GB4x NVidia Fermi C2050 (3GB Memory)
gpu2Quad-core Intel Xeon 2.4GHz32GB4x NVidia Fermi C2050 (3GB Memory)
gpu3Quad-core Intel Xeon 2.4GHz32GB1x NVidia Fermi C2050 (3GB Memory)
3x NVidia Fermi C2070 (6GB Memory)
gpu4Quad-core Intel Xeon 2.4GHz32GB2x AMD FirePro V7800

System Software and Compilers

The following software is installed on the testbed:

  • PGI 11 Compiler Suite - Includes support for CUDA Fortran and PGI Accelerator directives
  • GNU Compiler Suite
  • MPICH MPI Library (PGI and GNU versions)
  • Infiniband Verbs Library
  • NVidia CUDA Toolkit
  • ATI Stream Library
  • DDT Debugger with GPGPU support

Simulation Packages

If you have any requests for software packages to be installed on the system, please contact the HECToR Helpdesk.

The following packages are currently installed on the system;

  • PMEMD (Amber) - Single GPGPU and multiple GPGPU+MPI versions available
  • LAMMPS - Single GPGPU version available
  • NAMD 2.9 - Single GPGPU and multiple GPGPU+MPI versions available
  • GROMACS 4.6.0 - Single GPGPU and multiple GPGPU+MPI versions available
  • VASP 5.2.12 Exact-exchange - GPU+MPI version available

Filesystems and Quotas

Some notes on the filesystem:

  • The GPU has it's own disk and is not linked to the HECToR storage.
  • There is one filesystem called ghome
  • Disk capacity is limited - there is only 6Tb available
  • The disk is not backed up. Users should ensure critical data is moved elsewhere
  • All data will be deleted at the end of your trial project
  • Group quotas apply but user quotas do not
  • Disk usage statistics are not yet available via the SAFE
  • Users can check their group quota on the gpu using quota -g groupID

Running Jobs

The testbed machine uses the Sun Grid Engine (SGE) scheduler.

The basic commands are:

  • qsub - command is used to submit jobs;
  • qstat - command to view the job queue;
  • qdel - command to remove a job from the queue.

More information is available via the man sge_intro command on the testbed machine.

The longest runtime for a job is 12 hours and each user is restricted to 2 jobs running at any one time.

Single GPGPU card job: example

Here is an example job submission script for a 20 minute NAMD job that uses a single CPU core and GPGPU card. If you are not using NAMD then you will need to replace the 'namd2' executable with your own executable.

#!/bin/bash --login
#$ -N example_job
# Set the job time
#$ -l h_rt=0:20:0
# Set the account to charge to (change this to your account)
#$ -A gz01
# Shift to the directory that the job was submitted from
#$ -cwd
# Send environment with script (needed to get code modules)
#$ -V

# Load the NAMD module
module add namd

# Run the job
namd2 +idlepoll apao1.namd

Multiple GPGPU card job: example

Here is an example job submission script for a 20 minute PMEMD (Amber) job that uses four CPU cores and four GPGPU cards. If you are not using Amber then you will need to replace the 'pmemd' executable with your own executable.

#!/bin/bash --login
#$ -N example_job
# Set the job time
#$ -l h_rt=0:20:0
# Select the number of parallel processes to use
#$ -pe mpich 4
# Set the account to charge to (change this to your account)
#$ -A gz01
# Shift to the directory that the job was submitted from
#$ -cwd
# Send environment with script (needed to get code modules)
#$ -V

# Load the Amber module
module add amber

# Run the job
mpiexec -np 4 pmemd ...my input options...

Compiling Code

Compiling GPGPU-enabled code should be relatively straightforward as all the libraries should be in your path.

Compiling MPI code

By default, the GNU MPI programming environment is loaded - this means that the mpif90, mpif77, mpicc, and mpic++ will use the GNU compilers.

If you wish to use the PGI suite to compile MPI code you should first swap the GNU MPICH module for the PGI version with:

module swap mpich2/2-1.3-gcc mpich2/2-1.3-pgi

Support and Maintenance

Support for the GPU testbed is provided on a reasonable-endeavours basis. Please contact the HECToR Helpdesk for assistance.

We will endevour to schedule maintenance on the GPU to co-incide with XE6 maintenance. In order to support GPU training needs, there may be occasions where the batch queues are disabled. We will publish any training dates in advance.

Note: Utilisation and disk usage statistics are not yet available in SAFE.

Useful Links