The HECToR Service is now closed and has been superceded by ARCHER.

A Hands-on Workshop for the Cray XT4

7-10 April 2008
Training Room, EPCC, Edinburgh

course slides

The addition of HECToR to the UK HPC community has presented users with a step-change in the performance available to them. The highly scalable nature of the Cray XT4TM HECToR system enables users to solve problems at scales and resolutions previously unavailable. How best to leverage the HECToR architecture will be the focus of this workshop organised by the Cray Centre of Excellence for HECToR. Topics will include programming for multicore-including an introduction to quad-core, application performance tuning for Opteron, enhancing scalability, application performance tools, scientific library updates, and new Cray software. This will be a three-day hands-on workshop where users will have a chance to optimize their applications with the assistance of Cray Centre of Excellence staff. There will also be an optional fourth day where users can spend more time optimizing their codes with Cray personnel.

Click here to register

Day One - Monday 7th April 2008


  1. Architecture of the AMD Core (60 Minutes)
    1. Architectural features that the application developer needs to know
      1. Functional Units
      2. Cache Architecture
      3. Memory interface
    2. Issues using Dual core node in Cray XT4 MPP
  2. Compiler considerations when using the Dual Core (45 Minutes)
    1. Memory pre-fetching
    2. How to use shared memory parallelization in the compiler
  3. Using Craypat to profile applications on the Cray XT4 (45 Minutes)


Assignment - obtain profiles of your application running on the Cray XT4-Dual core.

Day Two - Tuesday 8th April 2008


  1. Optimizations for the AMD Dual Core (90 Minutes)
    1. Optimization techniques that the application developer needs to know
      1. Blocking for cache
      2. Using prefetch directives
  2. I/O Optimization on Lustre
  3. Using Apprentice to examine hardware counters for understanding cache utilization and vectorization (60 Minutes)


Assignment - optimize your application for node performance.

Day Three - Wednesday 9th April 2008


  1. Optimizing for the Distributed Shared Memory MPP (120 Minutes)
    1. Optimization techniques that application developer need to know about
      1. OpenMP
      2. Pthreads
      3. Mixing MPI and OpenMP/Pthreads
  2. Using Apprentice to examine MPI performance (60 Minutes)


Assignment - optimization your application across the MPP.

Day Four (optional) - Thursday 10th April 2008

All day hands on workshop with personnel from Cray to assist in optimizing your application.