HECToR Optimisation Guide
Version 2.5 (27 January 2012)
This guide provides more advanced information on using the HECToR system than is available in the HECToR User Guide. It includes sections detailing the hardware, optimising your code (in serial and parallel), profiling your code and debugging your code. This guide is also being continually updated with more content.
Description of the guide and useful links.
Detailed description of the HECToR hardware and system software. This includes a in-depth look at the AMD Bulldozer architecture and memory layout.
- 2.1 Processor architecture
- 2.2 Building block architecture
- 2.3 Memory architecture
- 2.4 Interconnect
- 2.5 I/O subsystem architecture
- 2.6 Available file systems
- 2.7 Operating system (CLE)
Details on how to compile codes; use numerical libraries; MPI libraries and other parallel programming options.
- 3.1 Modules environment
- 3.2 Compiler wrapper commands
- 3.3 Available compilers
- 3.4 Available (vendor optimised) numerical libraries
- 3.5 Available MPI implementations
- 3.6 OpenMP
- 3.7 SHMEM
Advanced use of the HECToR batch system.
- 4.1 Batch system commands
- 4.2 Multiple aprun commands in a single job
- 4.3 Job arrays
- 4.4 Interactive jobs
- 4.5 Writing job submission scripts in Perl and Python
How to use the performance analysis tools installed on the system.
- 5.1 Available Performance Analysis Tools
- 5.2 Cray Performance Analysis Tool (CrayPAT)
- 5.2 General hints for interpreting profiling results
Tips on how to optimise the performance of your code in both serial and parallel.
- 6.1 Optimisation summary
- 6.2 Serial (single-core) optimisation
- 6.3 Parallel optimisation
- 6.4 Advanced OpenMP usage
- 6.5 Memory optimisation
- 6.6 I/O optimisation
How to use the debugging tools installed on the system.
- 7.1 Available Debuggers
- 7.2 Totalview
- 7.3 Cray ATP
- 7.4 GDB (GNU Debugger)
- 7.5 DDT Debugger