The HECToR Service is now closed and has been superceded by ARCHER.

FAQ: Optimising Code

This section deals with the optimisation of code performance on a single node.

This section deals with the optimisation of the parallel aspects of the code.

Go back to the FAQ index.


Q. What is SSE(2,3,...)?

A. SSE stands for Streaming Single Instruction, Multiple Data Extensions. The preceding digit refers to the revision of the instruction set. With these instructions a processor is able to carry out packed operations on multiple data items simultaneously. This kind of process is sometimes referred to as vectorising.

Some compilers support the automatic generation of SSE instructions. For example, PGI has the switch -fast and the GNU compiler has -ftree-vectorize.

Q. What are the suggested optimisation options for the PGI compiler?

A.

-fast
chooses generally-optimal flags for a processor that supports SSE2 and SSE3 instructions.
-Mipa=fast
performs interprocedural analysis (at link time).
-Minline
enables function inlining.
-Minfo
outputs information about the optimisations attempted by the compiler.
Q. What are the suggested optimisation options for the GNU compiler?

A. The following should be useful:

-O3
common optimisations and inlining;
-funroll-loops
unrolls loops having constant loop counters.
-ftree-vectorize
tries to use the vectorised SSE instructions.

Q. How can I tune MPI?

A. See the HECToR Optimisation Guide - Tuning Chapter and the Good Practice Guide for Parallel Optimisation

Q. How can I tell if I should tune MPI?

A. If your code is not scaling well or a profile shows increasing time spent in MPI routines as the number of processors is increased then clearly you may need to tune MPI use or implements the communication parts of your algorithm differently.

See the HECToR Optimisation Guide for more information on profiling and tuning your code.

Q. How can I get more information about the nature of my code's communications?

A. Profile your code using CrayPat and/or Apprentice2. See the HECToR Optimisation Guide and the Good Practice Guides for more details.

Q. What can I do to improve my code's scaling?

A. See the HECToR Optimisation Guide - Tuning Chapter and the Good Practice Guide for Parallel Optimisation.

Q. My code spends most of its time reading/writing files. What can I do to improve performance?

A. Consider using parallel IO, see the Good Practice Guide for IO.


Go back to the FAQ index.