next up previous contents
Next: Benchmark Results Up: Improving the scalability of Previous: Functional Evaluation   Contents

Compiler Comparison

An additional objective of the project was to evaluate the performance of the compilers available on the Cray XT for CP2K. At the start of the earlier CP2K dCSE project[1] the Pathscale 3.1 compiler was found to give around 5% greater performance than the PGI (8.0.2) or gfortran (4.3.2) compilers. Around 2 years after these results, new versions of all three compilers are available, as well as the new Cray Compiler Environment (CCE). In addition, the ability of the compilers to handle the mixed-mode OpenMP code was also evaluated.

The H20-64 benchmark, running on 72 cores (6 nodes) of the Cray XT5 `Rosa' was used for this comparised. For this configuration, less than 30% of the runtime is spent in communication, so the performance of the compiled code is strongly dependent on the compiler's ability to generate a well-optimised binary. The results for the MPI-only code are shown in table 4. In contrast to the previous results, the gfortran compiler now produces results that are in fact slightly better than either Pathscale or PGI. Further details of each of the compilers are below:


Table 4: Comparison of compilers on Rosa, using bench_64
Compiler Optimisation flags Time(s)
PGI 10.6.0 -fastsse 143.7s
Pathscale 3.2.99 -O3 -OPT:Ofast -OPT:early_instrinsics=ON -LNO:simd=2 139.8s
gfortran 4.4.4 -O3 -ffast-math -funroll-loops -ftree-vectorize 136.1s
crayftn 7.2.4 -O 2 -O ipa1 184.7s


next up previous contents
Next: Benchmark Results Up: Improving the scalability of Previous: Functional Evaluation   Contents
Iain Bethune
2010-09-14