The HECToR Service is now closed and has been superceded by ARCHER.

CP2K

This Distributed Computational Science and Engineering (dCSE) project was to improve the performance of the Density Functional Theory code CP2K, in particular by addressing the MPI communications and load balancing.

  • The overall performance gains from the dCSE work show a speedup of 30% on 256 cores for a generally representative benchmark case. An even greater speedup up of 300% on 1024 cores can be achieved for larger test cases. This exceeds the aims set out in the original project proposal.

The individual achievements of the project are summarised below:

  • Compiled, tested and evaluated the performance of CP2K with the main compilers on HECToR and demonstrated that Pathscale was the fastest.
  • Optimisation of the Realspace to Planewave transfer routines. Now only half of the original time is taken up in halo swaps and this results in an overall speed up of 12% with 512 cores for the bench 64 (64 water molecules in a 0.12nm periodic cubic cell) test case.
  • The performance of the Fast Fourier Transform routines was improved by storing reusable data and eliminating redundant MPI collective operations, giving further speedups of 12% for the bench_64 test.
  • Improved load balancing for non-homogenous systems was achieved by rank reordering. Results show a speed up of 25% for 128 cores and 18% for 2048 cores.
  • Detailed investigation of CP2K from a performance standpoint highlighted several areas for further work.
  • These improvements target the performance-critical regions which are central to the code. They are now available to CP2K users worldwide via CVS and are included in the installed versions of CP2K on HECToR and HPCx.

Please see PDF or HTML for a report which summarises this project.