In summary we have taken a number of CASTEP routines that made use of a large number of point-to-point MPI communications and modified them to make use of MPI collectives where appropriate. For some parallel data distribution strategies messages the size of a band or block of bands are still sent, but with the MPI receive call is pre-posted where possible to avoid unexpected buffer errors. A mechanism has been put in place to force pre-posting for large messages if requested. The user can also set the size of the block of bands being sent at run time.

The above code modifications make certain CASTEP calculations more feasible on HECToR and other HPC environments. Specifically, checkpoints and restarts are more efficient and restarts are now possible when a band parallel data distribution is used. Classes of phonon calculations that can use thousands of cores are now feasible without having to tweak MPI environment variables to sacrifice RAM from being used for the scientific computations. The optimisations are also of benefit for the checkpoint and restart of time-dependent density-functional-theory (TDDFT) calculations (the subject of another CASTEP dCSE [4]), where each TDDFT eigenvector is a wavefunction-sized object.

These changes were available initially as a beta version to HECToR users, and have been subsequently incorporated into the main CASTEP source code and released worldwide in CASTEP 6.0.