In summary we have taken a number of CASTEP routines that made use of a large number
of point-to-point MPI communications and modified them to make use of MPI collectives
where appropriate. For some parallel data distribution strategies messages the size of
a band or block of bands are still sent, but with the MPI receive call is pre-posted where possible
to avoid unexpected buffer errors. A mechanism has been put in place to force pre-posting for
large messages if requested. The user can also set the size of the block of bands being sent at run time.
The above code modifications make certain CASTEP calculations more feasible
on HECToR and other HPC environments. Specifically, checkpoints and restarts
are more efficient and restarts are now possible when a band parallel data distribution is used.
Classes of phonon calculations that can use thousands of cores are now feasible without having to tweak MPI
environment variables to sacrifice RAM from being used for the scientific computations. The
optimisations are also of benefit for the checkpoint and restart of
time-dependent density-functional-theory (TDDFT) calculations (the subject of another CASTEP dCSE [4]),
where each TDDFT eigenvector is a wavefunction-sized object.
These changes were available initially as a beta version to HECToR users, and have been subsequently incorporated into the main CASTEP source code and released worldwide in CASTEP 6.0.