CASTEP
This Distributed Computational Science and Engineering (dCSE) project was to improve the scaling of the density functional theory code Castep for more than 1000 cores.
- Parallel efficiency for Castep now stands at around 42% for 1024 cores. This is nearly four times better than the original Castep 4.2.
- Storage and workload of the dominant parts of a Castep calculation were split using basic band parallelism in addition to the existing parallelisation scheme.
- Matrix Inversion and Diagonalisation was parallelised instead of being performed serially.
- The original band optimiser required frequent, expensive orthonormalisation steps, so this was replaced with a Band-Independent Optimiser.
- The original parallel efficiency of Castep 4.2 reduced from 86% for 64 cores to around 12% for 1024 cores.
Please see PDF or HTML for a report which summarises this project. More detailed versions of the report are also available: PDF, HTML. There is still scope for further optimisation leading on from this work and this could be implemented in a future dCSE project.