We decided to choose the Pathscale 3.0 binary, compiled with the recip_malloc_inline flags (see section 3.2.3) and linked against Cray's Libsci 10.2.13.2 and FFTW3 for our baseline, as this seemed to offer the best performance with the 4.2 Castep codebase.
![]() ![]() ![]() |
[Execution time]
![]() ![]()
|
[CPU time for Castep on 256 cores]
![]()
[CPU time for Castep on 512 cores]
![]() |
[CPU time spent applying the Hamiltonian in Castep]
![]()
[CPU time spent preconditioning the search direction in Castep]
![]() |