next up previous
Next: Demonstration calculation of an Up: Hybrid Time-Dependent Density Functional Previous: Comparison of Solvers

Parallel benchmark

For a parallel benchmark, we chose a larger molecular system in the form of Buckminster-Fullerene, C$_{60}$. We ran the calculations on HECToR (phase 2a) for 5 iterations of the Davidson solver, and used 16 cores as a baseline. See figure 1 below. We have used the prototype shared memory extension coded by Chris Armstrong (NAG) under a core CSE call. With 4-way SMP, a parallel efficiency of approximately 80% could be achieved with 256 processing elements. Initial tests with HECToR phase 2b gave us comparable calculation times only when using 4 cores per node (one per die), again with 4-way SMP.

Figure 1: Parallel speedup for the C$_{60}$ benchmark on HECToR phase 2a.
Image speedup_graph

We plan to merge our TDDFT code with the main CASTEP branch in the near future. Minor modifications will be required for bands-parallel. Improvements to the parallel scaling can be expected to be inline with those for the ground state calculation.



Dominik Jochym 2010-07-20