Test Case 3

The simulated system is PbO using 126 k-points. The original code exhibits slowdown for over 144 cores and the only way to employ more cores is to use the k-point parallelized code. The latter scales quite satisfactorily up to 1008 cores.


Table: Scaling of Test Case 3.
Test Case 3 Cores Time (secs) Speedup
VASP 5.2.2 144 120.388 1
KPAR=2 288 69.208 1.740
KPAR=3 432 48.315 2.492
KPAR=6 864 30.234 3.982
KPAR=7 1008 27.402 4.393
KPAR=9 1296 40.443 2.976



Table: Here the optimal NPAR for the original code is 18, 12,18, 36 for 144, 288, 432, 1008 cores respectively. For the k-point parallelized the optimal NPAR for any number of cores is 18, provided a k-group consists of 144 cores.
Test Case 3 Cores Time (secs) Speedup
VASP 5.2.2 288 130.688 1
KPAR=2 288 69.208 1.888
VASP 5.2.2 432 122.236 1
KPAR=3 432 48.315 2.530
VASP 5.2.2 864 320.196 1
KPAR=3 864 30.234 10.591
VASP 5.2.2 1008 318.516 1
KPAR=7 1008 27.402 11.624


Figure 3: Speedup for Test Case 3 (where Speedup is taken to be 1 for 144 cores.)
\includegraphics[angle=0,width=14cm]{TC_3.eps}

Asimina Maniopoulou 2011-07-09