Next: Castep Implementation
Up: Programming
Previous: Development
Contents
The performance of the distributed diagonaliser (PZHEEV) was compared to that
of the LAPACK routine ZHEEV for a range of matrix sizes.
Table 5.1:
Hermitian matrix diagonalisation times for the ScaLapack
subroutine PZHEEV.

time for various matrix sizes 
cores 
1200 
1600 
2000 
2400 
2800 
3200 
1 
19.5s 
46.5s 
91.6s 
162.7s 


2 
28.3s 
65.9s 
134.6s 



4 
15.8s 
38.2s 
54.7s 
90.1s 


8 
7.9s 
19.0s 
37.6s 
63.9s 
81.6s 

16 
4.3s 
10.5s 
20.3s 
32.5s 
76.2s 

32 
2.7s 
6.0s 
11.6s 
19.2s 
43.1s 


An improved parallel matrix diagonalisation subroutine,
PZHEEVR^{5.1}, was made available
to us by Christof Vömel (Zurich) and Edward Smyth (NAG). This
subroutine consistently outperformed PZHEEV, as can be seen from
figure 5.1.
Figure 5.1:
A graph showing the scaling of the parallel matrix
diagonalisers PZHEEV (solid lines with squares) and PZHEEVR (dashed
lines with diamonds) with matrix size, for various numbers of cores
(colourcoded)

The ScaLAPACK subroutines are based on a blockcyclic distribution,
which allows the data to be distributed in a general way rather than
just by row or column. The timings for different datadistributions
for the PZHEEVR subroutine are given in table
5.2.
Table 5.2:
PZHEEVR matrix diagonalisation times for a 2200x2200 Hermitian
matrix distributed in various ways over 64 cores of
HECToR.
Cores used for distribution of 

Rows 
Columns 
Time 
1 
64 
6.48s 
2 
32 
6.45s 
4 
16 
5.80s 
8 
8 
5.92s 

The computational time for diagonalisation of a matrix
scales as , so we fitted a cubic of the form

(5.1) 
to these data for the 8core runs. The results are shown in table
5.3. This cubic fit reinforces the empirical
evidence that the PZHEEVR subroutines have superior performance and
scaling with matrix size, since the cubic coefficient for PZHEEVR is
around 20% smaller than that of the usual PZHEEV subroutine.
Table 5.3:
The bestfit cubic polynomials for the PZHEEV and PZHEEVR
matrix diagonalisation times for Hermitian matrices from
to
distributed over 8 cores of
HECToR.
Coefficient 
PZHEEV 
PZHEEVR 
a 
1.43547 
0.492901 
b 
0.00137909 
0.00107718 
c 
9.0013e08 
7.22616e07 
d 
4.31679e09 
3.53573e09 

Next: Castep Implementation
Up: Programming
Previous: Development
Contents
Sarfraz A Nadeem
20080901