Next: Known issues remaining
Up: wave_write
Previous: wave_write
Contents
The initial implementation of the band-parallelism gave each node a
contiguous block of bands. Whilst this allowed greater optimisation on
each node, the load balancing was problematic. The two greatest
problems for load-balancing were:
- Construction of the charge density
The construction of the
valence charge density can be a time-consuming operation, but only
occupied bands contribute. If bands are distributed in contiguous
blocks, most nodes will contain only occupied bands, but a small
number of nodes will have a much larger proportion of conduction
bands-indeed some nodes might have no occupied bands
- EDFT unoccupied bands update
The EDFT algorithm requires the
unoccupied bands to be optimised separately from the occupied
bands. If the bands are assigned to nodes in contiguous blocks, only a
small number of nodes will have all of the conduction bands and the
rest will be idle in this phase.
There is a further issue with a distribution by contiguous blocks of
bands that is not related to load-balancing: traditional optimisation
methods are not band-local. This means that the nodes cannot optimise
their own local set of bands independently of the other nodes-in
particular, higher bands (i.e. more energetic bands) should not be
optimised until the lower bands have converged.
For all of these reasons, the decision was made to switch the
band-distribution to a round-robin scheme, whereby each of
nodes
gets every
th band. This improves the load-balancing greatly, and
also allows the existing optimisation algorithm to be used with few
changes. This distribution is not `hard-wired', and it is trivial to
change to a different scheme.
It should be noted that the proposed optimisation scheme in Work
Package 3 (see chapter 6) of this project is
band-local, and the detail of the band-distribution may be revisited
in that stage of the project.
Next: Known issues remaining
Up: wave_write
Previous: wave_write
Contents
Sarfraz A Nadeem
2008-09-01