next up previous contents
Next: Removing the land only Up: NEMO performance Previous: Performance for different grid   Contents


Scaling plot

We also look at the scaling of NEMO from 128 to 1024 processors. Where possible equal dimension grids have been used. In situations where this was not possible, e.g. 128 and 512 processors the grid size has been chosen as close to square as possible and such that $jpni < jpnj$ as this has shown to yield the best performance.

Figure 5 shows the scaling of NEMO for both the PGI and PathScale compilers.

Figure 5: Scaling of NEMO for the PGI and PathScale compilers. The grid dimensions used were respectively; 128 = 8 x 16, 256 = 16 x 16, 512 = 16 x 32 and 1024 = 32 x 32.
Image scaling

From figure 5 it's clear that NEMO continues to scale out to 1024 processors but the benefit in using more processors is purely a reduction in runtime and not efficient in terms of AU's used. 128 or 256 processors seem to give the best compromise between AU use and runtime. As seen previously, the PGI compiler performs slightly better than the PathScale compiler for all processor counts tested.


next up previous contents
Next: Removing the land only Up: NEMO performance Previous: Performance for different grid   Contents