next up previous contents
Next: Conclusion Up: 3D Spray Previous: 3D Spray   Contents

Benchmarking the parallel versions

Firstly, some of the pre-processing was to large for HECToR, even when ran in a small parallel queue, thus most pre-processing was performed on Ness.

The method of decomposition was to set the virtual process topology using npx=2, npy=2 and npz=1,2,4,..., using a simple decompositition method. This ensured that we had a balanced load-balance for the x-y plane at least.

The timing results for the 3 cases are presented graphically in figure 5 and presented in table 5, where can see that the code scales well for all 3 cases, where the optimum number of cores is 64, 256 and 512 for the coarse, medium and fine cases, respectively.

Figure: Time (secs) for the three 3D Jet Break Up cases, where jet_fine, jet_medium and jet_coarse are the times for the fine, medium and coarse cases, and jet_fine_linear, jet_medium_linear and jet_coarse_linear, are their respective perfect scaling curves.
\includegraphics[width=10cm]{jet.png}


Table 5: Timing and performance results for 3D Jet Break Up
Number Coarse Mesh Regular Mesh Fine Mesh
of cores Time (Perf) Time (Perf) Time (Perf)
4 417.0 (-) 3630.5 (1.18) - (-)
8 211.1 (1.98) 1804.6 (2.01) - (-)
16 105.6 (2.00) 900.9 (2.00) 12931.4 (-)
32 57.7 (1.83) 445.4 (2.02) 6568.7 (1.97)
64 27.2 (2.12) 220.5 (2.02) 3354.1 (1.96)
128 20.6 (1.32) 114.7 (1.92) 1752.0 (1.91)
256 24.5 (0.84) 70.9 (1.62) 947.6 (1.85)
512 - (-) 51.6 (1.37) 546.8 (1.73)
1024 - (-) - (-) 404.9 (1.35)
2048 - (-) - (-) 894.9 (0.45)


next up previous contents
Next: Conclusion Up: 3D Spray Previous: 3D Spray   Contents
Gavin J Pringle
2010-04-16