Currently, the code exhibits poor strong scaling: the input is performed in serial on the master process and the output is performed by each process but is serialised in a round-robin pattern. The effect on the overall performance of the code compared to the performance of the solver itself can be observed in Fig 1.
With respect the purely serial case (1 MPI process), the maximum speedup is limited to 22. The theoretical scalability limit for this code is the number of z-planes present in the problem as the decomposition is one dimensional. Throughout this report, we use the larger of the two test cases supplied by Prof. Fagan's group, the High res model which has 884 z-planes comprising 29 million elements.
For backwards compatibility, it is obviously desirable to be able to re-use previous data files and for that reason, all serial I/O routines are preserved. In addition, convertor utilities are provided to allow conversion of the old ASCII data files to and from the new format.