Summary of work

During this second part of the project, a number of work packages were originally proposed and are summarised below. It should be noted that the new coupler design facilitated many of these tasks. An explanation of the new coupler design and how it addresses the deliverables is also included below.

Work Package 1.1: Load balancing of the coupled application, by appropriate division of compute resources between the various algorithms.
The new coupler design (MPMD) ensures that all the scalability characteristics of the client applications are retained. As discussed earlier, the scalability of $ \mathcal {T}rans$ $ \mathcal {F}low$ has been demonstrated up to $ 10^3$ cores on HECToR phase 3 with 90% parallel efficiency. In order to achieve this scalability, effort was dedicated to synchronising the communication pattern in $ \mathcal {T}rans$ $ \mathcal {F}low$. In addition, effort was dedicated to improving the parallel performance of $ \mathcal {S}tream$$ MD$, and ensuring scalability figures commensurate with established packages such as LAMMPS. The parallelism was also extended to include the latest features of the code (originally only available in the serial release) such as long-chain molecules.

Work Package 1.2: Implementation of non-blocking MPI communication in the coupler and performance assessment.
As the case with any MPI-based development, the initial effort used blocking communication in order to ensure synchronous execution and for ease of debugging. However, in the final release of the coupler, all the MPI calls are based on non-blocking communication. Extensive verification of the coupler module ensured accuracy of the non-blocking calls. Further optimisation of this feature is dependent on the client application: Analysis of the client applications is required to identify suitable tasks that can be performed during asynchronous data exchanges.

Work Package 2.1: Implementation of mixed-mode operation (OpenMP + MPI) for massively parallel computations
The coupler library is an MPI-type utility. Therefore, the use of mixed-mode operation was intended for the client applications. $ \mathcal {T}rans$ $ \mathcal {F}low$ has been adapted to fully exploit mixed mode parallelism with MPI as the main, or default, mode. Thread-safe, shared-memory parallelism was extensively tested and validated agains MPI-only benchmarks. No major change in performance was observed since our simulations are compute intensive, without significant communication bottlenecks. It is expected, however, that performance benefits would be observed in massively-parallel computations on future architectures with large number of cores operating in shared-memory mode.

Work Package 2.2: Implementation of "MD farming", where MD instances are generated at different locations in the continuum.
The current design of the coupler was specifically engineered to be a multi-purpose, or general-purpose, coupling utility. This choice is a major benefit to the research and HECToR community. The case of farming is a straightforward example. Since the coupler operates in Multiple Program Multiple Data mode, any external programs can interface via the coupler. In that respect, any number of MD instances can interface to the continuum solver. At this stage, a farming example has not been verified since priority was dedicated to re-engineering the coupler into a general multi-purpose utility, performing extensive performance and accuracy check and implementing error reporting and documentation. In the near term, effort will be dedicated to exploiting the coupler for massively parallel simulations of physical problems of interest in order to maximise the scientific impact from this software development.