Improving Load Balancing and Parallel Partitioning in Fluidity

The Fluidity application is a non-hydrostatic, finite element/control volume CFD numerical forward model which is used in a number of scientific areas; geodynamics, ocean modelling, renewable energy, and geophysical fluid dynamics. The applications cover a range of scales from laboratory-scale problems, through to whole earth mantle simulations.

One of the unique features of Fluidity is its ability to adapt the computational mesh to the current simulated state: dynamic adaptive remeshing. When running on multiple processors the adapted mesh must be load balanced, which involves swapping elements from processor to processor. The Zoltan library is a collection of data management services for unstructured, adaptive and dynamic applications, and can be used to perform such dynamic load balancing. The steps taken to integrate Zoltan within Fluidity will be described in this dCSE project. The addition of Zoltan will extend the original functionality of Fluidity, and thereby increase the range of scientific areas that it can be applied to. Although there may be a small performance hit over the original (customised) load balancing library, this will only be noticeable when the number of elements per process is small.

Furthermore, by incorporating Zoltan, the parallel anisotropic mesh adaptivity and load balancing algorithms of Fluidity will be re-engineered, which will improve the scaling behaviour of Fluidity and allow the adaptive remeshing algorithms to be used in parallel, on any element pair, rather than being restricted to a single pair.

The achievements of this project are summarised below:

  • Zoltan has been installed as a module on HECToR and is available to all HECToR users. All partitioners (ParMETIS, PT-Scotch and Zoltan graph/hypergraph) are available when using the Zoltan module. A centrally installed copy of Fluidity using Zoltan is also available and details have been added to the Fluidity webpages on how to compile Fluidity on HECToR.
  • At Fluidity revision 3530, Zoltan was made the default solution for the Fluidity trunk, development branches and releases. All tests run as part of the Fluidity buildbot system now use the Zoltan build and run successfully with Zoltan. The previous load balancing library, Sam, is now a compile time option which may be accessed by configuring with the flag -with-sam. All options for Zoltan are fully documented and available through the Fluidity options package, diamond, with full details also given in the Fluidity manual.
  • The performance of the Zoltan implementation for the backward facing step HPC benchmark case resulted in a drop in performance on Phase 3 from 16 processors onwards, though at this number of elements per core and with the frequency of adapts it is not surprising that the performance difference between Zoltan and Sam becomes more pronounced. The expected performance increase in both the assembly and solve was not evident in this test case. Other tests may show better performance, however at this stage it is not clear which category of simulation would show better performance and there is a limited number of test cases that can be used as a comparison between Zoltan and Sam.
  • By using Zoltan, the maintainability, extensibility and functionality of the repartitioning solution in Fluidity has been extended to allow parallel periodic problems to be solved. Detectors have also been implemented allowing them to be used within all parallel adaptive simulations. The greatest benefit of the Zoltan implementation is that it is general purpose. It allows any element type to be used in parallel adaptive simulations. This will enable Fluidity to be used for new science not previously possible, e.g. modelling froths and foams.
  • During the course of this project, software or hardware upgrades to HECToR caused Fluidity to either fail to compile or fail to run. This often caused significant delays and it was decided to setup a build test. A serial test job was developed, which is now run once a day, it attempts to checkout, configure and compile Fluidity with the results transferred to AMCG systems. The data is then automatically processed and used to update the status of the HECToR build on the Fluidity buildbot system. This means that compilation failures of Fluidity on HECToR can be noted earlier and fixed quicker. Despite being outside the original plan this work was essential and very beneficial.
  • Fluidity is an open source project, so as well as benefiting all Fluidity users, this project will be of wider benefit to all those interested in dynamic load balancing and adaptive mesh methods. The inclusion of Zoltan, whilst not increasing performance as hoped, has vastly improved the maintainability and functionality of Fluidity on HECToR and other HPC architectures for investigating new areas of science.

The inclusion of Zoltan has also improved software sustainability for Fluidity, that will prepare the code for petaflop systems, such as those proposed by the PRACE project, and when combined with the new capabilities this will enable further science to be carried out. Zoltan has also allowed Fluidity to be coupled with other models such as atmospheric and ice-sheet models, a key part of future research, as these applications require non-standard discretisations. Finally, a recent development in Fluidity was the addition of Lagrangian particles, which are free-moving particles in the flow and are used as either detectors or within agent-based modelling. This is now in need of parallelisation, which is a non-trivial task, however will be helped considerably by the Zoltan library (e.g. Rendezvous).

For more information about this work, please see PDF or HTML for a report which summarises this project.