The HECToR Service is now closed and has been superceded by ARCHER.

Scaling the Nektar++ Spectral/hp element framework to large clusters

Nektar++ is a spectral/hp element framework designed for the numerical solution of partial differential equations using high-order discretisations. Nektar++ supports both continuous and discontinuous finite element methods, multiple element types across several platforms. The overall aims of Nektar++ are to develop: automatic tuning of optimal operator implementations based upon not only h and p, but also hardware considerations and mesh connectivity; temporal and spatial adaptivity; and high-order meshing techniques.

This project will develop Nektar++ in order to improve the tractability of a diverse range of large-scale engineering problems with the application. This will be achieved by implementing more efficient parallel preconditioners, that are tailored to high-order spectral/hp element methods, and also developing hybrid (MPI and multi-threaded) parallelisation.

On completion of this project Nektar++ is now capable of solving a broad range of real-world engineering problems in an efficient manner. The following summarises the outcome of the project:

  • Low energy basis (block) preconditioning (LEBP) was implemented for the conjugate gradient (substructure) solvers.
  • The performance was benchmarked on the cx1 cluster at Imperial College using 72 cores. For the pressure solver, a 15x speedup was observed, compared with the original diagonal preconditioner. Similarly, around a 13.5x speedup was achieved for the Helmholtz solver.
  • In addition to the LEBP p conditioning method, coarse space preconditioning (h preconditioning) was also implemented, and then combined with the LEBP via an additive Schwarz preconditioner.
  • The combined preconditioned conjugate gradient solver was benchmarked on the cx2 cluster at Imperial College. For a simulation of a Rabbit aorta with the Navier-Stokes equations, good scaling was observed for up to 432 cores.
  • Threaded parallelism was implemented in an independent fashion by using an abstract thread manager, which could then implemented by a concrete class, i.e. p-threads were implemented through the Boost library.
  • The hybrid parallel code was tested on cx1 and it showed good scaling, however, it is not yet clear that the threading approach outperforms the pure MPI approach without further investigation.
  • These developments will be available in a future release of Nektar++.

Please see PDF for a report which summarises this project.