The HECToR Service is now closed and has been superceded by ARCHER.

Enhancement of high-order CFD solvers for many-core architecture

This Distributed Computational Science and Engineering (dCSE) project is to improve the parallel implementation of two separate structured multi-block codes, which are mainly used for aircraft engine design: one of which is The Block Overset Fast Flow Solver (BOFFS) and the other is NEAT. Both codes are used to model complex flow scenarios, they each carry less overhead in comparison to unstructured solvers, for what is already an expensive solution. BOFFS is used on HECToR to study a wide variety of gas turbine engine components. These include high pressure turbine blades, low pressure turbines, rim seals and labyrinth seals. NEAT is also used to study a variety of complex flows and involves the use of Large eddy simulation (LES), Reynolds averaged Navier-Stokes (RANS) and Direct Numerical Simulation (DNS).

The overall aims of this project were:

  • Implement a more flexible parallel data decomposition in BOFFS to improve scalability.
  • Implement hybrid parallelism within the most computational parts of NEAT, i.e. the tri-diagonal matrix algorithm (TDMA), Gauss-Seidel (GS) scheme and subsidiary routines.

The individual achievements of the project are summarised below:

  • A flexible data decomposition was successfully implemented in BOFFS, which now allows an arbitrary number of blocks to be assigned to the same MPI process.
  • The new flexible decomposition in BOFFS was demonstrated for a subsonic jet LES with 50 million cells and 108 grid blocks.
  • Comparing the single core block case with multiple core blocks, the run-time for 10 time iterations reduced from 67.28s to 50.10s using nearly half the original number of cores.
  • A new threaded red-black ordering was implemented for the TDMA and GS routines in NEAT, along with improved memory access for the main computational loops (including those in the subsidiary routines).
  • A test case was performed with NEAT for an unsteady flow over a structured grid with 12,642,048 points and 128 blocks (i.e. 98,766 grid points per block), a 24% reduction in run-time was observed for the new code.
  • Reasonable scalability for up to 8 OpenMP threads is now achievable with NEAT in hybrid parallel mode.
  • The total run-time for the new version of NEAT is on average less than half that of the original code.
  • These developments will be used with BOFFS and NEAT for collaborative projects between Cambridge, Warwick and Cranfield Universities.

Please see PDF or HTML for a report which summarises this project.