The HECToR Service is now closed and has been superceded by ARCHER.

Further Improving NEMO In Shallow Seas (FINISS)

This Distributed Computational Science and Engineering (dCSE) project will develop the NEMO (Nucleus for a European Model of the Ocean) ocean modelling code. NEMO is of great strategic importance for the UK and European oceanographic communities. Although NEMO has been used successfully for a number of years in global and ocean basin applications, its use as a shelf-sea model is less well developed. This was the third dCSE project for developing NEMO on HECToR.

The overall aims of this project were:

  • Reduce the bandwidth requirements of 3-dimensional halo exchanges in NEMO by eliminating field values beneath the seabed from the halo messages.
  • Develop a tool for off-line generation of "grid-partition" maps, and add an option to NEMO to load grid-partition maps at run-time.
  • Determine the deepest level in each processor's sub-domain, and then redevelop the code in the NEMO's standard z-last ordering to restrict the outer loops over the vertical dimension so that levels entirely beneath the seabed are not traversed.
  • Improve improve the load-balancing in the z-first ordering, so that loops are performed only for the active levels at each grid-location, thereby eliminating the redundant computations on land and beneath the sea bed that remain after partitioning into sub-domains.
  • Benchmark using both deep ocean and shallow-sea test cases.

The individual achievements of the project are summarised below:

  • Eliminating field values beneath the seabed from halo messages was successful but resulted in negligible performance improvements on the AMM12 test case which covers the seas around the British Isles.
  • Restricting loops to the deepest level in a sub-domain did not eliminate enough layers to yield a significant improvement, due to the vertical co-ordinate scheme used in AMM12.
  • Looping over active levels in each sub-domain in the z-last ordering was implemented on some twenty-five source files that collectively account for more than 90% of the run-time (excluding I/O), sufficient to allow the extrapolation of the effect of "dry-point" elimination to 100% coverage; it is estimated that the load balance at 100% coverage will be 90% or better.
  • It was found that the z-first ordering gives a modest performance improvement on HECToR; this is due to slightly improved cache-reuse and is despite the fact that the important operation of tridiagonal solves in the vertical dimension does not vectorise in the z-first ordering.
  • The potential to reduce the communications cost of 2-dimensional solves was identified, and this is an optimisation that would benefit NEMO in either index-ordering.
  • All code changes have been incorporated back into a development branch of the main NEMO source repository.

Please see PDF or HTML for a report which summarises this project.