The HECToR Service is now closed and has been superceded by ARCHER.

Developing NEMO for Large Multi-core Scalar Systems

This Distributed Computational Science and Engineering (dCSE) project will develop the NEMO (Nucleus for a European Model of the Ocean) ocean modelling code. NEMO is of great strategic importance for the UK and European oceanographic communities. Although NEMO has been used successfully for a number of years in global and ocean basin applications, its use as a shelf-sea model is less well developed.

The shelf-sea modelling partners of the UK marine research programme (Oceans 2025) are transitioning from the use of the Proudman Oceanographic Laboratory Coastal Ocean Modelling System (POLCOMS) to the use of NEMO for modelling shelf seas. This will enable alignment of the shelf-seas modelling work with the open-ocean, co-ordination of effort with the Met Office and address issues of ocean-shelf coupling. However, NEMO was designed for vector architectures rather than distributed architectures such as HECToR and significant work is required to bring NEMO to bear on Grand Challenge problems in multi-scale oceanography, particularly in the algorithms used for shallow-sea problems and in the performance of the code and its scalability to many thousands of cores on modern architectures.

The aim of this project is to address the performance and scalability concerns by drawing on techniques previously proven in POLCOMS and applying them to NEMO. The overall work may be summarized as: i) array index re-ordering, such that the layout of all the arrays will be changed to have the level index first, while permuting the associated loop nests to match. The new ordering will be referred to as z-first, and the existing ordering referred to as z-last; ii) multicore-aware partitioning and halo exchange optimisation for large, multi-core systems, exemplified by HECToR. This work will involve implementing recursive k-section partitioning, which has proved very successful in POLCOMS.

The individual achievements of the project are summarised below:

  • The work on the array index re-ordering and loop-nest optimisations highlighted a bug in the PGI compiler. This was discovered when updating the 3D arrays near the I/O layer. Furthermore, improved cache utilisation for the tri-diagonal solve in the vertical dimension was implemented. This has enabled approximately a 3% increase in the single-core performance for the z-last version for the GYRE (no land) deep ocean test case, together with slightly improved scalability (due to the more efficient halo exchanges being more cache friendly).
  • A bug was resolved in the NEMO bathymetry-smoothing routine, zgr_sco(). This was discovered when using certain process counts. This bug was logged in the NEMO issues tracker and the fix was reported to the UK Met Office. A workaround was also required for the smoothing algorithm which sometimes caused wet coastal points to become dry and vice versa.
  • For the partitioning, recursive k-section (multi-core aware) partitioning was implemented for the domain decomposition. In addition to the GYRE (no land) test case, another case involving land was also required to test representative performance. The AMM12 test case was chosen for this. Overall, due to a reduction in the time spent in halo-swaps, the new code performs slightly better than the original for up to 128 MPI processes. However, NEMO 3.3.1 includes computation for dry (land) points as well as those in the ocean domain and this causes a load-imbalance. Further performance improvement is expected when the NEMO source can be updated to include loop-level avoidance of calculations on the dry (land) points that remain after partitioning into sub-domains.
  • On phase 3 of HECToR, it became difficult to debug NEMO with Totalview and a bug report was submitted to the Totalview developers, to fix the problem of the application crashing with NEMO on HECToR.
  • A performance overhead of using REWIND on HECToR with the Lustre file system was identified. The solution now gives a general 10% saving in NEMO production work on HECToR.
  • The developments from this project are in a branch of the NEMO source and will also be used to inform strategic decisions about future NEMO developments.

Please see PDF or HTML for a report which summarises this project.