PEQUOD (the Parallel Quasi-Geostrophic Model) is a finite difference code for solving the multi-layer quasi-geostrophic equations in a rectangular domain, it is primarily intended for the study of mesoscale eddy processes. PEQUOD is parallelised for distributed memory machines using MPI and is written in modern Fortran 90/95, taking advantage of several more advanced features of the language (e.g. derived data types, pointers, dynamic memory allocation, recursive subroutines and function overloading). It is around 10,000 lines of code. The model employs a structured Cartesian grid which may be de-composed in either one or two dimensions - for each layer. This defines a fully structured partitioning of the problem, with each process being assigned a partition of the global domain.

The CABARET solution of the quasi-geostrophic potential vorticity equation is a purely explicit problem (no matrix inversion operations are required) however, differencing at partition boundaries does require information to be communicated from adjacent processes. This is achieved via the use of halo (or ghost) nodes along with non-blocking MPI_ISEND and MPI_IRECV to communicate data to each neighbouring partition. A cartesian MPI topology is also used to locate neighbouring partitions. As this section of the code drives a purely finite difference, nearest neighbour algorithm, scalability is linear and so will not be discussed.

Although the solution of the quasi-geostrophic potential vorticity equation is a purely explicit problem, the potential vorticity inversion step is not. This is an implicit problem where the Helmholtz problem must be inverted in order to obtain the stream function from the potential vorticity. Furthermore, the elliptic nature of the Helmholtz operator means that this is inherently a non-local procedure (information must, in a direct solver, be communicated).

This step uses a one-dimensional parallel Helmholtz solver, along with a combination of data transposing. The system is repartitioned (via communication) yielding a new partitioning that is contiguous in the meridional direction (first data transpose). This allows a discrete sine transform in the meridional direction to be performed locally on each process. The one-dimensional Helmholtz problems are then inverted in parallel using the Schumann and Strietzel algorithm [9]. The inverse discrete sine transform step is then performed locally. Finally, the system is repartitioned back to the original partitioning (second data transpose). Due to the requirement to communicate globally, derived data types of varying size – at the start of this project, the data transposes were implemented with MPI_ALLTOALLW.

Phil Ridley 2012-10-01