Source code improvements
The modernisation of the legacy Fortan 77 source code to Fortran 95 was carried
out following the standard steps described in literature
[4]. The work carried out in this area included:
replacing all common blocks with module variables, changing the
include statement with the use statement, eliminating
implicit typing and implicit interfaces and replacing the static arrays with
allocatable versions.
Fortran 77 subroutines were wrapped in Fortran 95 modules, with one per
file, according to their functionality. A brief description of the
new modules is as follows:
- constant_and_kinds.f90: contains basic mathematical and code
constants, data kind definitions for Fortran and MPI,
- global_data.f90: contains global data used across the source
code, the needed entities are selected via the construct use,
global_data, only : ...,
- commonvdrop.f90: stores data for the initial perturbation,
- basic_functions.f90 : mathematical functions used to
compute physical properties across the other modules,
- plume.f90: contains the driver subroutine of the simulation,
i.e. it controls the main time loop and associated
procedures,
- pade_data.f90 : contains parameters used in the computation of the space
derivatives,
- deriv3d_v601.f90 : contains subroutines that compute the
derivatives in the x, y, z directions for the flow variables,
- main.f90: initialises MPI, reads the input file, calls the
initialisation subroutines from modules that need data
allocation/initialisation, calls the driver subroutine and
finalisation operations,
- rhs3d.f90 : contains the subroutines that compute the flow
variables and their derivatives at each grid point, boundary
conditions and thermodynamical functions necessary for time
integration,
- record3d.f90 : contains the subroutines that collect observation
data during computation,
- restart.f90: contains the former common blocks used in restart
subroutines and the interfaces for the implementation of MPI and
Fortran IO restart algorithms,
- debug_tools.f90: contain subroutines that can be used to log
warning or error messages to an unique file using a shared file
pointer provided in MPI-IO.
During benchmarking and profiling it was noticed that a couple of
subroutines of the core solver had a very poor cache memory
utilisation. This problem was corrected by reordering some of the
nested loops and evicting IF blocks from some of the other
loops. As a result the code performance in the RHS sector increased by
approximately 20%, see Table 1.
Some other source code upgrades and reorganisations are listed below:
- Quasi-identical versions of some derivative subroutines were reduced to
one version using an optional argument feature for module subroutines.
- The directory containing the source code was split into three
directories; src for the source code, utils for
post-processing programs and decomp_2d for the 2DECOMP library
source.
- The Makefile was updated to allow compilation with different compilers,
optimisation or in debug mode, and a full list of dependencies was created.
- User documentation for the changes has been provided in a README
file.
The new code was compiled with PGI, GNU and Cray compilers using debug
and optimised flags, all executables have been tested on several MPI
processor topologies and grids with different aspect ratios.
Lucian Anton 2011-09-13