HECToR

CITCOM

This Distributed Computational Science and Engineering (dCSE) project was to improve the performance of the geodynamic thermal convection code CITCOM, for more efficient use of the code on HECToR. Improved parallel performance and scalability was achieved by implementing multigrid methods to improve the rate of convergence. CITCOM is an MPI parallel finite element code written in C.

As a result of this dCSE work, CITCOM performs over 31% faster for the V-cycle multigrid scheme, over 38% faster for the W-cycle multigrid scheme for the simple 2d test problem and over 12% faster for the W-cycle multigrid scheme for the complex 3d test problem. Further details on performance and scaling are given below.

The aims of this project are summarised below:

Improving the originally implemented multigrid methods.
Implementing further multigrid cycles namely, the V-cycle, W-cycle, F-cycle or full multigrid cycle (FMG), etc.
Implementing local mesh refinement near the high viscosity gradients.

The individual findings and achievements of the project are summarised below:

The analysis of four multigrid schemes, namely the V-cycle, W-cycle, FMG(V) cycle and FMG(W) cycle were performed in this project.
For relatively simple and fast converging problems the Multigrid V-cycle proves to be the fastest scheme.
Full Multigrid or FMG schemes perform well in contrast to the corresponding V-cycle and W-cycle multigrid schemes for complex and hard to solve problems. In these cases, V-cycle and W-cycle multigrid schemes might fail to converge.
V-cycle based multigrid schemes generally perform better than the corresponding W-cycle based multigrid schemes.
The Multigrid V-cycle scheme offers optimal choice for relatively simple and easy to solve problems and the FMG(V) scheme offers optimal choice for relatively complex and hard to solve problems.
For complex problems in three dimensions under tougher conditions, the FMG(W) scheme offers optimal choice.
Scaling of all multigrid schemes is generally excellent. This is particularly true if the sub problem size per MPI process is not reduced to just two elements per MPI process in any direction.
The use of one or two cores per node instead of all four cores per node may slightly affect scaling, that is, the use of all cores per node gives best scaling. This is encouraging in the context of the efficient usage of multi-core configurations.

Please see PDF or HTML for a report which summarises this project.

Main web site navigation

CITCOM

In this section

Apply to ARCHER

Current Service Status