Expensive Field Calculations within the Code

To compute demagnetisation and exchange energies, which are the most computationally expensive of the MicroMag code, the stiffness matrices are determined at the start of the program, and assigned to PETSc matrices. The Matrix-vector products are shown below.

! now compute Hdemag
! Hd_x(i)=totfx(j)*YAB(ij) and similar for y and z
! Petsc matrix is called PAB
! allocate Hdemag
CALL VecDuplicateVecsF90(totfx,ithree,hdemag,mpierr)
CALL MatMult(PAB,totfx,hdemag(1),mpierr)
CALL MatMult(PAC,totfx,hdemag(2),mpierr)
CALL MatMult(PAD,totfx,hdemag(3),mpierr)
! Compute hexch
!hexch_x=m_x*YA [PA in Petsc matrices]
CALL VecDuplicateVecsF90(totfx,ithree,hexch,mpierr)
CALL MatMult(PA,mxyz(1),hexch(1),mpierr)
CALL MatMult(PA,mxyz(2),hexch(2),mpierr)
CALL MatMult(PA,mxyz(3),hexch(3),mpierr)

The function caldemagVec is the sum of two scalar potentials. The first, $\phi_1$, receives contributions to the force vector (${\bf {f}}_{1}$) from the divergence of the magnetic field, $\nabla \cdot {\bf m}$, and a Neumann boundary condition. A Krylov subspace solver then calculates, $\phi_1$, by solving the linear system

\begin{displaymath}
Y_A \phi_1 = {\bf {f}}_{1}
\end{displaymath} (2)

where $Y_A$ is the finite element stiffness matrix determined during the set up. The second potential, $\phi_2$ is created by an asymmetric contribution to the force vector ${\bf {f}}_{2}$, and is determined by a second Krylov subspace solution of:


\begin{displaymath}
Y_{AR} \phi_2 = {\bf {f}}_{2}
\end{displaymath} (3)

The solution of $\phi_2$ depends upon $\phi_1$, and the boundary conditions imposed here are to approximate an infinite volume. The resulting stiffness matrix $Y_{AR}$ is then asymmetric. The total magnetostatic potential is the sum of the two components $\phi_1+\phi_2$ and the resulting demag field is the gradient of this potential, which is found using the spatial components of the finite element basis functions.

Unfortunately, the dCSE project ran out of time before the calcdemagVec function could be implemented. Whilst this was ultimately disappointing, significant progress had been made in taking a legacy serial code and developing a fully parallel code which now has the potential for code performance by using a modern data parallel layer library, PETSc. Several obstacles caused delays to this project which led to it failing to complete on time. The critical issues are listed below:

  1. Starting from scratch with refactored code.
  2. Flawed parallelisation strategy.
  3. Bug in third party library.
  4. Short term project lacked contingency time to compensate delays.

The first issue meant that extra work had to be done to before the planned work could be started. This took extra time, but turned out to be a necessary step. The flawed strategy was a failure of the planning stage of the project. However, it was only when attempting the semi-parallel decomposition with the simpler refactored code could the strategy be discovered as flawed. Working with the refactored code also meant that using the PETSc library was possible. The bug in the PETSc-Sundials interface was unforeseen and took a significant amount of the remaining time to discover and report. The original proposal requested sixteen months, to parallelise this code. The various delays, bug hunting in libraries and changes to the work plan eventually meant that only two months were available to spend developing parallel code with PETSc.

At the end of dCSE project time, the code ran in parallel, but lacked some of the necessary scientific functionality. At this point, the MicroMag code status can be summarised as that after the pre-processing phase using METIS, the various stiffness matrices for the particular problem, the boundary conditions and the magnetic field are set up in serial. PETSc parallel data structures are created and populated from the serial ones. The time evolution loop is executed and the PETSc native EULER solver is called for each time-step. The RHS function calculates the energies, without the contribution to the demagnetisation from the total field.

To complete the project, two tasks would require implementation: The PETSc data structure representing the asymmetric stiffness matrix $Y_{AR}$ would need to be populated from the serial MicroMag one, taking account of the row-column transformation. The function calcdemagVec implemented, with calls to the Krylov subspace solvers in the PETSc library. Once these two tasks have been completed the code would then be scientifically verifiable, if not especially efficient. On August 31 2010, the last day of the project, the PETSc developers issued a patch to fix the bug. This came too late to be included in the project, but the code could quite easily be changed to use the Sundials solver, instead of the EULER solver. At this stage the code would be both scientifically correct, and algorithmically efficient.

Further development could also include, the PETSc data structures, which would be set up using the wrapper functions, so the same code runs in parallel and serial, and running in serial or parallel becomes a run-time option. Future development work would focus on the set up phase. Firstly, the PETSc data structures for the stiffness matrices would be populated in parallel from the outset, without determining them in serial before hand. Secondly, the ParMetis-PETSc interface would allow ParMetis to be called, thus the Finite element mesh composition to be determined at run-time, eliminating the pre-processing phase.

Chris Maynard 2011-06-08