### The SLEPc solution stage

With the Hamiltonian stored as a PETSc mat object performing the diagonalization with SLEPc is a simple matter of performing calls to various SLEPc routines:

```!Create eigensolver context
call EPSCreate(scomm,eps,ierr)
!Set operators. In this case, it is a standard eigenvalue problem
call EPSSetOperators(eps,ham,PETSC_NULL_OBJECT,ierr)
call EPSSetProblemType(eps,EPS_HEP,ierr)
call EPSSetTolerances(eps,tol,PETSC_DECIDE,ierr)
call EPSSetDimensions(eps,nstat_32,PETSC_DECIDE,PETSC_DECIDE,ierr)
!Solve the eigensystem
call EPSSolve(eps,ierr)
call EPSGetIterationNumber(eps,its,ierr)

!Get some information from the solver
call EPSGetType(eps,tname,ierr)
call EPSGetDimensions(eps,nev,sncv,smpd,ierr)
call EPSGetTolerances(eps,tol,maxit,ierr)

!Get the number of converged eigenpairs
call EPSGetConverged(eps,nconv,ierr)

do i = 0_shortint,nconv-1_shortint
!Get the converged eigenpairs: i-th eigenvalue is stored in kr
call EPSGetEigenPair(eps,i,kr,ki,Vr,PETSC_NULL_OBJECT,ierr)
call VecScatterCreateToZero(Vr,vscat,vout,ierr)
call VecScatterBegin(vscat,Vr,vout,INSERT_VALUES,SCATTER_FORWARD,ierr)
call VecScatterEnd(vscat,Vr,vout,INSERT_VALUES,SCATTER_FORWARD,ierr)
call VecScatterDestroy(vscat,ierr)
if (srank == master) then
print*, kr + dtnuc(1)
if (i == 0_shortint) allocate(hmt(nocsf*nconv),eig(nocsf),dgem(nocsf))
eig(i+1) = kr
call VecGetArrayF90(vout,xx,ierr)
do j = 1, nkeep
hmt(j+(i*nkeep)) = xx(j)
end do
call VecRestoreArrayF90(vout,xx,ierr)
end if
end do
```

The results are in excellent agreement with serial code - one of the requirements of the code maintainers was that they have one version of the code to look after. This meant that any modifications to the code had to be done in a way that the same code could still be used on a desktop machine with no access to PETSc/SLEPc and that the output was written to disk as fortran binary files since SCATCI is just one step in the suite of UKRMol and the output would need to be used in UKRMol-out. The serial code writes unformatted/binary sequential fortran files whilst PETSc is c based and so uses a different approach to binary which is not directly readable from Fortran. Unfortunately this approach does have a negative impact on the parallel performance of the code but there are plans in place to feed the output of SCATCI directly into PFARM in UKRMol-out which should bypass this issue.

Tests have been carried out for water and Phosphate (the DNA backbone is sugar molecules linked by phosphate groups). Phosphate has a Hamiltonian matrix order = 122102, with non-zero elements. Figure  shows the performance of the diagonalization stage compared to the serial and OpenMP versions of SCATCI.

Figure 8: A comparison of the performance of the diagonalization stage.

Finally to be able to solve the largest target jobs the UKRMol suite is required to us 64 bit integers for array indexing. This is problematic since we only have 32 bit PETSc/SLEPc on HECToR. There is a 64 bit MPI available with PGI but that is then incompatible with PETSC_COMM_WORLD. Fortunately the parallel partitioning of the data meant that it was possible to cast all integers to 32 bit when interacting with PETSc/SLEPc/MPI.

Paul Roberts 2012-06-01