## Implementation of a Divide and Conquer Strategy for the Materials Modelling Code CRYSTAL

The CRYSTAL ab initio materials modelling code is a program developed by the Computational Science and Engineering Department at STFC, and the theoretical chemistry group at Turin University (Italy). Its function is to compute the electronic structure of a material, and from this many material properties. The one electron wavefunctions in CRYSTAL are stored as a linear combination of localised Gaussian functions, which enables CRYSTAL to evaluate the electronic structure for aperiodic systems, and systems that are periodic in one, two or three dimensions efficiently. A significant advantage of using a local basis set, is that it enables relatively inexpensive use of the Hartree-Fock (HF) method and hybrid density functional theory (DFT).

The CRYSTAL code is particularly important for UK materials chemists. The functionality and performance of CRYSTAL continues to be improved. As the size and complexity of the systems of interest increase, it is necessary to continue to develop CRYSTAL to make best use of high performance computing (HPC) resources, such as HECToR.

CRYSTAL can be run in parallel using either a replicated data strategy using the program called PCRYSTAL, or alternatively using a distributed data strategy, using the program called MPP CRYSTAL. MPP CRYSTAL has been shown to scale up to a few thousand processors for systems with 10,000 basis functions or more. However the scaling is, unsurprisingly, very system dependent. The time taken by the code is dominated by two sections, the Gaussian integrals and the Hamiltonian matrix diagonalization. The former scales all but perfectly but has marked problems due to replicated memory, the latter scales not nearly so well, but is fully distributed memory. Thus high symmetry systems, which emphasize the diagonalization (which in the distributed memory parallel code does not take advantage of symmetry), scale poorly, while low symmetry ones, where the integrals dominate, scale much better.

The overall aim of this project was to implement a divide and conquer (D and C) strategy for non-metallic systems using a method similar to those used in other classical molecular dynamics codes. Such that the whole system will be broken down into subsystems comprising of a core and an outer region. These subsystems will then be solved using existing methodology except for the addition of an external potential, i.e. an atom based multipole expansion of the rest of the system. A parallel solver for the subsystems will also be implemented. For the method as a whole, the subsystems will be coupled by means of a global Fermi energy, so that after the multipole expansions are updated, the process will be repeated to convergence.

In summary, the overall aims of this project were to:

- Implement an O(N) divide and conquer algorithm for use within the CRYSTAL code.
- Validate MPP CRYSTAL working within a task farming harness.
- Develop an automatic decomposition of a system into subsystems and distribution of work so that subsystems are run independently within taskfarmed instances of MPP CRYSTAL.
- Implement the communication of eigenvectors and eigenvalues to determine a global Fermi energy and from this reconstruct a global density matrix.
- Decompose the density matrix as a multipole expansion and embed the subsystems within this multipole expansion in order to include long range electrostatics.

The outcomes of the project are:

- A formulation for the handling of point multipoles in CRYSTAL was implemented and a comparison given for the Fermi energy calculation against theoretical results.
- Two methods for partitioning the electronic structure were compared, the first method employed a single core atom within each subsystem, as used in SIESTA, however, the second allowed the user to specify the atoms within each partition. This approach was implemented for CRYSTAL, as it allows the user to define the D and C approach in a way that is best suited to their problem.
- The code was validated with a 10 angstrom cubic box of liquid neon atoms. The time taken to calculate an electronic structure was shown to scale near linearly with the D and C approach.
- Task level parallelism was introduced in MPP CRYSTAL. An automatic decomposition of a system into subsystems and distribution of work so that subsystems are run independently within task-farmed instances of MPP CRYSTAL is now possible. This is essential to the D and C approach, since calculating the electronic structure of sub systems sequentially is comparatively highly inefficient.
- The implementation for communication of eigenvectors and eigenvalues for determining the global Fermi energy and reconstructing the global density matrix was also performed.
- New code was developed to allow an improved initial guess for the density matrix, this gives faster convergence in the SCF calculation for certain systems.
- Currently only weakly interacting systems, such as molecular crystals can be studied using the DCSE developed code. However, there are several examples of ab initio studies of molecular crystals where this may be a useful alternative to the traditional Hamiltonian matrix diagonalisation approach. In such cases, the calculation of the ground state energy and wave functions in CRYSTAL will scale almost linearly with the number of basis sets, for more than a few thousand atoms.

Please see PDF or HTML for a report which summarises this work.