Kerker Preconditioning and Wave-dependent Metric

Pulay mixing[16] has became one of the standard methods implemented in the ab initio Density Functional Theory (DFT)[8,10] codes to achieve self-consistency in ground-state energy calculations[13, pp. 172-174]. The scheme at beginning of a given self-consistent loop $m$, takes a set of histories of input densities $\rho_{\text{in}}^i$ from self-consistency loops $i$ in the range $i =
m - N_{\text{Pulay}}$ to $m-1$ (for a given integer $N_{\text{Pulay}}$ and if $m = 1$ then we use an initial guess) and forms an optimised input density for step $m$

\rho_{\text{in}}^{m} = \sum_{i = m - N_{\text{Pulay}}}^{m -...
+ \lambda R[\rho_{\text{in}}^{i}] \right)
\end{displaymath} (1)

where $\lambda$ is the mixing parameter; the parameters $\alpha_i$ are given as
\alpha_i = \frac{\sum_{j = m - N_{\text{Pulay}}}^{m - 1} A^...
...rho_{\text{in}}^j], \vec{r}) R([\rho_{\text{in}}^i],\vec{r})
\end{displaymath} (2)

and satisfies the constraint $\sum_{i=m-N_{\text{Pulay}}}^{m-1}
\alpha_i = 1$; the residual $R([\rho_{\text{in}}^i],\vec{r})$ is defined as the difference between the output (calculated from minimising the energy functional dependent on the input density) and input density: \begin{equation*}
R([\rho_{\text{in}}^i],\vec{r}) = \rho_{\text{out}}^i([\rho_{\text{in}}^i], \vec{r})
- \rho_{\text{in}}^i(\vec{r})
\end{equation*} Note also that by choosing $N_{\text{Pulay}} = 1$ we get back simple linear mixing scheme.

The phenomenon called ``charge sloshing''[9]--where a small variation in the input density (in the self-consistency loop) causes a large change in output density--is often observed in calculations for metallic systems. This results in very long repeats of the self-consistency loop if the calculation ever converges and it is one of the main obstacles against fast calculations on metallic systems. The solution to this problem exists from noting that charge sloshing is caused by long-range real-space changes (and hence short reciprocal-space changes) in the density dominating the variation of the Hartree potential. Hence by slightly modifying the mixing procedure and by filtering out the long range changes in density (the residuals) one can reduce the effect of charge sloshing and hence achieve much faster self-consistency convergence. This preconditioning method is named after its inventor Kerker[9], and has already been successfully applied to various ab initio electronic structure codes[11,12]. An alternative method is by adding weight to the computation of the residual metric ($A_{ij}$ in equation (2)) so that the short ranged variations in density are given more importance in the mixing procedure, and this method is referred to as Wave-dependent metric approach.

Working under reciprocal space, the Pulay mixing equation (1) becomes \begin{equation*}
\tilde{\rho}_{\text{in}}^{m} = \sum_{i = m - N_{\text{Pulay}}...
...n}}^{i} + \lambda \tilde{R}[\tilde{\rho}_{\text{in}}^{i}]\right)
\end{equation*} where $\tilde{\rho}_{\text{in}}^i(\vec{q})$ are the Fourier transform of input density histories; and $\tilde{R}([\tilde{\rho}_{\text{in}}^i], \vec{q}) =
\tilde{\rho}_{\text{out}}^i(\vec{q}) -
\tilde{\rho}_{\text{in}}^i(\vec{q})$. For Kerker preconditioning, we replace the scalar mixing parameter $\lambda$ with a diagonal matrix in reciprocal space: \begin{equation*}
G = \lambda \frac{q^2}{q^2 + q_0^2}, \qquad q = \norm{\vec{q}}
\end{equation*} where $q_0$ is a user input parameter controlling the meaning of ``long ranged'' variation in density in the preconditioning procedure: for $q \gg q_0$ (short range in real space), $G \approx \lambda$ so we have normal Pulay mixing, but for $q \ll q_0$ (long range in real space), $G \approx 0$ so the portion of $\rho_{\text{in}}^i$ do not take part in the mixing.

While the Kerker preconditioning method effectively adds a $\vec{q}$-dependent weight to the mixing parameter, the wave-dependent metric method adds a $\vec{q}$-dependent weight on how we compute the Pulay paramaters $\alpha_i$. We modify the residual metric $A_{ij}$ (given in equation (2)) to become

\tilde{A}_{ij} = \sum_{\vec{q}} \frac{q^2 + q_1^2}{q^2}
...lde{\rho}_{\text{in}}^j], \vec{q}),
\qquad q = \norm{\vec{q}}
\end{displaymath} (3)

in the reciprocal space. $q_1$ is an user input parameter so that for $q \gg q_1$ (short range in real space) the weight for the metric $\to
1$ and hence its contribution (in terms of its inverse) to the Pulay parameters $\alpha_i$ will be as normal; and for $q \ll q_1$ (long range in real space), the weight $\to \infty$ and hence its contribution to the parameters $\alpha_i$ becomes negligible.

Special treatment is needed for the point $\vec{q} = 0$, since $\omega(\vec{q}) = \frac{q^2 + q_1^2}{q^2}$ is undefined at this point. The easiest approach is to define the weight at $\vec{q} = 0$ as analytic continuation of the factor, and hence we define $\omega(\vec{q}=0) \equiv \max_{\vec{q}} \omega(\vec{q})$.

Lianheng Tong 2011-03-02