Methfessel-Paxton Approximation to Step Function

Apart from the effects of charge-sloshing, another main problem associated with calculations involving metallic systems is that one must integrate a discontinuous function over the Brillouin zone due to the partial filling of the band. For numerical integration methods this requires a very fine reciprocal space mesh ($\vec{k}$-mesh) inside the Brillouin zone. One way to rectify this problem is to approximate the step-function occupation function wit a Fermi-Dirac function of finite temperature $T$. This makes the integrand a differentiable function everywhere in the Brillouin zone and thus improves $\vec{k}$-mesh convergence though one has to live with the fact that the result obtained is actually for a slightly different problem of system under finite temperatures. In fact one can show that by simply approximating the step function with a Fermi-Dirac distribution \begin{equation*}
g(E)f(E - E_F, T) \approx g(E) \theta(E-E_F)
\end{equation*} where \begin{equation*}
f(E - E_F, T) \equiv \frac{1}{e^{(E - E_F)/k_{B}T} + 1}
\end{equation*} we in general cannot get the integral $\int\D{E} g(E)f(E - E_F, T)$ to equal to $\int\D{E} g(E)\theta(E - E_F, T)$ exactly, this can be seen clearly in figure 3

Figure 3: Illustration of the difference between $\int\D{E} g(E)f(E - E_F, T)$ and $\int\D{E} g(E)\theta(E - E_F, T)$. The blue area corresponds to the former and the red area corresponds to the latter. Notice that the patch of blue area above $E_F$ is not the same size as the (uncovered) red area just below $E_F$.

A more sophisticated approximation to the occupation function can be done using Methfessel and Paxton method[14]. This method approximates the step-function by starting with approximating the delta-function using expansion in a set of orthogonal Hermite polynomials. And then the $N$-th order approximation $S_N(x)$ is obtained by integrating the delta function. The method guarantees that \begin{equation*}
\int\D{E} g(E)S_N\left( \frac{E - E_F}{k_B T} \right)
= \int\D{E} g(E)\theta\left( \frac{E - E_F}{k_B T} \right)
\end{equation*} for any $g(E)$ that is an polynomial of $N$-th order or less. The expansion in Hermite polynomials are given as
S_0(x) & = \frac{1}{2} (1 - \mathrm{erf}(x)) \\
S_N(x) & = S_0(x) + \sum_{n=1}^{N} A_nH_{2n-1}(x)e^{-x^2}

A_n = \frac{(-1)^n}{n!4^n\sqrt{\pi}}
\end{displaymath} (4)

and Hermite polynomials can be generated by the recurrence relation
H_0(x) & = 1 \\
H_1(x) & = 2x \\
H_{n+1}(x) & = 2xH_n(x) - 2n H_{n-1}(x)
Note that at 0-th order, the Methfessel-Paxton method corresponds to a simple Fermi-Dirac like smearing.

The ground-state energy when calculated using Methfessel-Paxton smearing is no-longer variational. Instead the ground-state density minimises the free energy[11]

F = E - \sum_{n} k_B T \cdot
\frac{1}{2} A_n H_{2n}\left(...
...E_F}{k_B T}\right) e^{-\left(\frac{E_n - E_F}{k_B T}\right)^2}
\end{displaymath} (5)

Difference between $F$ and $E$ are small even for large $k_B T$. This makes Methfessel-Paxton method desirable for calculations with metallic systems, one will be able to use a much corser reciprocal grid with a relatively high smearing factor $k_B T$ without introducing too much difference between $F$ and $E$. If $F - E$ is large then it means the calculated ground-state energy deviates from the true zero-temperature energy--which we are trying to approximate--greatly and therefore the result is unreliable.

Figure 4: Left: Methfessel-Paxton approximation to delta function. Right: Methfessel approximation to step function.

However Methfessel-Paxton method also introduces a numerical artefact that Fermi-energy may no-longer be unique. To see this we notice that by expanding the delta function in terms Hermite polynomials, we introduce a number of roots to the function, and this translates into regions of negative occupancies in $S_N$. This is illustrated in figure 4. As a result, the electron number as a function of Fermi energy:

N_e(E_F) = 2* \sum_{n} \int\D^3\vec{k} w(\vec{k}) f\left(\frac{E_n(\vec{k})-E_F}{k_B T}\right)
\end{displaymath} (6)

(where $w(\vec{k})$ is weight, and $f(x)$ is the occupation function, which is given as $S_N(x)$ for order $N$ Methfessel-Paxton approximation) is no-longer monotonic. This behaviour can be illustrated by a simple toy model where we imaging have a system with the density of states represented as two gaussian function separated by some energy $\Delta E$, then $N_e(E_f)$ can be calculated easily and are plotted in figure 5. As we can see, there is a possibility that for a given number of electrons there are more than one Fermi energies that would give the correct answer.

Figure 5: Left: Density of State of a toy model system. Right: The corresponding number of electrons with respect to Fermi Energy

While given a lower bound and upper bound in function $N_e(E_F)$ the bisection method always returns a solution $E_F$ that gives the desired electron numbers, it is important that we are consistent on which $E_F$--if there are more than one--we actually find. Note that the properties of non-unique Fermi energy is not physical, but is an artefact of Methfessel-Paxton approximation. Never the less, to be consistent in our calculations we define the Fermi energy to be always the lowest energy that give the correct electron number.

To make sure we always find the lowest energy solution, we need to find the lower and upper energy bounds very carefully. To get a lower bound, we could always start from the lowest energy. However to be more efficient, this energy is probably too low, and we define a parameter $N_0$ such that we fill up the lower bands (and assuming occupation of 1.0) until we have $N_e - N_0$ electrons. And we use the highest filled energy as the lower bound. To find the upper bound, we increase the lower band by energy steps of $\delta \epsilon$ until $N_e(E_F)$ is greater than the desired number of electrons. To find $\delta \epsilon$, we have to be careful the that it is not too big and miss one possible solution and at the same time not too small so that calculation is still efficient. We note that the width of the Methfessel-Paxton approximation to delta function is controlled by the gaussian weight $e^{-x^2}$, where $x = \frac{E_n-E_F}{k_B T}$. Hence we may define the width $W$ of the Methfessel-Paxton approximation to delta function as

W = 2\sqrt{-\ln(g)}k_B T
\end{displaymath} (7)

where $g>0$ is a user definable parameter. Then we prove that--see theorem A.1--for a Methfessel-Paxton approximation of order $N$, there are exactly $2N$ roots for $D_N(x)$--Methfessel-Paxton approximation to $\delta(x)$. This implies we can choose the energy step to be
\begin{empheq}[box=\fbox ]{equation}
\delta \epsilon = \frac{W}{\eta2 N} = \frac{\sqrt{-\ln(g)}k_B T}{\eta N}
where $\eta$ is a user definable parameter controlling finess of the step. If $\eta \ge 1.0$, $\eta$ should be small enough to only go over one stationary point in $S_N$ at a time--remember $D_N(x)$ is the derivative of $S_N(x)$. And hence if starting from an absolute lower bound, the next upper bound we will find for $N_e(E_F)$ will guaranteed to bracket only the lowest Fermi Energy. We can then use the bisection method to find $E_F$.

Lianheng Tong 2011-03-02