Linear-scaling electronic structure theory: Electronic temperature in   the Kernel Polynomial Method

Eunan J. McEniry; Ralf Drautz

arXiv:1701.01568·cond-mat.mtrl-sci·January 9, 2017

Linear-scaling electronic structure theory: Electronic temperature in the Kernel Polynomial Method

Eunan J. McEniry, Ralf Drautz

PDF

Open Access

TL;DR

This paper explores the Kernel Polynomial Method for electronic structure calculations, establishing its connection to electron temperature and comparing it with the Fermi operator expansion, enhancing large-scale atomistic simulation efficiency.

Contribution

It formally connects the Kernel Polynomial Method with electron temperature and compares it to the Fermi operator expansion, providing insights for large-scale simulations.

Findings

01

Kernel polynomial convolution acts as an effective electron temperature.

02

Formal connection established between KPM and Fermi operator expansion.

03

Application demonstrated on a tight-binding model.

Abstract

Linear-scaling electronic structure methods based on the calculation of moments of the underlying electronic Hamiltonian offer a computationally efficient and numerically robust scheme to drive large-scale atomistic simulations, in which the quantum-mechanical nature of the electrons is explicitly taken into account. We compare the kernel polynomial method to the Fermi operator expansion method and establish a formal connection between the two approaches. We show that the convolution of the kernel polynomial method may be understood as an effective electron temperature. The results of a number of possible kernels are formally examined, and then applied to a representative tight-binding model.

Equations56

N

N

U_{ban d}

ρ_{ij} = \int f (ε, μ) n_{ij} (ε) d ε,

ρ_{ij} = \int f (ε, μ) n_{ij} (ε) d ε,

N_{i} = \int f (ε, μ) n_{i} (ε) d ε .

N_{i} = \int f (ε, μ) n_{i} (ε) d ε .

ξ_{ij}^{(n)}

ξ_{ij}^{(n)}

μ_{i}^{(n)}

f (ε, μ) = k \sum c_{k} ε^{k},

f (ε, μ) = k \sum c_{k} ε^{k},

ρ_{ij} = k \sum c_{k} ξ_{ij}^{(k)} .

ρ_{ij} = k \sum c_{k} ξ_{ij}^{(k)} .

U_{ban d} = ij \sum ρ_{ij} H_{j i},

U_{ban d} = ij \sum ρ_{ij} H_{j i},

U_{ban d} = i \sum k \sum c_{k} μ_{i}^{(k + 1)},

U_{ban d} = i \sum k \sum c_{k} μ_{i}^{(k + 1)},

\tilde{n}_{i} (ε) = \int K (ε, ε^{'}) n_{i} (ε^{'}) d ε,

\tilde{n}_{i} (ε) = \int K (ε, ε^{'}) n_{i} (ε^{'}) d ε,

N_{i}

N_{i}

U_{ban d, i}

m_{i}^{F O E} (ε^{'}) = \int f (ε, μ) δ (ε^{'}, ε) n_{i} (ε) d ε,

m_{i}^{F O E} (ε^{'}) = \int f (ε, μ) δ (ε^{'}, ε) n_{i} (ε) d ε,

m_{i}^{K P M} (ε^{'}) = \int Θ (ε^{'}, ε_{F}) K (ε^{'}, ε) n_{i} (ε) d ε .

m_{i}^{K P M} (ε^{'}) = \int Θ (ε^{'}, ε_{F}) K (ε^{'}, ε) n_{i} (ε) d ε .

U_{ban d, i}^{F O E}

U_{ban d, i}^{F O E}

U_{ban d, i}^{K P M}

\frac{\partial f ( ε , μ )}{\partial μ} = K (μ, ε),

\frac{\partial f ( ε , μ )}{\partial μ} = K (μ, ε),

f = \frac{1}{2} [1 - erf (- (ε - μ) / γ)],

f = \frac{1}{2} [1 - erf (- (ε - μ) / γ)],

\frac{\partial f}{\partial μ} = \frac{1}{π γ ^{2}} exp (- (\frac{ε - μ}{γ})^{2}),

\frac{\partial f}{\partial μ} = \frac{1}{π γ ^{2}} exp (- (\frac{ε - μ}{γ})^{2}),

f (x) \approx \frac{1}{π 1 - x ^{2}} (μ_{0}^{T} + 2 n = 1 \sum N_{max} μ_{n}^{T} T_{n} (x)) .

f (x) \approx \frac{1}{π 1 - x ^{2}} (μ_{0}^{T} + 2 n = 1 \sum N_{max} μ_{n}^{T} T_{n} (x)) .

μ_{n}^{T} = \int_{- 1}^{1} f (x) T_{n} (x) d x

μ_{n}^{T} = \int_{- 1}^{1} f (x) T_{n} (x) d x

f (x) \approx \frac{1}{π 1 - x ^{2}} (μ_{0}^{T} + 2 n = 1 \sum N_{max} g_{n} μ_{n}^{T} T_{n} (x)) .

f (x) \approx \frac{1}{π 1 - x ^{2}} (μ_{0}^{T} + 2 n = 1 \sum N_{max} g_{n} μ_{n}^{T} T_{n} (x)) .

μ_{Gauss}^{(n)} = \frac{1}{π} γ^{n} Γ (\frac{1 + n}{2})

μ_{Gauss}^{(n)} = \frac{1}{π} γ^{n} Γ (\frac{1 + n}{2})

T_{n} (x) = 2 x T_{n - 1} (x) - T_{n - 2} (x)

T_{n} (x) = 2 x T_{n - 1} (x) - T_{n - 2} (x)

τ_{Gauss}^{(0)}

τ_{Gauss}^{(0)}

τ_{Gauss}^{(2)}

τ_{Gauss}^{(4)}

τ_{Gauss}^{(6)}

τ_{Gauss}^{(8)}

τ_{Dirac}^{(n)} = (- 1)^{n /2}

τ_{Dirac}^{(n)} = (- 1)^{n /2}

g_{n}^{(Gauss)} \approx 1 - γ^{2} (\frac{n}{2})^{2} \approx exp (- (\frac{nγ}{2})^{2})

g_{n}^{(Gauss)} \approx 1 - γ^{2} (\frac{n}{2})^{2} \approx exp (- (\frac{nγ}{2})^{2})

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Chemical Physics Studies · Quantum and electron transport phenomena · Spectroscopy and Quantum Chemical Studies

Full text

Linear-scaling electronic structure theory: Electronic temperature in the Kernel Polynomial Method

Eunan J. McEniry1, Ralf Drautz2

1 Department of Computational Materials Design, Max-Planck-Institut für Eisenforschung GmbH, Düsseldorf, Germany

2 Interdiscplinary Centre for Advanced Materials Simulation, Ruhr-Universität Bochum, Germany

[email protected]

Abstract

Linear-scaling electronic structure methods based on the calculation of moments of the underlying electronic Hamiltonian offer a computationally efficient and numerically robust scheme to drive large-scale atomistic simulations, in which the quantum-mechanical nature of the electrons is explicitly taken into account. We compare the kernel polynomial method to the Fermi operator expansion method and establish a formal connection between the two approaches. We show that the convolution of the kernel polynomial method may be understood as an effective electron temperature. The results of a number of possible kernels are formally examined, and then applied to a representative tight-binding model.

1 Linear-scaling electronic structure approaches via the calculation of moments

The Fermi Operator Expansion (FOE) and the Kernel Polynomial Method (KPM) achieve linear scaling electronic structure computations by an iterative evaluation of a local Hamiltonian. The two methods appear to follow different strategies for the evaluation of forces and energies. The FOE expands the occupation number of the eigenstates represented by a temperature dependent smearing function. If the smearing function, which corresponds to the Heaviside step function at zero temperature, is represented by a smooth expansion, then the resulting density matrix is short ranged and thus local. The smoothing of the step function is expected to be efficient in particular for systems with a band gap at the Fermi level, thus the FOE is often portrayed as being applicable in particular for insulators and semi conductors. In the Kernel Polynomial Method (KPM) the density of states is convoluted with a kernel that leads to a smoothed representation of the density of states. The kernel corresponds to a broadened representation of a Dirac delta function, and a broader kernel implies a more local evaluation of the energy. The KPM obtains energies and forces from the density of states and in this way avoids the explicit computation of the density matrix. Therefore it is often seen as a method that is suitable for simulations in metals with their long-ranged density matrix. The KPM may be related to recursion based methods that also focus on the density of states.[1, 2]

We will first compare and contrast the FOE and the KPM and then show how the two methods are closely related. From this straightforward analysis the smoothing of the density of states in the KPM may be understood as a temperature broadening, which will then lead us to suggest a new kernel for the KPM. The new kernel performs practically as well as the standard kernel.

In ab-initio or tight-binding electronic structure calculations, the electronic temperature $T$ is typically introduced by applying a smearing function at the Fermi level. The band energy and electron count are then obtained as

[TABLE]

where $\mu$ is the electron chemical potential, $f(\varepsilon,\mu)$ the temperature dependent smearing function and $n(\varepsilon)$ the density of states. At $T=0$ K the smearing is zero and $f(\varepsilon,\mu)$ corresponds to the Heaviside step function $\Theta(\varepsilon,\varepsilon_{F})$ which is one below the Fermi energy and zero above. In the Fermi operator expansion (FOE) method[3, 4] the density matrix is locally approximated by writing it as

[TABLE]

where $n_{ij}(\varepsilon)$ is the spectrally resolved density matrix. The diagonal elements $n_{i}(\varepsilon)=n_{ii}(\varepsilon)$ of the spectrally resolved density matrix correspond to the local density of states associated with orbital $|i\rangle$ , such that the number of electrons in this orbital is obtained as,

[TABLE]

The moments of the spectrally resolved density matrix and the local density of states may be related to expectation values that can be computed from Hamiltonian matrix elements alone,

[TABLE]

Therefore, by writing a polynomial expansion of the smearing function

[TABLE]

an expansion of the density matrix may be obtained,

[TABLE]

This is the basis for the Fermi operator expansion, where in practice Chebyshev polynomials are used instead of a direct polynomial expansion[3]. The band energy is then obtained in intersite representation,

[TABLE]

which for $T=0$ K is equivalent to the onsite representation Eq. (2). Clearly the band energy according to Eq.(2) may also be directly expanded using Eq.(7),

[TABLE]

which follows directly from Eq.(6).

In the kernel polynomial method [5, 6, 1] one takes a different approach to the calculation of the number of electrons and the band energy. In a first step an approximate density of states $\tilde{n}_{i}(\varepsilon)$ is obtained from

[TABLE]

where $n_{i}(\varepsilon)$ is the local density of states and $K(\varepsilon,\varepsilon^{\prime})$ the kernel. The kernel is expanded in Chebyshev polynomials, making use of the moments theorem Eqs.(5,6). A strictly positive representation of $K(\varepsilon,\varepsilon^{\prime})$ guarantees that the resulting density of states $\tilde{n}_{i}(\varepsilon)$ is also strictly positive. The number of electrons and the band energy is then obtained from

[TABLE]

Here we adopt a slightly different approach to relating the FOE and KPM formally by introducing

[TABLE]

and

[TABLE]

The energy and number of electrons may be obtained from

[TABLE]

From Eq.(14) and (15) it is evident that the Fermi Operator Expansion and the kernel Polynomial Method converge to the same limit when $f(\varepsilon,\mu)$ becomes a step function at $T=0$ K and the kernel $K(\varepsilon^{\prime},\varepsilon)$ approaches the delta function as the number of moments in the expansion is increased.

Clearly at finite temperature or a finite number of moments FOE and KPM will be different; here we compare the two approaches to lowest order. We assume that we may identify $\mu=\varepsilon_{F}$ in expression (14) and (15), where $\varepsilon_{F}$ is the Fermi level in the KPM. Then, by taking the derivative w.r.t. $\mu$ and integrating over $\varepsilon^{\prime}$ , the following identity has to hold

[TABLE]

if no further assumptions regarding the local density of states $n_{i}(\varepsilon)$ are made. As expected, we may understand the kernel as the width of the smearing function that is indicated by a non-zero first derivative.

A systematic study of different smearing functions by Liang et al.[7] has indicated that the most rapid convergence for a FOE is found by using a complementary error function as the electronic distribution function, i.e.,

[TABLE]

where the width $\gamma$ acts as an effective “temperature”. In this case,

[TABLE]

so that the underlying kernel is expected to be shaped like a Gaussian. The width of the Jackson Kernel, for example, is given by $\gamma=\sqrt{2}\pi/N_{\rm max}$ [1] and thus $N_{\rm max}$ in KPM may be used to set the electron temperature in a simulation.

2 Kernels and damping factors

In the following section we illustrate more closely the connection between the KPM and FOE, by comparing numerically the results obtained via the two approaches. We consider first an arbitrary function $f(x)$ (which could be the local density of states), for which we have evaluated the first $N_{\rm max}$ Chebyshev moments. A pure Chebyshev expansion of this function would be given by

[TABLE]

Here $\{T_{n}(x)\}$ are the Chebyshev polynomials of the first kind, and the Chebyshev moments

[TABLE]

At points where $f(x)$ has discontinuities, for example, in the region of a band gap, the expansion of Eq. (21) exhibits Gibbs oscillations, which may lead to regions where the density of states is negative. The use of a kernel in Eq. (11) acts to “damp” higher-moment oscillations. This leads to the introduction of damping coefficients $\{g_{n}\}$ , so that Eq. (21) becomes

[TABLE]

We now determine an approximate expression for the $\{g_{n}\}$ for the case of the Gaussian kernel Eq. (20). In what follows, we assume that $\gamma\ll 1$ , and $\mu\to 0$ , as this will allow us to simplify a number of expressions. The real moments $\mu_{\rm Gauss}^{(n)}$ of this distribution are given by

[TABLE]

for $n$ even. By assuming small $\gamma$ , we can write the Chebyshev moments (of the first kind) $\tau_{n}$ , by making use of the recurrence relations of the Chebyshev polynomials:

[TABLE]

Thus, the first few (even) Chebyshev moments are given by:

[TABLE]

If $\gamma\to 0$ , $\delta_{\rm gauss}$ becomes simply the Dirac delta function, $\delta_{\rm Dirac}$ , whose (even) Chebyshev moments are simply

[TABLE]

For each $n$ , the damping factor $g_{n}^{(\rm Gauss)}$ will be simply the ratio $\tau_{\rm Gauss}^{(n)}/\tau_{\rm Dirac}^{(n)}$ . For small $n$ , and since $\gamma\ll 1$ , we can omit (approximately) terms of $\mathcal{O}(\gamma^{4})$ and above, hence the damping factors become

[TABLE]

These damping coefficients have been derived for a number of existing kernels[6], and are illustrated in Fig. 1. Of particular interest is the Jackson kernel, which guarantees a strictly positive density of states, while simultaneously minimising the broadening of the spectral function. In Fig. 1, and throughout the rest of the paper, we choose the parameter $\gamma=\sqrt{2}\pi/N_{\rm max}$ . This choice is motivated by Weiße et al. [1], who have shown that the reproduction of the delta function via the Jackson kernel is a good approximation to a Gaussian of this width. It is noteworthy that, of the kernels considered here, the damping coefficients for the Gaussian kernel are the only ones which do not go to zero as $n\to N_{\rm max}$ .

To demonstrate the effects of the smearing caused by the damping coefficients, the various kernels are applied to the problem of reproducing the delta function; the results are illustrated in Fig. 2 for $N_{\rm max}=50$ . The Dirichlet kernel, which corresponds to the undamped expansion Eq. (21), demonstrates large Gibbs oscillations. The Jackson kernel, although significantly broadening the peak, has no negative values for the reproduced functions. The Lanczos $\sigma^{3}$ kernel gives similar results to the Jackson, without guaranteeing a strictly positive function. The Gaussian kernel, with $\gamma=\sqrt{2}\pi/N_{\rm max}$ is extremely similar to the Jackson kernel as anticipated.

3 Application to carbon

Having established the connection between the KPM and FOE approaches, we move on to examine the performance of the two approaches when applied to real electronic structure models. As a test case, we have taken the orthogonal tight-binding (TB) model for carbon from Xu[10], and examined the convergence of the band energy as a function of moments, for the various kernels.

The tight-binding electronic structures for carbon in the diamond and graphite structures are illustrated in Fig 3. The diamond structure exhibits a band-gap of $\sim 5$ eV, while the graphite structure has a semi-metallic DOS, with a narrow anti-resonant feature at the Fermi level. Due to the narrowness of these features in relation to the bandwidth of the system, we can anticipate that a relatively large number of moments are needed to accurately reproduce the total energy of the system.

In Figs. 4 and 5, the densities of states of the diamond and graphite structures are shown for the various kernel approximations for $N_{\rm max}=20$ and $N_{\rm max}=40$ respectively. In Figure 4, the undamped Dirichlet kernel appears to show fast convergence, in particular in the reproduction of the band-gap of diamond. However, large Gibbs oscillations mean that the density of states for diamond has negative regions within the gap, which in turn results in the electron count being a multi-valued function of the energy and ambiguity in the determination of the valence band maximum. The Gaussian and Jackson kernels produce rather similar results, with differences only apparent toward the edge of the bandwidth. The Lanczos $\sigma$ kernel produces reasonable convergence, without producing negative DOS in the band-gap. In the case of graphene, it is clear that $N_{\rm max}=20$ is insufficient to fully reproduce the complexity of the TB DOS, with only rather general features of the electronic structures being recovered. In Figure 5, the increased number of moments provides much more realistic reproduction of the TB DOS. In the case of diamond, all kernels produce a reasonable description of the electronic structure; however, the undamped case exhibits significant Gibbs oscillations in the gap region. In the case of graphene, the broadening introduced by the Gaussian and Jackson kernels removes much of the local structure around the Fermi level, while the Lanczos $\sigma$ kernel reproduces much of the local structure with the only negative regions of the DOS appearing at the band edges.

In Figure 6, the error in the band energy (compared to exact diagonalisation) is shown as a function of the number of moments, for both diamond and graphene. The Dirichlet kernel, although closest at low moments, oscillates significantly around the true energy in both cases. The Gaussian, Lanczos $\sigma^{3}$ and Jackson kernels all converge monotonically from above to the exact value. As expected, the convergence for graphite is slower with $N_{\rm max}$ , with $N_{\rm max}\approx 100$ required to have energy convergence $\sim 50$ meV/atom for the Gaussian, Lanczos $\sigma^{3}$ and Jackson kernels.

The non-variational behaviour of the Dirichlet and Lanczos $\sigma$ approaches becomes more clear when full energy-volume curves are evaluated. In Figure 7, the energies of diamond and graphene as a function of lattice constant are shown for the various approaches (with $N_{\rm max}=20$ ) and compared with well-converged TB results. In the case of diamond, the Dirichlet and Lanczos $\sigma$ kernels undershoot the TB energies, with a reasonable prediction of the lattice constant and bulk modulus. The much smoother Gaussian and Jackson kernels give significantly higher energies, ostensibly due to an elevated effective electronic temperature. In the case of graphene, both the Dirichlet and Lanczos $\sigma$ kernels have energies above the TB result, thus falsely stabilising the diamond structure over the graphene. Even though the energy error for the Jackson kernel is of the order of $2$ eV, the energetic ordering between the two crystal structures is in good agreement with the TB result. In Fig 8, the results for $N_{\rm max}=40$ are shown. In the case of diamond, all kernels give a reasonable description of the lattice constant and bulk modulus, with the exception of the Dirichlet kernel, which arises from the multivaluedness of the electron number due to the negative DOS within the band gap. For graphene, the slower convergence in energy with $N_{\rm max}$ is apparent even at $N_{\rm max}=40$ , with all damped kernels stabilising the diamond structure over graphene. In order to reliably determine energy differences, it is essential that the key features of the density of states are reproduced; the smearing out of the anti-resonance feature at the Fermi level in graphene has a significant effect on the total energy, even if the lattice constant is in fair agreement with the TB reference.

The results here show the ability of the KPM to reproduce, in a smooth and rigorous manner, the total energy of these systems, as well as the challenges of such an approach to reproduce narrow spectral features which drive small energy differences between structures. In a realistic calculation, the number of moments required to reliably describe the allotropes of carbon would be prohibitively expensive to evaluate exactly, thereby necessitating a means of estimating higher Chebyshev moments from lower ones. One recent approach[2] is to estimate these from the Krylov subspace generated by Lanczos recursion approaches to the same problem.

4 Conclusion

A connection between the KPM and the FOE approach to linear-scaling electronic structure calculations has been made. We have shown that the width of the Kernel in the KPM may be related directly to the electronic temperature. The use of a complementary error function as an electronic smearing function in the FOE is consistent with assuming a Gaussian Kernel within the KPM. Moreover, while in the FOE, the expansion coefficients $\{c_{k}\}$ of Eq. (7) must be re-evaluated for each change in chemical potential, this is not necessary within the KPM. Therefore, while the direct FOE and Gaussian kernel KPM method give virtually identical answers, the KPM approach is computationally more efficient.

References

[1]

Weiße A., Wellein G., Alvermann A., and H. Fehske.

The kernel polynomial method.

Rev. Mod. Phys., 78:275, 2006.

[2]

B. Seiser, D. G Pettifor, and R. Drautz.

Analytic bond-order potential expansion of recursion-based methods.

Phys. Rev. B, 87:094105, 2013.

[3]

S. Goedecker and M. Teter.

Tight-binding electronic-structure calculations and tight-binding molecular dynamics with localized orbitals.

Phys. Rev. B, 51:9455, 1995.

[4]

S. Goedecker.

$\mathcal{O}(N)$ methods for electronic structure calculations.

Rev. Mod. Phys., 71:1085, 1999.

[5]

R. N. Silver and H. Röder.

Densities of states of mega-dimensional hamiltonian matrices.

Int. J. Mod. Phys. C, 5:735, 1994.

[6]

R. N. Silver, H. Röder, A. F. Voter, and J. D Kress.

Kernel polynomial approximations for densities of states and spectral functions.

J. Comp. Phys., 124:115, 1996.

[7]

WanZhen Liang, Chandra Saravanan, Yihan Shao, Roi Baer, Alexis T. Bell, and Martin Head-Gordon.

Improved fermi operator expansion methods for fast electronic structure calculations.

The Journal of Chemical Physics, 119(8):4117, 2003.

[8]

D. Jackson.

PhD thesis, Georg-August-Universität Göttingen, 1911.

[9]

C. Lanczos.

Discourse on Fourier Series.

Hafner, New York, 1966.

[10]

C H Xu, C Z Wang, C T Chan, and K M Ho.

A transferable tight-binding potential for carbon.

J. Phys.: Condens. Matter, 4:6047, 1992.

Bibliography10

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Weiße A., Wellein G., Alvermann A., and H. Fehske. The kernel polynomial method. Rev. Mod. Phys. , 78:275, 2006.
2[2] B. Seiser, D. G Pettifor, and R. Drautz. Analytic bond-order potential expansion of recursion-based methods. Phys. Rev. B , 87:094105, 2013.
3[3] S. Goedecker and M. Teter. Tight-binding electronic-structure calculations and tight-binding molecular dynamics with localized orbitals. Phys. Rev. B , 51:9455, 1995.
4[4] S. Goedecker. 𝒪 ( N ) 𝒪 𝑁 \mathcal{O}(N) methods for electronic structure calculations. Rev. Mod. Phys. , 71:1085, 1999.
5[5] R. N. Silver and H. Röder. Densities of states of mega-dimensional hamiltonian matrices. Int. J. Mod. Phys. C , 5:735, 1994.
6[6] R. N. Silver, H. Röder, A. F. Voter, and J. D Kress. Kernel polynomial approximations for densities of states and spectral functions. J. Comp. Phys. , 124:115, 1996.
7[7] Wan Zhen Liang, Chandra Saravanan, Yihan Shao, Roi Baer, Alexis T. Bell, and Martin Head-Gordon. Improved fermi operator expansion methods for fast electronic structure calculations. The Journal of Chemical Physics , 119(8):4117, 2003.
8[8] D. Jackson. Ph D thesis, Georg-August-Universität Göttingen, 1911.