Ab initio theory of polarons: formalism and applications

Weng Hong Sio; Carla Verdi; Samuel Ponce; Feliciano Giustino

arXiv:1906.08408·cond-mat.mtrl-sci·June 21, 2019

Ab initio theory of polarons: formalism and applications

Weng Hong Sio, Carla Verdi, Samuel Ponce, Feliciano Giustino

PDF

TL;DR

This paper presents a first-principles theoretical framework for studying polarons in semiconductors and insulators, accurately capturing their formation, excitation energies, and wavefunctions without supercell calculations.

Contribution

It introduces a novel ab initio variational method based on density-functional theory to analyze polarons, including both small and large types, using only unit cell calculations.

Findings

01

Successfully applied to LiF and Li2O2 to demonstrate effectiveness.

02

Captures both Frohlich and non-Frohlich electron-phonon couplings.

03

Provides spectral decompositions for analyzing electron-phonon interactions.

Abstract

We develop a theoretical and computational framework to study polarons in semiconductors and insulators from first principles. Our approach provides the formation energy, excitation energy, and wavefunction of both electron and hole polarons, and takes into account the coupling of the electron or hole to all phonons. An important feature of the present method is that it does not require supercell calculations, and relies exclusively on electron band structures, phonon dispersions, and electron-phonon matrix elements obtained from calculations in the crystal unit cell. Starting from the Kohn-Sham (KS) equations of density-functional theory, we formulate the polaron problem as a variational minimization, and we obtain a nonlinear eigenvalue problem in the basis of KS states and phonon eigenmodes. In our formalism the electronic component of the polaron is expressed as a coherent…

Equations147

E_{LP} = \frac{ℏ ^{2}}{2 m ^{*}} \int d r ∣\nabla ψ ∣^{2} + \frac{1}{2} \int d r E \cdot D,

E_{LP} = \frac{ℏ ^{2}}{2 m ^{*}} \int d r ∣\nabla ψ ∣^{2} + \frac{1}{2} \int d r E \cdot D,

D = \frac{e}{4 π} \nabla \int d r^{'} \frac{∣ ψ ( r ^{'} ) ∣ ^{2}}{∣ r - r ^{'} ∣} .

D = \frac{e}{4 π} \nabla \int d r^{'} \frac{∣ ψ ( r ^{'} ) ∣ ^{2}}{∣ r - r ^{'} ∣} .

\frac{1}{2} \int d r E \cdot D = \frac{1}{2} \frac{e ^{2}}{4 π ϵ _{0}} \frac{1}{ϵ ^{0}} \int d r d r^{'} \frac{∣ ψ ( r ) ∣ ^{2} ∣ ψ ( r ^{'} ) ∣ ^{2}}{∣ r - r ^{'} ∣} .

\frac{1}{2} \int d r E \cdot D = \frac{1}{2} \frac{e ^{2}}{4 π ϵ _{0}} \frac{1}{ϵ ^{0}} \int d r d r^{'} \frac{∣ ψ ( r ) ∣ ^{2} ∣ ψ ( r ^{'} ) ∣ ^{2}}{∣ r - r ^{'} ∣} .

\frac{1}{2} \int d r E \cdot D = \frac{1}{2} \frac{e ^{2}}{4 π ϵ _{0}} (\frac{1}{ϵ ^{0}} - \frac{1}{ϵ ^{\infty}}) \int d r d r^{'} \frac{∣ ψ ( r ) ∣ ^{2} ∣ ψ ( r ^{'} ) ∣ ^{2}}{∣ r - r ^{'} ∣} .

\frac{1}{2} \int d r E \cdot D = \frac{1}{2} \frac{e ^{2}}{4 π ϵ _{0}} (\frac{1}{ϵ ^{0}} - \frac{1}{ϵ ^{\infty}}) \int d r d r^{'} \frac{∣ ψ ( r ) ∣ ^{2} ∣ ψ ( r ^{'} ) ∣ ^{2}}{∣ r - r ^{'} ∣} .

E_{LP} [ψ]

E_{LP} [ψ]

E_{LP}^{'} [ψ, ε]

E_{LP}^{'} [ψ, ε]

- \frac{ℏ ^{2}}{2 m ^{*}} \nabla^{2} ψ (r) - \frac{e ^{2}}{4 π ϵ _{0}} \frac{1}{κ} \int d r^{'} \frac{∣ ψ ( r ^{'} ) ∣ ^{2}}{∣ r - r ^{'} ∣} ψ (r) = ε ψ (r),

- \frac{ℏ ^{2}}{2 m ^{*}} \nabla^{2} ψ (r) - \frac{e ^{2}}{4 π ϵ _{0}} \frac{1}{κ} \int d r^{'} \frac{∣ ψ ( r ^{'} ) ∣ ^{2}}{∣ r - r ^{'} ∣} ψ (r) = ε ψ (r),

\int d r ∣ ψ (r) ∣^{2} = 1.

E_{LP} = ε + \frac{1}{2} \frac{e ^{2}}{4 π ϵ _{0}} \frac{1}{κ} \int d r d r^{'} \frac{∣ ψ ( r ) ∣ ^{2} ∣ ψ ( r ^{'} ) ∣ ^{2}}{∣ r - r ^{'} ∣} .

E_{LP} = ε + \frac{1}{2} \frac{e ^{2}}{4 π ϵ _{0}} \frac{1}{κ} \int d r d r^{'} \frac{∣ ψ ( r ) ∣ ^{2} ∣ ψ ( r ^{'} ) ∣ ^{2}}{∣ r - r ^{'} ∣} .

E_{LP} (r_{p}) = \frac{ℏ ^{2}}{2 m ^{*} r _{p}^{2}} - \frac{5}{16} \frac{1}{κ} \frac{e ^{2}}{4 π ϵ _{0} r _{p}} .

E_{LP} (r_{p}) = \frac{ℏ ^{2}}{2 m ^{*} r _{p}^{2}} - \frac{5}{16} \frac{1}{κ} \frac{e ^{2}}{4 π ϵ _{0} r _{p}} .

\frac{r _{p, min}}{a _{0}} = \frac{16}{5} \frac{κ}{m ^{*} / m _{e}},

\frac{r _{p, min}}{a _{0}} = \frac{16}{5} \frac{κ}{m ^{*} / m _{e}},

\frac{E _{LP, min}}{E _{Ha}} = - \frac{25}{512} \frac{m ^{*} / m _{e}}{κ ^{2}},

\frac{E _{LP, min}}{E _{Ha}} = - \frac{25}{512} \frac{m ^{*} / m _{e}}{κ ^{2}},

\frac{ε}{E _{Ha}} = - \frac{75}{512} \frac{m ^{*} / m _{e}}{κ ^{2}} .

\frac{ε}{E _{Ha}} = - \frac{75}{512} \frac{m ^{*} / m _{e}}{κ ^{2}} .

α = \frac{e ^{2}}{4 π ϵ _{0} ℏ} \frac{m ^{*}}{2ℏ ω _{LO}} \frac{1}{κ},

α = \frac{e ^{2}}{4 π ϵ _{0} ℏ} \frac{m ^{*}}{2ℏ ω _{LO}} \frac{1}{κ},

E_{LP, min} = - \frac{50}{512} α^{2} ℏ ω_{LO},

E_{LP, min} = - \frac{50}{512} α^{2} ℏ ω_{LO},

\frac{E _{LP}}{α ^{2} ℏ ω} = \int d r ∣\nabla ϕ (r) ∣^{2} - \int \frac{∣ ϕ ( r ) ∣ ^{2} ∣ ϕ ( r ^{'} ) ∣ ^{2}}{∣ r - r ^{'} ∣} d r d r^{'},

\frac{E _{LP}}{α ^{2} ℏ ω} = \int d r ∣\nabla ϕ (r) ∣^{2} - \int \frac{∣ ϕ ( r ) ∣ ^{2} ∣ ϕ ( r ^{'} ) ∣ ^{2}}{∣ r - r ^{'} ∣} d r d r^{'},

\frac{E [{ ψ _{v k} } , { τ _{κ α} }]}{E _{Ha}} = - 2 v k \sum \int d r ψ_{v k}^{*} \frac{a _{0}^{2} \nabla ^{2}}{2} ψ_{v k}

\frac{E [{ ψ _{v k} } , { τ _{κ α} }]}{E _{Ha}} = - 2 v k \sum \int d r ψ_{v k}^{*} \frac{a _{0}^{2} \nabla ^{2}}{2} ψ_{v k}

+ \frac{1}{2} T \sum \int d r d r^{'} \frac{a _{0} n ( r ) n ( r ^{'} )}{∣ r - r ^{'} - T ∣} + \frac{E _{x c} [ n ^{↑} , n ^{↓} ]}{E _{Ha}}

- κ p T \sum \int d r \frac{a _{0} Z _{κ} n ( r )}{∣ r - τ _{κ p} - T ∣} + \frac{1}{2} κ p T κ^{'} p^{'} \sum \frac{a _{0} Z _{κ} Z _{κ^{'}}}{∣ τ _{κ p} - τ _{κ^{'} p^{'}} - T ∣},

E [{ψ_{v k}}, {τ_{κ p}}] = E [{ψ_{v k}^{0}}, {τ_{κ p}^{0}}]

E [{ψ_{v k}}, {τ_{κ p}}] = E [{ψ_{v k}^{0}}, {τ_{κ p}^{0}}]

+ \frac{1}{2} κ α p κ^{'} α^{'} p^{'} \sum C_{κ α p, κ^{'} α^{'} p^{'}}^{0} Δ τ_{κ α p} Δ τ_{κ^{'} α^{'} p^{'}} + O (Δ τ^{3}),

\frac{E [ ψ , { ψ _{v k} } , { τ _{κ p} }]}{E _{Ha}} = - 2 v k \sum \int d r ψ_{v k}^{*} \frac{a _{0}^{2} \nabla ^{2}}{2} ψ_{v k}

\frac{E [ ψ , { ψ _{v k} } , { τ _{κ p} }]}{E _{Ha}} = - 2 v k \sum \int d r ψ_{v k}^{*} \frac{a _{0}^{2} \nabla ^{2}}{2} ψ_{v k}

- \int d r ψ^{*} \frac{a _{0}^{2} \nabla ^{2}}{2} ψ + \frac{E _{x c} [ n ^{↑} + Δ n , n ^{↓} ]}{E _{Ha}}

+ \frac{1}{2} T \sum \int d r d r^{'} \frac{a _{0}}{∣ r - r ^{'} - T ∣}

\times [n (r) + Δ n (r) - 1/ N_{p} Ω] [n (r^{'}) + Δ n (r^{'}) - 1/ N_{p} Ω]

- κ p T \sum \int d r \frac{a _{0} Z _{κ} [ n ( r ) + Δ n ( r ) - 1/ N _{p} Ω ]}{∣ r - τ _{κ p} - T ∣}

+ \frac{1}{2} κ p, κ^{'} p^{'} T \sum \frac{a _{0} Z _{κ} Z _{κ^{'}}}{∣ τ _{κ p} - τ _{κ^{'} p^{'}} - T ∣} .

E_{x c} [n^{↑} + Δ n, n^{↓}] = E_{x c} [n^{↑}, n^{↓}] + \int d r \frac{δ E _{x c}}{δ n ^{↑}} Δ n (r)

E_{x c} [n^{↑} + Δ n, n^{↓}] = E_{x c} [n^{↑}, n^{↓}] + \int d r \frac{δ E _{x c}}{δ n ^{↑}} Δ n (r)

+ \int d r d r^{'} \frac{1}{2} \frac{δ ^{2} E _{x c}}{δ n ^{↑} δ n ^{↑}} Δ n (r) Δ n (r^{'}) + O (Δ n^{3}) .

E [ψ, {ψ_{v k}}, {τ_{κ p}}] = E [{ψ_{v k}^{0}}, {τ_{κ p}^{0}}] +

E [ψ, {ψ_{v k}}, {τ_{κ p}}] = E [{ψ_{v k}^{0}}, {τ_{κ p}^{0}}] +

+ \frac{1}{2} κ α p κ^{'} α^{'} p^{'} \sum C_{κ α p, κ^{'} α^{'} p^{'}}^{0} Δ τ_{κ α p} Δ τ_{κ^{'} α^{'} p^{'}}

\displaystyle\qquad+E_{\rm Ha}\int\!d{\bf r}\,\psi^{*}({\bf r})\!\Bigg{[}-\frac{a_{0}^{2}}{2}\nabla^{2}+\!\sum_{\bf T}\!\int\!d{\bf r}^{\prime}\frac{a_{0}n({\bf r}^{\prime})}{|{\bf r}-{\bf r}^{\prime}-{\bf T}|}

- κ p T \sum \frac{Z _{κ} a _{0}}{∣ r - τ _{κ p} - T ∣} + \frac{1}{E _{Ha}} \frac{δ E _{x c}}{δ n ^{↑}} ψ (r)

\displaystyle\qquad+\frac{1}{2}E_{\rm Ha}\Bigg{[}\int d{\bf r}d{\bf r}^{\prime}\frac{1}{E_{\rm Ha}}\frac{\delta^{2}E_{xc}}{\delta n^{\uparrow}\delta n^{\uparrow}}\Delta n({\bf r})\Delta n({\bf r}^{\prime})

+ T \sum \int d r d r^{'} \frac{[ Δ n ( r ) - 1/ N _{p} Ω ] [ Δ n ( r ^{'} ) - 1/ N _{p} Ω ]}{∣ r - r ^{'} - T ∣/ a _{0}}]

+ E_{B} + O (Δ τ^{3}) + O (Δ n^{3}),

\hat{H}_{KS} [n (r), {τ_{κ p}}] = \hat{H}_{KS} [n^{0} (r), {τ_{κ p}^{0}}]

\hat{H}_{KS} [n (r), {τ_{κ p}}] = \hat{H}_{KS} [n^{0} (r), {τ_{κ p}^{0}}]

E_{p} [ψ, {Δ τ_{κ α p}}] = E [{ψ_{v k}^{0}}, {τ_{κ p}^{0}}]

E_{p} [ψ, {Δ τ_{κ α p}}] = E [{ψ_{v k}^{0}}, {τ_{κ p}^{0}}]

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Present address: University of Vienna, Faculty of Physics and Center for Computational Materials Sciences, Sensengasse 8/12, 1090 Vienna, Austria]

Ab initio theory of polarons: formalism and applications

Weng Hong Sio

Department of Chemistry, Physical and Theoretical Chemistry, University of Oxford, South Parks Road, Oxford, OX1 3QZ, UK

Department of Materials, University of Oxford, Parks Road, Oxford, OX1 3PH, UK

Carla Verdi

[

Samuel Poncé

Feliciano Giustino

[email protected]

Department of Materials, University of Oxford, Parks Road, Oxford, OX1 3PH, UK

Abstract

We develop a theoretical and computational framework to study polarons in semiconductors and insulators from first principles. Our approach provides the formation energy, excitation energy, and wavefunction of both electron and hole polarons, and takes into account the coupling of the electron or hole to all phonons. An important feature of the present method is that it does not require supercell calculations, and relies exclusively on electron band structures, phonon dispersions, and electron-phonon matrix elements obtained from calculations in the crystal unit cell. Starting from the Kohn-Sham (KS) equations of density-functional theory, we formulate the polaron problem as a variational minimization, and we obtain a nonlinear eigenvalue problem in the basis of KS states and phonon eigenmodes. In our formalism the electronic component of the polaron is expressed as a coherent superposition of KS states, in close analogy with the solution of the Bethe-Salpeter equation for the calculation of excitons. We demonstrate the power of the methodology by studying polarons in LiF and Li2O2. We show that our method describes both small and large polarons, and seamlessly captures Fröhlich-type polar electron-phonon coupling and non-Fröhlich coupling to acoustic and optical phonons. To analyze in quantitative terms the electron-phonon coupling mechanisms leading to the formation of polarons, we introduce spectral decompositions similar to the Eliashberg spectral function. We validate our theory using both analytical results and direct calculations on large supercells. This study constitutes a first step toward complete ab initio many-body calculations of polarons in real materials.

I Introduction

The polaron is a quasiparticle that can be found in many crystalline solids such as semiconductors,Lindemann et al. (1983) insulators,Popp and Murray (1972) and molecular crystals.Chaikin et al. (1972) A polaron is formed when an electron or a hole couples to the ions in a crystal in such a way as to generate a lattice distortion; the distortion in turn produces an electric field that acts on the electron or hole. This feedback mechanism alters the energetics and dynamics of the charge carrier and may induce self-trapping. With the improvement in the energy and momentum resolution of angle-resolved photoelectron spectroscopy (ARPES), it has become possible to probe these quasiparticles in many systems of interest, from transition metal oxides,(Moser et al., 2013; Cancellieri et al., 2016; Riley et al., 2018; Wang et al., 2016) to two-dimensional materials.(Chen et al., 2015, 2018; Kang et al., 2018) These experiments and related theoretical investigations contributed to reinvigorating the interest in polaron physics.(Verdi et al., 2017; Riley et al., 2018; Nery et al., 2018; Antonius et al., 2015)

The notion of polaron was introduced in a classic short paper by Landau,Landau (1933) and quantitative studies started with the work of Pekar,Pekar (1946) who considered a single electron interacting with a dielectric continuum. This interaction was shown to induce a localization of the wavefunction, and an enhancement of the effective mass of the electron.Landau and Pekar (1948) Shortly afterwards, Fröhlich, Pelzer, and Zienau formulated a quantum-mechanical theory of the polaron, where the interaction with the polarizable continuum was replaced by electron-phonon interactions (EPIs) between the excess electron and the longitudinal optical phonons of the lattice.Fröhlich et al. (1950) Subsequent work by Lee, Low, and Pines,Lee et al. (1953) Fröhlich,Fröhlich (1954) Feynman,Feynman (1955) and others,Gross (1955); Marshall and Mills (1970); Luttinger and Lu (1980); Kholodenko and Freed (1983); Bogolubov (2014) focused on determining accurate solutions of the Fröhlich polaron Hamiltonian for various strengths of the EPI. More recent work includes accurate numerical investigations of the Fröhlich Hamiltonian using the diagrammatic Monte Carlo method,Prokof’ev and Svistunov (1998) path-integral Monte Carlo,Titantah et al. (2001) and the renormalization group approach.Grusdt (2016) For comprehensive and up-to-date reviews of this vast research area we refer the reader to Refs. Alexandrov and Devreese, 2010; Grusdt and Demler, 2016; Devreese, .

Despite the successes of these model solutions and the growing interest in applying these techniques to novel areas such as ultracold atoms,Tempere et al. (2009); Rath and Schmidt (2013); Grusdt (2016) the Fröhlich Hamiltonian describes a highly-idealized model system, and does not contain enough information to begin a quantitative and predictive study of polarons in real solids. In fact, this model considers the coupling of an electron to a dispersionless longitudinal optical phonon, but in most materials of practical interest the EPI is far more complex. For example halide perovskites such as CH3NH3PbI3 exhibit multi-phonon Fröhlich coupling,Sendner et al. (2016); Schlipf et al. (2018); Poncé et al. (2019) and transition metal oxides such as TiO2 exhibit anisotropic effective masses.Verdi et al. (2017) Furthermore in many situations the EPI involves both long-range and short-range effects, which are not well captured by the two limiting scenarios investigated by FröhlichFröhlich et al. (1950) and Holstein Holstein (1959). In order to mitigate these drawbacks, considerable effort is being devoted to expanding the scope of model Hamiltonians to additional EPI mechanisms.Vlietinck et al. (2015) In our view what is still missing in this area is a unified approach to the polaron problem, where the EPI mechanisms and parameters are obtained from first principles, without making a priori assumptions.

An obvious candidate for beginning to develop an ab initio theory of polarons is density-functional theory (DFT). However, DFT studies of polarons also carry some limitations. Since the calculations are performed by adding or removing an electron in a supercell, the computational cost restricts the systems that can be investigated to small- and intermediate-size polarons (i.e. supercells containing up to a few thousand atoms).Spreafico and VandeVondele (2014) This limitation makes it difficult to investigate systems with interesting long-range Fröhlich EPIs.Verdi and Giustino (2015) On top of these computational challenges, standard DFT calculations suffer from the self-interaction error,Perdew and Zunger (1981) and this can be critical in the study of polarons. Several promising attempts at circumventing this problem have been made, ranging from using Hubbard-corrected DFTSetvin et al. (2014); Himmetoglu et al. (2014), to hybrid functionals,Setvin et al. (2014); Himmetoglu et al. (2014); Kokott et al. (2018) and specialized self-interaction correction (SIC) schemes.Sadigh et al. (2015) Even though it is reasonable to expect that these technical challenges will be overcome in the future, DFT calculations based on supercell calculations offer limited physical insight into the EPI mechanisms that drive polaron formation. As a result, it is difficult to establish a link between such calculations and more advanced many-body solvers for model Hamiltonians.

The goal of the present study is to make ab initio DFT calculations of polarons more accessible and more systematic, and to lay the groundwork for linking these calculations with advanced polaron solvers based on model Hamiltonians. To this aim we reformulate the calculation of polaron energies and wavefunctions using DFT and supercells into a nonlinear eigenvalue problem. The ingredients of this nonlinear problem are DFT quantities that are obtained exclusively from calculations in the crystal unit cell, namely electron bands, phonon dispersions, and electron-phonon matrix elements; the method does not require explicit supercell calculations. Our present approach is similar in spirit to the study of excitons via the Bethe-Salpeter equations:(Rohlfing and Louie, 2000; Bokdam et al., 2016) as in the exciton problem, we write the polaron wavefunction as a superposition of Kohn-Sham (KS) states, and we seek to determine the expansion coefficients in this basis. This is achieved by performing a variational minimization, and the resulting ‘polaron equations’ are found to be closely related to the Landau-Pekar theory. The key approximations involved in our approach are those of harmonic phonons and linear electron-phonon coupling, as in the original Fröhlich model and in the vast majority of modern many-body investigations of polarons. We illustrate the capability of this new theoretical and computational framework by discussing applications to the large electron polaron in LiF, the small hole polaron in the same compound, and the small electron polaron in Li2O2. For these test cases we report polaron formation energies and excitation energies, wavefunctions, and atomic displacement profiles, and we analyze the underlying EPI mechanisms in each case. We also discuss a self-interaction scheme that eliminates the need for Hubbard corrections or hybrid functionals. A preliminary account of this work was given in Ref. Sio et al., 2019.

The manuscript is organized as follows. In Sec. II we review the classic Landau-Pekar model.Landau (1933); Pekar (1946); Alexandrov and Devreese (2010) In Sec. III we develop our formalism. We start from the derivation of the polaron equations in Sec. III.1, we discuss the formation energy and the excitation energy in Sec. III.2, and we recast the problem in the basis of KS states and vibrational eigenmodes in Sec. III.3. In Sec. III.4 we obtain the atomic displacement patterns associated with the polaron, and in Sec. III.5 we provide useful expressions for the polaron energy. Section III.6 describes how to visualize the polaron wavefunctions, and Sec. III.7 established the formal link between the present approach and the Landau-Pekar theory described in Sec. II. In Sec. IV we discuss the SIC employed in this work, and how it relates to the polaron equations derived in Sec. III.1. The technical details of our implementation and the computational setup for the calculations are described in Sec. V. In particular we give details of all DFT calculations (Sec. V.1), of the nonlinear eigenvalue solver (Sec. V.2), and basic information on each of the compounds considered (Sec. V.3). In Sec. VI we illustrate our results. First we validate our SIC against previous work using $\alpha$ -quartz as a test case (Sec. VI.1). Then we discuss the dependence of the polaron energies on supercell size and compare with previous work and SIC calculations in Sec. VI.2. We show polaron wavefunctions and lattice distortions in Sec. VI.3, and we compare our results with explicit supercell calculations. The spectral decomposition of the polaron into KS states and normal modes is presented in Sec. VI.4. In Sec. VII we discuss possible future work to link the present formalism with advanced many-body approaches for model Hamiltonians, and in Sec. VIII we draw our conclusions.

II The Landau-Pekar model

In this section we summarize the original derivation of the Landau-Pekar (LP) model,Landau (1933); Pekar (1946) since this model provides a very useful starting point to understand our ab initio approach described in Sec. III.

The LP model is a simple yet powerful framework for studying a single electron added to a polar insulator. The key assumption of this model is that the electron wavefunction extends over spatial dimensions spanning many crystal unit cells. As a consequence, the atomistic details of the crystal are neglected; the interaction of the added electron with the valence manifold is described via the effective-mass approximation and thus only enters the kinetic energy; the interaction of this electron with the ionic lattice is described via a continuum electrostatic model. The total energy of the LP model is written as:

[TABLE]

where $\psi({\bf r})$ is the wavefunction of the added electron, ${\bf E}({\bf r})$ is the self-consistent electric field and ${\bf D}({\bf r})$ is the electric displacement field. The first term on the r.h.s. of Eq. (1) represents the band energy of the extra electron, and includes electron-electron interactions via the conduction band effective mass $m^{*}$ . The second term represents the total electrostatic energy of the dielectric.Jackson (1998)

The displacement field ${\bf D}$ is related to the density of free carriers, and therefore to the wavefunction of the excess electron, by the relation $\nabla\cdot{\bf D}=-e|\psi({\bf r})|^{2}$ ( $e$ is the electron charge), or equivalently:

[TABLE]

The displacement field is also related to the self-consistent electric field via ${\bf D}=\epsilon_{0}\epsilon^{0}{\bf E}$ , where $\epsilon_{0}$ is the vacuum permittivity and $\epsilon^{0}$ is the static dielectric constant. By replacing Eq. (2) into (1) we obtain the total electrostatic energy:

[TABLE]

In this expression the electric field ${\bf E}$ includes contributions from both the electronic screening and the lattice screening. Since the electronic screening energy is already accounted for in the band structure term in Eq. (1), we need to subtract this contribution from Eq. (3). The electronic-only contribution is simply obtained by evaluating Eq. (3) with the ionic screening turned off, i.e. by using the high-frequency (electronic) permittivity $\epsilon^{\infty}$ instead of the static (electronic and ionic) permittivity $\epsilon^{0}$ . After removing this contribution the electrostatic energy reads:

[TABLE]

By defining $1/\kappa=1/\epsilon^{\infty}-1/\epsilon^{0}$ ,Alexandrov and Devreese (2010) we can rewrite Eq. (1) as a functional of the polaron wavefunction:

[TABLE]

The ground-state energy of the LP polaron is found by minimizing this functional with respect to $\psi$ , subject to the constraint provided by the normalization condition $\int\!d{\bf r}\,|\psi({\bf r})|^{2}=1$ . This problem can be solved by transforming it into an unconstrained minimization with the normalization incorporated via the Lagrange multiplier $\varepsilon$ :

[TABLE]

By setting to zero the two functional derivatives $\delta E_{\rm LP}^{\prime}/\delta\psi^{*}$ and $\delta E_{\rm LP}^{\prime}/\delta\varepsilon$ one obtains a Schrödinger-type eigenvalue problem for $\psi$ :

[TABLE]

Here the eigenvalue $\varepsilon$ carries the meaning of an energy, but it is not the total energy of the polaron. This is seen by projecting Eq. (7) onto $\psi^{*}$ and comparing with Eq. (5):

[TABLE]

Equation (7) provides an intuitive understanding of the nature of polaron self-trapping in the LP model. Let us consider for example a normalized trial wavefunction $\psi({\bf r})=(\pi r_{\rm p}^{3})^{-1/2}\exp(-|{\bf r}|/r_{\rm p})$ .Alexandrov and Devreese (2010) Using this trial function, it is evident that that minimization of the kinetic energy term in Eq. (7) favors delocalization (larger $r_{\rm p}$ ), while the minimization of the Coulomb term favors localization (smaller $r_{\rm p}$ ). The polaron size $r_{\rm p}$ results from a trade-off between these competing effects. By replacing the above exponential ansatz in Eq. (5), one obtains a simple estimate for the energy as a function of the polaron size $r_{\rm p}$ :

[TABLE]

The minimum of this function is given by:

[TABLE]

where $a_{0}$ denotes the Bohr radius and $m_{e}$ is the free electron mass. By replacing Eq. (11) inside Eq. (10) we find the ground-state energy:

[TABLE]

with $E_{\rm Ha}$ being the Hartree energy. Furthermore, by using Eq. (9) we obtain the polaron eigenvalue:Alexandrov and Devreese (2010)

[TABLE]

The energy given by Eq. (12) can also be expressed by using the standard polaron coupling constant $\alpha$ , which is defined as:

[TABLE]

where $\omega_{\text{LO}}$ is the characteristic frequency of longitudinal optical phonons. By combining Eqs. (12) and (14) one obtains the standard result:

[TABLE]

which is very close to the original variational solution by Pekar.Pekar (1946) Much work has been done to improve on the simple exponential ansatz employed in this brief overview of the LP model. However, apart from obtaining a more accurate prefactor in front of the term ${\alpha}^{2}\,\hbar\omega_{\text{LO}}$ in Eq. (15), these improvements do not change the qualitative features of the solution. This is a consequence of the fact that Eq. (5) can be written in a scale-invariant form by defining $\psi({\bf r})=a^{-3/2}\phi({\bf r}/a)$ with $a=\kappa\,a_{0}/(m^{*}/m_{e})$ , so that:

[TABLE]

subject to the normalization condition $\int\!d{\bf r}\,|\phi({\bf r})|^{2}=1$ . The direct numerical solutionMiyake (1975) of Eq. (16) yields a wavefunction which is very close to the original variational result found by Pekar using a modified exponential.Pekar (1946); Alexandrov and Devreese (2010); Devreese and Alexandrov (2009); Miyake (1975)

From Eq. (12) we see that the formation of a localized polaron is only possible when $\epsilon^{0}>\epsilon^{\infty}$ , that is in polar crystals. This leaves out those polarons that can form in non-polar semiconductors. In addition, since the electron-phonon coupling mechanism is related to the ionic dielectric response, i.e. to the long-range Fröhlich potential generated by lattice distortions, the LP model also leaves out acoustic and piezo-acoustic polarons. Further limitations of the model are that it assumes an isotropic dielectric, and does not take into account the atomistic nature of the crystal lattice. In Ref. Devreese and Alexandrov, 2009 it was pointed out that the LP model is essentially never valid, because it relies on the assumption of large polarons in order to use continuum electrostatics, but its results tend to be accurate in the regime of strong coupling, that is for small polarons, in contrast with the starting hypothesis. The LP model is said to describe strong-coupling polarons because the energy given by Eq. (15) is almost the same as that obtained in the strong-coupling limit of the Feynman theory, $-\alpha^{2}\,\hbar\omega/3\pi$ .Feynman (1955)

In the following section we show how the essential physics of the LP model can be retained by moving to an ab initio formalism based on density-functional theory, and that most of the intrinsic limitations of the model can be overcome in this new framework.

III Polarons in density-functional theory

III.1 Derivation of the polaron equations

In order to develop an ab initio theory of polarons, we take the view that standard density-functional theory (DFT) implementations contain most of the essential physics, and can serve as a useful starting point. The modification to remove the self-interaction error in standard DFT will be discussed in Sec. IV.

DFT already incorporates the physics of the LP model: if we add an electron in an otherwise empty conduction band of a semiconductor or insulator, the ions experience an additional force that causes them to screen the extra charge. This notion is well established, and has been exploited in several investigations of small polarons, i.e. polarons with a spatial extension corresponding to one or few atomic orbitals.d’Avezac et al. (2005); Feng et al. (2013); Kokott et al. (2018)

The main limitation of such direct calculations is that only small polarons can be investigated, because intermediate-size and large polarons would require prohibitively time-consuming calculations with supercells containing many thousands of atoms. Another limitation is that with direct calculations it is not possible to analyze the individual contributions to the polaron formation, for example which phonons are responsible for the self-trapping, and which electrons participate in the polaron wavefunction. Lastly, direct calculations are very sensitive to the choice of the DFT exchange and correlation functional, mostly due to the self interaction error, making it very challenging to obtain reliable polaron formation energies.

To overcome these limitations it is desirable to formulate an ab initio theory of polarons which does not require large supercell calculations, and where the individual contributions to the polaron energy and wavefunctions are easily recognizable. In the following we propose a new framework to address these challenges.

We start by writing the DFT total energy of a semiconducting or insulating crystal, with the valence bands fully occupied and the conduction bands empty. We consider a Born-von Karman supercell of the crystal, containing $N_{p}$ unit cells of volume $\Omega$ . We follow the notation of Ref. Giustino, 2017 and use ${\bm{\tau}}_{\kappa p}$ and $\tau_{\kappa p{\alpha}}$ to indicate the position and Cartesian coordinates of the atom $\kappa$ in the unit cell $p$ along the Cartesian direction ${\alpha}$ , respectively. The atom $\kappa$ has a charge $e\,Z_{\kappa}$ ; we shall write the equations with an all-electron implementation in mind; the transposition to a pseudopotential formalism is obvious. The KS states have wavefunctions $\psi_{n{\bf k}}({\bf r})$ with energies $\varepsilon_{n{\bf k}}$ , where $n$ is the band index and ${\bf k}$ the wavevector. The wavefunctions are normalized in the supercell, and we have $N_{p}$ ${\bf k}$ -points on a uniform grid. With this notation the electron density reads $n=n^{\uparrow}+n^{\downarrow}$ , with $n^{\uparrow}({\bf r})=n^{\downarrow}({\bf r})=\sum_{v{\bf k}}|\psi_{v{\bf k}}({\bf r})|^{2}$ and the subscript $v$ running over all occupied states. The system is assumed to be spin-degenerate in the ground state. The DFT total energy of the entire supercell reads:

[TABLE]

where ${\bf T}$ is a vector of the supercell lattice, and all integrals are evaluated over the supercell. In the last term the contribution from $\kappa p=\kappa^{\prime}p^{\prime}$ is omitted when ${\bf T}=0$ . We now call ${\bm{\tau}}^{0}_{\kappa p}$ the atomic positions at equilibrium in the ground state, so that a general ionic coordinate reads ${\bm{\tau}}_{\kappa p}={\bm{\tau}}^{0}_{\kappa p}+\Delta{\bm{\tau}}_{\kappa p}$ . Similarly, we call $\psi_{v{\bf k}}^{0}$ the wavefunctions obtained with the atoms in the equilibrium positions, and $n^{0}$ the corresponding density. To second order in the displacements $\Delta{\bm{\tau}}_{\kappa p}$ , the total energy in Eq. (III.1) can be written as:

[TABLE]

where $C^{0}_{\kappa{\alpha}p,\kappa^{\prime}{\alpha}^{\prime}p^{\prime}}$ is the usual matrix of interatomic force constants,Baroni et al. (2001); Gonze and Lee (1997); Giustino (2017) evaluated for the ground state. Upon adding an extra electron to the ground state, we fill one conduction state and the system becomes spin polarized. Before proceeding we emphasize that the same reasoning can be made for the case of a hole at the top of the valence bands; the formalism is entirely symmetric in this respect. Let us call the wavefunction of the excess electron $\psi$ , and its associated density $\Delta n=|\psi|^{2}$ . For definiteness we say that this extra electron carries a spin up. We also add a compensating jellium background, $-1/N_{p}\Omega$ , to avoid the Coulomb divergence. The total energy from Eq. (III.1) is modified as follows:

[TABLE]

In order to proceed we make the following key observation: the addition of a single electron to a system of many electrons will modify the electron density only slightly. Indeed, in the limit of very large polaron the extra electron density at any point will be of the order of $(N_{p}\Omega)^{-1}\ll n$ ; in the limit of very small polaron the density will be of the order of $\Omega^{-1}$ in one unit cell, and negligible in the others. Following this argument, in the following we make the approximations that, upon adding one electron, $\Delta n\ll n$ almost everywhere, and as a result the valence wavefunctions $\psi_{v{\bf k}}$ remain unaltered. The latter approximation allows us to expand the exchange and correlation energy as follows:

[TABLE]

By combining Eqs. (III.1)-(III.1) and rearranging we find:

[TABLE]

where $E_{\rm B}$ is a constant term arising from the jellium background. Inside the square brackets in the third and fourth lines of this equation we recognize the KS Hamiltonian ${\hat{H}}_{\rm KS}[n({\bf r}),\{{\bm{\tau}}_{\kappa p}\}]$ associated with the occupied manifold in absence of the excess electron. In analogy with Eq. (III.1), we can rewrite this term by performing a Taylor expansion around the equilibrium atomic coordinates:

[TABLE]

where we use $V_{\rm KS}^{0}$ to indicate the KS self-consistent potential at equilibrium, in the absence of the excess electron. To keep the formalism as simple as possible, we truncate the expansion to first order in $\Delta\tau_{\kappa{\alpha}p}$ . This is the lowest order that admits non-trivial solutions, that is self-trapped polarons.

The fifth and sixth lines of Eq. (III.1) contain the Hartree, exchange, and correlation self-interaction of the excess electron. These are spurious contributions which artificially increase the energy needed to form a polaron, and which tend to delocalize the polaron wavefunctions. For the time being we neglect these terms. In Sec. IV we show that the correct procedure to deal with these terms is to modify the exchange-correlation functional $E_{xc}$ by including suitable SICs. The resulting formalism is robust and mathematically elegant (validation tests are presented in Sec. VI.1).

Now we can combine Eqs. (III.1) and (22) to obtain our final expression for the DFT functional of a polaron. At this point we omit the fifth, sixth, and seventh lines of Eq. (III.1), and we use the short-hand notation ${\hat{H}}_{\rm KS}^{0}$ for ${\hat{H}}_{\rm KS}[n^{0}({\bf r}),\{{\bm{\tau}}^{0}_{\kappa p}\}]$ :

[TABLE]

The functional $E_{\rm p}[\psi,\{\Delta\tau_{\kappa{\alpha}p}\}]$ defined by this equation constitutes the DFT counterpart of the Laundau-Pekar functional in Eq. (9). Also in this case we can take into account the normalization constraint on the wavefunction by introducing the Lagrange multiplier $\varepsilon$ . By setting to zero the derivatives with respect to $\psi^{*}$ and $\Delta\tau_{\kappa{\alpha}p}$ , we find the coupled system of equations:

[TABLE]

This coupled system of equations defines a self-consistent problem in $\psi$ and $\Delta\tau_{\kappa{\alpha}p}$ , whose solution yields the polaron wavefunction and the associated pattern of atomic displacements. In order to emphasize the analogy with the Landau-Pekar polaron discussed in Sec. II, it is convenient to replace Eq. (III.1) inside (III.1). The result is:

[TABLE]

having defined the ‘polaron kernel’ $K^{0}({\bf r},{\bf r}^{\prime})$ as:

[TABLE]

In this form the similarity with Eq. (7) is evident: the KS Hamiltonian in Eq. (26) is the counterpart of the kinetic energy with the band effective mass in the LP model, while the kernel is the counterpart of the self-trapping potential. We will elaborate on this analogy in Sec. III.7.

III.2 The formation energy of a polaron and the meaning of the polaron eigenvalue

As in the case of the LP model, the eigenvalue $\varepsilon$ appearing in Eq. (26) does not correspond to the energy of the polaron. To see this it is convenient to define the polaron formation energy $\Delta E_{f}$ as the energy required to trap a conduction band state into a localized polaron:

[TABLE]

Here $E_{\rm p}$ is the functional defined by Eq. (III.1). This definition yields the energy gained by the system when a delocalized conduction electron becomes self-trapped, and allows us to separate the energetics of the polaron formation from that of the electron addition into the conduction band of the insulator/semiconductor with the ions in the equilibrium positions. By using Eqs. (III.1), (III.1) and (27) in this expression we find:

[TABLE]

where $\varepsilon_{\rm CBM}$ is the KS eigenvalue of the conduction band bottom. Similarly, we can obtain an expression for the Lagrange multiplier $\varepsilon$ in Eq. (26) by projecting onto $\psi^{*}$ :

[TABLE]

By subtracting the last two equations we obtain a simple relation between the formation energy $\Delta E_{f}$ and the eigenvalue $\varepsilon$ :

[TABLE]

This result shows that the Lagrange multiplier contains a double counting of the Coulomb energy, which has to be removed in order to obtain the formation energy. This is analogous to the relation between the DFT total energy and the sum of the band eigenvalues.Giustino (2014)

By using Eqs. (III.1) and (27) we can rewrite Eq. (31) as follows:

[TABLE]

This expression for the formation energy can be interpreted in the context of Franck-Condon principle: the difference $\varepsilon_{\rm CBM}-\varepsilon$ can be thought of as the energy required for an ultrafast excitation to promote the electron from the polaron state to a band state at the bottom of the conduction manifold, while the ions are still in the distorted polaron state; the sum on the r.h.s. then corresponds to energy released by the distorted lattice upon relaxation. The same interpretation is often discussed in relation to the LP model.(Devreese and Alexandrov, 2009)

III.3 Polaron equations in the basis of Kohn-Sham states and phonon modes

For practical ab initio calculations it is convenient to recast the equations derived in Sec. III.2 in a reciprocal space formulation. Since the KS states in the ground state form a complete basis, we can expand the polaron wavefunction as:

[TABLE]

where the summation is restricted to the unoccupied (conduction) states since we are assuming that the valence band manifold remains unchanged. From the normalization of the KS states $\psi_{n{\bf k}}$ and the polaron wavefunction $\psi$ it follows:

[TABLE]

Now we replace Eq. (33) inside Eq. (26) and project both sides on a KS state. To carry out the algebra it is useful to keep in mind the standard relations between the electron-phonon matrix elements, the interatomic force constants, and the vibrational eigenmodes:Giustino (2017)

[TABLE]

Here $e_{\kappa{\alpha},\nu}({\bf q})$ denotes orthonormal vibrational modes for the wavevector ${\bf q}$ and branch $\nu$ , with frequency $\omega_{{\bf q}\nu}$ . $M_{\kappa}$ is the mass of the $\kappa$ -atom, ${\bf R}_{p}$ is a vector of the direct lattice of the crystal unit cell. The integral is over the supercell, and $g_{mn\nu}({\bf k},{\bf q})$ is the matrix element for the scattering of an electron $\psi_{n{\bf k}}$ into $\psi_{m{\bf k}+{\bf q}}$ via the phonon ${\bf q}\nu$ ; it has dimensions of an energy. By combining Eqs. (26)-(27) and (33)-(36) we arrive at the self-consistent eigenvalue problem:

[TABLE]

The operator on the left-hand side of Eq. (37) is Hermitian. This can be verified after noting that from Eq. (38) we have $B^{*}_{{\bf q}\nu}=B_{-{\bf q}+{\bf G},\nu}$ , where ${\bf G}$ is a reciprocal lattice vector that folds $-{\bf q}$ back into the first Brillouin zone (possibly ${\bf G}=0$ ). The periodicity of $B_{{\bf q}\nu}$ is inherited from the choice of a periodic gauge for both the KS states and phonon modes. Furthermore, by taking the complex conjugate of Eq. (37) and using $g^{*}_{mn\nu}({\bf k},{\bf q})=g_{mn\nu}(-{\bf k}+{\bf G},-{\bf q})$ from time-reversal symmetry, it can be seen that if $A_{n{\bf k}}$ is a solution vector, then also $A^{*}_{n,-{\bf k}+{\bf G}}$ is a solution for the same eigenvalue. This implies that, apart from a non-essential phase, $A_{n,-{\bf k}+{\bf G}}=A^{*}_{n{\bf k}}$ . By using this property in the expansion Eq. (33) we see that the polaron wavefunction $\psi$ has to be real-valued.

Equations (37) and (38) constitute the central result of this manuscript. They allow us to calculate the polaron wavefunction without resorting to supercell calculations, but only starting from standard ingredients of DFT calculations in the unit cell, such as KS states, phonons, and electron-phonon matrix elements.Baroni et al. (2001); Giustino (2017)

III.4 Lattice distortion in the polaronic ground state

The polaron eigenvector $A_{n{\bf k}}$ obtained from the solution of Eqs. (37), (38) can be used to find the atomic displacements in the polaron ground state. To this aim we replace Eqs. (33)-(36) and (38) inside Eq. (III.1). After some manipulations we obtain:

[TABLE]

Here we can see that the quantity $B_{{\bf q}\nu}$ has the physical meaning of the amplitude of the phonon mode ${\bf q}\nu$ which contributes to the atomic displacement $\Delta\tau_{\kappa{\alpha}p}$ . As in the case of the electron wavefunction in the previous section, it is easy to verify that the atomic displacements $\Delta\tau_{\kappa{\alpha}p}$ are real-valued as a result of time-reversal symmetry, $B^{*}_{{\bf q}\nu}=B_{-{\bf q}+{\bf G},\nu}$ . By inverting Eq. (39) we also find that $B_{{\bf q}\nu}$ fulfils the sum rule:

[TABLE]

where the r.h.s. can be interpreted as a measure of the lattice distortion.

III.5 Formation energy in the basis of Kohn-Sham states and phonon modes

In analogy with Eq. (39) we can derive the formation energy in terms of the eigenvector $A_{n{\bf k}}$ . To this aim we combine Eq. (29) with Eqs. (33)-(36) and (38). The result is:

[TABLE]

or equivalently, using Eq. (31):

[TABLE]

The formation energy in Eq. (41) is composed of one term associated with the electron part of the polaron, described by $A_{n{\bf k}}$ , and one term associated with the phonon part, described by $B_{{\bf q}\nu}$ . By comparing Eqs. (42) and (32) we see that $|B_{{\bf q}\nu}|^{2}\hbar\omega_{{\bf q}\nu}$ represents the contribution of every vibrational mode to the elastic energy of the polaron. Therefore it is natural to interpret $|B_{{\bf q}\nu}|^{2}$ as the number of phonons in each mode participating to the polaron. This heuristic interpretation can be placed on more rigorous ground by moving from a classical to a quantum-mechanical description of the ionic coordinates, and by performing a Bogoliubov transformation.Bogolubov (2014) For now we limit ourselves to emphasize that in DFT calculations the nuclei are described in the adiabatic and classical approximation, therefore we do not strictly have phonon quanta in our formalism. By introducing the spectral functions:

[TABLE]

Eq. (41) is recast as:

[TABLE]

From these relations we see that the spectral functions $A^{2}(E)$ and $B^{2}(E)$ play a similar role in the polaron problem as the Eliashberg function in the theory of superconductors.(Allen and Mitrović, 1982) In Sec. VI.4 we will show that these functions can be used to identify the EPI mechanisms leading to the formation of polarons.

III.6 Visualization of the polaron wavefunction

In order to visualize the polaron wavefunction $\psi$ in Eq. (33), it is convenient to resort to a Wannier function representation. Using the standard notation introduced in Ref. Marzari et al., 2012, each KS state can be expanded in a basis of maximally-localized Wannier functions as follows:

[TABLE]

where ${\rm w}_{m}({\bf r})$ is a Wannier function in the unit cell at the origin of the reference frame, normalized in the supercell, and $U^{\dagger}_{mn{\bf k}}$ is the unitary matrix that generates the smooth Bloch gauge. By combining Eq. (33) and Eq. (46) we obtain:

[TABLE]

having defined:

[TABLE]

Equation (47) naturally defines $A_{m}({\bf R}_{p})$ as the envelope function of the polaron, starting from an ab initio perspective. It is interesting to observe that Eq. (48) for the electron part of the polaron is entirely analogous to Eq. (39) for the phonon part. Equation (48) is also useful for practical calculations, especially in combination with Wannier-Fourier interpolation of the electron-phonon matrix elements, as we will show in Sec. VI.3.

It should be noted that the use of Eq. (47) requires some care: the KS wavefunctions employed to determine $A_{n{\bf k}}$ from Eqs. (37) and (38) must be the same as those employed to construct maximally-localized Wannier functions, i.e. the matrix $U^{\dagger}_{mn{\bf k}}$ required in Eq. (48). Failure to do so would result in the introduction of spurious phases and the calculation of an incorrect envelope function.

If the Wannier functions are real, and the wavefunctions $\psi_{n{\bf k}}$ fulfil time-reversal symmetry ( $\psi_{n,-{\bf k}}=\psi^{*}_{n{\bf k}}$ , this is not automatically guaranteed in ab initio calculations), then it follows that $U_{mn,-{\bf k}}=U_{mn{\bf k}}^{*}$ . Combined with Eq. (48), these properties imply that also the envelope functions $A_{m}({\bf R}_{p})$ will be real-valued.

By combining Eqs. (48) and (34) we obtain the normalization condition on the envelope function:

[TABLE]

where we used the property that $U_{mn{\bf k}}$ is a unitary matrix.

III.7 Link with the Landau-Pekar model

We now show that, under suitable approximations, the ab initio polaron equations Eqs. (37)-(38) reduce precisely to the LP model discussed in Sec. II.

To this aim we consider a model system with only one conduction band with effective mass $m^{*}$ , one dispersionless phonon mode with frequency $\omega_{\text{LO}}$ , and electron-phonon coupling given by the Fröhlich interaction. The electron-phonon matrix element $g(q)$ is given by:(Verdi and Giustino, 2015; Giustino, 2017, 2019)

[TABLE]

This expression is valid for an isotropic crystal with a single infrared-active phonon. By replacing Eqs. (48) and (50) inside Eqs. (37)-(38), after some algebra we obtain:

[TABLE]

where we omitted the subscript $p$ from ${\bf R}_{p}$ for notational simplicity, and the gradient is with respect to ${\bf R}$ .

In the limit of dense Brillouin-zone sampling, i.e. supercell of infinite size, we can replace the summation over ${\bf q}$ by an integral using $N_{p}^{-1}\sum_{\bf q}=\Omega_{\rm BZ}^{-1}\int_{\rm BZ}d{\bf q}$ . Using this replacement and carrying out the integral, Eq. (III.7) becomes:

[TABLE]

We can now transform the summation over the lattice vectors into an integral, by regarding ${\bf R}$ as a continuous variable and using the substitution $\Omega\sum_{\bf R}=\int d{\bf R}$ :

[TABLE]

By comparing this result with Eq. (7), we see that the envelope function $\Omega^{-1/2}A({\bf r})$ coincides with the solution $\psi({\bf r})$ of the LP model. Therefore, in the case of single-band and single-phonon isotropic systems with Fröhlich electron-phonon coupling, there exists a direct and unambiguous link between the LP model and first-principles calculations of polarons.

IV Quadratic self-interaction correction for polarons

As anticipated in Sec. III, the fifth and sixth lines of Eq. (III.1) contain Hartree and exchange-correlation self-interaction energy of the polaron wavefunction. These terms are a DFT artefact and in a more accurate many-body picture the excess electron should not interact with itself. The practical consequence of having these terms is that they prevent electron self-trapping. For example it is immediate to see that the Hartree term always decreases the formation energy of the polaron. As we show in Sec. VI.1, we confirmed by direct calculations that we are unable to obtain stable self-trapped polarons in the presence of these spurious self-interactions. This behavior is also well documented in the literature.(d’Avezac et al., 2005; Sadigh et al., 2015)

In order to remove the polaron self-interaction terms in Eq. (III.1), we introduce a modified DFT functional with SIC as follows:

[TABLE]

where $E[n_{\uparrow}+\Delta n,n_{\downarrow}]$ is a standard DFT functional, as in Eq. (III.1). The term $E_{\rm H}$ in this equation indicates the Hartree energy functional, and $\Delta n_{\rm B}=(N_{p}\Omega)^{-1}$ is the compensating jellium background. The form of the functional $E^{\rm SIC}$ is chosen in such a way as to cancel exactly the Hartree self-interaction of the polaron, and to cancel the exchange-correlation self-interaction up to third order in the polaron density $\Delta n=|\psi|^{2}$ . In fact, upon functional differentiation of the exchange-correlation terms in Eq. (IV) we find:

[TABLE]

which corresponds precisely to the functional in Eq. (III.1) with the fifth and sixth lines removed. The present analysis demonstrates that, not only our starting functional [as defined by the first four lines of Eq. (III.1)] is physically motivated, but also it can be derived from a simple self-interaction-free DFT functional, as given by Eq. (IV). This is particularly useful for benchmarking our formalism against direct calculations in large supercells.

In order to generate KS equations starting from Eq. (IV), we evaluate the functional derivatives with respect to $\psi_{v{\bf k}{\uparrow}}$ , $\psi_{v{\bf k}{\downarrow}}$ , and $\psi$ . As a reminder we have $n^{\uparrow}=\sum_{v{\bf k}}|\psi_{v{\bf k}{\uparrow}}|^{2}$ , $n^{\downarrow}=\sum_{v{\bf k}}|\psi_{v{\bf k}{\downarrow}}|^{2}$ , and $\Delta n=|\psi|^{2}$ . The total density is $n=n^{\uparrow}+\Delta n+n^{\downarrow}$ , the spin-up density is $n^{\uparrow}+\Delta n$ , and the spin-down density is $n^{\downarrow}$ . We find the following modified KS Hamiltonians for spin-up valence electrons ( ${\hat{H}}^{\rm SIC}_{v{\uparrow}}$ ), spin-down valence electrons ( ${\hat{H}}^{\rm SIC}_{v{\downarrow}}$ ), and the polaron wavefunction ( ${\hat{H}}^{\rm SIC}_{\rm pol}$ ):

[TABLE]

where the Hartree potential $V_{\rm H}$ and the exchange-correlation potentials $V_{xc}^{{\uparrow},{\downarrow}}$ are defined in the usual way. In order to avoid false minima which are typically encountered in self-interaction corrected DFT,Goedecker and Umrigar (1997); d’Avezac et al. (2005) we follow the method of Ref. d’Avezac et al., 2005 and choose to perform a constrained total energy minimization with the constraint $\psi_{v{\bf k}{\uparrow}}=\psi_{v{\bf k}{\downarrow}}$ . The added advantage of this choice is that it is fully consistent with the assumptions that we used in Sec. III.1 to derive the polaron equations.

Our functional $E^{\rm SIC}$ in Eq. (IV) is similar, albeit not identical, to the SIC proposed in Ref. d’Avezac et al., 2005. In that work the authors studied the self-trapping of holes in ${\alpha}$ -quartz, by using a damped Car-Parrinello minimization of the total energy. Using the present notation, their functional reads:

[TABLE]

By comparing this expression with Eq. (IV), we see that the Hartree self-interaction is removed in a similar way in both approaches. The difference lies in the exchange-correlation self-interaction: by expanding $E_{xc}[n_{\uparrow}+\Delta n,n_{\downarrow}]$ in Eq. (IV) using the functional derivative, we see that the polaron does not experience the exchange-correlation interaction with the valence electrons. This can lead to artificially large band gaps. By applying the SIC to quadratic order in $\Delta n$ both for the Hartree and for the exchange-correlation contributions via Eq. (IV), the polaron experiences the usual exchange-correlation interaction with the valence electrons, and band gaps remain unaffected.

The correction provided by Eq. (IV) is easy to implement in DFT schemes which perform a direct minimization of the energy functional, and requires minimal changes to existing codes.Giannozzi et al. (2017)

As we show in Sec.VI.2, the functional $E^{\rm SIC}$ defined by Eq. (IV) overcomes the delocalization problem of DFT, and correctly yields localized polaron wavefunctions in polar materials. Importantly, this method does not require the tuning of Hubbard corrections in DFT+ $U$ or the mixing parameter $\alpha$ in hybrid functional calculations, since the self-interaction error is removed from the outset without introducing additional parameters.

V Implementation and computational setup

V.1 Density-functional theory calculations

In order to demonstrate the theory developed in Sec. III we perform DFT calculations using planewaves and pseudopotentials, as implemented in the Quantum ESPRESSO materials simulation suite,Giannozzi et al. (2009) together with the wannier90Mostofi et al. (2014) and EPWPoncé et al. (2016) codes. The polaron equations described in Sec. III.3 are implemented in a modified version of the EPW code, and the visualization of the polaron wavefunctions as described in Sec. III.6 is performed using a modified version of the wannier90 code and VESTA for visualization.Momma and Izumi (2011) We use the generalized gradient approximation to DFT of Perdew, Burke, and Ernzerhof (PBE),Perdew et al. (1996) and optimized norm-conserving Vanderbilt (ONCV) pseudopotentials,Hamann (2013) with planewaves kinetic energy cutoffs of 150 Ry, 105 Ry, and 70 Ry for LiF, Li2O2, and $\alpha$ -SiO2, respectively. In the ground-state calculations we sample the Brillouin zone with $\Gamma$ -centered uniform meshes of size $12\times 12\times 12$ and $8\times 8\times 8$ for LiF and Li2O2 respectively, while $\alpha$ -SiO2 is sampled at $\Gamma$ . Lattice vectors and internal coordinates are optimized using this setup before proceeding to calculate polarons. Equations (37) and (38) require the evaluation of KS energies, phonon energies, and electron-phonon matrix elements on dense uniform grids. To this aim we employ Wannier-Fourier interpolation,Giustino (2017); Marzari et al. (2012); Yates et al. (2007) as implemented in wannier90 and EPW. In order to validate our approach against explicit supercell calculations, we consider two systems, Al-doped $\alpha$ -SiO2 and Li2O2, and we perform self-interaction corrected Car-Parrinello calculations using the CP code(Laasonen et al., 1993) of Quantum ESPRESSO. The SIC scheme implemented in CP was developed in Ref. d’Avezac et al., 2005, and corresponds to the functional in Eq. (IV). To implement the functional in Eq. (IV) we made minor modifications to the existing code.

V.2 Solution of the polaron equations

In order to solve Eqs. (37)-(38) we rewrite Eq. (37) more conveniently as follows:

[TABLE]

with

[TABLE]

In this form it is clear that the solution of Eq. (60) can be obtained using standard numerical eigensolvers. In order to start the procedure we initialize the vector of coefficients $A_{n{\bf k}}$ using a Gaussian lineshape centered at the band minimum. From this starting guess we proceed to construct the vector of coefficients $B_{{\bf q}\nu}$ using Eq. (38). At this point we can set up the Hamiltonian matrix of Eq. (61) and proceed to the solution of the eigenvalue problem in Eq. (60). The lowest-energy eigenvector $A_{n{\bf k}}$ is used again in Eq. (38) and the whole procedure is repeated until convergence in the polaron formation energy as given by Eq. (42). In all calculations we employ an energy convergence threshold of 0.1 meV.

The ${\bf k}$ -point grid employed in Eq. (60) defines the equivalent Born-von Kárman (BvK) supercell hosting the polaron. For example a ${\bf k}$ -point grid $10\times 10\times 10$ corresponds to calculating the polaron wavefunction, the corresponding atomic displacements, and the energetics in an equivalent $10\times 10\times 10$ supercell. Since we need information on both $A_{n{\bf k}}$ and $B_{{\bf q}\nu}$ , we use the same uniform and $\Gamma$ -centered grid for ${\bf k}$ -points and ${\bf q}$ -points. When ${\bf k}+{\bf q}$ falls outside of the initial grid, we use the periodic gauge and set $A_{n{\bf k}+{\bf q}}=A_{n{\bf k}+{\bf q}+{\bf G}}$ , with ${\bf G}$ a reciprocal lattice vector that folds ${\bf k}+{\bf q}$ inside the original grid. This procedure is necessary to guarantee that the solution vector $A_{n{\bf k}}$ fulfils time-reversal symmetry, see discussion after Eq. (38).

In the case of large polarons dominated by the Fröhlich coupling, the electron-phonon matrix elements exhibit a singularity at ${\bf q}=0$ .Verdi et al. (2017) As a result the solution vectors $A_{n{\bf k}}$ tend to have significant weight only in the vicinity of the band extrema. This is the case of the electron polaron in LiF, for example, as discussed in Sec. VI.3. In these situations one needs relatively fine ${\bf k}$ - and ${\bf q}$ -point meshes, but most grid points do not contribute to the calculations; to reduce computational cost we use fine grids but we restrict the Hamiltonian $H_{n{\bf k},n^{\prime}{\bf k}^{\prime}}$ to an inner grid of ${\bf k},\,{\bf k}^{\prime}$ -points near the band edges. We then increase the size of the inner grid to check for convergence.

Since in the present formalism we study a localized charge distribution in a supercell, the solutions of the eigenvalue problem in Eq. (60) contain a spurious interaction energy between the polaron and its periodic images. The same situation is also found in the study of charged defects in periodic supercells. In order to eliminate this spurious energy we employ the standard Makov-Payne correction.Makov and Payne (1995) To this aim we perform calculations for increasing size of the equivalent BvK supercell, and then extrapolate the formation energy and the polaron eigenvalue using the asymptotic trend $L^{-1}$ , where $L$ is the linear size of the equivalent supercell. For example, in the case of the large electron polaron in LiF, we use ${\bf k}$ -point grids up to $33\times 33\times 33$ . In order to cope with such large grids we use a distributed-memory eigensolver from the ScaLAPACK library.(Choi et al., 1996)

One last aspect that requires some care is the gauge arbitrariness of the electron-phonon matrix elements $g_{mn\nu}({\bf k},{\bf q})$ that one obtains from Wannier-Fourier interpolation. The arbitrariness relates to the facts that (i) the unitary rotation $U^{\dagger}_{mn{\bf k}}$ used in Eq. (46) to go from the smooth Bloch basis to the basis of KS states is determined from a separate diagonalization at each ${\bf k}$ -point; (ii) the analogous rotation required for the atomic displacements in Eq. (35), that is the matrix of vibrational eigenvectors $e_{\kappa{\alpha},\nu}({\bf q})$ , is also obtained by a separate diagonalization at each ${\bf q}$ . These diagonalizations have two drawbacks: 1) they do not satisfy the time-reversal symmetry requirements; 2) they may lead to different results on different architectures, and even on the same architecture but in different runs. This issue is particularly delicate because, in order to save memory, we recompute the matrix elements $g_{mn\nu}({\bf k},{\bf q})$ at each self-consistent iteration. Our benchmarks indicate that this issue can lead to (relatively small) numerical noise in the calculated formation energies, that shows up as small oscillations in plots of $\Delta E_{f}$ vs. $L$ . In order to eliminate these fluctuations we enforce a predetermined choice for the gauge of eigenmodes and wavefunctions, in the same spirit as in Sec. V C of Ref. Giustino et al., 2007. First we rotate $U^{\dagger}_{mn{\bf k}}$ and $e_{\kappa{\alpha},\nu}({\bf q})$ so that the first nonzero component is real and positive. Then we check for degeneracies in the electron or phonon energies, and we break these degeneracies using a fictitious perturbation. To this aim we set up a Hermitian perturbation $P_{mn{\bf k}}$ that spans the Bloch subspace. We fill this matrix by using a sequence of small prime numbers as matrix elements. Then, we diagonalize $P^{\prime}_{ij{\bf k}}=\sum_{mn}U_{im{\bf k}}P_{mn{\bf k}}U^{\dagger}_{nj{\bf k}}$ , where the indices $i$ , $j$ are restricted to the degenerate subspaces. By denoting with $V_{ij{\bf k}}$ the unitary matrix that diagonalizes $P^{\prime}_{ij{\bf k}}$ , we construct $U^{\prime}_{jn{\bf k}}=\sum_{p}V^{\dagger}_{jp{\bf k}}U_{pn{\bf k}}$ . Finally we obtain interpolated KS states and energies from $U^{\prime}_{mn{\bf k}}$ instead of $U_{mn{\bf k}}$ . If the energies are all non-degenerate, then we are done. If there are still degeneracies we repeat the operation by filling the perturbation matrix using the next prime numbers in the sequence. We note that in the subsequent polaron calculation the KS energies remain unaffected by this fictitious perturbation, as from this procedure we only retain the unitary rotation $V_{mn{\bf k}}$ ; formally this is equivalent to taking the limit of a vanishingly small perturbation. We operate similarly for the vibrational eigenmodes. This procedure guarantees that all KS states and phonon eigenmodes carry a unique gauge across successive iterations in the same calculation, or across different machines. We note that the present procedure is simpler and more efficient than the one used in Ref. Giustino et al., 2007, since here we only perform operations on very small matrices and we do not calculate explicitly the matrix elements of the fictitious perturbation using planewaves, unlike in Ref. Giustino et al., 2007. Finally we enforce time-reversal symmetry by making sure that only half of the ${\bf k}$ -points are effectively employed in Eq. (60), using a simple mapping.

V.3 Test systems

V.3.1 Lithium fluoride

The first test system that we consider is a prototypical ionic insulator, lithium fluoride. LiF crystallizes in a simple rock-salt structure and is known to be a wide-gap insulator. As the other members of the alkali halides family, LiF hosts color centers with interesting optoelectronic properties.Baldacchini and Montereali (2001) In particular, the VK center is a self-trapped hole polaron which has been studied in a number of investigations.Karsai et al. (2014); Pederson and Klein (1988); Mallia et al. (2001); Williams and Song (1990); Shluger and Stoneham (1993); Gavartin et al. (2003); Schirmer (2006); Ramo et al. (2007); Sadigh et al. (2015); Kokott et al. (2018) On the other hand, the electron polaron is expected to be a large polaron and has been investigated only by means of model Hamiltonians.(Inoue et al., 1970) Here we perform calculations for both the small hole polaron and the large electron polaron of LiF, and we show that our formalism correctly describes both limits on the same footing.

Fig. 1(a) shows a supercell of LiF (the unit cell consists of only two atoms). Our optimized lattice parameter is $a=4.058$ Å, in agreement with the experimental value $a=4.02$ Å.(Dressler et al., 1987) Our calculated KS band gap is $E_{\rm g}=8.9$ eV, and underestimates the experimental optical gap of 14.2 eV as expected.Piacentini et al. (1976) We find isotropic electron and hole effective masses of $0.88\,m_{e}$ and $3.73\,m_{e}$ , respectively. The electron mass is in good agreement with the reported values 0.78-1.2 $m_{e}$ ,(Page and Hygh, 1970; Iadonisi, 1984) but we could not find previous values for the hole mass. The calculated relative dielectric constants are $\epsilon^{0}=10.62$ and $\epsilon^{\infty}=2.04$ , to be compared with the measured values $\epsilon^{0}=9.04$ and $\epsilon^{\infty}=1.92$ .(Andeen et al., 1970; Levin and Offenbacher, 1960) The highest computed phonon energy is $\hbar\omega_{\rm max}=77.0$ meV, close to the experimental value $\hbar\omega_{\rm max}=80$ meV,(Dolling et al., 1968) and the Fröhlich coupling constant for the electrons is $\alpha=4.92$ .

V.3.2 Lithium peroxide

The second test system that we consider is lithium peroxide, Li2O2. This compound crystallizes in a layered hexagonal structure, with space group P63/mmc. The structure can be though of as consisting of LiO2 layers intercalated by Li planes as seen on Fig. 1(b). Li2O2 forms in battery cathodes during the operation of lithium-air batteries, and can degrade the battery performance through its low electrical conductivity.Ong et al. (2012); Feng et al. (2013); Viswanathan et al. (2011) It has been proposed that the low conductivity of this compound originates from a strong electron-phonon coupling, and several studies reported the calculation of small electron polarons using a supercell approach.Kang et al. (2012); Garcia-Lastra et al. (2013); Feng et al. (2013) In Ref. Feng et al., 2013 it was shown that a small electron polaron can form in a $3\times 3\times 2$ supercell, without the use of Hubbard corrections or hybrid functionals. This finding suggests that Li2O2 supports strongly bound small polarons. Furthermore Li2O2 is highly anisotropic. These properties make lithium peroxide an ideal candidate for testing the limits of our approach.

Figure 1(b) illustrates a supercell of this compound: in each unit cell we have four Li and four O atoms, and the optimized lattice parameters are $a=3.153$ Å and $c/a=2.433$ . Using these parameters, we calculate a band gap of $E_{\rm g}=2.05$ eV, electron effective masses in- and out-of-plane of $2.19\,m_{e}$ and $0.42\,m_{e}$ , respectively, and in-/out-of-plane relative dielectric constants $\epsilon^{\infty}=2.73/3.94$ and $\epsilon^{0}=8.36/14.20$ . The highest phonon energy that we calculate is $\hbar\omega_{\rm max}=98.2$ meV, and the in-/out-of-plane Fröhlich coupling constantsVerdi et al. (2017) are $\alpha=4.74/1.54$ . Our calculations are in good agreement with previous ones yielding $a=3.17$ Å,(Chan et al., 2011) $c/a=2.43$ ,(Chan et al., 2011) $E_{\rm g}=$ 3.6-4.8 eV,(Lastra et al., 2011) and $\hbar\omega_{\rm max}=99.3$ meV.(Lau et al., 2015)

V.3.3 $\alpha$ -Quartz

Since in Sec. IV we introduced a modified version of the SIC for polarons of Ref. d’Avezac et al., 2005, it is important to check that our functional yields results in line with previous work.Lægsgaard and Stokbro (2001); d’Avezac et al. (2005); et al (2015) To this aim we repeat previous calculations on Al-doped $\alpha$ -quartz, and we compare the localization of the trapped hole with the existing results.Lægsgaard and Stokbro (2001); d’Avezac et al. (2005); et al (2015)

Figure 1(c) illustrates the optimized structure of the primitive unit cell of $\alpha$ -SiO2, in the absence of the Al defect. Our optimized lattice parameters are $a=4.913$ Å and $c/a=1.100$ , in agreement with the experimental values $a=4.904$ Å and $c/a=1.100$ .(Smith and Alexander, 1963) We model the defect-induced localized hole using a supercell with 72 atoms, with one Si atom replaced by Al. The lattice parameters of the supercell are not re-optimized after this substitution. The defective structure is shown in Fig. 1(d).

VI Results

VI.1 Validation of the SIC functional

In order to validate the SIC functional proposed in Eq. (IV), we consider an Al defect in $\alpha$ -quartz, following previous work.Lægsgaard and Stokbro (2001); d’Avezac et al. (2005); et al (2015) A calculation without SIC yields a delocalized electronic state and no lattice distortion, as shown in Fig. 1(c). However, when we include the SIC of Eq. (IV), we obtain a localized solution, as seen in Fig. 1(d). This result is in agreement with previous work based on the unrestricted Hartree-Fock methodLægsgaard and Stokbro (2001) and other SIC schemes.d’Avezac et al. (2005); et al (2015)

To be more quantitative we also calculate the bond lengths around the defect site. Using the labeling convention set out in Fig. 1(d), our SIC functional yields the bond lengths 1.946/1.696/1.708/1.699 Å for the bonds Al-O(1) to Al-O(4), respectively. These values compare well with previous findings, with r.m.s. deviations of only 0.006 Å.d’Avezac et al. (2005); Lægsgaard and Stokbro (2001) We can conclude that our modified SIC functional yields the same geometry as in previous work. We also confirmed that the isosurface of the hole density [Fig. 1(d)] looks similar to what previously reported.Lægsgaard and Stokbro (2001); d’Avezac et al. (2005); et al (2015)

To avoid possible ambiguity, we emphasize that the localized hole in Al-doped $\alpha$ -SiO2 does not constitute a polaron strictly speaking. In fact the localization and self-trapping are driven by the crystal potential of Al, and do not reflect a spontaneous breaking of translational symmetry as in the cases of Li2O2 and LiF discussed below. Accordingly, in this case we do not compare with our linear-response polaron formalism, which addresses spontaneous symmetry breaking in perfect crystals.

As a second test we check the geometry of the small electron polaron in Li2O2. In this case previous work finds an electron localized around two nearest-neighbor O atoms in the LiO2 plane, see for example Fig. 5(a). The O-O distance in the pristine lattice is 1.54 Å (1.51 Å in Ref. Kang et al., 2012). Using hybrid functional calculations, Ref. Kang et al., 2012 reported that this distance increases to 2.20 Å upon adding one excess electron in a supercell with 192 atoms. Our calculations using the SIC functional of Eq. (IV) also yield an electron localized around the same pair of oxygen atoms, as shown in Fig. 5(e). The resulting O-O distance is 2.25 Å, only 2 % larger than in Ref. Kang et al., 2012.

VI.2 Polaron energy vs. supercell size and

Mott transition

In Fig. 2(a),(b) we compare the formation energy and polaron eigenvalue obtained via our Eqs. (37)-(38) (brown symbols) with the results of the continuous LP model described in Sec. II (orange lines). We focus on the large electron polaron in LiF for definiteness, and for calculations using the LP model we take $\kappa=2.53$ and $m^{*}/m_{e}=0.88$ from Sec. V.3.1. Fig. 2(a) shows that the polaron formation energy scales with the supercell size as $L^{-1}$ , as expected. In the LP model, the formation energy extrapolated at infinite supercell size is $\Delta E_{f}=-210$ meV. In contrast, when we solve the ab initio polaron equations, we find the extrapolated energy $\Delta E_{f}=-231$ meV. The difference between the LP model and our method relates to the fact that in our ab initio calculations the bands, phonons, and electron-phonon matrix elements are not as simple as in the LP model. To demonstrate this point, we show in the same figure a calculation carried out using our method, but after replacing the band structure by a parabolic band with the same effective mass as in the LP mode, the phonon dispersion relations by a single, non-dispersive longitudinal-optical (LO) mode, and retaining only the long-range component of the electron-phonon matrix element. This “trimmed” version of the calculation reproduces the LP model exactly, as shown by the blue symbols in Fig. 2(a). A comparison of the ab initio electron-phonon matrix element for this mode and the long-range Fröhlich component used in the LP model is shown in Fig. 2(c); here we see that the LP model overestimates the strength of the coupling at short range. Besides validating our method, the present comparison highlights the fact that even in a compound as simple as LiF the electron-phonon coupling is more complex than a simple Fröhlich interaction, and that details of band structures, phonon dispersions, and matrix elements are to be taken into account for predictive calculations.

In Fig. 2(b) we report the polaron eigenvalue $\varepsilon$ measured from the conduction-band bottom as a function of supercell size $L$ . In this case the Makov-Payne extrapolation to infinite supercell size yields $\varepsilon=-800$ meV with our method (brown symbols), and $\varepsilon=-609$ meV with the LP model (orange line). As for the formation energy, also in the case of the polaron eigenvalue we fully recover the LP result when we consider a parabolic band and a non-dispersive LO phonon [blue symbols in Fig. 2(b)]. It is interesting to note that in LiF the ratio between the polaron eigenvalue and its formation energy is 3.46; this ratio is close to the prediction of the LP model in Sec. II, which yields $\varepsilon/\Delta E_{f}=3$ using the exponential ansatz in Eqs. (12) and (13); note that in the LP model $\Delta E_{f}$ coincides with the energy $E_{\rm LP}$ .

In Fig. 2(a) we also see that when the LiF supercell is smaller than $12\times 12\times 12$ unit cells there is no localized polaron solution, i.e. $\Delta E_{f}=0$ . The existence of a critical supercell size for polaron formation can be explained in terms of the Mott transition: when the periodic replicas of the polaron are too close, they form an extended wavefunction, and the corresponding lattice deformation is too shallow to trap an electron. In this case the excess electron becomes fully delocalized. Therefore a localized polaron can only form when the overlap between nearest-neighbor replicas is negligible. This is the same criterion used by Mott to identify the metal-insulator transition.(Mott, 1968) Using the Mott criterion in the standard form $r_{\rm p}\,n_{\rm c}^{1/3}=0.26$ ,(Mott, 1968) with $n_{\rm c}$ being the critical density and $r_{\rm p}$ from Eq. (11), we can estimate a critical density:

[TABLE]

We note that this is only a crude estimate since it is based on a simplified solution to the Pekar polaron problem. Using $\kappa=2.53$ and $m^{*}/m_{e}=0.88$ from Sec. V.3.1 inside Eq. (62), we obtain $n_{\rm c}=15\cdot 10^{19}$ cm*-3*. This estimate is of the same order of magnitude as our calculation in Fig. 2(a), which places the transition between supercells of size $11^{3}$ and $12^{3}$ , that is $n_{\rm c}=4\cdot 10^{19}$ cm*-3*.

In Figs. 2(d),(e) we show our calculated formation energy and eigenvalue for the hole polaron in LiF, respectively. In this case we obtain self-trapped polarons already for supercells as small as $2\times 2\times 2$ unit cells. This result is consistent with Eq. (62) and the heavy effective mass of the valence bands. In fact if we use $m^{*}/m_{e}=3.73$ from Sec. V.3.1 we obtain $n_{\rm c}=1.15\cdot 10^{22}$ cm*-3*, which corresponds approximately to one electron in a $2\times 2\times 2$ supercell. The Makov-Payne extrapolation yields $\Delta E_{f}=-1.98$ eV and $\varepsilon=+4.76$ eV (measured from the valence-band top), therefore we are in the presence of a strongly bound polaron. We note that the polaron eigenvalue is positive because the localized hole state lies above the valence band, but the energy is still within the KS gap of this system ( $E_{\rm g}=8.9$ eV from Sec. V.3.1). For comparison with explicit DFT calculations, in Fig. 2(d) we also report the formation energies calculated in Ref. Sadigh et al., 2015 using SIC or hybrid functionals (filled squares). These calculations correspond to $5\times 5\times 5$ supercells and are in very good agreement with our results.

In Figs. 2(f),(g) we show the eigenvalues and formation energies of the electron polaron in Li2O2, respectively, as a function of supercell size. Using $\kappa=4.05$ and $m^{*}/m_{e}=2.19$ from Sec. V.3.2 inside Eq. (62), we obtain the estimate $n_{\rm c}=6\cdot 10^{20}$ cm*-3*; therefore we expect to see localized solutions already for supercells as small as $5\times 5\times 1$ unit cells. Our calculations indeed find polarons already at $2\times 2\times 1$ , see Fig. 2(f). In this case the formation energy and eigenvalue extrapolated at infinite supercell size (brown symbols) are $\Delta E_{f}=-4.87$ eV and $\varepsilon=-10.98$ eV, respectively. The polaron eigenvalue falls within a band gap in the valence manifold. In this figure we also compare to direct calculations using the SIC functional of Sec. IV. The formation energy in our explicit DFT SIC calculation (cyan symbols) is close to the results of our linear response polaron equations (green symbols), and exhibits the same trend as a function of supercell size; the DFT SIC calculation for the largest supercell considered here ( $7\times 7\times 1$ ) yields $-4.13$ eV, to be compared to our linear-response result $-4.70$ eV. The deviation of $\sim$ 14% can be attributed to the fact that our formalism neglects the response of the valence electrons to the localized lattice distortion caused by this strongly bound polaron, or to the fact that the approximation of linear electron-phonon coupling becomes inaccurate for such large atomic displacements. In the same figure we also show that a DFT calculation without SIC fails to predict the correct formation energy, and tends to delocalize the polaron when increasing the supercell size (orange symbols). We note that a study of the scaling of the polaron energy vs. supercell size has not yet been reported in the literature.

VI.3 Polaron wavefunctions

Figure 3 shows the electron polaron in LiF, obtained by solving Eqs. (37) and (38). The electron wavefunction is computed using Eq. (46), and the atomic displacements are obtained via Eq. (39). Figures 3(a),(b) show the electron wavefunction as an isosurface and as a contour plot in a plane cutting through the center, respectively. When we compare with the delocalized electronic state shown in Fig. 1(a) we see that now we are in the presence of a localized, but large, polaron. To quantify the spatial extension of the polaron we plot the electron density along a line going through the polaron center, see Fig. 3(c). The envelope of the resulting function resembles a Gaussian; if we define the polaron size as the full width at half maximum we obtain $2r_{\rm p}=9.0$ Å. Therefore this polaron extends approximately over two unit cells of the LiF lattice. In Figs. 3(d),(e) we report the atomic displacements associated with this polaron, using a vectorial representation and a one-dimensional cut, respectively. We note that the displacements of the F anions are consistently larger than those of the Li cations. This may appear counterintuitive because the F anions are heavier, but it is consistent with the fact that the electron charge is mostly concentrated around the Li cations due to the character of the conduction band bottom, therefore the F atoms experience the strongest electrostatic force. The largest atomic displacement is 0.02 Å, and this value is only 1% of the Li-F bond length. Therefore we are well within the remit of the harmonic approximation.

Figure 4 shows the hole polaron in LiF, namely (a) the wavefunction isosurface, (b) the same function as a contour plot, (c) a line-cut of the wavefunction, (d) the atomic displacements as arrows, and (e) the size of the displacements along a line passing through the center. Here we are in the presence of a small hole polaron, which is expected given the much heavier masses of the holes as compared to electrons in this system and the much narrowed valence band width [Fig. 6(a)]. As it will be discussed in Sec. VI.4 the hole polaron in LiF is much closer to a Holstein polaron than a Fröhlich polaron. In this case the wavefunction extends over approximately two atomic orbitals, and from Fig. 4(c) we obtain the full width at half maximum $2r_{\rm p}=0.97$ Å. Accordingly only a few atoms undergo significant displacements, as shown in Fig. 4(e). The largest displacements are found for Li cations, in line with the fact that the wavefunctions at the top of the valence band are localized around the anions, which therefore experience a weaker force. We obtain a maximum displacement of 0.44 Å, which is approximately 20% of the bond length of Li-F (2.03 Å). It is remarkable that our formalism is able to capture this limit of very small polaron, even when the atomic displacements are definitely beyond the harmonic regime. We believe that the reason why the formalism works is this extreme case is that the distortion caused by the small polaron affects only a small portion of the crystal; therefore the use of bands, phonons, and electron-phonon matrix elements calculated for the undistorted unit cell does not lead to significant inaccuracies.

Figure 5 shows the electron polaron in Li2O2. In this case we compare three calculations: in (a)-(d) we show the small electron polaron obtained with our formalism; in (e)-(h) we show an explicit supercell calculation using the SIC functional of Sec. IV; in (i)-(l) we show the results of a standard DFT calculation without SIC. In each column we report, from top to bottom: the electron wavefunction, its one-dimensional cut across the polaron center, the atomic displacements as arrows, and the one-dimensional cut of these displacements. The first observation to be made is that standard DFT yields a two-dimensional electronic state that is localized along the $c$ axis [Fig. 5(i)] but delocalized in the $ab$ plane. The SIC leads to electron localization also in the plane, and this is observed both in the explicit supercell calculation [Fig. 5(e)] and using our method [Fig. 5(a)]. The explicit supercell calculation yields a slightly asymmetric wavefunction, while our method gives a perfectly symmetric polaron. This is an artifact of the constraint $\psi_{v{\bf k}{\uparrow}}=\psi_{v{\bf k}{\downarrow}}$ used in the SIC calculation, in fact previous work using hybrid functionals and supercells also found a symmetric polaron,Kang et al. (2012) as in our method. In this case the polaron is also very small, and extends over two adjacent O- $p$ orbitals. From Fig. 5(b) we determine $2r_{\rm p}=0.63$ Å, and from Fig. 5(d) we find the largest displacement to be 0.38 Å. Also in this case the atomic displacements are large ( $\sim$ 25% of the O-O distance, 1.51 Å), but our method correctly predicts the distorted structure as the explicit DFT SIC calculation. This success is remarkable if we consider that our theory is based on small displacements and linear electron-phonon interactions.

VI.4 Spectral decomposition of the polaron

In Figs. 6-8 we present the spectral decomposition of the polaron wavefunctions and atomic displacements in terms of the underling band states. Figure 6(a) shows the electronic weights $|A_{n{\bf k}}|^{2}$ plotted on top of the band structure for the case of the large electron polaron in LiF. The corresponding electronic density of states and spectral function $A^{2}(E)$ are shown in Fig. 6(b). These plots are meant to mimic similar representations of the excitons calculated via the Bethe-Salpeter method.Rohlfing and Louie (2000); Bokdam et al. (2016); Onida et al. (1995) The large electron polaron is dominated by states at the bottom of the conduction band; as expected the localization in reciprocal space mirrors the delocalization in real space. The corresponding atomic displacements are resolved using the weights $|B_{{\bf q}\nu}|^{2}$ in Fig. 6(c), and the density of vibrational states and phonon spectral function $B^{2}(E)$ are given in Fig. 6(d). We see that the polaron is dominated by the LO mode at 76 meV, as expected from earlier work on Fröhlich polarons in halide salts,(Inoue et al., 1970) but we also have smaller contributions coming from the acoustic branches. By integrating $B^{2}(E)$ in Fig. 6(d) we can quantify the roles of these phonons: we find that the LO mode accounts for 62% of the polaronic distortion, while the the transverse acoustic (TA) mode is responsible for the remaining 38%. To the best of our knowledge this is the first study where the importance of TA phonons in the polaron physics of LiF has been identified. This is precisely the kind of new insight that our method can offer.

Figure 7 shows the spectral decomposition of the small hole polaron in LiF. Here the main observation is that the entire highest valence band contributes to the polaronic wavefunction, with smaller contributions from lower-lying bands. This behavior suggests that the small hole polaron of LiF is closer to the Holstein limit(Holstein, 1959) than the Fröhlich limit.(Fröhlich et al., 1950) We emphasize that, at variance with model Hamiltonians, our approach is parameter-free, therefore it captures seamlessly both limits. Also in this case the LO phonon branch dominates the coupling, however now it is the entire branch that contributes, as shown in Fig. 7(d). This observation is in line with the fact that the small polaron requires short-range electron-phonon coupling, therefore the range of important phonon wavevectors must extend away from the zone center. By integrating the spectral function $B^{2}(E)$ we find that the LO branch contributes 78% of the coupling in this case.

Finally Fig. 8 shows the spectral decomposition for the small electron polaron in Li2O2. In this case the two lowest conduction bands contribute equally to the polaron wavefunctions. This is a case where one-band model Hamiltonians such as the models of Fröhlich and Feynman would not be sufficient to capture the essential features of the problem. We also point out that the higher-lying conduction bands do not contribute appreciably to the polaron wavefunction, as it can be seen from the spectral density $A^{2}(E)$ in Fig. 8(b). The largest contribution to the lattice distortion comes from TO modes around 96 meV, which account for 64% of the coupling.

VII Future developments

Having established the potential of our new methodology in Sec. VI, it is worth looking ahead to anticipate possible future developments.

One immediate development would be to explore excited polaron states beyond the ground state. This will require us to solve Eq. (37) for higher-lying electronic eigenstates instead of retaining only the ground state. The study of electronic excitations at fixed lattice distortion could be useful to understand the response of polarons to ultrafast optical excitations for example.

Another important development would be to go beyond the adiabatic and classical approximations. Indeed, the main limitation of the present approach is that the starting point of the formalism is the DFT energy functional in Eq. (III.1). In this functional the electronic structure is described as a parametric function of classical ionic coordinates, therefore both DFT calculations of polarons and our formalism are both similar in spirit to the Landau-Pekar polaron model.

Ideally we would want to study this problem using a fully-fledged field-theoretic formulation, as provided for example by the self-consistent Hedin-Baym equations for the coupled electron-phonon system.Giustino (2017) While it may be possible to proceed along this direction, we speculate that it may be easier to start from the present formulation, and upgrade the theory by reinstating from the outset non-adiabatic effects and quantum nuclear fluctuations. For example, we could restart from Eq. (III.1), introduce the quantum kinetic energy of the nuclei, and write the problem in terms of the correlated electron-ion wavefunction $\Psi({\bf r},\{\Delta\tau_{\kappa{\alpha}p}\})$ :

[TABLE]

The advantage of this formulation is that one could focus on a single electron interacting with a phonon bath, because the electron-electron interaction is already captured by the DFT KS Hamiltonian.

Equation (VII) can be reformulated in terms of phonon ladder operators and electron-phonon matrix elements, following steps similar to Sec. III.4. Using the same notation as in Ref. Giustino, 2017, the Hamiltonian inside the square brackets becomes (apart from a constant provided by the zero-point energy):

[TABLE]

where the summations over bands are restricted to conduction or to valence states for electron or hole polarons, respectively. In this equation we do not employ the usual electron field operator because we have only one electron, therefore second quantization only applies to phonons.

Equation (VII) can be considered as the ab initio counterpart of the Fröhlich electron-phonon Hamiltonian.(Fröhlich, 1954) In fact the standard Fröhlich Hamiltonian is recovered by retaining only one parabolic band and considering only one LO phonon branch. This equation suggests a possible route to link the present approach with many-body calculations of model polaron Hamiltonians: (i) for a given system we could identify the most important electronic bands, phonons, and electron-phonon couplings using our spectral decomposition into $A_{n{\bf k}}$ and $B_{{\bf q}\nu}$ ; (ii) we could then simplify Eq. (VII) to retain only the most important contributions; (iii) at this point we could employ advanced many-body techniques for polaron Hamiltonians, such as for example diagrammatic Monte Carlo (DMC) approaches.(Prokof’ev and Svistunov, 1998) In this way one could envision complete first-principles calculations of polarons, where the atomistic details and predictive power of DFT approaches are combined with the wealth of many-body physics of DMC or other field-theoretic techniques.

VIII Summary

In this work we developed a first-principles methodology that enables calculations of polaron energies and wavefunctions without using supercells. Our method employs electronic band structures, phonon dispersion relations, and electron-phonon matrix elements calculated in the crystal unit cell using density-functional theory and density-functional perturbation theory. In our theory we formulate the polaron problem as a variational minimization of a DFT functional including a self-interaction correction for the polaron wavefunction. This strategy leads to a non-linear system of two coupled equations for the electron or hole wavefunction and the associated atomic displacements. We showed that this approach has a mathematical structure similar to the classic Landau-Pekar polaron problem, but in our case the coupling to all phonons, both acoustic and optical, and both short- and long-range, is taken into account.

We applied this method to three test cases, namely the large electron polaron in a halide salt, LiF, the small hole polaron in the same material, and the small electron polaron in a layered metal oxide, Li2O2. In the case of the large polaron we validated our calculation using the continuous Landau-Pekar model; in the case of the small polaron we compared our results with explicit supercell calculations. We observed that our technique describes correctly and accurately both large and small polarons, therefore this method carries general validity across the length scales.

We introduced a spectral analysis of the polaron wavefunction and atomic displacements in order to quantify which electron bands, phonon modes, and electron-phonon couplings play the most important role in the formation of the polaron. This analysis allowed us to identify Fröhlich-type electron polarons in LiF, and Holstein-type polarons in LiF (holes) and Li2O2 (electrons). We anticipated that this type of analysis will be useful to devise model polaron Hamiltonian starting from realistic materials parameters computed from first principles.

We hope that the present work will serve as the basis for future ab initio calculations of polarons in real materials, and it will help combining together the strengths of DFT-type calculations with field-theoretic polaron techniques developed for model Hamiltonians.

Acknowledgements.

This work was supported by a postgraduate scholarship from the Macau SAR Government (W.H.S.), by the Leverhulme Trust (Grant RL-2012-001), the UK Engineering and Physical Sciences Research Council (grants No. EP/L015722/1 and EP/M020517/1), the Graphene Flagship (Horizon 2020 Grant No. 785219 - GrapheneCore2), the University of Oxford Advanced Research Computing (ARC) facility (http://dx.doi.org/810.5281/zenodo.22558), the PRACE-17 resources MareNostrum at BSC-CNS.

Bibliography102

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Lindemann et al. (1983) G. Lindemann, R. Lassnig, W. Seidenbusch, and E. Gornik, Phys. Rev. B 28 , 4693 (1983) . · doi ↗
2Popp and Murray (1972) R. Popp and R. Murray, J. Phys. Chem. Solids 33 , 601 (1972) . · doi ↗
3Chaikin et al. (1972) P. M. Chaikin, A. F. Garito, and A. J. Heeger, Phys. Rev. B 5 , 4966 (1972) . · doi ↗
4Moser et al. (2013) S. Moser et al. , Phys. Rev. Lett. 110 , 196403 (2013) . · doi ↗
5Cancellieri et al. (2016) C. Cancellieri, A. S. Mishchenko, U. Aschauer, A. Filippetti, C. Faber, O. Barišić, V. Rogalev, T. Schmitt, N. Nagaosa, and V. N. Strocov, Nat. Commun. 7 , 10386 (2016).
6Riley et al. (2018) J. M. Riley, F. Caruso, C. Verdi, L. Duffy, M. D. Watson, L. Bawden, K. Volckaert, G. van der Laan, T. Hesjedal, M. Hoesch, et al. , Nat. Commun. 9 , 2305 (2018).
7Wang et al. (2016) Z. Wang, S. M. Walker, A. Tamai, Y. Wang, Z. Ristic, F. Y. Bruno, A. De La Torre, S. Riccò, N. Plumb, M. Shi, et al. , Nat. Mater. 15 , 835 (2016).
8Chen et al. (2015) C. Chen, J. Avila, E. Frantzeskakis, A. Levy, and M. C. Asensio, Nat. Commun. 6 , 8585 (2015).

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Ab initio theory of polarons: formalism and applications

Abstract

I Introduction

II The Landau-Pekar model

III Polarons in density-functional theory

III.1 Derivation of the polaron equations

III.2 The formation energy of a polaron and the meaning of the polaron eigenvalue

III.3 Polaron equations in the basis of Kohn-Sham states and phonon modes

III.4 Lattice distortion in the polaronic ground state

III.5 Formation energy in the basis of Kohn-Sham states and phonon modes

III.6 Visualization of the polaron wavefunction

III.7 Link with the Landau-Pekar model

IV Quadratic self-interaction correction for polarons

V Implementation and computational setup

V.1 Density-functional theory calculations

V.2 Solution of the polaron equations

V.3 Test systems

V.3.1 Lithium fluoride

V.3.2 Lithium peroxide

V.3.3 α\alphaα-Quartz

VI Results

VI.1 Validation of the SIC functional

VI.2 Polaron energy vs. supercell size and

VI.3 Polaron wavefunctions

VI.4 Spectral decomposition of the polaron

VII Future developments

VIII Summary

Acknowledgements.

V.3.3 $\alpha$ -Quartz