Field Theory and The Sum-of-Squares for Quantum Systems

M. B. Hastings

arXiv:2302.14006·quant-ph·February 28, 2023

Field Theory and The Sum-of-Squares for Quantum Systems

M. B. Hastings

PDF

Open Access

TL;DR

This paper explores the sum-of-squares hierarchy for quantum spin and fermion systems using quantum field theory techniques, and examines the challenges in approximating the SYK model's ground state energy with various methods.

Contribution

It introduces new insights into the sum-of-squares hierarchy for quantum systems and analyzes limitations of classical approximation methods for the SYK model.

Findings

01

Limitations on the Lanczos method starting from Gaussian states

02

Constraints on approximating SYK ground state energy with Gaussian wavefunctions

03

Application of quantum field theory ideas to sum-of-squares hierarchy

Abstract

This is a collection of various result and notes, addressing the sum-of-squares hierarchy for spin and fermion systems using some ideas from quantum field theory, including higher order perturbation theory, critical phenomena, nonlocal coupling in time, and auxiliary field Monte Carlo. This paper should be seen as a sequel to Refs. 1,2. Additionally in this paper, we consider the difficulty of approximating the ground state energy of the Sachdev-Ye-Kitaev (SYK) model using other methods. We provide limitations on the power of the Lanczos method, starting with a Gausian wavefunction, and on the power of a sum of Gaussian wavefunctions (in this case under an assumption).

Equations144

M_{ab} = E [O_{a}^{†} O_{b}],

M_{ab} = E [O_{a}^{†} O_{b}],

H = α \sum λ_{α} O_{α}^{†} O_{α} + λ,

H = α \sum λ_{α} O_{α}^{†} O_{α} + λ,

O_{α}^{†} O_{α} = A_{α}^{†} A_{α} + B_{α}^{†} B_{α} + i [A_{α}, B_{α}] .

O_{α}^{†} O_{α} = A_{α}^{†} A_{α} + B_{α}^{†} B_{α} + i [A_{α}, B_{α}] .

H = Z_{1} + Z_{2} + g X_{1} X_{2} .

H = Z_{1} + Z_{2} + g X_{1} X_{2} .

H=\frac{1}{2}\Bigl{(}aX_{1}+ia^{-1}Y_{1}+bX_{2}\Bigr{)}\Bigl{(}aX_{1}-ia^{-1}Y_{1}+bX_{2}\Bigr{)}+1\leftrightarrow 2+\lambda,

H=\frac{1}{2}\Bigl{(}aX_{1}+ia^{-1}Y_{1}+bX_{2}\Bigr{)}\Bigl{(}aX_{1}-ia^{-1}Y_{1}+bX_{2}\Bigr{)}+1\leftrightarrow 2+\lambda,

2 ab = g,

2 ab = g,

λ = 2 1 + g^{2} /4,

λ = 2 1 + g^{2} /4,

\exp(-\beta\lambda)\Bigl{(}\exp(-\tau Q)\prod_{a}\exp(-\tau Q_{a}^{2})\Bigr{)}^{\beta/\tau},

\exp(-\beta\lambda)\Bigl{(}\exp(-\tau Q)\prod_{a}\exp(-\tau Q_{a}^{2})\Bigr{)}^{\beta/\tau},

\exp(-\tau Q_{a}^{2})=(4\pi\tau)^{-1/2}\int\exp(i\phi Q_{a})\exp\Bigl{(}-\frac{\phi^{2}}{4\tau}\Bigr{)}{\rm d}\phi.

\exp(-\tau Q_{a}^{2})=(4\pi\tau)^{-1/2}\int\exp(i\phi Q_{a})\exp\Bigl{(}-\frac{\phi^{2}}{4\tau}\Bigr{)}{\rm d}\phi.

H = H_{0} + ϵ p = 0 \sum 4 H_{p, 4 - p},

H = H_{0} + ϵ p = 0 \sum 4 H_{p, 4 - p},

H_{0} = j \sum E_{j} ψ_{j}^{†} ψ_{j},

H_{0} = j \sum E_{j} ψ_{j}^{†} ψ_{j},

O = Ψ_{u}^{†} Ψ_{v},

O = Ψ_{u}^{†} Ψ_{v},

s = 1 \sum r t = 0 \sum r - s (s n) (t n) .

s = 1 \sum r t = 0 \sum r - s (s n) (t n) .

i = 1 \sum 4 ψ_{i}^{†} ψ_{i} + ϵ (ψ_{1}^{†} ψ_{2}^{†} ψ_{3}^{†} ψ_{4}^{†} + h.c.),

i = 1 \sum 4 ψ_{i}^{†} ψ_{i} + ϵ (ψ_{1}^{†} ψ_{2}^{†} ψ_{3}^{†} ψ_{4}^{†} + h.c.),

(0 ϵ ϵ 4),

(0 ϵ ϵ 4),

u = \frac{4 + ϵ ^{2} - 2}{ϵ} .

u = \frac{4 + ϵ ^{2} - 2}{ϵ} .

λ_{α} = λ = \frac{ϵ}{6 u},

λ_{α} = λ = \frac{ϵ}{6 u},

H = α \sum O_{α}^{†} O_{α} + E_{0} (ϵ) + c (ϵ) i < j \sum n_{i} (1 - n_{j}),

H = α \sum O_{α}^{†} O_{α} + E_{0} (ϵ) + c (ϵ) i < j \sum n_{i} (1 - n_{j}),

M = M_{0} + Δ,

M = M_{0} + Δ,

\Pi_{0}\Bigl{(}\Delta+\Delta M_{0}(E-1)^{-1}\Delta+\Delta M_{0}(E-1)^{-1}\Delta M_{0}(E-1)^{-1}\Delta+\ldots\Bigr{)}\Pi_{0}.

\Pi_{0}\Bigl{(}\Delta+\Delta M_{0}(E-1)^{-1}\Delta+\Delta M_{0}(E-1)^{-1}\Delta M_{0}(E-1)^{-1}\Delta+\ldots\Bigr{)}\Pi_{0}.

\displaystyle\Pi_{0}\Bigl{(}\Delta-\Delta M_{0}\Delta+\Delta M_{0}\Delta M_{0}\Delta+\ldots\Bigr{)}\Pi_{0}

\displaystyle\Pi_{0}\Bigl{(}\Delta-\Delta M_{0}\Delta+\Delta M_{0}\Delta M_{0}\Delta+\ldots\Bigr{)}\Pi_{0}

Δ_{g} + Π_{0} Δ_{e} j = 1 \sum \infty (- 1)^{j} (M_{0} Δ_{e})^{j} Π_{0} \geq 0.

Δ_{g} + Π_{0} Δ_{e} j = 1 \sum \infty (- 1)^{j} (M_{0} Δ_{e})^{j} Π_{0} \geq 0.

\tilde{ψ}_{i} (ϵ) \equiv U (ϵ) ψ_{i} U (ϵ)^{†},

\tilde{ψ}_{i} (ϵ) \equiv U (ϵ) ψ_{i} U (ϵ)^{†},

\tilde{ψ}_{i} (ϵ) Ψ_{0} (ϵ) = 0.

\tilde{ψ}_{i} (ϵ) Ψ_{0} (ϵ) = 0.

j \sum E_{j} \tilde{ψ}_{j} (ϵ)^{†} ψ_{j} (ϵ) .

j \sum E_{j} \tilde{ψ}_{j} (ϵ)^{†} ψ_{j} (ϵ) .

H=J\sum_{<i,j>}\vec{q}_{i}\cdot\vec{q}_{j}+\frac{1}{2}\sum_{i}(\vec{p}_{i})^{2}+\frac{V}{2}\sum_{i}\Bigl{(}\frac{(\vec{q}_{i})^{2}}{N}-1\Bigr{)}^{2},

H=J\sum_{<i,j>}\vec{q}_{i}\cdot\vec{q}_{j}+\frac{1}{2}\sum_{i}(\vec{p}_{i})^{2}+\frac{V}{2}\sum_{i}\Bigl{(}\frac{(\vec{q}_{i})^{2}}{N}-1\Bigr{)}^{2},

\frac{\kappa}{2}=V\Bigl{(}\mathbb{E}\Bigl{[}\frac{1}{N}(\vec{q}_{j})^{2}\Bigr{]}-1\Bigr{)}.

\frac{\kappa}{2}=V\Bigl{(}\mathbb{E}\Bigl{[}\frac{1}{N}(\vec{q}_{j})^{2}\Bigr{]}-1\Bigr{)}.

\frac{V}{2} (q_{j}^{2} - 1)^{2} SoS \geq \frac{V}{2} (1 - s^{2}) + V q_{j}^{2} (s - 1) .

\frac{V}{2} (q_{j}^{2} - 1)^{2} SoS \geq \frac{V}{2} (1 - s^{2}) + V q_{j}^{2} (s - 1) .

H SoS \geq H_{Gaussian} + j \sum \frac{V}{2} (1 - s^{2}),

H SoS \geq H_{Gaussian} + j \sum \frac{V}{2} (1 - s^{2}),

\frac{κ}{2} = V (s - 1) .

\frac{κ}{2} = V (s - 1) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsQuantum Chromodynamics and Particle Interactions · Physics of Superconductivity and Magnetism · Theoretical and Computational Physics

Full text

Field Theory and The Sum-of-Squares for Quantum Systems

Matthew B. Hastings

Abstract

This is a collection of various result and notes, addressing the sum-of-squares hierarchy for spin and fermion systems using some ideas from quantum field theory, including higher order perturbation theory, critical phenomena, nonlocal coupling in time, and auxiliary field Monte Carlo. This paper should be seen as a sequel to Ref. Hastings and O’Donnell (2022) and Ref. Hastings (2022). Additionally in this paper, we consider the difficulty of approximating the ground state energy of the Sachdev-Ye-Kitaev (SYK) model using other methods. We provide limitations on the power of the Lanczos method, starting with a Gausian wavefunction, and on the power of a sum of Gaussian wavefunctions (in this case under an assumption).

I Introduction and Background

The difficulty of simulating quantum systems on a classical computer has been recognized almost since the dawn of digital computers. The world’s first commercially available digital computer was the Ferranti Mark I. Its successor was the Ferranti “Mercury”. In 1964, Bonner and FisherBonner and Fisher (1964) used a Ferranti Mercury to simulate a one-dimensional quantum spin chain with up to $11$ sites. In this influential paper, they explicitly described the exponentially growing size of the Hilbert space (even using translation symmetry to reduce the size), and lamented that their “relatively slow” computer restricted the available system sizes. Of course, the importance of this exponential growth must have been understood well before that time, especially to anyone who attempted a hand calculation, but it is remarkable that some of the earliest computers were used to simulate quantum many-body problems.

In this regard, many of the simulation methods used in practice are attempts to find some polynomial time algorithm which gives reasonable answers at least in some regime. For example, perturbation methods work at weak coupling, with a polynomial overhead at any given order of perturbation theory; DMRG and matrix product states work well in one-dimensional systems with low entanglementSchollwöck (2011); and so on.

One particularly intriguing approach is based on the sum-of-squares hierarchy. This method can give rigorous lower bounds on the ground state energy of a quantum system. Any given level of the hierarchy takes a polynomial time, with the order of the polynomial increasing at higher levels of the hierarchy111This statement skips over some details. At any given order of the hierarchy, one has a semidefinite progam of polynomial size, and commonly it is stated that such programs can be solved in polynomial time up to small additive error. There can be some tricky hidden parameters in these claims (see https://www.cs.cmu.edu/~odonnell/papers/sos-automatizability.pdf , Ref. O’Donnell (2017) ), but it seems that those issues cannot arise for the present problem.. See Refs. Goemans and Williamson (1995); Charikar and Wirth (2004) for algorithms for classical systems, while for quantum systems with fermions the method is known as the reduced density matrix method (RDM)Coleman (1963); Erdahl (1978); Percus (1978); Mazziotti and Erdahl (2001); Nakata et al. (2001); Mazziotti (2012); Klyachko (2006). There is also a sum-of-squares hierarchy for qudit systemsHelton and McCullough (2004); Navascués et al. (2008); Doherty et al. (2008); Pironio et al. (2010).

In this paper we give several results about the sum-of-squares. These results are largely unrelated to each other, but there is a general theme that they have some relation to field theory methods. In Section II, we discuss auxiliary field quantum Monte Carlo, showing a relation between optimal decompositions of the interaction (from the point of view of the sign problem) and a certain restricted version of the sum-of-squares method. In Section III, we discuss the ability of sum-of-squares to reproduce perturbation theory for fermionic systems; we extend results of Hastings (2022), which showed that second order perturbation theory could be reproduced by a fragment of degree- $6$ sum-of-squares but that it could not be reproduced by degree- $4$ sum-of-squares. We show that for any given order of perturbation theory, there is some order of the sum-of-squares which reproduces it. We do not, however, find the minimal order of sum-of-squares which can reproduce a given order of perturbation theory, and we leave this as an open question. In Section IV, we discuss critical phenomena in the sum-of-squares framework, showing that in several cases the leading order sum-of-squares gives critical exponents which coincide with the large- $N$ $O(N)$ vector model. In Section V, we consider methods related to the sum-of-squares applied to systems which have a nonlocal interaction in time. Finally, in Section VI, we consider the ability of various classical variational methods to approximate the ground state energy of the SYK modelSachdev and Ye (1993); Kitaev (2015). While this section does not involve the sum-of-squares hierarchy directly, there is some relation because in Ref. Hastings and O’Donnell (2022) it was shown that sum-of-squares methods could certify one-sided bounds on the ground state energy within constant factors, with high probability.

I.1 Background

A review of the hierarchy is in Ref. Hastings and O’Donnell (2022). Some of the results extend results in Ref. Hastings (2022). Since this paper is to some extent a sequel to those two papers, we will not give detailed definitions if they are explained there.

To very briefly sketch the idea of the hierarchy, one first chooses some set of operators, $\{O_{a}\}$ . In the so-called “degree $2r$ sum-of-squares”, the set of operators $\{O_{a}\}$ will be the set of monomials of degree at most $r$ in some operators, such as creation and annihilation operators for fermionic systems, or Pauli spin operators for some system of qubits. Given this set of operators, one introduces a matrix, $M$ , with matrix elements given by

[TABLE]

where $\mathbb{E}[O^{\dagger}_{a}O_{b}]$ is a pseudo-expectation value. What does “pseudo-expectation value” mean? This means that we impose three conditions on $M_{ab}$ . First, we impose some linear relations, determined by the algebra of operators. That is, if for some $\lambda_{ab}$ we have $\sum_{ab}\lambda_{ba}O^{\dagger}_{a}O_{b}=c$ , for some scalar $c$ , then we impose ${\rm Tr}(\lambda M)=c$ . Examples of these relations include things like $Z^{2}=1$ if $Z$ is a Pauli spin operator, or $\{\gamma_{a},\gamma_{b}\}=2\delta_{a,b}$ if $\gamma_{a},\gamma_{b}$ are Majorana operators. The second condition on $M$ is that $M$ is Hermitian. The third condition is that $M$ is positive semi-definite.

Remark: because of these linear constraints, for degree- $2r$ sum-of-squares we can take the set $\{O_{a}\}$ to be a set which spans the same vector space of operators as the set of all monomials of degree at most $r$ . For example, with Maorana operators, we do not need both $\gamma_{a}\gamma_{b}$ and $\gamma_{b}\gamma_{a}$ , but only need one of them.

Note that given any quantum density matrix $\rho$ , the expectation value $\mathbb{E}[O^{\dagger}_{a}O_{b}]={\rm Tr}(\rho O_{a}^{\dagger}O_{b})$ is such a pseudo-expectation. Such an $M$ is necessarily positive semi-definite because for any operator $O$ , we have ${\rm Tr}(\rho O^{\dagger}O)\geq 0$ . Conversely, if the set of operators $\{O_{a}\}$ is complete in that every operator on the given Hilbert space is a linear combination of operators in the set, then every pseudo-expectation defines some quantum state $\rho$ such that $\mathbb{E}[O^{\dagger}_{a}O_{b}]={\rm Tr}(\rho O_{a}^{\dagger}O_{b})$ , but if the set of operators $\{O_{a}\}$ includes, for example, only operators up to some given degree, then there may be pseudo-expectation values that do not correspond to any quantum density matrix $\rho$ .

The sum-of-squares hierarchy defines a semi-definite program by minimizing the pseudo-expectation value of the Hamiltonian, subject to these constraints. This is a semi-definite program, called the “primal” problem.

It is standard that, given a primal semi-definite program which involves minimizing some quantity subject to constraints, there is a dual problem which involves maximizing some quantity subject to constraints, and the minimum of the primal is greater than or equal to the maximum of the dual. In this particular case, the so-called “duality gap” vanishes, and the minimum of the primal equals the maximum of the dual. The dual problem has a particularly simple explanation in this case. It is equivalent to: given a Hamiltonian $H$ , find a deomposition

[TABLE]

where $\lambda$ and $\lambda_{\alpha}$ are non-negative scalars and where the $O_{\alpha}$ are linear combinations of operators in the given set $\{O_{a}\}$ (e.g., polynomials of degree at most $r$ ). For the optimal decomposition, $\lambda$ is equal to the ground state energy, and any such decomposition proves that the ground state energy is $\geq\lambda$ .

Note that, for example, even if $H$ is a sum of terms which are degree at most $4$ in some variables, one may use $O_{a}$ which are polynomials of higher degree in such a decomposition, so that using the (anti-)commutation relations of the algebra one can show that the result is equal to $H$ . Of course, we may absorb $\lambda_{\alpha}$ into the definition of $O_{\alpha}$ , rescaling $O_{\alpha}\rightarrow\sqrt{\lambda_{\alpha}}O_{\alpha}$ and writing $H=\sum_{\alpha}O_{\alpha}^{\dagger}O_{\alpha}+\lambda$ , but sometimes it is more convenient to write it this way using $\lambda_{\alpha}$ .

I.2 Notation, and Conventions

A remark on norms: given an un-normalized state (i.e., a positive semi-definite matrix), the norm we use, unless we say otherwise, is the $\ell_{1}$ norm. So, given any state $\rho$ , we say “the projection of $\rho$ onto some subspace $S$ is $\leq\ldots$ ” to mean that the trace of the projection of $\rho$ onto $S$ is $\leq\ldots$

We use the notation $A\underset{\rm SoS}{\geq}B$ to indicate that $A-B$ is a sum-of-squares, i.e., it is a sum of operators $O_{a}^{\dagger}O_{a}$ . If we are discussing a particular order of sum-of-squares, when we use $\underset{\rm SoS}{\geq}$ we implicitly mean that the sum-of-squares is at most of that given order.

We use computer-science big- $O$ notation $o(n),{\cal O}(n),\ldots$ throughout. We use $n$ to indicate the number of fermionic modes or qubits, following Hastings and O’Donnell (2022). We use $O(N)$ later to denote a particular orthogonal group.

We use $X_{i},Y_{i},Z_{i}$ to denote Pauli operators on a given qubit $i$ in a qubit system. We use $\gamma_{a}$ to denote Majorana operators, $a\in\{1,\ldots,2n\}$ . We also use creation and annihilation operators $\psi^{\dagger}_{a},\psi_{b}$ which obey canonical anti-commutation relations, with $a,b\in\{1,\ldots,n\}$ . Even for fermionic systems which might not obey particle-number conservation, we will see that there are some uses for creation and annihilation operators.

II Auxiliary Field Quantum Monte Carlo and the Importance of Commutators

The sum-of-square method above considers decomposition of a Hamiltonian $H$ as $H=\sum_{\alpha}O_{\alpha}^{\dagger}O_{\alpha}+\lambda$ for some scalar $\lambda$ . Here, the operator $O_{\alpha}$ may not be normal, meaning that the commutator $[O_{\alpha}^{\dagger},O_{\alpha}]$ might not vanish. Suppose indeed $O_{\alpha}=A_{\alpha}+iB_{\alpha}$ for some Hermitian operators $A_{\alpha},B_{\alpha}$ , with $[A_{\alpha},B_{\alpha}]\neq 0$ . Then

[TABLE]

The first two terms on the right-hand side are squares of Hermitian operators, but the commutator is not.

Indeed, if we restrict to sum-of-squares of Hermitian operators, so that we decompose $H=\sum_{\alpha}O_{\alpha}^{2}+\lambda$ for Hermitian $O_{\alpha}$ , then the method becomes weaker than if we allow non-Hermitian $O_{\alpha}$ . For example, consider a system with two qubits with Hamiltonian

[TABLE]

First consider a decomposition with non-Hermitian $O_{\alpha}$ . Let us consider the following, where $a,b,\lambda$ are some real scalars that we adjust (here we use the standard convention for Pauli matrices that $XY=iZ$ ):

[TABLE]

where $1\leftrightarrow 2$ means the same term as the first except with qubits $1$ and $2$ interchanged. Then, we need to take

[TABLE]

and we have $-\lambda=a^{2}+(1+g^{2}/4)a^{-2}.$ Optimizing over $a$ , we may obtain

[TABLE]

in agreement with the exact ground state.

Remark: this decomposition is almost the same as that used in Hastings (2022) to show that we may reproduce low order perturbation theory with the sum-of-squares, except that here we have allowed a more general decomposition with $a\neq 1$ . If we restrict to $a=1$ , we reproduce low order perturbation theory.

Now suppose instead that we consider only a decomposition of $H$ as a sum of squares of Hermitian operators. Again we consider the degree- $2$ sum-of-squares. The dual formulation is in terms of a pseudo-expectation, $\mathbb{E}[\cdot]$ , as before. However, now rather than requiring that the matrix of pseudo-expectation values $\mathbb{E}[O_{a}^{\dagger}O_{b}]$ be Hermitian and positive semi-definite, we will take all operators $O_{a}$ defining that matrix to be Hermitian, and we will only require that the matrix be Hermitian and that the symmetric part of the matrix be positive semi-definite. Taking the operators $O_{a}$ to be drawn from the set $\{1,X_{1},Y_{1},Z_{1},X_{2},Y_{2},Z_{2}\}$ , we consider the following pseudo-expectation. The diagonal elements of the matrix of pseudo-expectation values are of course all equal to $+1$ . Let $\mathbb{E}[Z_{1}]=\mathbb{E}[Z_{2}]=-1$ ; this of course implies that $\mathbb{E}[1Z_{1}]=\mathbb{E}[Z_{1}1]=\mathbb{E}[1Z_{2}]=\mathbb{E}[Z_{2}1]=-1$ . Assume $g>0$ and let $\mathbb{E}[X_{1}X_{2}]=\mathbb{E}[X_{2}X_{1}]=-1.$ Finally, let all other matrix elements have vanishing real part; i.e., their contribution to the symmetric part of the matrix vanishes. Note that the various linear constraints imposed on pseudo-expectation values by the Pauli commutation relations imply that $\mathbb{E}[X_{1}Y_{1}]=-\mathbb{E}[Y_{1}X_{1}]=i\mathbb{E}[Z_{1}]=-i$ , i.e., the real part of that matrix element vanishes as required. Similarly, $\mathbb{E}[X_{2}Y_{2}]=-\mathbb{E}[Y_{2}X_{2}]=-i$ .

One may verify that this defines a matrix whose symmetric part is positive semi-definite but now we are only able to show that $H\underset{\rm SoS}{\geq}-2-g$ at this order of sum-of-squares.

Given that this formulation of the sum-of-squares using only Hermitian term in the squares is weaker than the more general formulation, the reader may wonder why it is worth considering. One answer is that it is worth discussing simply to emphasize why we want to use non-Hermitian operators in the squares. However, there is also an interesting relation to auxiliary-field quantum Monte Carlo (AFQMC); see Blankenbecler et al. (1981); Sugiyama and Koonin (1986) for original AFQMC papers, and see later work (too much to summarize here) for various methods of improving the sign problem. Suppose we have a Hamiltonian $H$ for a fermion system which is a sum of quadratic and quartic terms. Then, suppose we find a decomposition $H=Q+\sum_{a}Q_{a}^{2}+\lambda$ where the operators $Q,Q_{a}$ are Hermitian and are quadratic in the fermion fields; note that $Q$ is a quadratic term in the Hamiltonian while $Q_{a}^{2}$ includes quadratic and quartic terms. We choose $Q$ to be a sum-of-squares of linears in the fermion operators so $Q$ is positive semi-definite. Then, we can implement an auxiliary-field Monte Carlo in imaginary time. To do this, one may use Trotter-Suzuki to approximate the imaginary time evolution $\exp(-\beta H)$ by a product

[TABLE]

where $\tau$ is a small timestep in imaginary time, and where the product $\prod_{a}\exp(-\tau Q_{a}^{2})$ is taken in some arbitrary order. Then, use a Hubbard-Stratonovich decoupling

[TABLE]

This turns the evolution in imaginary time to the evolution under a quadratic Hamiltonian coupled to a fluctuating field. In this case, there is a sign problem, but the magnitude of the sign problem depends on the difference between $-\lambda$ and the exact ground state energy. That is, if the the ground state energy is $E_{0}$ , then at large inverse temperature $\beta$ , we have ${\rm tr}(\exp(-\beta\sum_{a}Q_{a}^{2}))\rightarrow\exp(-\beta(E_{0}-\lambda))$ . The operator $\exp(i\phi Q_{a})$ is unitary for any $\phi$ , and so the decay $\exp(-\beta(E_{0}-\lambda))$ is due to a combination of the fluctuating sign due to averaging over different fluctuating auxiliary fields as well as any additional decay due to $\exp(-\tau Q)$ . To say it differently, the weight of any configuration is non-increasing as imaginary time increases (it may decrease due to the term $\exp(-\tau Q)$ but cannot increase); so, any decay in the average sign must lead to a decay in the total weight. Thus, an optimal solution to the semi-definite program may give an optimal decomposition for AFQMC as we can lower bound the decay in the sign problem at long time by a constant times $\exp(-\beta(E_{0}-\lambda))$ .

Of course, in many cases solutions using a real coupling to the auxiliary field, rather than imaginary, may lead to a better sign problem.

Some numerical experiments on quartic Hamiltonians show that there is a large loss in accuracy of the semi-definite program by using only Hermitian terms in the sum-of-square (for example, on some small molecules of 5-10 orbitals, changing from errors of $<10^{-4}$ Hartree using non-Heritian terms in the sum-of-squares to $\approx 0.2$ Hartree using only Hermitian terms). However, even so the resulting error in the ground state energy ( $0.2$ Hartree in this case) suggests that it might lead to a manageable sign problem. When doing these calculations, we allowed the operators $Q_{\alpha}$ to be spin [math] or spin $1$ , and also allowed them to be particle-number-nonconserving; indeed, allowing that full generality was needed to obtain the optimal solution of the semidefinite program.

Remark: this idea has some similarity of Levy and Clark (2021), in that one finds some optimal way of writing a Hamiltonian to minimize a sign problem by a variational method. Here the variational method is solving a semi-definite program, there the variational method involved maximizing an energy of a quantum Monte Carlo simulation.

For the rest of this paper, we consider the general case where the operators $O_{\alpha}$ may be non-Hermitian.

III Perturbation Theory for Fermionic Systems

Here we discuss the relationship between the sum-of-squares and perturbation theory. Following Hastings (2022), we consider a Hamiltonian

[TABLE]

where

[TABLE]

where all $E_{j}$ are positive scalars, where $\epsilon$ is a small parameter controlling the perturbation theory, and where each term $H_{p,4-p}$ is a sum of products of $p$ creation operators and $4-p$ annihilation operators, with the term normal ordered so that the annihilation operators are to the right of the creation operators. Thus, all terms $H_{p,4-p}$ , except for $H_{4,0}$ , annihilate the unperturbed ground state (i.e., when all number operators $n_{i}$ are equal to [math]).

For small $\epsilon$ , there is a well-studied theory of perturbatively solving this Hamiltonian for the ground state energy as a function of $\epsilon$ , which we denote $E_{0}(\epsilon)$ . Indeed, there are several such perturbation methods, such as Rayleigh-Schrodinger, Brillouin-Wigner, and Green’s function (diagrammatic) methods. These methods all yield the same power series in the end, but may organize the computation differently.

We address two questions. First, we consider a perturbative solution of the semi-definite program at a given order of the sum-of-squares. Second, a related question, we consider whether the sum-of-squares at a given order can reproduce a given order of perturbation theory; here, we say that it reproduces a given order $k$ of perturbation theory if it proves a lower bound on the ground state energy which is at least $E_{0}(\epsilon)+o(\epsilon^{k})$ . Indeed, in all such cases where it does this we will find that it reproduces it up to error ${\cal O}(\epsilon^{k+1})$ .

Remark: of course, for physical system, the quantities $E_{j}$ in Eq. 1 may have either sign. If they are negative, then the ground state of $H_{0}$ has some filled states. However, by applying a particle-hole conjugation we can bring it into the form above with $E_{j}>0$ . This particle-hole conjugation also means that terms $H_{p,4-p}$ may arise with $p\neq 2$ even if the original Hamiltonian conserves number.

III.1 General Formalism for Perturbative Solution of Semidefinite Program and the Rank of the Reduced Density Matrix

We choose a basis for operators $O_{a}$ which are polynomials of degree at most $r$ in the creation and annihilation operators. A suitable basis is to use normal ordered monomials, i.e., the annihilation operators are to the right of the creation operators. We write such an operator as

[TABLE]

where $\vec{u},\vec{v}$ are bit strings of length $n$ , where $n$ is the number of fermionic degrees of freedom. Each operator $\Psi_{\vec{v}}$ is defined to be the product of $\psi_{a}$ for $a$ such that the $a$ -th bit of $\vec{v}$ is nonzero, with the product taken in the order of increasing $a$ .

When $\epsilon=0$ , the exact ground state of $H$ is of course easy to find and one may calculate the expectation values of products of operators $O_{a}^{\dagger}O_{b}$ in this ground state. Further, it is easy to show that any degree $2r$ sum-of-squares, for $r\geq 1$ , reproduces all these expectations of monomials, for $O_{a},O_{b}$ monomials of degree $d\leq r$ .

In particular, the result for $\mathbb{E}[O_{a}^{\dagger}O_{b}]$ is as follows. Let $O_{a}=\Psi^{\dagger}_{\vec{u}_{a}}\Psi_{\vec{v}_{a}}$ and let $O_{b}=\Psi^{\dagger}_{\vec{u}_{b}}\Psi_{\vec{v}_{b}}$ . Then $\mathbb{E}[O_{a}^{\dagger}O_{b}]=0$ if $\vec{v}_{b}\neq 0$ or $\vec{v}_{a}\neq 0$ . If $\vec{v}_{b}=\vec{v}_{a}=0$ , then $\mathbb{E}[O_{a}^{\dagger}O_{b}]=\delta_{\vec{u}_{a},\vec{u}_{b}}$ , where the $\delta$ -function is a Kronecker delta-function.

With this choice of basis, and this solution of the semidefinite program for $\epsilon=0$ , we may begin perturbation theory. First, however, we remark on an interesting property regarding the rank of the matrix $M$ of pseudoexpectation values for a primal solution to the semidefinite program.

III.1.1 Rank of $M$

Note that at $\epsilon=0$ the matrix $M$ is diagonal, so the number of zero eigenvalues is simply equal to the number of zero entries on the diagonal. The zero diagonal entries entries correspond to the case where $O_{a}=O_{b}=O$ with $O=\Psi^{\dagger}_{\vec{u}}\Psi_{\vec{v}}$ and with $\vec{v}$ nonzero. At $\epsilon=0$ , the number of zero eigenvalues, keeping $O_{a},O_{b}$ which are monomials of degree at most $r$ , is equal to the number of choices of $\vec{u},\vec{v}$ such that the total Hamming weight $|\vec{u}|+|\vec{v}|$ is $\leq r$ and such that $\vec{v}\neq 0$ . Thus, the number of zero eigenvalues equals

[TABLE]

We emphasize that the matrix of pseudoexpectation values $M$ obtained by solving the sum-of-squares hierarchy at any given order (of at least $2$ ) in the case $\epsilon=0$ reproduces the exact expectation values in the ground state, and so in particular they have the same number of zero eigenvalues.

Now consider the following toy problem. Take $n=4$ and let

[TABLE]

where $+{\rm h.c.}$ means to add the Hermitian conjugate.

The exact ground state wavefunction can be written as a sum $\Psi_{0}(\epsilon)=a|0\rangle+b|4\rangle$ where $|0\rangle$ is the empty state (i.e., the state annihilated by all $\psi_{i}$ ) and $|4\rangle$ is the state with four particles (i.e., the state annihlated by all $\psi^{\dagger}_{i}$ ). Indeed, the ground state energy $E_{0}$ is the lowest eigenvalue of the two-by-two matrix

[TABLE]

and hence $E_{0}=-2-\sqrt{4+\epsilon^{2}}.$

We now show that degree- $4$ sum-of-squares reproduces this, by writing $H=E_{0}+\sum_{\alpha}\lambda_{\alpha}O_{\alpha}^{\dagger}O_{\alpha},$ for some $O_{a}$ . We can guess an appropriate choice of $O_{a}$ by looking at the exact solution for the ground state wavefunction: we must have $O_{a}\Psi_{0}(\epsilon)=0$ . One choice is to pick a quadruple $i,j,k,l$ all distinct, with $i,j,k,l\in\{1,2,3,4\}$ , and let $O_{\alpha}=u\psi^{\dagger}_{i}\psi^{\dagger}_{j}+\psi_{k}\psi_{l},$ where $u$ is a scalar. Without loss of generality, let us pick $i,j,k,l$ so that they give an even permutation of the sequence $1,2,3,4$ (if they are an odd permutation, then the sign of $u$ below is changed). By inspecting the ground state wavefunction, we see that we need

[TABLE]

Then, $\lambda_{\alpha}O_{\alpha}^{\dagger}O_{\alpha}=\lambda_{\alpha}u(\psi^{\dagger}_{1}\psi^{\dagger}_{2}\psi^{\dagger}_{3}\psi^{\dagger}_{4}+{\rm h.c.})+\lambda n_{i}n_{j}+\lambda u^{2}(1-n_{k})(1-n_{l}),$ where $n_{i}=\psi^{\dagger}_{i}\psi_{i}$ . Summing over all choices of $i<j$ , if we pick

[TABLE]

then this gives the desired term $\epsilon(\psi^{\dagger}_{1}\psi^{\dagger}_{2}\psi^{\dagger}+3\psi^{\dagger}_{4}+{\rm h.c.})$ .

However, we do not yet have the correct term $\sum_{i}n_{i}$ in the Hamiltonian. However, after some algebra, we find that

[TABLE]

for some non-negative scalar $c(\epsilon)$ . Further $n_{i}(1-n_{j})$ is a sum of squares, as $n_{i}(1-n_{j})=O^{\dagger}O$ for $O=\psi_{i}\psi^{\dagger}_{j}$ . So, in this way we find the desired decomposition of $H$ as a sum of squares, giving the exact ground state energy at this order.

Now consider the rank of the matrix of pseudoexpectation values, $M$ , for a solution of the semidefinite program, restricting to operators $O_{a}$ of degree at most $2$ in the fermionic operators. Since degree- $4$ sum-of-squares reproduces the exact solution, we may consider the rank of the matrix of expectation values in the true ground state. One finds that the ground state is annihilated by any operator of the form $\psi^{\dagger}_{i}\psi_{j}$ for $i\neq j$ , and there are $12$ such operators. It also is annihilated by the operators $u\psi^{\dagger}_{i}\psi^{\dagger}_{j}+\psi_{k}\psi_{l}$ and there are $6$ such operators. Indeed, the number of zero eigenvalues is equal to $18$ . This compares with the case $\epsilon=0$ where we found that the number of zero eigenvalues equals $\sum_{s=1}^{2}\sum_{t=0}^{2-s}{4\choose s}{4\choose t}=4+16+6=26.$

Thus, the rank of the matrix of pseudoexpectation values increases when $\epsilon$ becomes nonzero. Indeed, even if consider the rank restricted to the submatrix where the operator $O_{a}$ has even fermion parity, then the number of zero eigenvalues at $\epsilon>0$ is equal to $18$ while the number at $\epsilon=0$ is equal to $22$ so even in that submatrix the rank has increased.

III.1.2 General Formalism

Now we develop a general formalism for a perturbative solution of the primal semidefinite for a Hamiltonian of form Eq. 1 at a fixed order of the sum-of-squares.

The presentation here is not rigorous. For example, we will ignore any questions of convergence of the series.

Assume we have a perturbative expansion

[TABLE]

where $M_{0}$ is the solution at $\epsilon=0$ and $\Delta$ is given by a series as $\Delta=\epsilon M_{1}+\epsilon^{2}M_{2}+\ldots.$

One simplification is that $M_{0}$ , using the basis of operators $O_{a}$ above, is a projector. Let $\Pi_{0}=1-M_{0}$ . We may perturbatively impose the requirement $M\geq 0$ , where the inequality is interpreted as meaning that $M$ is positive semi-definite.

To impose this requirement perturbatively, we need to ensure that the lowest eigenvalue of $M$ is $\geq 0$ . To do this, it is convenient to use Brillouin-Wigner perturbation theory. This perturbation theory is simpler than Rayleigh-Schrodinger perturbation theory in the case of a degenerate ground state. One complication that occurs in Brillouin-Wigner perturbation theory is that it involves denominators, $1/(E-E_{i})$ where $E$ is the lowest eigenvalue of the perturbed system and $E_{i}$ is some nonzero eigenvalue of the unperturbed system; however we will see that these denominators simplify greatly. In our case, all these $E_{i}$ are equal to $1$ . The result of Brillouin-Wigner perturbation theory, is that there is some eigenvalue $E$ close to zero if

[TABLE]

has an eigenvalue equal to $E$ .

However, note that (assuming $\Delta$ is ${\cal O}(\epsilon)$ ), reducing $E$ must increase the eigenvalues of this matrix since it increases the second order term $\Delta M_{0}(E-1)^{-1}\Delta$ and all higher terms $T$ in the series obey $T\leq{\cal O}(\epsilon)\Delta M_{0}(E-1)^{-1}\Delta$ . So, if there is a negative eigenvalue of this matrix for some $E<0$ then there is also a negative eigenvalue for $E=0$ .

The result is that, perturbatively, positivity of $M$ is equivalent to the requirement that

[TABLE]

If we write $\Delta=\Delta_{g}+\Delta_{e}$ where $\Delta_{g}=\Pi_{0}\Delta\Pi_{0}$ , then this is equivalent to

[TABLE]

One obvious way to satisfy this is to have $\Delta_{g}=-\Pi_{0}\Delta_{e}\sum_{j=1}^{\infty}(-1)^{j}(M_{0}\Delta_{e})^{j}\Pi_{0}$ . However, such a choice of $\Delta_{g}$ might not obey the linear relations imposed on $M$ by the canonical anticommutation relations. Indeed, this is precisely why we considered the rank of $M$ in Section III.1.1: if we had $\Delta_{g}=-\Pi_{0}\Delta_{e}\sum_{j=1}^{\infty}(-1)^{j}(M_{0}\Delta_{e})^{j}\Pi_{0}$ then the rank of $M$ would not change, but the rank does change in some cases.

It may be interesting to continue to develop this theory, to understand the perturbative solution of the sum-of-squares. However, in the next section we turn to an alternative approach to show that the sum-of-squares can reproduce perturbation theory.

III.2 Perturbation Theory and the Sum-of-Squares

We now consider the question of reproducing a given order of perturbation theory using an appropriate order of the sum-of-squares. For sufficiently small $\epsilon$ , there is a power series in $\epsilon$ which defines a unitary $U(\epsilon)$ such that $\Psi_{0}(\epsilon)=U(\epsilon)\Psi_{0}(0),$ where $\Psi_{0}(\epsilon)$ is the ground state at given $\epsilon$ and $\Psi_{0}(0)$ is the unperturbed ground state. We may prove this, for example, using exactOsborne (2007); Bravyi and Hastings (2011) quasi-adiabatic continuationHastings (2004) to construct a unitary describing the adiabatic evolution (for small enough $\epsilon$ , the gap between ground and first excited state remains open, as needed for this method) of the ground state. Alternatively, one may use higher-order Schrieffer-Wolff methodsBravyi et al. (2011).

Let

[TABLE]

for each $i$ . Since $\psi_{i}\Psi_{0}(0)=0$ , we have

[TABLE]

The power series in $\epsilon$ for $U(\epsilon)$ defines a power series in $\epsilon$ for $\tilde{\psi}_{i}(\epsilon)$ . The term of order $\epsilon^{k}$ in this power series is a polynomial in the creation and annihilation operators $\psi^{\dagger},\psi$ of degree at most $2k+1$ . For brevity, let us simply say the term “is a polynomial in $\psi,\psi^{\dagger}$ ”. The term of order $\epsilon^{0}$ is equal to $\psi_{i}$ .

Similarly, we can define a power series in $\epsilon$ for $\psi_{i}$ where the term term of order $\epsilon^{k}$ in this power series is a polynomial in operators $\tilde{\psi}^{\dagger}(\epsilon),\tilde{\psi}(\epsilon)$ of degree at most $2k+1$ . Again for brevity, let us simply say the term “is a polynomial in $\tilde{\psi},\tilde{\psi}^{\dagger}$ ”.

Using this power series for $\psi$ in terms of $\tilde{\psi}$ , we can write the Hamiltonian $H$ as a power series in $\epsilon$ where each term is a polynomial in $\tilde{\psi},\tilde{\psi}^{\dagger}$ . The term of order $\epsilon^{0}$ is equal to

[TABLE]

That is, it is the same as $H_{0}$ except with $\psi,\psi^{\dagger}$ replaced with $\tilde{\psi},\tilde{\psi}^{\dagger}$ .

Let $\tilde{n}_{i}(\epsilon)=\tilde{\psi}_{i}(\epsilon)^{\dagger}\tilde{\psi}_{i}(\epsilon).$

Note that we have canonical anti-commutation relations also for $\tilde{\psi},\tilde{\psi}^{\dagger}$ , i.e., $\{\tilde{\psi}_{j}(\epsilon),\tilde{\psi}_{k}(\epsilon)\}=\{\tilde{\psi}_{j}(\epsilon)^{\dagger},\tilde{\psi}_{k}(\epsilon)^{\dagger}\}=0$ and $\{\tilde{\psi}_{j}(\epsilon)^{\dagger},\tilde{\psi}_{k}(\epsilon)\}=\delta_{j,k}.$ So, we may take the representation of $H$ in terms of $\tilde{\psi},\tilde{\psi}^{\dagger}$ and normal order the terms, using these anti-commutation relations. This then expresses $H=\sum_{j}E_{j}\tilde{n}_{j}(\epsilon)+V+\lambda,$ where $V$ is ${\cal O}(\epsilon)$ and is a sum of normal ordered terms and $\lambda$ is a scalar.

Indeed, since the terms $V$ are normal ordered, then for $\epsilon$ small enough that $\Psi_{0}(\epsilon)=U(\epsilon)\Psi_{0}(0)$ , we have $\lambda=E_{0}(\epsilon)$ , i.e., it is the ground state energy at the given $\epsilon$ .

Moreover, by lemma 1 of Ref. Hastings (2022), each normal ordered term in $V$ of degree $d$ is equal to some linear combination of $\tilde{n}_{j}$ plus a sum-of-squares of terms, each term being a polynomial in $\tilde{\psi},\tilde{\psi}^{\dagger}$ degree at most $d/2$ . The sum, over all terms in $V$ of all these linear combinations $\tilde{n}_{j}$ can be written as some $\sum_{j}\delta_{j}(\epsilon)\tilde{n}_{j}$ . So, we express $H\underset{\rm SoS}{\geq}\sum_{j}(E_{j}+\delta_{j}(\delta))\tilde{n}_{j}(\epsilon)+E_{0}(\epsilon).$ Since $\delta_{j}(\epsilon)$ is ${\cal O}(\epsilon)$ , for small enough $\epsilon$ we have $E_{j}+\delta_{j}(\epsilon)>0$ for all $j$ so $\sum_{j}(E_{j}+\delta_{j}(\delta))\tilde{n}_{j}(\epsilon)$ is a sum of squares.

Now let us show that for any given order of perturbation theory, some finite order of the sum-of-squares can reproduce the results at that order.

We have, as explained above, a sum-of-squares proof $H\underset{\rm SoS}{\geq}\sum_{j}(E_{j}+\delta_{j}(\delta))\tilde{n}_{j}(\epsilon)+E_{0}(\epsilon).$ Indeed, this means that $H=\sum_{\alpha}O_{\alpha}^{\dagger}O_{\alpha}+E_{0}(\epsilon),$ where each $O_{\alpha}$ is a polynomial in operators $\tilde{\psi},\tilde{\psi}^{\dagger}$ and where we have fixed $\lambda_{\alpha}=1$ and so omitted $\lambda_{\alpha}$ .

Further, we claim that each $O_{\alpha}$ has either even or odd fermion parity, i.e., it is a polynomial with only terms of even degree in $\tilde{\psi},\tilde{\psi}^{\dagger}$ or with only odd degree in $\tilde{\psi},\tilde{\psi}^{\dagger}$ , as the Hamiltonian only has terms of even degree (this is a special case of a more general result on symmetries of a Hamiltonian discussed in Section IV).

We claim the coefficient of a term of degree- $q$ in $\tilde{\psi},\tilde{\psi}^{\dagger}$ in $O_{\alpha}$ is ${\cal O}(\epsilon^{(q-1)/2})$ . Thus, a cubic term is of order $\epsilon$ , while a quadratic term is of order $\sqrt{\epsilon}$ . Intuitively, this makes sense: the square of the quadratic term is a quartic term, and the quartic term in the Hamiltonian is of order $\epsilon$ . However, we may prove that this must be the case in general as follows: the term of degree- $2$ in $H$ is ${\cal O}(1)$ and the term of degree- $4$ in $H$ is ${\cal O}(\epsilon)$ , and so using the series for $\psi,\psi^{\dagger}$ in terms of $\tilde{\psi},\tilde{\psi}^{\dagger}$ expresses the Hamiltonian $H$ in terms of $\tilde{\psi},\tilde{\psi}^{\dagger}$ where each term of degree $d$ is of order ${\cal O}(\epsilon^{(d-2)/2})$ . Normal ordering does not change this: we get a normal ordered Hamiltonian in terms of $\tilde{\psi},\tilde{\psi}^{\dagger}$ , where again each term of degree $d$ is of order ${\cal O}(\epsilon^{(d-2)/2})$ , as normal ordering can only reduce the degree. Finally, when using lemma 1 of Hastings (2022) to that show each normal ordered term in $V$ of degree $d$ is equal to some linear combination of $\tilde{n}_{j}$ plus a sum-of-squares of terms, each term having degree at most $d/2$ , the resulting sum-of-squares has the desired property that the coefficient of a term of degree- $q$ is ${\cal O}(\epsilon^{(q-1)/2})$ .

Next, one may re-express each $O_{\alpha}$ as a series in $\psi,\psi^{\dagger}$ , using the series for $\tilde{\psi},\tilde{\psi}^{\dagger}$ in terms of $\psi,\psi^{\dagger}$ . Then, again we have a similar result: the coefficient of a term of degree $r$ in $\tilde{\psi},\tilde{\psi}^{\dagger}$ in $O_{\alpha}$ is ${\cal O}(\epsilon^{(r-1)/2})$ . Suppose we truncate the series for each $O_{\alpha}$ in $\psi,\psi^{\dagger}$ at degree $r$ for some $r$ . For $r=2k+1$ , this truncation of $O_{\alpha}$ is correct up to error ${\cal O}(\epsilon^{k+1})$ . To clarify what we mean by “correct up to error $\ldots$ ”, since the quantity we are talking about is an operator rather than a number, we mean simply that it is a polynomial whose coefficients are of the given order.

For example, for $r=3$ , we can express the leading term $\tilde{\psi}^{\dagger}_{i}\tilde{\psi}_{i}$ up to error ${\cal O}(\epsilon^{2})$ , because we include the terms of degree $3$ but not the terms of degree $5$ .

Call this truncation $O_{\alpha}^{\rm trunc}$ . Then, $H=\sum_{\alpha}\lambda_{\alpha}(O_{\alpha}^{\rm trunc})^{\dagger}O_{\alpha}^{\rm trunc}+E_{0}(\epsilon)+\delta,$ where $\delta$ is the “error” in making this truncation. The term $\delta$ is, by construction, ${\cal O}(\epsilon^{k+1})$ . Further, $\delta$ is of degree at most $2r$ by construction, assuming $2r\geq 4$ , as then every term in $H$ and in $\sum_{\alpha}\lambda_{\alpha}(O_{\alpha}^{\rm trunc})^{\dagger}O_{\alpha}^{\rm trunc}$ is at most degree $2r$ . So, we may show that $\delta\underset{\rm SoS}{\geq}-{\cal O}(\epsilon^{k+1})$ by a degree- $2r$ sum-of-squares proof222This is a trivial proof. There is a sum-of-squares proof that any monomial in $\psi,\psi^{\dagger}$ , with coefficient equal to $1$ , is $\geq-1$ and $\leq+1$ ..

So, this gives a degree- $2(2k+1)$ sum-of-squares proof that $H\underset{\rm SoS}{\geq}E_{0}(\epsilon)-{\cal O}(\epsilon^{k+1})$ , i.e., a degree- $4k+2$ proof. So, as claimed, for any desired order of perturbation theory, there is some order of the sum-of-squares that reproduces it, i.e., $k$ -th order perturbation theory is reproduced by degree $4k+2$ sum-of-squares. However, we can see that this result is not optimal. In Hastings (2022), it was shown that degree- $6$ sum-of-squares reproduces second order perturbation theory.

We leave it as an open question to determine what the minimal order of sum-of-squares is to reproduce a given order of perturbation theory. In Hastings (2022) it was proven that second order perturbation theory is reproduced by degree- $6$ sum-of-squares and not by degree- $4$ .

IV Critical Phenomena

In this section, we apply leading order sum-of-squares to various models with a quantum critical point. Interestingly, we see exponents that coincide with the large- $N$ vector model (explained below) in a variety of cases. We see this both when the model is a vector model, and when it is a transverse field Ising model. The next-to-leading-ordered sum-of-squares treatment of this critical phenomena may be very complicated.

IV.1 Large $N$ Vector Model—Relation To Sum-of-Squares

The $O(N)$ vector model is a model studied in quantum field theory (see chapter 8 of Polyakov (1987) or section 4 of Zinn-Justin (1998)). (The notation $N$ in the vector model should not be confused with our use of $n$ elsewhere for the number of degrees of freedom.) We can give a Hamiltonian formulaton of this model on a lattice as follows. Consider a lattice of sites labelled by integers $i,j,\ldots$ ; for example, consider a cubic lattice in $d$ spatial dimensions. On each site $j$ , there are $N$ continuous degrees of freedom, for some integer $N\geq 1$ . To describe these degrees of freedom, we introduce operators $q_{j}^{\mu}$ and $p_{j}^{\mu}$ , where $\mu\in\{1,\ldots,N\}$ indexes the different degrees of freedom on the given site $j$ . These operators obey the canonical commutation relations $[q_{j}^{\mu},p_{j}^{\nu}]=i\delta_{\mu,\nu}$ , where $\delta_{\mu,\nu}$ is the Kronecker $\delta$ -function (fixing $\hbar=1$ ). We use $\vec{q}_{j}$ to denote a vector with components $q_{j}^{\mu}$ and similarly $\vec{p}_{j}$ denote a vector with components $\vec{p}_{j}^{\mu}$ .

The Hamiltonian is then

[TABLE]

where the notation $\sum_{<i,j>}$ denotes the sum over nearest neighbor $i$ and $j$ and where $(\vec{p})^{2}\equiv\vec{p}\cdot\vec{p}$ . Taking $V$ large forces $(\vec{q}_{i})^{2}$ to be close to $1$ , so that the vector $\vec{q}_{j}$ is constrained to given length.

At large $N$ , it is possible to solve this model using a saddle point method; one decouples the quartic interaction in $H$ using a Hubbard-Stratonovich transformation, introducing an auxiliary field (which becomes our quantity $\kappa$ below), and then takes a saddle point in the integral over the auxiliary field (the saddle point is accurate for large $N$ , and it gives the self-consistent equation below). This solution does not require $V$ to be large. The resulting solution gives an interesting solvable model that displays a phase transition with non-mean-field behavior in three dimensions. At small $J$ , the different sites are approximately decoupled, showing paramagnetic behavior, but for $J$ larger than some critical $J_{c}$ , long-range ferromagnetic order sets in.

The solution of this large $N$ model can be summarized as follows. For simplicity, we consider the model on a cubic lattice, which simplifies the construction due to translational invariance. Then, there is some scalar $\kappa$ 333Without translation invariance, it becomes necessary to introduce a different mass for each site.. Then, we consider the Hamiltonian $H_{\rm Gaussian}\equiv\sum_{<i,j>}\vec{q}_{i}\cdot\vec{q}_{j}+\frac{\kappa}{2}\sum_{i}(\vec{q}_{i})^{2}+\frac{1}{2}\sum_{i}(\vec{p}_{i})^{2}.$ This Hamiltonian describes coupled harmonic oscillators and can be readily solved using creation and annihilation operators. To do this, one applies a discrete Fourier transform to go to normal modes.

The ground state of the Hamiltonian has some given expectation value $\mathbb{E}[\frac{1}{N}(\vec{q}_{j})^{2}]$ . Then, $\kappa$ is chosen to satisfy a self-consistent equation

[TABLE]

As one approaches $J_{c}$ , the quantity $\kappa$ displays interesting critical behavior. As $J$ approaches $J_{c}$ from below, in the infinite lattice size limit, $\kappa$ tends to some quantity such that, at that quantity, the energy of the lowest normal mode of $H_{\rm Gaussian}$ is equal to zero; this lowest normal mode is the one at wavevector equal to [math]. However, for any finite lattice size, $\kappa$ never reaches that quantity. For $J>J_{c}$ , at finite lattice size, a non-negligible contribution to $\mathbb{E}[\frac{1}{N}(\vec{q}_{j})^{2}]$ comes from the lowest normal mode.

Following this very brief review, it is interesting to see that the sum-of-squares method reproduces the same self-consistent solution for $\kappa$ even at $N=1$ . For the case $N=1$ , we will drop the vector notation, and simply use operators $q_{j}$ and $p_{j}$ .

We use the following fact: for any real scalar $s$ , the function $(x-1)^{2}$ obeys $(x-1)^{2}\underset{\rm SoS}{\geq}(s-1)^{2}+2(x-s)(s-1)$ ; note, the right-hand side are the zeroth and first order terms of a Taylor expansion of $(x-1)^{2}$ around $x=s$ . We have $(s-1)^{2}+2(x-s)(s-1)=1-s^{2}+2x(s-1)$ .

So, for any $s$ ,

[TABLE]

So, for any $s$ , we have

[TABLE]

where we take

[TABLE]

The ground state energy of $H_{\rm Gaussian}$ can be calculated exactly using degree- $2$ sum-of-squares; indeed, this is precisely the usual calculation using creation and annihilation operators. We now vary over $s$ to get the tightest lower bound on the ground state energy. The derivative of the ground state energy of $H_{\rm Gaussian}$ with respect to $s$ is

[TABLE]

Hence, the condition for a maximum is $V\mathbb{E}[q_{j}^{2}]=Vs$ , so

[TABLE]

Using the given $s$ and $\kappa$ , we see that this is the same self-consistent equation as Eq. 5 for $N=1$ .

The reader might note that we are working in a slightly non-standard form of the sum-of-squares. Eq. 6 is an inequality in the degree- $4$ sum-of-squares, but otherwise we use only the degree- $2$ sum-of-squares to solve $H_{\rm Gaussian}$ . That is, we use part of the degree- $4$ sum-of-squares (indeed, we must, since the Hamiltonian is degree- $4$ ), but otherwise we use only the degree- $2$ sum-of-squares. We claim, but leave to the reader to show, that we have found the optimal dual solution if we consider degree- $2$ sum-of-squares as well as sums of, for each $j$ , squares of polynomials of at most degree $2$ in $q_{j}$ . So, it is an interesting question how using the full power of degree- $4$ sum-of-squares would change the solution.

To emphasize why it is slightly surprising that sum-of-squares gives the same solution at $N=1$ , consider an alternative approximate method of solving the case $N=1$ . This alternative method gives a variational upper bound on the ground state energy. We use the ground state of Hamiltonian $H_{\rm Gaussian}$ as a variational state, and compute the expectation value of Hamiltonian $H$ in this state, minimizing over $\kappa$ . If one works out the details, one will find a different self-consistent equation for $\kappa$ . Briefly stated, the reason is that sum-of-squares effectively uses the inequality $\mathbb{E}[(q_{j}^{2})^{2}]\geq\mathbb{E}[q_{j}^{2}]^{2}$ , while in the Gaussian state we have $\mathbb{E}[(q_{j}^{2})^{2}]=3\mathbb{E}[q_{j}^{2}]^{2}$ .

IV.2 Transverse Field Ising Model at Leading Order

We now turn to the transverse field Ising model.

IV.2.1 Mean Field Theory

We begin with a treatment of a Hamiltonian appropriate for mean-field theory. Consider

[TABLE]

where there are $n$ qubits, with corresponding Pauli operators $X_{i}$ and $Z_{i}$ . At small $h$ , we expect a ferromagnetic phase in the limit of large $n$ , while at large $h$ we expect a paramagnet. The factor of $1/2$ is to avoid double counting.

We can solve this Hamiltonian in a mean-field approximation. Take a product state for the spins, where each spin has $\langle X_{i}\rangle=\cos(\theta)$ and $\langle Z_{i}\rangle=\sin(\theta)$ for some angle $\theta$ . Then the expectation value of the energy is (up to ${\cal O}(1)$ corrections for the case $i=j$ in the sum)

[TABLE]

For $h\geq 1$ , the minimum is at $\theta=0$ , with energy $-nh$ . For $h\leq 1$ , the minimum is at $\cos(\theta)=h$ , with energy

[TABLE]

Now we consider a sum-of-squares treatment of the problem. We pause first for a useful result. Suppose we have some symmetry group of a Hamiltonian $H$ , i.e., we have a group homomorphism $\pi$ from some group $G$ to the group of unitaries, such that the unitaries in the image commute with $H$ . Further, suppose that, acting by conjugation, these unitaries do not increase the degree of a monomial, i.e., given an operator $O$ of given degree in some operators (e.g., the Pauli operators) and given a $g\in G$ , the operator $\pi(g)O\pi(g)^{\dagger}$ has the same degree as $O$ . Then we claim that given a sum-of-squares representation as $H=\sum_{\alpha}O_{\alpha}^{\dagger}O_{\alpha}+\lambda$ , we can find a sum-of-squares representation $H=\sum_{\alpha}Q_{\alpha}^{\dagger}Q_{\alpha}+\lambda$ of the same degree, where each $Q$ has the property that it maps under conjugation by $\pi(g)$ according to some irreducible representation of $G$ . To see this, suppose $O_{\alpha}$ is a sum of operators which transform by conjugation under inequivalent irreducible representations, i.e., $O_{\alpha}=\sum_{r}O_{\alpha,r}$ where $r$ labels irreducible representations and $O_{\alpha,r}$ transforms by conjugation according to representation $r$ . Then,

[TABLE]

By assumption that $g$ is a symmetry of $H$ , for any $g\in G$ we may make the replacement $O_{\alpha}\rightarrow\pi(g)O_{\alpha}\pi(g)^{\dagger}$ for every $\alpha$ , and this gives another sum-of-squares representation of $H$ . By summing over this replacement for various choices of $g$ , we may remove any “cross-terms” in Eq. 8, i.e., remove those terms with $r\neq r^{\prime}$ .

In this case, the symmetry group that we will use is the symmetry under spin flip, meaning $\prod_{i}X_{i}$ , as well as a symmetry under cyclic permutation of the spins $1\rightarrow 2\rightarrow 3\ldots$ Indeed, we have a full spin permutation symmetry but we will not need that.

We will use degree-2 sum-of-squares, so we search for a representation $H=\sum_{\alpha}O_{\alpha}^{\dagger}O_{\alpha}$ where each $O_{\alpha}$ is degree $1$ . By the above general result, we may assume each $O_{\alpha}$ is either even or odd under spin flip (i.e., stays the same or changes sign under spin flip).

We will make a few choices, where we say it “suffices” to consider only certain things; one may verify that these indeed give the optimal decomposition. Further, since as we will see the result agrees with mean-field up to ${\cal O}(1)$ corrections in the energy, it implies that these choices do give the correct energy up to ${\cal O}(1)$ corrections. First, it suffices to consider only odd terms. So, $\alpha$ labels different irreps under cyclic permutation. Also, it suffices to take only one term for each irrep.

So, we represent

[TABLE]

where $p$ ranges over $0,1,\ldots,n-1$ and

[TABLE]

where it suffices to take $a(p),b(p)$ as real scalars. Here $p$ plays the role of a “momentum”, i.e., labeling different Fourier modes

Due to the symmetry under arbitrary permutation of sites, we may assume that all $a(p)$ are the same for $p\neq 0$ and similarly for $b(p)$ . So, we set $a(p)=a$ for $p=0$ and $a(p)=a^{\prime}$ for $p\neq 0$ and $b(p)=b$ for $k=0$ and $b(p)=b^{\prime}$ for $b\neq 0$ . Since the Hamiltonian has no $Y_{i}Y_{j}$ terms in it, indeed we must have $b=b^{\prime}$ . Then, to obtain the correct $Z_{i}Z_{j}$ term in $H$ we need

[TABLE]

To obtain the correct $X_{i}$ term in $h$ we need

[TABLE]

From this we get

[TABLE]

and we wish to minimize $-\lambda$ (i.e., maximize $\lambda$ ). Let us solve this for large $n$ . Then we approximate $a^{\prime}b=h/2$ and we wish to minimize $n(b^{2}+a^{\prime 2})$ subject to $a^{\prime 2}\geq 1/2$ since $a^{2}\geq 0$ . For $h\geq 1$ , the minimum is at $b=\sqrt{h/2}$ with $\lambda=-hn+o(n)$ ; recall that $o(n)$ denotes a term asymptotically smaller than $n$ .

For $h\leq 1$ , we have $a^{\prime}=1/\sqrt{2}$ and so $b=h/\sqrt{2}$ and so $\lambda=-\frac{n}{2}(1+h^{2})+o(n)$ . So, the sum-of-squares result matches the variational result, up to corrections which are subleading in $n$ .

IV.2.2 Three Dimensions

We now turn to the three-dimensional Ising model on a cubic lattice with $n$ sites. We label sites by triples of integers, using a vector notation such as $\vec{j}$ to label a site. We let

[TABLE]

where the sum $\sum_{<\vec{j},\vec{k}>}$ is over nearest neighbor $\vec{j},\vec{k}$ .

The Hamiltonian has a spin flip symmetry as before. It also has a symmetry under translation by $1$ site in any of three orthogonal directions. We will label Fourier modes by vectors $\vec{p}$ , e.g., given an $L$ -by- $L$ -by- $L$ cube with $L^{3}=n$ , we have $\vec{p}=(p_{x},p_{y},p_{z})$ where $p_{x},p_{y},p_{z}$ are integer multiplies of $2\pi/L$ .

As in the previous subsection, we consider a decomposition of $H$ as a sum-of-squares as

[TABLE]

where

[TABLE]

where $a(\vec{p}),b(\vec{p})$ are real scalars and where $Z(\vec{p})=n^{-1/2}\sum_{\vec{j}}\exp(i\vec{p}\cdot\vec{j})Z_{\vec{j}}$ and $Z(\vec{p})=n^{-1/2}\sum_{\vec{j}}\exp(i\vec{p}\cdot\vec{j})X_{\vec{j}}$ . The reader may verify that this is an optimal decomposition at this order; we omit the proof.

As in the mean-field case, since the Hamiltonian has no $YY$ terms, $b(\vec{p})$ must be independent of $\vec{p}$ . So, we write $b(\vec{p})=b$ for some $b$ .

Given the $ZZ$ terms in $H$ , we must have

[TABLE]

for some scalar $c$ . To get the correct $X$ term in $H$ , we need

[TABLE]

We have $\lambda=-n(b^{2}+c)$ , so we wish to minimize $b^{2}+c$ .

We will be concerned only with what the criticial behavior is, so we will make various approximations. In an integral approximation to Eq. 12, we need

[TABLE]

We will work in an approximation: we consider only $\vec{p}$ close to [math], and we expand

[TABLE]

Let us write $m^{2}=2(c-3)$ , or, equivalently, $c=m^{2}/2+3$ , so $c-3+\frac{1}{2}\vec{p}^{2}=(1/2)(m^{2}+\vec{p}^{2})$ .

We introduce a “cutoff” $\Lambda$ and consider only $|p|\leq\Lambda$ . So, we approximate Eq. 13 by

[TABLE]

where $F(m^{2})$ is defined to be the given integral.

The integral $F(m^{2})$ has some dependence on cutoff $\Lambda$ (it is “ultraviolet divergent”), so it is convenient to remove this dependence by differentiating twice with respect to $m^{2}$ . We get

[TABLE]

This integral now diverges at small $|\vec{p}|$ for $m=0$ . This divergent is cutoff for nonzero $m$ and the integral is then (for $m<<\Lambda^{2}$ ) approximately equal to $-C_{1}/m=-C_{1}/(m^{2})^{1/2}$ for some $C_{1}>0$ . Hence, the integral in Eq. 14 behaves for small $m$ like $C+C^{\prime}m^{2}-C^{\prime\prime}m^{3}+\ldots$ for some constants $C,C^{\prime},C^{\prime\prime}>0$ .

So, keeping only the leading terms for small $m$ , we have that $C+C^{\prime}m^{2}-C^{\prime\prime}m^{3}=h/b$ and we wish to minimize $b^{2}+c$ , with $c=3+m^{2}/2$ . So,

[TABLE]

This gives some function of $m$ , which can be expanded for small $m$ as $c_{0}+c_{2}m^{2}+c_{3}m^{3}+c_{4}m^{4}+\ldots$ , for some constant $c_{0},c_{2},c_{3},c_{4}$ . The quantity $c_{2}$ depends on $h$ . Indeed, there is some critical value of $h$ , $h_{\rm cr}$ , above which the system is in the paramagnetic phase. This $h_{\rm cr}$ is the value of $h$ at which $c_{2}$ vanishes. For $h>h_{\rm cr}$ , the quantity $c_{2}$ is negative. Since $c_{2}$ depends linearly on $h-h_{\rm cr}$ for small $h-h_{\rm cr}$ , we are minimizing a function of $m$ of the form $c_{0}+d(h_{\rm cr}-h)m^{2}+c_{3}m^{3}+\ldots$ , for some constant $d$ .

Hence, for small $h-h_{\rm cr}$ , the minimum is at $m\sim h-h_{\rm cr}$ , i.e., $m^{2}\sim(h-h_{\rm cr})^{2}$ . This in fact is the scaling dependence of mass on distance from critical point which occurs for the $O(N)$ vector model in the limit of large $N$ in $2+1$ dimensions.

We leave it to the reader to verify that the singular behavior of the ground state energy matches that from the large- $N$ vector model.

V Beyond the Hamiltonian Formulation

The sum-of-squares method is tied to the Hamiltonian formulation of quantum mechanics. One might wonder: suppose one has a relativistically invariant field theory; is there some covariant version of the sum-of-squares method? One could of course build transfer matrices perpendicular to any spacelike surface and apply the sum-of-squares method to that transfer matrix.

However, one might wonder, is there some variant of the sum-of-squares method which is more closely related to the path integral, or action, formulation of quantum mechanics? Since the action formalism is often applied to systems where there is some nonlocal action in time after integrating out degrees of freedom, it is interesting to consider the possibility of applying a sum-of-squares to some system that is nonlocal in time. That is what we consider in this section.

To describe such a system, we define a function $Z(\beta,g)$ analogous to the partition function in the following way

[TABLE]

This expression will require some explanation. Here $\beta,g$ are non-negative real scalars. We assume that some finite dimensional Hilbert space is given, and $H$ is some Hermitian operator on this Hilbert space, and the $\Delta_{a}$ is some (possibly non-Hermitian) operator, where $a$ is some discrete index ranging over some given finite set. The functions $F_{a}(\cdot)$ are some non-negative functions which are periodic in $\beta$ so that $F(x+\beta)=F(x)$ . The notation ${\cal T}{\rm tr}(\cdot)$ means a “time-ordered trace”. Let us define this formally for those unfamiliar with this notation: first, one formally expands the exponential in a power series. Then, any given term in the power series is some time-ordered trace of integrals over some “time parameters” $\tau_{1},\tau_{2},\ldots$ . We bring these integrals outside the “time-ordered trace” and also bring the functions $F_{a}$ outside the trace, i.e., for any function $f(\cdot)$ of some number of time parameters, and any operators $O_{1},O_{2},\ldots$ we define

[TABLE]

To define the time-ordered trace, we define

[TABLE]

where $\pi$ is a permutation such that

[TABLE]

If some time parameters in the integral coincide (e.g. $\tau_{1}=\tau_{2}$ ), we leave the time-ordered trace ill-defined, but this does not contribute to the integral.

Thus, what we would hope is that if $g>0$ then $Z(\beta,g)\leq Z(\beta,0)$ . Indeed, suppose we take a limit in which for each $a$ , the function $F_{a}(\tau-\tau^{\prime})=\lim_{\epsilon\rightarrow 0^{+}}\delta(\tau-\tau^{\prime}-\epsilon)$ . Here we are being slightly loose about the use of Dirac $\delta$ -functions but the reader can easily replace them with some sufficiently sharply peaked functions if desired. Then, this is the same as considering the partition function of Hamiltonian $H+g\sum_{a}\Delta_{a}^{\dagger}\Delta_{a}$ at inverse temperature $\beta$ , i.e., in this case $Z(\beta,g)={\rm tr}(\exp(-\beta(H+g\sum_{a}\Delta_{a}^{\dagger}\Delta_{a}))$ . This may be seen to be a non-increasing function of $g$ as in this case $\partial\ln Z(\beta,g)=-g\sum_{a}\langle\Delta_{a}^{\dagger}\Delta_{a}\rangle\leq 0$ , where $\langle\ldots\rangle$ denotes the thermal expectation value.

However, what we will argue is that if we consider more general $F(\cdot)$ , then we may have $Z(\beta,g)>Z(\beta,0)$ for some small nonzero $g>0$ . Indeed, we will do this in the case that $\Delta_{a}=\Delta^{\dagger}_{a}$ and that the functions $F_{a}$ are even. We give this and some other examples in Section V.1; these examples are probably well-known but I do not know a reference. Then in Section V.2, we give some sufficient conditions on the functions $F$ to have $Z(\beta,g)\leq Z(\beta,0)$ .

V.1 Some Counter-Examples

The first example has a two-dimensional Hilbert space, corresponding to a single qubit. We let $H=VZ$ where $V$ is a scalar. The index $a$ takes only a single possible value, so we omit the subscripts on $\Delta$ and $F$ . We let $\Delta=\Delta^{\dagger}=X$ . Finally, we choose

[TABLE]

for some $\tau_{0}>0$ , where the sum is over integer $n$ to make the function periodic in $\beta$ as required.

Then, let us take some fixed $V>0$ , take $|g|<<1$ with $g>0$ , and take $V^{-1}<<\tau_{0}<<\beta$ . To analyze this, note that if $g=0$ , then it simply describes a Hamiltonian $Z$ , whose ground state is the spin down state. Suppose we expand $Z(\beta,g)$ in powers of $g$ . The first order term in $g$ is proportional to $\exp(-V\tau_{0})$ as each operator $\Delta_{a}$ flips the spin from ground state to excited state, and the spin does not flip back until time $\tau_{0}$ later. However, at second order, we can obtain contributions which are not exponentially suppressed in $\exp(-V\tau_{0})$ , but only suppressed by a power of $V$ . Indeed, consider a term

[TABLE]

Then, it may be that $V|\tau_{1}^{\prime}-\tau_{2}^{\prime}|<<1$ and $V|\tau_{1}-\tau_{2}|<<1$ . For example, we might have $\tau_{1}=0,\tau_{2}=\epsilon,\tau^{\prime}_{1}=\tau_{0},\tau^{\prime}_{2}=\tau_{0}+\epsilon$ , for some small $\epsilon$ . In this case, the suppression is only exponentially small in $V\epsilon$ , and integrating over $\epsilon$ simply gives a factor $V^{-1}$ so the overall contribution is of order $\beta g^{2}/V$ .

We may then choose the parameters $g,V,\tau_{0}$ so that this particular second order contribution is exponentially larger than the first order contribution, and yet still have $g^{2}/V\ll 1$ . In this case, we expect that the partition function is growing exponentially in $\beta g^{2}/V$ , i.e., we may have $Z(\beta,g)>Z(\beta,0)$ .

We omit any formal proof that the partition function is growing exponentially in $\beta g^{2}/V$ (though it is probably not difficult), but instead give the standard argument. First, of course, there is another second order contribution to $Z(\beta,g)$ which is proportional $\beta^{2}$ . Indeed, this contribution is $1/2$ times the square of the first order contribution. However, the standard way to deal with this is to perform a series expansion for $\ln(Z(\beta,g))$ , and then the only second order contribution is the positive contribution proportional to $\beta g^{2}/V$ and we may choose parameters so that the sum of the first two contributions is positive.

In this example, the sign of the second order contribution is positive and larger than the first order contribution. It is easy to see that the sign of the first order contribution is always negative. It is interesting to note though that we may also have a negative sign for the second order contribution to $\log(Z)$ . To do this, we take a four-dimensional Hilbert space, corresponding to two qubits. We let $H=VZ_{1}$ where $V$ is a scalar. We let $\Delta_{1}=X_{1}X_{2}$ , let $\Delta_{2}=X_{1}Y_{2}$ , and let $\Delta_{3}=X_{1}Z_{2}$ . Finally, we choose $F_{a}(x)=F(x)$ , where $F(x)$ is as before. Then, the first order contribution is exponentially suppressed as before, but one may verify that the sign of the second order contribution is negative.

V.2 Sufficient Conditions

Having seen that it is not enough to have $F_{a}(\tau-\tau^{\prime})$ be a non-negative function to have $Z(\beta,g)\leq Z(\beta,0)$ , we now give some sufficient conditions. Rather than just stating the conditions and then proving that they are sufficient, we instead derive them in some sense.

Consider the following alternative definition of a function $Z(\beta,g)$ . The Hilbert space is a tensor product of a finite dimensional Hilbert space on which $H,\Delta_{a}$ act and also some additional harmonic oscillators, one such oscillator for each choice of the index $a$ . We let

[TABLE]

where the trace is over both finite dimensional and harmonic oscillator Hilbert space, where

[TABLE]

where where $b_{a},b_{a}^{\dagger}$ are creation and annihilation operators on the given harmonic oscillator and $\epsilon_{a}>0$ are real scalars, and where

[TABLE]

where $\omega_{a}$ are some real numbers which are integer multiple of $2\pi/\beta$ , and $\Delta_{a}$ are some operators.

Note that if all $\omega_{a}$ are equal to [math], then we do not need to use time-ordered traces. In this case we would simply have $Z(\beta,g)={\rm tr}(\exp[-\beta(H+\sum_{a}b_{a}^{\dagger}b_{a}+i\sqrt{g}(\sum_{a}F_{a}(\tau)b_{a}^{\dagger}+F_{a}^{\dagger}b_{a}))])$ .

For $g\geq 0$ , the term $i\sqrt{g}$ is anti-Hermitian, regardless of $\omega_{a}$ . We claim then that $Z(\beta,g)\leq Z(\beta,0)$ . Indeed, this follows because we can (to any desired accuracy) approximate $Z(\beta,g)$ by a Trotter-Suzuki decomposition by

[TABLE]

where each unitary matrix $U_{j}=\exp[i\sqrt{g}(\beta/n)\sum_{a}\exp(-i\omega_{a}\tau)\Delta_{a}b^{\dagger}_{a}+\exp(+i\omega_{a}\tau)\Delta_{a}^{\dagger}b_{a})],$ and where the error in the Trotter-Suzuki approximation tends to zero as the integer $n$ tends to infinity. Then, by a generalization of von Neumann’s trace inequality due to Fan (1951), this is bounded by $Z(\beta,0)$ ; see also theorem 20B.2 of Marshall et al. (1979) for this generalization. The generalization is that given any trace ${\rm tr}(A_{1}U_{1}A_{2}U_{2}\ldots A_{m}U_{m})$ , where $U_{m}$ are unitary matrices, and the $A_{j}$ have singular values $\sigma_{1}(A_{j})\geq\sigma_{2}(A_{j})\geq\ldots\geq 0$ , the trace is bounded by $\sum_{i}\sigma_{i}(A_{1})\sigma_{i}(A_{2})\ldots\sigma_{i}(A_{m})$ .

However, we may then, using standard field theory techniques, integrate out the harmonic oscillators, leaving an expression that involves only the finite dimensional Hilbert space. The result is

[TABLE]

where the notation ${\cal T}{\rm tr}_{\rm qudit}(\cdot)$ indicates a time-ordered trace just over the finite dimensional Hilbert space, where $Z_{\rm harmonic}(\beta)={\rm tr}_{\rm harmonic}(\exp(-\sum_{a}\epsilon_{a}b^{\dagger}_{a}b_{a}))$ is the partition function of the harmonic oscillators (here the notation indicates that the trace is just over the harmonic oscillator Hilbert space), and where

[TABLE]

where the notation $\langle\ldots\rangle_{\beta}$ denotes a thermal expectation value at inverse temperature $\beta$ as defined by the above equation. We use the symbol $G_{a}$ to denote this function because it is a Green’s function of the harmonic oscillator.

For finite $\beta$ , the Green’s function is some periodic function of $\beta$ . However, it is interesting to consider the limit of large $\beta$ . In this case, $G_{a}$ converges to $\theta(\tau-\tau^{\prime})\exp(-\epsilon_{a}(\tau-\tau^{\prime}))$ , where $\theta(\cdot)$ is a step function. That is, when $\tau-\tau^{\prime}$ is small compared to $\beta$ , it converges to a step function. We remind the reader that we are being slightly careless about $\delta$ -functions and step functions, but it is not difficult to fill in the details.

Thus, identifying $F_{a}(\tau-\tau^{\prime})=\exp(i\omega_{a}(\tau-\tau^{\prime}))\exp(-\epsilon_{a}(\tau-\tau^{\prime}))\theta(\tau-\tau^{\prime})$ gives us a choice of functions $F_{a}$ for which $Z(\beta,g)$ as in Eq. 15 has the property $Z(\beta,g)\leq Z(\beta,0)$ for $g>0$ .

Of course, we can also use this trick of integrating out harmonic oscillators to obtain functions $F_{a}(\tau-\tau^{\prime})$ which are equal to $\exp(-\epsilon_{a}(\tau-\tau^{\prime}))\theta(\tau-\tau^{\prime})\sum_{b}W_{b}\exp(i\omega_{a,b}(\tau-\tau^{\prime}))$ where the sum is over some discrete index $b$ , where $W_{b}>0$ , and where $\omega_{a,b}$ is some function of $a$ and $b$ , by indexing the harmonic oscillators with a pair of indices $a,b$ .

By doing this and taking a large number of terms in the sum while taking $\epsilon_{a}\rightarrow 0^{+}$ , we expect that we will have the property $Z(\beta,g)\leq Z(\beta,0)$ for $g>0$ in the limit of large $\beta$ whenever $F_{a}(\tau-\tau^{\prime})=\theta(\tau-\tau^{\prime})f_{a}(\tau-\tau^{\prime})$ for any choices of functions $f_{a}$ which have Fourier transforms which are non-negative and sufficiently well-behaved (we leave details of what this would mean to the reader!).

VI On Classical Methods for SYK Ground States

In this section, we consider two classical variational methods for approximating the ground state energy of the SYK model. One method is the Lanczos algorithm, starting with a Gaussian wavefunctions (we explain Gaussian states and wavefunctions in more detail below), and the other method is a sum of Gaussian wavefunctions. We prove limitations on the power of these methods.

VI.1 Background

The SYK modelSachdev and Ye (1993); Kitaev (2015) is a model of fermions with randomly chosen interactions. See Rosenhaus (2019); García-García and Verbaarschot (2016); García-García et al. (2018); Feng et al. (2019, 2018, 2020).

In this paper, we largely consider the degree- $4$ SYK model. In this case, the Hamiltonian is

[TABLE]

where $\gamma_{i}$ are Majorana operators obeying the anti-commutation relations $\{\gamma_{i},\gamma_{j}\}=2\delta_{i,j}$ , with $i\in\{1,2,\ldots,n\}$ , with $n$ even, and where the entries of the tensor $J$ are independent Gaussians, up to the requirement that $J$ be totally anti-symmetric in its indices.

More generally, one can consider a degree- $4$ SYK model for even $q>4$ . In this case, we mean a sum of degree- $q$ monomials in Majorana variables, with Gaussian random coefficients, with variance chosen so that the expected sum-of-squares of coefficients is equal to $1$ . We discuss this briefly later, but if not otherwise specified, we mean the degree- $4$ model.

We choose the variance of the Gaussians so that the $\ell_{2}$ norm of $J$ (i.e., the square-root of sum-of-squares of its entries) is a constant, independent of $n$ . This is a different normalization that considered in physics, where instead for the degree- $4$ model the $\ell_{2}$ norm is of order $\sqrt{n}$ , but is convenient for us and was used in Ref. Hastings and O’Donnell (2022).

Also, we will (following Hastings and O’Donnell (2022)) consider approximating the state with most positive eigenvalue (i.e., the highest ezcited state), rather than the state with most negative eigenvalue (i.e., the ground state). The distribution of $J$ is invariant under change of sign, so this has no effect, but it avoids some signs later.

Mathematical physics results predict that with this normalization, the largest eigenvalue is proportional to $\sqrt{n}$ with high probability, and even predict the leading coefficient, though it is not proven. In Feng et al. (2019), it is proven that with high probability the largest eigenvalue is ${\cal O}(\sqrt{n})$ with high probability. In Hastings and O’Donnell (2022), it is also proven that with high probability, the largest eigenvalue is $\Omega(\sqrt{n})$ , thus proving that the eigenvalue is $\theta(\sqrt{n})$ with high probability.

The eigenstate with largest eigenvalue of the SYK model is predicted to be highly entangledLiu et al. (2018). In this regard, it is interesting to see to what extent one can find variational states which still have a large expectation value for the SYK Hamiltonian. In Hastings and O’Donnell (2022), it was shown that, with high probablity, one could efficiently on a quantum computer construct a quantum variational state which had energy which is $\theta(\sqrt{n})$ , where by “energy” of a state, we simply mean the expectation value of the SYK Hamiltonian.

However, suppose we restrict to variational states whose energy can be efficiently evaluated on a classical computer. In this case, Haldar et al. (2021) proved an important negative result: with high probability, for any Gaussian state, the expectation value of the SYK Hamiltonian is ${\cal O}(1)$ , which has a different scaling with $n$ than the largest eigenvalue.

In this section, we prove further results. Our main result is to bound the expectation value of the energy of a sum of Gaussian wavefunctions, i.e., some sum $\sum_{i}a_{i}\psi_{i}$ , where $\psi_{i}$ are Gaussian wavefunctions, under an assumption explained later on the norm. Note, a sum of polynomially many Gaussian wavefunctions is an important class of states where one can efficiently evaluate the energy on a classical computer is a sum of Gaussian wavefunctionsBravyi and Gosset (2017); Boutin and Bauer (2021). Importantly, these Gaussian wavefunctions do not need to be orthogonal to each other. If they are orthogonal, our norm assumption is fulfilled, so long as the total number of Gaussians is sufficiently small; the number we allow is exponentially large in a power of $n$ . Even if the wavefunctions are not orthogonal, the norm assumption may be fulfilled. As a corollary, we prove a limitation on the power of Lanczos methods starting with a Gaussian wavefunction. The Lanczos method is variational method within the subspace, called a “Krylov space”, which is the span of $\psi,H\psi,H^{2}\psi,\ldots,H^{k}\psi$ , for some finite $k$ . When we say we “start” with a Gaussian wavefunction, we mean that $\psi$ is a Gaussian wavefunction (explained next).

Before doing this, let us define what we mean by Gaussian states and give some mathematical background on Wick’s theorem. Here a “state” refers to a density matrix. Gaussian states are those states in which expectation values of any product of Majorana operators are determined by Wick’s theorem (below). The pure Gaussian states are precisely the states which are ground states of Hamiltonians which have unique ground states and which are quadratic in Majorana operators. See Ref. Bravyi (2004) for more details.

We use the term “wavefunction” to mean a vector $\psi$ in the Hilbert space describing the given quantum system, so that for a normalized wavefunction $|\psi\rangle$ , the projector $|\psi\rangle\langle\psi|$ is a state. We say that $\psi$ is a Gaussian wavefunction if the corresponding projector is a Gaussian state.

Let us briefly review Wick’s theorem. Let $M$ be a matrix with matrix elements $M_{lm}=\langle\gamma_{l}\gamma_{m}\rangle$ , where $\langle\ldots\rangle$ denotes the expectation value in a given Gaussian state. In general we have $M=I+iB$ , where $I$ is the identity matrix and $B$ is a real anti-symmetric matrix, with eigenvalues of $iB$ bounded by $1$ in absolute value. If the Gaussian state is pure, then $B^{2}=-I$ , and the state is a ground state of a quadratic Majorana Hamiltonian.

Any higher order expectation value $\langle\gamma_{i_{1}}\gamma_{i_{2}}\ldots\gamma_{i_{2m}}\rangle$ in a Gausian state can be computed as follows. Consider all possible ways of pairing the $2m$ different Majorana operators with each other. There are $(4k)!!\equiv(4k-1)\cdot(4k-3)\cdot\ldots\cdot 1$ such pairings. We will regard a pairing of these Majorana operators as a pairing of the integers $1,2,\ldots,2m$ . Each pairing defines some sequence of pairs $(a_{1},b_{1}),(a_{2},b_{2}),\ldots,(a_{m},b_{m})$ with $a_{1}<a_{2}<\ldots<a_{m}$ and $b_{1}<b_{2}<\ldots<b_{m}$ . For each pairing, consider the product

[TABLE]

Then, sum this product over all pairings, with a sign equal to the sign of the permutation from the sequence $a_{1},b_{1},a_{2},b_{2},\ldots,a_{m},b_{m}$ to the sequence $1,2,3,\ldots,2m$ .

Remark: often when Wick’s theorem is given for Majorana fermions, it is assumed that all $i_{1},\ldots,i_{2m}$ are distinct from each other. Of course, we can always reduce to this case by using the Majorana anti-commutation relations. In this case, the sum over pairings has a nice representation as a Pfaffian.

VI.2 Main Results

We now prove the main results.

The results are largely corollaries of the following:

Theorem 1.

With high probability, for $H$ drawn from the SYK distribution, the expectation value of $H^{k}$ , for integer $k\geq 0$ , in any Gaussian state is bounded by

[TABLE]

Proof.

We compute expectation value of $H^{k}$ by Wick’s theorem. We bound the product $\prod_{j}M_{i_{a_{j}},i_{b_{j}}}$ in each pairing, and then sum over pairings using a triangle inequality.

To bound a pairing, as in Hastings and O’Donnell (2022), note that a pairing can be regarded as a tensor network. There are degree $4$ vertices corresponding to the four-index tensor $J$ . There are $k$ such vertices. We join these tensors in the way corresponding to the given pairing, and then in each edge we insert a degree- $2$ vertex, corresponding to the matrix $M$ . We will write this tensor network as a particular product of vectors and matrices.

Let us define $J_{2,2}^{mat}$ to be an $n^{2}$ -by- $n^{2}$ matrix, with rows (and columns) indexed by pairs $(i,j)$ with $i,j\in\{1,\ldots,n\}$ , with matrix element $(J_{2,2}^{mat})_{(i,j),(k,l)}=J_{i,j,k,l}$ . The matrix $J_{2,2}^{mat}$ was called simply $J^{mat}$ in Hastings and O’Donnell (2022). We use this additional notation because we also introduce matrices $J_{p,4-p}^{mat}$ which are defined to be $n^{p}$ -by- $n^{4-p}$ matrices, where the rows are indexed by $p$ integers from $\{1,\ldots,n\}$ and the columns are indexed by $4-p$ such integers, with matrix elements defined in the obvious way:

[TABLE]

Then we assign each of the degree $2$ vertices of the network a label, either “left” or “right”. Then, the value of the tensor network can be written as some some expectation value as follows. Let $M_{vec}$ be a vector in ${\mathbb{C}}^{n^{2}}$ given by regarding $M$ as a vector. Then, the contraction of the tensor network equals

[TABLE]

Let us explain the meaning of this. We let $N_{L}$ be the number of left vertices and $N_{R}$ be the number of right vertices, with $N_{L}+N_{R}=2k$ . Here we are using the bra-ket notation simply as a way of writing a vector-matrix-vector product, with the bra vector in ${\mathbb{C}}^{n^{2N_{L}}}$ and the ket vector in ${\mathbb{C}}^{n^{2N_{R}}}$ . Here we label basis vectors of ${\mathbb{C}}^{n^{2N_{L}}}$ by $2N_{L}$ indices, each in $\{1,\ldots,n\}$ , and $\pi_{L}$ denotes an operator which applies some permutation of these indices, i.e., it maps a given basis vector to some other basis vector by permuting the indices in some given way. We define $\pi_{R}$ similarly to permute indices. The number $k_{p,4-p}$ is equal to the number of degree $4$ vertices in the tensor network for which $p$ edges connect to left vertices and $4-p$ connect to right vertices.

The advantage of this representation is that we can apply norm bounds. Let $v_{L}=\pi_{L}(M_{vec})^{\otimes N_{L}}$ and let $v_{R}=\pi_{R}(M_{vec})^{\otimes N_{R}}$ . We have $|v_{L}|\leq{\cal O}(n^{N_{L}/2})$ and $|v_{R}|\leq{\cal O}(n^{N_{R}/2})$ , where we use the $\ell_{2}$ norm.

Standard random matrix theory bounds show that, with high probability, $\|J_{2,2}^{mat}\|\leq{\cal O}(1/n)$ , and $\|J_{1,3}^{mat}\|=\|J_{3,1}^{mat}\|\leq{\cal O}(1/\sqrt{n})$ and $\|J_{0,4}^{mat}\|=\|J_{4,0}^{mat}\|\leq{\cal O}(1)$ , where $\|\ldots\|$ denotes the operator norm. In the rest of the proof, we will assume that these bounds on operator norms hold.

So, the value of the tensor network is bounded in absolute value by

[TABLE]

This holds for any labelling. We now choose a labelling. We claim that it is possible to label the degree- $2$ vertices so that each each degree- $4$ vertex has two left neighbors and two right neighbors, i.e., so that $k_{2,2}=k$ . Then, the value of the tensor network is bounded in absolute value by

[TABLE]

Summing over all pairings, and using $4k!!\leq(4k)^{2k}$ , the theorem follows.

To prove the claim on the labelling of vertices, it is convenient to consider the multigraph obtained by connecting the degree- $4$ vertices according to the given pairing, without inserting the degree- $2$ vertices in each edge. In this case, the question is whether we can color the edges of a regular (i.e., all vertices have the same degree) degree- $4$ multigraph so that each vertex has two edges of each color attached to it. This is possible as follows444I thank R. O’Donnell for this proof of the existence of such a coloring.: since the multigraph has even degree, each connected component is Eulerian. Take an Eulerian cycle in each connected component and alternately color the edges. Since the degree of each vertex is zero mod $4$ , the total number of edges is even in each connected component, and so Eulerian cycle has even length, making this alternation of coloring possible. ∎

Remark: one may also consider the case of the degree- $q$ SYK model for even $q>4$ . There are some modifications needed to the above proof. We may define an analogous $J^{mat}_{p,q-p}$ . Then, we have, with high probability that $\|J^{mat}_{q/2,q/2}\|\leq{\cal O}(n^{-q/4})$ . Also, if $q=0\mod 4$ , then the analogous coloring argument works, and it is possible to color edges of a regular degree- $q$ multigraph so that each vertex has $q/2$ edges of each color attached to it. However, if $q=2\mod 4$ , then the coloring argument need not work since the length of the Eulerian cycle is odd if there are an odd number of vertices in a given connected component. In that case, if the number of vertices in a component is odd, it is possible to color so that all but one vertex in each connected component has $q/2$ edges of each color attached to it, and the remaining vertex may be colored so that it has $q/2+1$ edges of one color and $q/2-1$ edges of the other color. Indeed, this may be the best possible; for example, consider the case $q=6$ and consider a multigraph with three vertices of degree $6$ , with $3$ edges connecting each pair of vertices.

Hence,

Corollary 1.

Let $H$ be drawn from the SYK distribution. Let $c$ be any positive constant. Then, with high probability, for any Gaussian state $\rho$ the projection of $\rho$ onto the eigenspace of $H$ with eigenvalue $\geq c\sqrt{n}$ is exponentially small in $n^{1/4}$ , i.e., the projection is bounded by $(c^{\prime})^{n^{1/4}}$ for some $c^{\prime}<1$ .

Proof.

Let $\Pi$ project onto the eigenspace of $H$ with eigenvalue $\geq c\sqrt{n}$ Pick $k$ even. Then ${\rm tr}(\rho H^{k})\geq{\rm tr}(\rho\Pi)c^{k}n^{k/2}$ . By Theorem 1, with high probability we have ${\rm tr}(\rho H^{k})\leq({\cal O}(k))^{2k}$ . So ${\rm tr}(\rho\Pi)\leq({\cal O}(k))^{2k}c^{-k}n^{-k/2}=({\cal O}(k)n^{-1/4}c^{-1/2})^{2k}$ . We pick $k$ to be the largest even integer less than $c^{\prime\prime}n^{1/4}$ for some $c^{\prime\prime}>0$ . Picking $c^{\prime\prime}$ small enough, the result follows. ∎

So,

Corollary 2.

Let $H$ be drawn from the SYK distribution. Given any sum of polynomially many Gaussian wavefunctions, of the form $\Psi\equiv\sum_{i}a_{i}\psi_{i}$ , with $\psi_{i}$ being Gaussian wavefunctions, then, with high probability, if $\log(\sum_{i}|a_{i}|/|\Psi|)$ is $o(n^{1/4})$ , then $\langle\Psi|H|\Psi\rangle/|\Psi|^{2}=o(\sqrt{n})$ . Remark: colloquially one may say that the condition is that $\sum_{i}|a_{i}|/|\Psi|$ is not exponentially large in $n^{1/4}$ .

Further, if the Gaussians are orthogonal to each other, and if the number of Gaussian wavefunctions in the sum is some $k$ with $\log(k)=o(n^{1/4})$ , then $\langle\Psi|H|\Psi\rangle/|\Psi|^{2}=o(\sqrt{n})$ .

Proof.

For any $c>0$ , the norm of the projection of $\psi_{i}$ onto the eigenspace with eigenvalue $\geq c\sqrt{n}$ is exponentially small in $n^{1/4}$ . Hence, by a triangle inequality, the norm of the projection of $\Psi$ onto the given eigenspace is bounded by $\sum_{i}|a_{i}|$ times something exponentially small in $n^{1/4}$ .

To show the second claim, if the Gaussian wavefunctions are orthogonal, and there are $k$ wavefunctions in the sum, then $|\Psi|/\sum_{i}|a_{i}|\geq 1/\sqrt{k}.$ ∎

The above corollary needs this assumption on the norm $|\Psi|/\sum_{i}|a_{i}|$ or on orthogonality of the wavefunctions. We conjecture that this assumption is not necessary.

Conjecture 1.

Let $H$ be drawn from the SYK distribution. Given any sum of polynomially many Gaussian wavefunctions, the expectation value of the $H$ in the resulting state is $o(\sqrt{n})$ .

Before proving this, we give the immediate corollary:

Corollary 3.

With high probablity, using $o(n^{1/4})/\log(n)$ steps of the Lanczos algorithm or the power method, starting from a Gaussian wavefunction, produces a state whose for the SYK Hamiltonian is $o(\sqrt{n})$ .

Proof.

A Gaussian wavefunction is the ground state of a quadratic Majorana Hamiltonian which naturally defines an orthonormal basis of states. Given such a quadratic Hamiltonian $H_{\rm quad}$ , one can pick a new basis of Majorana operators that we write as ${\tilde{\gamma}}_{1},\ldots,{\tilde{\gamma}}_{n}$ such that in this basis

[TABLE]

These operators also obey the canonical anti-comutation relations:

[TABLE]

Then, this defines a natural orthonormal basis of states, where each such state is an eigenstate of all the operators $i{\tilde{\gamma}}_{2j-1}{\tilde{\gamma}}_{2j}$ and hence is a Gaussian wavefunction. Each such operator has eigenvalues $\pm 1$ and there are $2^{n/2}$ such states.

We say a state has $m$ excitations if there are $m$ such operators with eigenvalue $+1$ and the others all have eigenvalue $-1$ . Starting with a Gaussian wavefunction, and applying $k$ steps of the Lanczos algorithm, one can describe the resulting state as a sum of states with up to $4k$ “excitations”. For $k=o(n^{1/4})/\log(n)$ , there are $2^{o(n^{1/4})}$ such states, and so the result follows from Corollary 2. ∎

Note, for only logarithmically many steps of the Lanczos algorithm, the natural basis of states used in the above proof means that one can efficiently classically apply the Lanczos algorithm. However, we do not know any way to efficiently compute a super-logarithmic number of Lanczos steps.

Bibliography42

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Hastings and O’Donnell (2022) M. B. Hastings and R. O’Donnell, in Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing (2022), pp. 776–789.
2Hastings (2022) M. B. Hastings, ar Xiv preprint ar Xiv:2205.12325 (2022).
3Bonner and Fisher (1964) J. C. Bonner and M. E. Fisher, Physical Review 135 , A 640 (1964).
4Schollwöck (2011) U. Schollwöck, Annals of physics 326 , 96 (2011).
5O’Donnell (2017) R. O’Donnell, in Proceedings of the 8th annual Innovations in Theoretical Computer Science Conference (ITCS) (2017).
6Goemans and Williamson (1995) M. Goemans and D. Williamson, Journal of the ACM 42 , 1115 (1995).
7Charikar and Wirth (2004) M. Charikar and A. Wirth, in Proceedings of the 45th annual Symposium on Foundations of Computer Science (FOCS) (IEEE, 2004), pp. 54–60.
8Coleman (1963) A. J. Coleman, Reviews of Modern Physics 35 , 668 (1963).

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Field Theory and The Sum-of-Squares for Quantum Systems

Abstract

I Introduction and Background

I.1 Background

I.2 Notation, and Conventions

II Auxiliary Field Quantum Monte Carlo and the Importance of Commutators

III Perturbation Theory for Fermionic Systems

III.1 General Formalism for Perturbative Solution of Semidefinite Program and the Rank of the Reduced Density Matrix

III.1.1 Rank of MMM

III.1.2 General Formalism

III.2 Perturbation Theory and the Sum-of-Squares

IV Critical Phenomena

IV.1 Large NNN Vector Model—Relation To Sum-of-Squares

IV.2 Transverse Field Ising Model at Leading Order

IV.2.1 Mean Field Theory

IV.2.2 Three Dimensions

V Beyond the Hamiltonian Formulation

V.1 Some Counter-Examples

V.2 Sufficient Conditions

VI On Classical Methods for SYK Ground States

VI.1 Background

VI.2 Main Results

Theorem 1**.**

Proof.

Corollary 1**.**

Proof.

Corollary 2**.**

Proof.

Conjecture 1**.**

Corollary 3**.**

Proof.

III.1.1 Rank of $M$

IV.1 Large $N$ Vector Model—Relation To Sum-of-Squares

Theorem 1.

Corollary 1.

Corollary 2.

Conjecture 1.

Corollary 3.