Enhancing the efficiency of quantum annealing via reinforcement: A   path-integral Monte Carlo simulation of the quantum reinforcement algorithm

A. Ramezanpour

arXiv:1812.02569·cond-mat.dis-nn·December 26, 2018

Enhancing the efficiency of quantum annealing via reinforcement: A path-integral Monte Carlo simulation of the quantum reinforcement algorithm

A. Ramezanpour

PDF

TL;DR

This paper demonstrates through path-integral Monte Carlo simulations that quantum reinforcement can significantly improve the success probability of quantum annealing in solving hard combinatorial problems.

Contribution

It introduces a local quantum reinforcement algorithm and shows its potential to enhance quantum annealing efficiency for constraint satisfaction problems.

Findings

01

Quantum reinforcement increases success probability of quantum annealing.

02

The local reinforcement algorithm performs well on XORSAT problems.

03

Results suggest potential for larger problem sizes and classical optimization applications.

Abstract

The standard quantum annealing algorithm tries to approach the ground state of a classical system by slowly decreasing the hopping rates of a quantum random walk in the configuration space of the problem, where the on-site energies are provided by the classical energy function. In a quantum reinforcement algorithm, the annealing works instead by increasing gradually the strength of the on-site energies according to the probability of finding the walker on each site of the configuration space. Here, by using the path-integral Monte Carlo simulations of the quantum algorithms, we show that annealing via reinforcement can significantly enhance the success probability of the quantum walker. More precisely, we implement a local version of the quantum reinforcement algorithm, where the system wave function is replaced by an approximate wave function using the local expectation values of the…

Equations33

E (σ) = a = 1 \sum M (1 - J_{a} i \in \partial a \prod σ_{i}) .

E (σ) = a = 1 \sum M (1 - J_{a} i \in \partial a \prod σ_{i}) .

H = σ \sum E (σ) ∣ σ ⟩ ⟨ σ ∣ - σ \sum i = 1 \sum N Γ (∣ σ^{- i} ⟩ ⟨ σ ∣ + ∣ σ ⟩ ⟨ σ^{- i} ∣) .

H = σ \sum E (σ) ∣ σ ⟩ ⟨ σ ∣ - σ \sum i = 1 \sum N Γ (∣ σ^{- i} ⟩ ⟨ σ ∣ + ∣ σ ⟩ ⟨ σ^{- i} ∣) .

H = a \sum (1 - J_{a} i \in \partial a \prod σ_{i}^{z}) - i \sum Γ σ_{i}^{x} .

H = a \sum (1 - J_{a} i \in \partial a \prod σ_{i}^{z}) - i \sum Γ σ_{i}^{x} .

H_{r} (t) \equiv - r (t) σ \sum ∣ ψ (σ; t) ∣^{2} ∣ σ ⟩ ⟨ σ ∣.

H_{r} (t) \equiv - r (t) σ \sum ∣ ψ (σ; t) ∣^{2} ∣ σ ⟩ ⟨ σ ∣.

H_{r}^{l oc a l} (t) \equiv - r (t) σ \sum i \sum K_{i}^{R} σ_{i} ∣ σ ⟩ ⟨ σ ∣.

H_{r}^{l oc a l} (t) \equiv - r (t) σ \sum i \sum K_{i}^{R} σ_{i} ∣ σ ⟩ ⟨ σ ∣.

H_{r}^{l oc a l} (t) \equiv - r (t) σ \sum (i \sum K_{i}^{R} σ_{i} + a \sum K_{a}^{R} i \in \partial a \prod σ_{i}) ∣ σ ⟩ ⟨ σ ∣.

H_{r}^{l oc a l} (t) \equiv - r (t) σ \sum (i \sum K_{i}^{R} σ_{i} + a \sum K_{a}^{R} i \in \partial a \prod σ_{i}) ∣ σ ⟩ ⟨ σ ∣.

Z_{QR} = σ_{1}, \dots, σ_{N} \sum exp (- \frac{β}{N _{s}} α = 1 \sum N_{s} [E (σ (α)) - r (t) i = 1 \sum N K_{i}^{R} σ_{i} (α)]) \times α = 1 \prod N_{s} ⟨ σ (α) ∣ e^{\frac{β}{N _{s}} Γ \sum_{i = 1}^{N} σ_{i}^{x} (α)} ∣ σ (α + 1)⟩ .

Z_{QR} = σ_{1}, \dots, σ_{N} \sum exp (- \frac{β}{N _{s}} α = 1 \sum N_{s} [E (σ (α)) - r (t) i = 1 \sum N K_{i}^{R} σ_{i} (α)]) \times α = 1 \prod N_{s} ⟨ σ (α) ∣ e^{\frac{β}{N _{s}} Γ \sum_{i = 1}^{N} σ_{i}^{x} (α)} ∣ σ (α + 1)⟩ .

Z_{QR} = σ_{1}, \dots, σ_{N} \sum exp (τ α = 1 \sum N_{s} [a \sum J_{a} i \in \partial a \prod σ_{i} (α) + r (t) i \sum K_{i}^{R} σ_{i} (α)]) \times i \prod α \prod (cosh (τ Γ) δ_{σ_{i} (α + 1), σ_{i} (α)} + sinh (τ Γ) δ_{σ_{i} (α + 1), - σ_{i} (α)}),

Z_{QR} = σ_{1}, \dots, σ_{N} \sum exp (τ α = 1 \sum N_{s} [a \sum J_{a} i \in \partial a \prod σ_{i} (α) + r (t) i \sum K_{i}^{R} σ_{i} (α)]) \times i \prod α \prod (cosh (τ Γ) δ_{σ_{i} (α + 1), σ_{i} (α)} + sinh (τ Γ) δ_{σ_{i} (α + 1), - σ_{i} (α)}),

P_{QR} (σ_{i}) \propto exp τ α = 1 \sum N_{s} [a \in \partial i \sum J_{a} j \in \partial a \prod σ_{j} (α) + r (t) K_{i}^{R} σ_{i} (α)] \times α \prod (cosh (τ Γ) δ_{σ_{i} (α + 1), σ_{i} (α)} + sinh (τ Γ) δ_{σ_{i} (α + 1), - σ_{i} (α)}) .

P_{QR} (σ_{i}) \propto exp τ α = 1 \sum N_{s} [a \in \partial i \sum J_{a} j \in \partial a \prod σ_{j} (α) + r (t) K_{i}^{R} σ_{i} (α)] \times α \prod (cosh (τ Γ) δ_{σ_{i} (α + 1), σ_{i} (α)} + sinh (τ Γ) δ_{σ_{i} (α + 1), - σ_{i} (α)}) .

E (σ) = - i = 1 \sum N h_{i} σ_{i} - a = 1 \sum M h_{a} i \in \partial a \prod σ_{i} .

E (σ) = - i = 1 \sum N h_{i} σ_{i} - a = 1 \sum M h_{a} i \in \partial a \prod σ_{i} .

Z = σ \sum e^{- E (σ)},

Z = σ \sum e^{- E (σ)},

m_{i}

m_{i}

m_{a}

μ_{i \to a} (σ_{i})

μ_{i \to a} (σ_{i})

μ_{a \to i} (σ_{i})

μ_{i \to a}^{t + 1} (σ_{i})

μ_{i \to a}^{t + 1} (σ_{i})

μ_{a \to i}^{t + 1} (σ_{i})

μ_{i}^{t + 1} (σ_{i}) = \frac{1}{z _{i}} e^{h_{i} σ_{i} + r (t) μ_{i}^{t} (σ_{i})} a \in \partial i \prod μ_{a \to i}^{t} (σ_{i}) .

μ_{i}^{t + 1} (σ_{i}) = \frac{1}{z _{i}} e^{h_{i} σ_{i} + r (t) μ_{i}^{t} (σ_{i})} a \in \partial i \prod μ_{a \to i}^{t} (σ_{i}) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Enhancing the efficiency of quantum annealing via reinforcement: A path-integral Monte Carlo simulation of the quantum reinforcement algorithm

A. Ramezanpour

[email protected]

Physics Department, College of Sciences, Shiraz University, Shiraz 71454, Iran

Leiden Academic Centre for Drug Research, Faculty of Mathematics and Natural Sciences, Leiden University, Leiden, The Netherlands

Abstract

The standard quantum annealing algorithm tries to approach the ground state of a classical system by slowly decreasing the hopping rates of a quantum random walk in the configuration space of the problem, where the on-site energies are provided by the classical energy function. In a quantum reinforcement algorithm, the annealing works instead by increasing gradually the strength of the on-site energies according to the probability of finding the walker on each site of the configuration space. Here, by using the path-integral Monte Carlo simulations of the quantum algorithms, we show that annealing via reinforcement can significantly enhance the success probability of the quantum walker. More precisely, we implement a local version of the quantum reinforcement algorithm, where the system wave function is replaced by an approximate wave function using the local expectation values of the system. We use this algorithm to find solutions to a prototypical constraint satisfaction problem (XORSAT) close to the satisfiability to unsatisfiability phase transition. The study is limited to small problem sizes (a few hundreds of variables), nevertheless, the numerical results suggest that quantum reinforcement may provide a useful strategy to deal with other computationally hard problems and larger problem sizes even as a classical optimization algorithm.

I Introduction

Finding a solution to a computationally hard constraint satisfaction problem becomes more difficult for a typical instance of the problem as one approaches the phase transition from a satisfiable (SAT) to unsatisfiable (UNSAT) phase sat-nature-1999 ; sat-science-2002 ; in the SAT phase, with high probability there is a solution to the problem satisfying all the constraints, whereas in the UNSAT phase there is no solution to the problem with high probability. A reinforcement algorithm tries to find a solution to the problem by utilizing the information that is obtained from the system at each step of the algorithm. This provides a class of powerful classical reinforcement algorithms to deal with such problems SB-book-1998 ; BZ-prl-2006 . In this paper, we show that adding reinforcement to the standard quantum annealing algorithm is helpful in the study of a prototypical constraint satisfaction problem. More precisely, we observe that a path-integral Monte Carlo simulation of the quantum reinforcement algorithm gives much higher success probabilities than the simulated quantum annealing algorithm.

The presence of strong and long-range correlations between the problem variables, due to a spin glass or a freezing phase transition gibbs-pnas-2007 ; semerjian-jstat-2008 ; gibbs-pre-2008 ; clustering-prl-2005 ; clustering-jstat-2008 , is responsible for the computational complexity of a constraint satisfaction problem close to the SAT-UNSAT transition. It means that to obtain efficient approximation algorithms, we should be able to extract efficiently the global information that is relevant to the problem, from the system of interacting variables. For instance, the Gaussian elimination algorithm provides an efficient way of solving a set of linear equations over binary variables, which is known as the XOR-satisfiability (XORSAT) problem xor-prl-2001 ; xor-pre-2001 . See also Ref. qxor-prl-2009 for a quantum algorithm for the XORSAT problem. Nevertheless, it is very difficult to write the Gaussian elimination algorithm in a form that is amenable to local message-passing algorithms MM-book-2009 . Another example in this direction is provided by Ref. entropy-jstat-2016 , where the entropy, or number of solutions in a region around a point in the configuration space, is estimated at each step to guide the search algorithm. Here, the entropy is playing the role of the global information that is used by the algorithm. The main problem is that obtaining good estimations of the relevant global quantities and writing this computation in a locally manageable way is usually difficult. There are, however, special examples, where the global constraints can be treated exactly and efficiently via message passing along a spanning tree of the interaction graph globalgame-2011 ; sign-prb-2012 .

In a previous study QR-pra-2017 , we introduced a quantum reinforcement algorithm, which uses the global information contained in the wave function of the system in a quantum annealing algorithm. More precisely, we considered a continuous-time quantum random walk in the configuration space of the classical optimization problem ALZ-pra-1993 ; K-cp-2003 ; A-jqi-2003 . At the beginning of the algorithm, the on-site energies at each point of the configuration space are given by the energy function of the classical problem. These on-site energies are gradually modified according to the wave function of the evolving quantum system to localize preferentially the wave function on a solution to the classical problem. Using exact numerical simulations of small systems, we showed that such quantum feedback increases the minimal energy gap of the quantum system in a quantum annealing algorithm, and therefore could be useful in the study of hard optimization problems F-sci-2001 ; NC-book-2002 .

Notice that the quantum reinforcement algorithm results in a nonlinear Schrodinger equation, and it is known that one can efficiently solve a computationally hard problem with nonlinear quantum mechanics lloyd-prl-1998 . In addition, we know that the standard quantum annealing algorithm is frustrated by the exponentially small energy gaps of the system in the annealing process AHJ-pnas-2010 ; qxor-prl-2010 ; qxor-pre-2011 ; qxor-pra-2012 . There are remedies to this problem that work by adding auxiliary interactions to the Hamiltonian to suppress the spoiling quantum transitions in the annealing process B-jpa-2009 ; C-prl-2013 . These auxiliary interactions are highly nonlocal, but good approximation algorithms can still be obtained by replacing the nonlocal Hamiltonians with effective local Hamiltonians localCA-pra-2014 ; localCA-pnas-2017 .

In this paper, we show that the local versions of the quantum reinforcement algorithm work also for larger problem sizes. To this end, we resort to quantum Monte Carlo simulations of the algorithm, using the path-integral representation of the quantum system at equilibrium for sufficiently low temperatures tosatti-prb-2002 ; pathMC-prb-2008 ; QC-prep-2013 . We apply the algorithm to the XORSAT problem close to the SAT-UNSAT phase transition, where the problem is expected to be hard for a local algorithm. We compare the performance of the quantum reinforcement algorithm with that of the standard quantum annealing algorithm for problems with a few hundreds of variables. We observe considerable improvements in the success probability of the algorithms by adding reinforcement to the quantum annealing algorithm. Note that our previous study QR-pra-2017 was limited to small problem sizes and exact numerical simulations of a fully connected spin-glass model. Moreover, in that study we could not observe the superior performance of the quantum reinforcement algorithm in larger systems, compared to the standard quantum annealing.

The paper is organized as follows. In Sec. II we define the problem in more detail. Then we briefly review the quantum reinforcement algorithm and its local approximations in Sec. III. The path-integral Monte Carlo simulation of the algorithms is described in Sec. IV. Section V is devoted to the presentation of the numerical results, and finally Sec. VI gives the conclusions.

II Problem statement and definitions

We consider the classical optimization problem of minimizing an energy function $E(\boldsymbol{\sigma})$ of $N$ binary spins $\sigma_{i}=\pm 1$ . As the benchmark, we take the random regular XORSAT problem MM-book-2009 , with

[TABLE]

Here, $M$ is the number of $K$ -spin interactions and $J_{a}=\pm 1$ with equal probability. The subset of spins involved in interaction $a$ are denoted by $\partial a$ . The $M$ interactions are selected randomly and uniformly from the set of all possible $K$ -spin interactions. The interaction graph is regular in the sense that each interaction term involves exactly $K$ spins, and each spin is associated with exactly $L$ interactions.

A solution to this problem is a spin configuration with energy zero, where $J_{a}\prod_{i\in\partial a}\sigma_{i}=1$ for all the $a$ . The problem is called satisfiable if there is at least one solution to the problem. It is well known that the problem is satisfiable (SAT) with high probability for $L<K$ , and unsatisfiable (UNSAT) for $L>K$ MM-book-2009 . Moreover, the problem is computationally easy and belongs to the complexity class $P$ ; this means that we can decide if the problem is SAT or UNSAT in a computation time that grows polynomially with the size of the problem ( $N$ ). In addition, as long as the problem is satisfiable, a solution can easily be obtained by the Gaussian elimination algorithm. To be specific, we consider random regular XORSAT problems with parameters $(K=4,L=3)$ . We know that for these values of $K$ and $L$ the solution space is clustered and it is computationally difficult to find a solution by a local algorithm such as the Markov Chain Monte Carlo xor-prl-2001 ; xor-pre-2001 ; MM-book-2009 . It is also known that we need an exponentially large computation time to find the ground state of the XORSAT problem by the standard quantum annealing algorithm qxor-prl-2010 ; qxor-pre-2011 ; qxor-pra-2012 .

We shall use a continuous-time quantum random walk to explore the space of spin configurations $\boldsymbol{\sigma}=\{\sigma_{1},\dots,\sigma_{N}\}$ . The space is a hypercube of $2^{N}$ sites corresponding to the total number of spin configurations. The Hamiltonian for a particle walking in the energy landscape of the classical optimization problem is given by

[TABLE]

The parameter $\Gamma$ determines the strength of tunneling from $|\boldsymbol{\sigma}\rangle$ to a neighboring state $|\boldsymbol{\sigma}^{-i}\rangle$ . Here, $|\boldsymbol{\sigma}^{-i}\rangle$ denotes the spin state which is different from $|\boldsymbol{\sigma}\rangle$ only at site $i$ . In terms of the quantum spin variables (Pauli matrices), the above Hamiltonian reads as follows,

[TABLE]

The basis states $|\boldsymbol{\sigma}\rangle$ are the $N$ -spin states with definite $\sigma_{i}^{z}$ values, that is, $\sigma_{i}^{z}|\boldsymbol{\sigma}\rangle=\sigma_{i}|\boldsymbol{\sigma}\rangle$ .

Starting from an initial state $|\psi(0)\rangle$ , the time evolution of the isolated system is governed by the Schrodinger equation $\hat{i}\frac{d}{dt}|\psi(t)\rangle=H|\psi(t)\rangle$ with $\hbar=1$ . In the following, we shall assume that the system is always in thermal equilibrium with a thermal bath at a sufficiently small temperature. At equilibrium, the physical properties of the system are obtained from the quantum partition function $Z=\mathrm{Tr}e^{-\beta H}$ , for a large inverse temperature $\beta$ .

III Quantum Reinforcement Algorithm

In this section we briefly review the quantum reinforcement algorithm introduced in Ref. QR-pra-2017 . The goal is to find a solution to the classical optimization problem by following the time evolution of the quantum system. A quantum annealing (QA) algorithm F-sci-2001 starts from the ground state of $H_{x}\equiv-\sum_{i}\Gamma\sigma_{i}^{x}$ and changes slowly the Hamiltonian to $H_{c}\equiv\sum_{a}(1-J_{a}\prod_{i\in\partial a}\sigma_{i}^{z})$ . The adiabatic theorem then ensures that in the absence of level crossing, the system follows the instantaneous ground state of the time dependent Hamiltonian $H_{QA}(t)=s(t)H_{c}+[1-s(t)]H_{x}$ . The annealing parameter $s(t)$ changes slowly from zero at $t=0$ to one at $t=t_{max}$ . In the following, we shall assume that $s(t)=t/t_{max}$ .

In a quantum reinforcement (QR) algorithm, we add a reinforcement term to the Hamiltonian which favors the spin states of higher probability QR-pra-2017 . More precisely, the Hamiltonian is $H_{QR}(t)=H_{c}+H_{x}+H_{r}(t)$ , where the reinforcement term reads as follows,

[TABLE]

Here, $\psi(\boldsymbol{\sigma};t)$ refers to the wave function of the quantum system. The reinforcement parameter $r(t)$ is zero at the beginning and is expected to grow slowly with time.

III.1 Local approximations of the algorithm

To obtain a local version of the QR algorithm, we first replace the $|\psi(\boldsymbol{\sigma};t)|^{2}$ with $\log|\psi(\boldsymbol{\sigma};t)|^{2}$ , which is an increasing function of the probability distribution. On the other hand, we can always write $\psi(\boldsymbol{\sigma};t)=\exp(\sum_{i}K_{i}\sigma_{i}/2+\sum_{i<j}K_{ij}\sigma_{i}\sigma_{j}/2+\cdots)/\sqrt{Z}$ , taking into account all the possible multispin interactions with complex couplings $K_{i}=K_{i}^{R}+\hat{i}K_{i}^{I},K_{ij}=K_{ij}^{R}+\hat{i}K_{ij}^{I},\dots$ . Consequently, $|\psi(\boldsymbol{\sigma};t)|^{2}=\exp(\sum_{i}K_{i}^{R}\sigma_{i}+\sum_{i<j}K_{ij}^{R}\sigma_{i}\sigma_{j}+\cdots)/Z$ and $Z$ is the normalization constant. The coupling parameters $K_{i}^{R},K_{ij}^{R},\dots$ can in principle be determined from the expectation values $\langle\sigma_{i}^{z}\rangle,\langle\sigma_{i}^{z}\sigma_{j}^{z}\rangle,\dots$ inverse-advanc-2017 . A one-local quantum reinforcement ( $1$ -lQR) algorithm then is obtained by approximating the wave function with a product state,

[TABLE]

The reinforcement fields $K_{i}^{R}$ depend on the average spin values $m_{i}^{z}=\sum_{\boldsymbol{\sigma}}\sigma_{i}|\psi(\boldsymbol{\sigma};t)|^{2}$ through $K_{i}^{R}=\frac{1}{2}\log((1+m_{i}^{z})/(1-m_{i}^{z}))$ . More accurate approximations of the wave function and the quantum reinforcement algorithm can be obtained by considering the two-spin interactions in the expansion. This gives a two-local quantum reinforcement ( $2$ -lQR) algorithm. Similarly, one obtains the higher-order approximations. The interaction pattern of the random regular XORSAT problem, however, suggests a $K$ -local reinforced Hamiltonian, where

[TABLE]

In the following, we shall focus mainly on the $1$ -lQR algorithm.

IV The simulated quantum reinforcement algorithm

Let us consider the one-local QR Hamiltonian $H_{QR}(t)=H_{c}+H_{x}+H_{r}^{local}(t)$ with $H_{r}^{local}(t)=-r(t)\sum_{i}K_{i}^{R}\sigma_{i}^{z}$ . In the following, we ignore the constant term in the energy function of the classical problem. Using the Suzuki-Trotter decomposition for the partition function $Z_{QR}=\mathrm{Tr}\exp(-\beta H_{QR})$ , we get

[TABLE]

Here, $\alpha=1,\dots,N_{s}$ shows different imaginary times, and $N_{s}$ is the number of imaginary-time slices. Note that we are using the periodic boundary condition, i.e., $\boldsymbol{\sigma}(N_{s}+1)=\boldsymbol{\sigma}(1)$ . The bold symbols $\boldsymbol{\sigma}(\alpha)$ show the spin values $\sigma_{i}(\alpha)$ for a given imaginary time $\alpha$ . On the other hand, the vector $\vec{\sigma}_{i}$ displays the spin values at site $i$ for different imaginary times.

Specifically, the partition function for our problem can be written as

[TABLE]

where we defined $\tau\equiv\beta/N_{s}$ . This defines a positive probability measure for the spin configuration (for positive $\Gamma$ ) which can be used in a standard Monte Carlo (MC) simulation. In each step of the Monte Carlo, we replace the imaginary spin values $\vec{\sigma}_{i}$ with $\vec{\sigma}_{i}^{\prime}$ , which is sampled from the following probability distribution,

[TABLE]

This is a one-dimensional problem and the new configuration can easily be obtained by the transfer-matrix method QC-prep-2013 . Here, we use the belief propagation (BP) algorithm for this task. The BP algorithm is explained with more details in the Appendix. More precisely, the new spin values $\vec{\sigma}_{i}^{\prime}$ are obtained one by one with a decimation algorithm; at each step the value $\sigma_{i}^{\prime}(\alpha)$ is sampled from the marginal probability distribution $\mu_{\alpha}(\sigma)$ , which is computed by the BP algorithm conditioned on the values of the previously decimated spins. In each Monte Carlo sweep, the $N$ spin vectors $\vec{\sigma}_{i}$ are chosen in a random sequential way and are updated according to the above procedure.

Having a quantum Monte Carlo simulation, the simulated QR algorithm starts with a random spin configuration $\{\vec{\sigma}_{1},\dots,\vec{\sigma}_{N}\}$ , where $\sigma_{i}(\alpha)=\pm 1$ with equal probability. We set the reinforcement parameter $r(t)=0$ and couplings $K_{i}^{R}(t)=0$ , at time step $t=0$ . Then, for each time step $t=1,\dots,t_{max}$ we do the following:

Perform $t_{eq}$ Monte Carlo sweeps for equilibration. 2. 2.

Use the last $t_{av}$ sweeps to estimate the averages $m_{i}=\sum_{\alpha}\sigma_{i}(\alpha)/N_{s}$ . 3. 3.

Update the reinforcement couplings $K_{i}^{R}(t)=\frac{1}{2}\log((1+m_{i})/(1-m_{i}))$ . 4. 4.

Increase the reinforcement parameter $r(t)=r(t-1)+\delta r$ . 5. 5.

Compute $E(\boldsymbol{\sigma}(\alpha))$ for $\alpha=1,\dots,N_{s}$ . 6. 6.

Report the minimum energy $E_{min}(t)=\min_{\alpha}E(\boldsymbol{\sigma}(\alpha))$ and stop if $E_{min}(t)=0$ .

The partition function for the $K$ -local QR Hamiltonian is obtained simply by adding the extra reinforcement term, i.e., $-r(t)\sum_{a}K_{a}^{R}\prod_{i\in\partial a}\sigma_{i}(\alpha)$ , to the energy function of the replicated system $E(\boldsymbol{\sigma}(\alpha))$ . The simulation of the $K$ -local QR algorithm is similar to the $1$ -local QR algorithm except in steps $2$ and $3$ . Here, in addition to the $m_{i}$ in step $2$ , we need also to compute the average values $m_{a}=\sum_{\alpha}\prod_{i\in\partial a}\sigma_{i}(\alpha)/N_{s}$ , and in step $3$ , we have to solve the inverse problem of computing the $K_{i}^{R}$ and $K_{a}^{R}$ from the expectation values $m_{i}$ and $m_{a}$ . In the Appendix, we describe an approximate algorithm to deal with this inverse problem inverse-advanc-2017 . The idea is to start from $K_{i}^{R}(old)=K_{a}^{R}(old)=0$ and change slightly the parameters depending on the difference in the associated expectation values, i.e. $K_{i,a}^{R}(new)=K_{i,a}^{R}(old)+\eta(m_{i,a}-m_{i,a}(old))$ , for a positive and small $\eta$ . We compute the expectation values $m_{i,a}(old)$ by the BP algorithm with the parameters $K_{i,a}^{R}(old)$ . After each step the old parameters are replaced with the new ones, and the process is repeated for $t_{inv}$ steps.

For comparison, we also simulate the standard quantum annealing algorithm with Hamiltonian $H_{QA}(t)=s(t)H_{c}+[1-s(t)]H_{x}$ . Here, we do not have the reinforcement terms in the energy function of the replicated system. Instead the energy function and $\Gamma$ are replaced with $s(t)E(\boldsymbol{\sigma}(\alpha))$ and $[1-s(t)]\Gamma$ , respectively. The algorithm is similar to but simpler than the $1$ -local QR algorithm, in that steps $2-4$ are replaced with one step which updates $s(t)=t/t_{max}$ . As before, we start from a random spin configuration. Note that at the beginning of the algorithm ( $t=0$ ) we have a system of independent spins $\vec{\sigma}_{i}$ , and in each MC sweep, we replace all spins $\vec{\sigma}_{i}$ with new ones from the equilibrium probability distribution. Therefore, the first MC sweeps are enough to equilibrate the system at the beginning of the algorithm, even for a sufficiently large inverse temperature $\beta$ .

V Numerical Results and Discussion

In this section we compare the performances of the algorithms introduced in the previous section. As the benchmark, we take the problem of minimizing the energy function of the random regular XORSAT problem with parameters $(K=4,L=3)$ . Let us start from comparing the success probability of the $1$ -local QR algorithm with that of the standard QA algorithm.

Figure 1 shows the success probability of the two algorithms for different relevant parameter values in the algorithms. The success probability $P_{success}$ here refers to the fraction of times that an algorithm provides a zero-energy spin configuration satisfying all the constraints. Each time we take an independently generated random instance of the problem, which is identified with the random structure of the interaction graph and the random values of the couplings $J_{a}$ . We run the algorithms for a sufficiently large number of problem instances $N_{samples}$ to obtain a reasonable stationary value for the success probability. The number of samples ranges from a few hundreds to at most ten thousands depending on the problem size, As expected, we observe that $P_{success}$ decreases exponentially with the problem size $N$ . The QR algorithm, however, exhibits much better performances than the QA algorithm for different parameter values. We recall that by adding the reinforcement to the Hamiltonian we are in fact increasing the minimal energy gap of the system in the annealing process QR-pra-2017 ; that is because the reinforced Hamiltonian is assigning lower energies to the more probable states.

Figure 2 displays more results from the $1$ -local QR algorithm to see how the algorithm parameters affect the success probability. Note that for $N_{s}=20$ the best performances are observed for $\beta=30$ (i.e., $\tau=1.5$ ). Moreover, the behavior of the algorithm is not very sensitive to the values of $\Gamma=1,2,3$ and $\delta r=0.001,0.002,0.005$ . In Fig. 3, we compare the efficiencies of the $1$ -local and $K$ -local QR algorithms for a larger number of imaginary-time slices and longer annealing times. We observe a small improvement in the success probability and computation time of the local QR algorithm by considering the $K$ -local interactions in the wave function. Here, the quality of the approximate inverse algorithm in the $K$ -local algorithm is very crucial. The difference in the performances of the two local algorithms is expected to be more pronounced if we employ more accurate inverse algorithms. Finally, for comparison, in Fig. 4 we also report the success probability of a powerful classical optimization algorithm (reinforced BP), which is described in the Appendix. This shows that by adding a local reinforcement to the quantum annealing algorithm, one can achieve performances that are better than or comparable to those of the classical algorithm.

VI Conclusion

We employed the path-integral quantum Monte Carlo to simulate the behavior of the quantum reinforcement algorithms in optimization of a hard constraint satisfaction problem. We observed that local quantum reinforcements can significantly improve the success probability of the standard quantum annealing algorithm. The performance of the simulated quantum reinforcement algorithm can systematically be improved by considering more accurate representations of the system wave function (e.g., tensor networks tn-siam-2008 ; tn-anp-2014 ; tn-arxiv-2018 ) in the annealing process, and by utilizing more efficient approximations for estimating the wave-function parameters from the measurements.

In this paper, we assumed the quantum system is close to the thermal equilibrium as the Hamiltonian changes with time. This means that in practice the equilibration time should be smaller than the time scale of changing the Hamiltonian. Moreover, we did not consider the effect of measurements, which are needed for implementing the reinforcement, on the quantum state of the system. In this sense, the simulated quantum reinforcement algorithm which was presented in this paper is closer to a classical optimization algorithm. A more realistic simulation of the quantum annealing process with reinforcement, should consider an open quantum system which also interacts with a classical (or even quantum) controller. The controller is to adjust the reinforcement Hamiltonian, which depends on the outcomes of the necessary (weak) measurements, e.g., measurements of the local magnetizations DK-pra-1999 ; Qestimation-prl-2006 ; Qcontrol-book . This is the subject of our future study.

There are quantum annealers that provide hardware support for solving an optimization spin-glass problem qa-nature-2011 ; qa-nc-2013 . An experimental implementation of the quantum annealing algorithm on such devices first needs a mapping of the optimization problem to the Ising model with two-spin interactions lucas-fn-2014 . In addition, one also needs to embed the interactions of the Ising Hamiltonian onto the interaction graph of the specified device. Each of the above steps requires a polynomial number of auxiliary spins to be added to the system, and thus increases the size of the necessary device choi-qi-2008 ; choi-qi-2011 ; sg-prx-2015 . The quantum reinforcement algorithm could increase this complexity by adding other ancillary spins to the system for an indirect or weak measurement of the local magnetizations and correlations.

Appendix A Bethe approximation and Belief Propagation (BP) equations

Consider an interacting system of $N$ binary variables $\sigma_{i}\in\{-1,+1\}$ with the following energy function

[TABLE]

The interaction pattern of the variables is identified with the neighborhood subsets $\partial a$ and $\partial i$ . Here, $\partial a$ gives the set of variables in constraint $a$ , and $\partial i$ is the set of constraints involving variable $i$ . The partition function for this problem reads as follows,

[TABLE]

where the inverse temperature parameter is absorbed in the couplings $h_{i,a}$ .

Assuming that the interaction graph is locally treelike, the local averages $m_{i}=\langle\sigma_{i}\rangle$ and $m_{a}=\langle\prod_{i\in\partial a}\sigma_{i}\rangle$ can be written in terms of the cavity probabilities MM-book-2009 ,

[TABLE]

Here $\mu_{i\to a}(\sigma_{i})$ is the probability of state $\sigma_{i}$ for variable $i$ in the absence of interaction $a$ , and $\mu_{a\to i}(\sigma_{i})$ is the message that variable $i$ receives from interaction factor $a$ to satisfy the interaction. The $z_{i}$ and $z_{a}$ are normalization constants. The cavity messages are governed by the Bethe equations,

[TABLE]

with the normalization constants $z_{i\to a}$ and $z_{a\to i}$ . The cavity equations are solved by iteration starting from random initial values for the cavity messages. Then, the messages are used to find the local estimation values from the above equations.

A.1 Solving the inverse problem within the Bethe approximation

The BP algorithm provides an efficient way of estimating the expectation values, given the energy function. This approximation method is useful also in solving the inverse problem of constructing the energy function, here the parameters $h_{i}$ and $h_{a}$ , which best describes the given expectation values $m_{i}$ and $m_{a}$ . A simple strategy, assuming that there is no error in the $m_{i,a}$ , is to find the set of parameters $\mathbf{h}=\{h_{i,a}\}$ that minimize the differences between the resulting $m_{i,a}[\mathbf{h}]$ and the given values $m_{i,a}$ . The following algorithm tries to solve the above problem with iteration:

•

Start at time step zero $t=0$ with initial parameters $h_{i}(t)=h_{a}(t)=0$ .

•

For $t=0,\dots,t_{inv}$ do:

compute the expectation values $m_{i,a}[\mathbf{h}(t)]$ ; 2. 2.

compute the deviations $\Delta_{i,a}=|m_{i,a}-m_{i,a}[\mathbf{h}(t)]|$ ; 3. 3.

stop if the maximum deviation is smaller than $\epsilon$ ; 4. 4.

change the parameters $h_{i,a}(t+1)=h_{i,a}(t)+\eta(m_{i,a}-m_{i,a}[\mathbf{h}(t)])$ .

Here, we use the BP algorithm to estimate the average values $m_{i,a}[\mathbf{h}(t)]$ . The parameter $\eta$ is a sufficiently small and positive number.

A.2 The reinforced BP algorithm

The Bethe approximation also provides an approximate algorithm to find a solution to the XORSAT problem. A solution is a spin configuration which satisfies all the XORSAT constraints, i.e. $\prod_{i\in\partial a}\sigma_{i}=J_{a}$ for all the $a$ , with $J_{a}=\pm 1$ . To this end, one introduces the reinforced term $E_{r}(\boldsymbol{\sigma})=-r\sum_{i}\mu_{i}(\sigma_{i})$ to the energy function. The reinforcement parameter $r$ is assumed to increase slowly with the number of algorithm iterations. More precisely, the reinforced BP (rBP) equations for the cavity messages at iteration $t$ are

[TABLE]

The small external fields $h_{i}$ , with a magnitude much less than one, are to break the high symmetry of the problem. The indicator function $\mathbb{I}_{a}(\sigma_{\partial a})$ is one if constraint $a$ is satisfied, otherwise, it is zero. Moreover, the local marginal probabilities are given by

[TABLE]

The equations are solved by iteration starting from random initial messages and updating them according to the above equations for at most $t_{max}$ iterations. At each iteration, we update all the cavity and local marginals. We also introduce damping to the iterative process, i.e., at each step the messages are updated as follows: $\mu^{t+1}=\lambda\mu^{t}+(1-\lambda)\mu^{t+1}$ with a damping parameter $0<\lambda<1$ . We set $r(0)=0$ and increase the reinforcement parameter linearly with the number of iterations as $r(t+1)=r(t)+\delta r$ . After each iteration, a candidate spin configuration for solution is obtained by looking at the local marginal probabilities $\sigma_{i}^{*}=\arg\max\mu_{i}(\sigma_{i})$ . The algorithm stops when the candidate configuration is a solution to the problem.

Bibliography47

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1) Monasson, Remi, et al. ”Determining computational complexity from characteristic ‘phase transitions’.” Nature 400.6740 (1999): 133.
2(2) Mezard, Marc, Giorgio Parisi, and Riccardo Zecchina. ”Analytic and algorithmic solution of random satisfiability problems.” Science 297.5582 (2002): 812-815.
3(3) Sutton, Richard S., and Andrew G. Barto. Introduction to reinforcement learning. Vol. 135. Cambridge: MIT Press, 1998.
4(4) Braunstein, Alfredo, and Riccardo Zecchina. ”Learning by message passing in networks of discrete synapses.” Physical review letters 96.3 (2006): 030201.
5(5) Krzakala, Florent, et al. ”Gibbs states and the set of solutions of random constraint satisfaction problems.” Proceedings of the National Academy of Sciences 104.25 (2007): 10318-10323.
6(6) Semerjian, Guilhem. ”On the freezing of variables in random constraint satisfaction problems.” Journal of Statistical Physics 130.2 (2008): 251-293.
7(7) Dall’Asta, Luca, Abolfazl Ramezanpour, and Riccardo Zecchina. ”Entropy landscape and non-Gibbs solutions in constraint satisfaction problems.” Physical Review E 77.3 (2008): 031118.
8(8) Mezard, Marc, Thierry Mora, and Riccardo Zecchina. ”Clustering of solutions in the random satisfiability problem.” Physical Review Letters 94.19 (2005): 197205.