Analog Errors in Quantum Annealing: Doom and Hope

Adam Pearson; Anurag Mishra; Itay Hen; Daniel Lidar

arXiv:1907.12678·quant-ph·December 30, 2021

Analog Errors in Quantum Annealing: Doom and Hope

Adam Pearson, Anurag Mishra, Itay Hen, Daniel Lidar

PDF

TL;DR

This paper investigates how analog control errors, causing $J$-chaos, severely impair quantum annealing performance, but demonstrates that quantum annealing correction can mitigate these effects and restore potential quantum speedups.

Contribution

It provides empirical evidence of $J$-chaos causing catastrophic failure in quantum annealing and shows that quantum annealing correction can effectively mitigate this issue.

Findings

01

$J$-chaos leads to worse-than-classical scaling in quantum annealing.

02

Quantum annealing correction improves scaling to outperform classical methods.

03

Empirical validation on D-Wave quantum annealers confirms the mitigation strategy's effectiveness.

Abstract

Quantum annealing has the potential to provide a speedup over classical algorithms in solving optimization problems. Just as for any other quantum device, suppressing Hamiltonian control errors will be necessary before quantum annealers can achieve speedups. Such analog control errors are known to lead to $J$ -chaos, wherein the probability of obtaining the optimal solution, encoded as the ground state of the intended Hamiltonian, varies widely depending on the control error. Here, we show that $J$ -chaos causes a catastrophic failure of quantum annealing, in that the scaling of the time-to-solution metric becomes worse than that of a deterministic (exhaustive) classical solver. We demonstrate this empirically using random Ising spin glass problems run on the two latest generations of the D-Wave quantum annealers. We then proceed to show that this doomsday scenario can be mitigated using…

Tables1

Table 1. Table 1: Fit parameters for Eq. ( 6 ) after data collapse of the TTS scaling data shown in Fig. 6 . “Upper” and “lower” refers to the 95 % percent 95 95\% C.I. values of the parameters, calculated as explained in detail in Section B.3 . Of particular note is the d 𝑑 d parameter, which determines the asymptotic scaling. For QAC d < 2 𝑑 2 d<2 while for C d > 2 𝑑 2 d>2 , with d = 2 𝑑 2 d=2 being the scaling of an exhaustive classical solver.

Scheme	$a$	$b$	$c$	$d$
C	8.01	0.134	1.61	2.12
C lower	5.71	0.109	1.58	2.10
C upper	10.3	0.159	1.64	2.15
QAC	0.392	0.069	0.486	1.73
QAC lower	0.384	0.057	0.483	1.70
QAC upper	0.399	0.081	0.490	1.75

Equations28

H (s) = A (s) H_{X} + B (s) \tilde{H}_{Ising},

H (s) = A (s) H_{X} + B (s) \tilde{H}_{Ising},

H_{Ising}

H_{Ising}

δ H_{Ising}

δ h_{i}, δ J_{ij} \sim N (0, η^{2}) .

δ h_{i}, δ J_{ij} \sim N (0, η^{2}) .

TTS = t_{f} ⌈ \frac{ln ( 1 - 0.99 )}{ln ( 1 - P _{g} )} ⌉,

TTS = t_{f} ⌈ \frac{ln ( 1 - 0.99 )}{ln ( 1 - P _{g} )} ⌉,

TTS_{DP} / P_{g} = O (L^{2} 2^{4 L} e^{c_{DP} L^{2}}), c_{DP} = 8 η^{α}

TTS_{DP} / P_{g} = O (L^{2} 2^{4 L} e^{c_{DP} L^{2}}), c_{DP} = 8 η^{α}

f (L, η) = 1 0^{a (η^{2} + b^{2})^{c} L^{d}},

f (L, η) = 1 0^{a (η^{2} + b^{2})^{c} L^{d}},

μ = D \cdot B = i = 1 \sum g D_{i} B_{i}

μ = D \cdot B = i = 1 \sum g D_{i} B_{i}

σ^{2} = i = 1 \sum g D_{i} (B_{i} - μ)^{2} .

σ^{2} = i = 1 \sum g D_{i} (B_{i} - μ)^{2} .

γ_{opt} = γ arg max P_{g} (i, L, η, γ)

γ_{opt} = γ arg max P_{g} (i, L, η, γ)

\overline{σ_{i}^{z}} = l = 1 \sum n σ_{i_{l}}^{z}, \overline{σ_{i}^{z} σ_{j}^{z}} = l = 1 \sum n σ_{i_{l}}^{z} σ_{j_{l}}^{z} .

\overline{σ_{i}^{z}} = l = 1 \sum n σ_{i_{l}}^{z}, \overline{σ_{i}^{z} σ_{j}^{z}} = l = 1 \sum n σ_{i_{l}}^{z} σ_{j_{l}}^{z} .

H_{P} = - i = 1 \sum N σ_{i_{p}}^{z} \overline{σ_{i}^{z}},

H_{P} = - i = 1 \sum N σ_{i_{p}}^{z} \overline{σ_{i}^{z}},

\overline{H} (s) = A (s) H_{X} + B (s) (α \overline{H}_{Ising} + γ H_{P}),

\overline{H} (s) = A (s) H_{X} + B (s) (α \overline{H}_{Ising} + γ H_{P}),

g_{1}

g_{1}

g_{3}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Analog Errors in Quantum Annealing: Doom and Hope

Adam Pearson*(1,2), Anurag Mishra(1,2,6), Itay Hen(1,2,3), Daniel A. Lidar(1,2,4,5)*

*(1)*Department of Physics and Astronomy, *(2)*Center for Quantum Information Science & Technology, *(3)*Information Sciences Institute, *(4)*Department of Electrical Engineering, *(5)*Department of Chemistry, University of Southern California, Los Angeles, CA 90089

*(6)*Current address: Qulab Inc., 1642 Westwood Blvd, Los Angeles, CA 90024, USA

Abstract

Quantum annealing has the potential to provide a speedup over classical algorithms in solving optimization problems. Just as for any other quantum device, suppressing Hamiltonian control errors will be necessary before quantum annealers can achieve speedups. Such analog control errors are known to lead to $J$ -chaos, wherein the probability of obtaining the optimal solution, encoded as the ground state of the intended Hamiltonian, varies widely depending on the control error. Here, we show that $J$ -chaos causes a catastrophic failure of quantum annealing, in that the scaling of the time-to-solution metric becomes worse than that of a deterministic (exhaustive) classical solver. We demonstrate this empirically using random Ising spin glass problems run on the two latest generations of the D-Wave quantum annealers. We then proceed to show that this doomsday scenario can be mitigated using a simple error suppression and correction scheme known as quantum annealing correction (QAC). By using QAC, the time-to-solution scaling of the same D-Wave devices is improved to below that of the classical upper bound, thus restoring hope in the speedup prospects of quantum annealing.

I Introduction

The demonstration of scaling speedups Shor (1997); Bravyi et al. (2018) using quantum hardware is the holy grail of quantum computing, and massive efforts are underway worldwide in their pursuit. A daunting obstacle is the fact that all physical implementations of quantum computers suffer from analog control errors, in which the coefficients of the Hamiltonian implemented differ from those intended (e.g., Barends et al. (2014)), a fact that threatens to spoil the results of computations due to the accumulation of small errors. This problem was recognized early on in the gate model of quantum computing Landauer (1995), and soon after theoretically dealt with by error discretization via quantum error correcting codes Shor (1996). Moreover, the accuracy threshold theorem guarantees that if the physical gates used to implement encoded, error-corrected quantum circuits have sufficiently high fidelity, then any real, noisy quantum computation can be made arbitrarily close to the intended, noiseless computation with modest overhead Aliferis et al. (2006); Chao and Reichardt (2018); Lidar and Brun (2013). In contrast, the ultimate impact of analog control errors in Hamiltonian quantum computing, in particular in adiabatic quantum optimization Farhi et al. (2000) and quantum annealing (QA) Kadowaki and Nishimori (1998), is not as clear. While unlike the gate model adiabatic quantum evolution is inherently robust to path variations due to unitary control errors Childs et al. (2001), there is as of yet no equivalent mechanism of error discretization or an analogous accuracy threshold theorem in this paradigm. Yet, at the same time quantum annealing offers a currently unparalleled opportunity to explore NISQ-era Preskill (2018) quantum optimization with thousands of qubits Albash and Lidar (2018); Mandrà and Katzgraber (2018). It is thus of great importance to assess the role of analog control errors in QA, and to find ways to mitigate them. Here we do so in the context of spin glass problems, which are known to exhibit a type of control-error induced bond or disorder chaos already in a purely classical setting, causing chaotic changes in the ground or equilibrium state Bray and Moore (1987); Katzgraber and Krzakala (2007).

The extent to which analog control errors present a challenge in using any physical realization of quantum annealing, such as the D-Wave processors Bunyk et al. (2014); Harris et al. (2010a, b), for optimization, cannot be overstated. Indeed, an earlier study of such errors in these processors found evidence of sub-classical performance and referred to the effect as $J$ -chaos Martin-Mayor and Hen (2015), a terminology we adopt here. More recently it was shown that analog control noise causes a decrease in the probability that the implemented Hamiltonian shares a ground state with the intended Hamiltonian that scales exponentially in the size of the problem and the magnitude of the noise Albash et al. (2019). This means that even if the annealer solves the implemented problem correctly, it has an exponentially shrinking probability of finding the intended ground state. In other words, subject to $J$ -chaos an otherwise perfectly functioning quantum annealer will typically find the correct answer to the wrong problem.

To mitigate this “wrong Hamiltonian” problem and restore the prospects for a speedup in the use of quantum annealing for optimization, it is necessary to introduce techniques for error suppression and correction. This observation is not new Young et al. (2013); Bian et al. (2014); Zhu et al. (2016), and repetition coding along with the use of energy penalties has been shown to significantly enhance the performance of quantum annealers Pudenz et al. (2014, 2015); Mishra et al. (2015); Vinci et al. (2015, 2016); Vinci and Lidar (2018). In contrast to previous work, here, for the first time, we directly address the impact of $J$ -chaos on algorithmic scaling of optimization in experimental quantum annealing, while accessing a computational scale that is still far out of reach of current gate-model quantum computing devices. We employ a quantum annealing correction method to mitigate the problem, and demonstrate that while the scaling of uncorrected quantum annealing in solving random Ising spin glass problems is catastrophically affected by $J$ -chaos — in that it is worse than even that of a deterministic (brute force) classical solver — hope for a quantum speedup Rønnow et al. (2014) is restored with error suppression and correction. This reassuring conclusion is reached here using the simplest possible error suppression and correction scheme Pudenz et al. (2014), so that much room for improvement remains for more advanced methods. We expect our results to apply broadly, certainly beyond the D-Wave devices to other quantum Weber et al. (2017); Novikov et al. (2018); Goto (2019) and semiclassical annealing implementations Inagaki et al. (2016); Goto et al. (2019), and to other forms of analog quantum computing Das and Chakrabarti (2008).

II Results

Inspired by classical simulated annealing, in which thermal fluctuations are used to hop over barriers, quantum annealing uses quantum fluctuations to tunnel through barriers Kadowaki and Nishimori (1998); Ray et al. (1989); Brooke et al. (1999); Santoro et al. (2002); Boixo et al. (2016); Muthukrishnan et al. (2016); Denchev et al. (2016). The D-Wave processors are physical implementations of such devices Johnson et al. (2011). In the standard forward annealing protocol they apply a time-dependent transverse field $H_{X}=\sum_{i}\sigma^{x}_{i}$ ( $\sigma^{a}_{i}$ denotes the Pauli matrix of type $a\in\{x,y,z\}$ applied to qubit $i\in\{1,\dots,N\}$ ) and Ising Hamiltonian as follows,

[TABLE]

where $\tilde{H}_{\mathrm{Ising}}=H_{\mathrm{Ising}}+\delta H_{\mathrm{Ising}}$ and $H_{\mathrm{Ising}}$ are, respectively, the implemented (perturbed) and intended (unperturbed) “problem” Hamiltonians, while $\delta H_{\mathrm{Ising}}$ is an error term (the perturbation). Thus $\tilde{H}_{\mathrm{Ising}}$ is the (wrong) Hamiltonian including analog control errors while $H_{\mathrm{Ising}}$ is the Hamiltonian whose ground state we wish to find as the solution to the optimization problem specified by a set of local fields $\{h_{i}\}$ and couplers $\{J_{ij}\}$ :

[TABLE]

where $\mathcal{V}$ and $\mathcal{E}$ are the vertex and edge sets of the graph $\mathcal{G}$ , and $N=|\mathcal{G}|$ . We assume that the noise is Gaussian with zero mean and standard deviation $\eta$ :

[TABLE]

We note that analog errors resulting in the replacement of $H_{X}$ by $\sum_{i}(1+\epsilon_{i})\sigma^{x}_{i}$ with random $\epsilon_{i}$ are expected as well, but we do not consider such transverse field errors here. The normalized time, $s=t/t_{\mathrm{f}}$ , with $t_{\mathrm{f}}$ denoting the final time, increases from [math] to $1$ , with $A(0)\gg B(0)$ and $B(1)\gg A(1)$ , and $A(s)$ [ $B(s)$ ] decrease (increase) monotonically. As such, the transverse field initially drives strong quantum fluctuations that eventually give way to the implemented Ising Hamiltonian. In the absence of $\delta H_{\mathrm{Ising}}$ and an environment the adiabatic theorem guarantees that the ground state of $H_{\mathrm{Ising}}$ will be found if $t_{\mathrm{f}}$ is large compared to the inverse of the minimum gap of $H(s)$ and the maximum time-derivative(s) of $H(s)$ Jansen et al. (2007); Lidar et al. (2009). In the presence of an environment the adiabatic theorem instead guarantees evolution towards the steady state of the corresponding Liouvillian Avron et al. (2012); Venuti et al. (2016), which becomes the ground state only for sufficiently low temperature (compared to the gap of the Liouvillian). When $\delta H_{\mathrm{Ising}}\neq 0$ , the probability that the computation ends in the ground state of $H_{\mathrm{Ising}}$ decreases exponentially in both $N$ and some power of $\eta$ Albash et al. (2019). The reason is that as more noise is added, there is an increasing probability that the spectrum changes such that the ground state is swapped with an excited state. Thus, increasing noise and problem size leads to a rapidly growing probability of failure to find the correct ground state. In experimental quantum annealing both the environment and control errors inevitably play a role.

II.1 Effect of control noise

In order to systematically test the effect of analog control errors we studied the performance of two D-Wave devices on random Ising instances of varying size $N$ , to which we added artificially generated Gaussian control noise $\eta$ . This noise was added to the intrinsic analog device noise $\eta_{\mathrm{int}}$ 111The D-Wave documentation refers to this as Integrated Control Errors (ICE). According to https://docs.dwavesys.com/docs/latest/c_qpu_1.html, $\eta_{\mathrm{int}}\lesssim 0.015$ for the DW2000Q device., so that the total control noise had variance $\eta^{2}_{\mathrm{int}}+\eta^{2}$ . Adding noise in this manner allowed us to test its effect on algorithmic performance, and the efficacy of the quantum annealing correction (QAC) strategy described below.

Throughout this work we used Ising instances with local fields $h_{i}=0$ and couplers $J_{ij}$ selected uniformly at random from the set $\pm\{1/6,1/3,1/2\}$ , and chose $\eta\in\{0,0.03,0.05,0.07,0.10,0.15\}$ . Such instances have been studied before Boixo et al. (2014); Rønnow et al. (2014); Pudenz et al. (2015); Katzgraber et al. (2014); King and McGeoch (2014), but only with $\eta=0$ , i.e., never subject to the systematic addition of control noise. We define “success” in all cases as finding a ground state of the unperturbed Hamiltonian $H_{\mathrm{Ising}}$ . See Methods Section .1 for a complete description of the instances and how we verified ground states.

The number of qubits $N$ is proportional to $L^{2}$ , the number of Chimera graph unit cells of the D-Wave devices we used (see Methods Section .2). Figure 1 displays a series of correlation plots between different levels of added noise, for different problem sizes parametrized by $L$ . At every size, addition of noise results in a lower success probability for all instances. Increased size also results in lower success probability, as expected. Thus, Fig. 1 gives a visual confirmation of the detrimental effect of control noise; we quantify this systematically below.

Next we test whether control noise results in $J$ -chaos. The latter exhibits itself as large variations in the success probability of the programmed Hamiltonian across different runs. We quantify this in terms of the J-chaoticity measure $\sigma/\mu$ , where $\sigma$ is the standard deviation of the success probability across repeated runs of a given instance, and $\mu$ is the corresponding mean (see Methods Section .3 for more details). In Fig. 2, we plot the correlation of the J-chaoticity measure with increasing noise. For most instances this quantity becomes larger with increasing noise and size, which indicates that they are becoming more chaotic. Success probability is also strongly (negatively) correlated with increasing J-chaoticity, as shown in Fig. 3. This establishes that control-noise induced $J$ -chaos is responsible for a strong decline in performance. Before we quantify this decline in terms of the time-to-solution metric, we first address how to mitigate this problem using error suppression and correction.

II.2 Quantum Annealing Correction

The error suppression and correction scheme used in this work is the ${[3,1,3]}_{1}$ QAC code introduced in Pudenz et al. (2014) and further studied experimentally in Pudenz et al. (2015); Vinci et al. (2015); Mishra et al. (2015). We refer the reader to these references for details, and to Methods Section .4 for a brief summary. The ${[3,1,3]}_{1}$ code is a three-qubit repetition code that corrects bit-flip errors, with the subscript denoting one extra penalty qubit. The penalty term energetically suppresses all errors that do not commute with $\sigma^{z}$ during the anneal.

Since the ${[3,1,3]}_{1}$ code graph is a minor of the Chimera graph (see Methods Fig. 9), we can also implement these instances without QAC. But, to ensure a fair comparison we need to equalize the resources used with QAC and without it. The ${[3,1,3]}_{1}$ code consumes four qubits to encode one logical qubit. Thus, we can use the same amount of resources as the encoded logical problem by running four unencoded copies in parallel, which is called the classical repetition strategy (C). To be clear, the difference between QAC and the C strategy is fourfold: QAC uses logical qubits, logical operators, and an energy penalty term, while C uses physical qubits, physical operators, and no penalty. The decoding strategy for QAC is a majority vote over the three data qubits of each logical qubit, while for C it is best-of-four-copies of the logical problem solved by QAC. A more powerful, nested QAC strategy is known Vinci et al. (2016), but it requires more physical qubits per logical qubit, and hence is less suitable for a scaling analysis of the type we perform here.

We now discuss the results after the application of QAC and compare them to the C strategy. Fig. 4 illustrates that for relatively large problem sizes and strong added control noise, such as at $L=14$ and $\eta=0.07$ , QAC is able to find nearly all ground states while the C strategy only finds the ground state of a small fraction of instances.

More systematically, we show in Fig. 5 the fraction of instances where using QAC improved success probability when compared to the C strategy. If more than half of the instances exhibit better performance for the QAC strategy, applying it is useful for median instances. Evidently, QAC becomes a better strategy for large size and large noise, i.e., as finding the ground state becomes harder. Similarly, we compare the J-chaoticity measure $\sigma/\mu$ for QAC and C, as seen in Fig. 5. Just as in Fig. 5, we see greater advantage using QAC over C with increasing size and noise. These observations are consistent with earlier results Pudenz et al. (2014, 2015); Vinci et al. (2015); Mishra et al. (2015) where the ${[3,1,3]}_{1}$ code performs better than the classical repetition strategy at large problem sizes. Here, we have shown that this is also true in the high control noise and $J$ -chaos regime.

To understand the source of the improvement in the success probability, consider that application of repetition codes can decrease the effective noise on the encoded problem Young et al. (2013). In particular, the encoded operators of an $n$ -qubit repetition code have an effective energy scale that scales extensively relative to the unencoded problem (see Methods Section .4), while the random control noise adds up incoherently and hence its energy contribution only scales up by a factor of $\sqrt{n}$ . Thus, the encoding reduces the effective noise of the problem by the factor $1/\sqrt{n}$ . This enhances the success probability of the encoded problem over the simple C strategy, essentially by leveraging just classical properties of the repetition code. However, there is a quantum mechanism at work as well: a mean-field analysis reveals that the penalty term reduces the tunneling barrier width and height in the QAC case Matsuura et al. (2016, 2017). Indeed, we shall next see that our empirical results are inconsistent with the constant success probability enhancement that would be expected from a purely classical reduction of the effective noise by the factor $1/\sqrt{3}$ (we use an $n=3$ repetition code).

II.3 Scaling of the time-to-solution

We now discuss the impact of the analog noise on the computational effort required to find a solution of the problem. This can be quantified by the time-to-solution (TTS) metric Rønnow et al. (2014), which is the number of runs required to obtain the correct ground state at least once with 99% success probability:

[TABLE]

where $t_{\mathrm{f}}$ is the total anneal time per run, and $P_{g}$ is the probability of finding the ground state in a given run (see Methods Section .3 for details on how $P_{g}$ was computed). The TTS metric is often used for benchmarking quantum annealing against classical algorithms (e.g., Hen et al. (2015); Mandrà et al. (2016); King et al. (2017); Hamerly et al. (2019); Jünger et al. (2019)). The TTS metric gives accurate scaling with problem size only when $t_{\mathrm{f}}$ is optimized to minimize the TTS for each size Rønnow et al. (2014); Albash and Lidar (2018). In our experiments, we used a fixed $t_{\mathrm{f}}=5\ \mu$ s, and hence these results only place a lower bound on the true scaling Hen et al. (2015), but this is sufficient for our purposes. Since our anneal time was fixed, we actually report the number of runs $R=\mathrm{TTS}/t_{\mathrm{f}}$ , which we still refer to as TTS. Note that $R\geq 1$ as one needs to run the annealer at least once to find the correct ground state.

In Fig. 6, we show the TTS required to find the ground state for the median instances at each size, for both C and QAC, sorted by different levels of added noise $\eta$ . As expected for spin glasses, the TTS scales at least exponentially in $L$ for both cases, with the scaling becoming worse for larger $\eta$ . However, we note that QAC exhibits both milder scaling and lower absolute effort. The same conclusion holds when we sort instances by hardness (see Section B.2). To more directly see the advantage of QAC over C, we plot the speedup ratio Hen et al. (2015) in Fig. 7. This plot clearly shows the scaling advantage of QAC over C for sufficiently large size $L$ and added noise $\eta$ : for all added noise levels $\eta$ , the slope of the speedup ratio becomes positive beyond an initial transient at small sizes $L$ , and this happens sooner the larger $\eta$ is.

II.4 Data collapse and scaling: doom vs hope

We have seen that QAC outperforms the C strategy. But what is the worst-case classical cost of solving the same Ising problem instances? For a generalized Chimera graph of $L\times L$ unit cells of complete bipartite graphs $K_{r,r}$ , the tree-width is $w=rL+1$ ; for the D-Wave devices used here $r=4$ . Dynamic programming takes time $O(L^{2}2^{w})$ to find the ground state of any Ising problem defined on such a graph Selby (2014); Jünger et al. (2019). Here $2^{w}$ is the dimension of the exhaustive search space for each of the $L^{2}$ tree nodes of width $w$ . Thus in the present case any problem can be solved exactly, deterministically, in time TTS ${}_{\mathrm{DP}}=O(L^{2}2^{4L})$ . However, adding analog errors exponentially suppresses the probability of success. Specifically, if the errors are drawn from a Gaussian distribution with standard deviation $\eta$ on an instance with $N$ spins, then $P_{g}=O(e^{-\eta^{\alpha}N})$ , where $\alpha\leq 1$ depends on the problem class Albash et al. (2019). Thus, to find the ground state of the intended Hamiltonian $H_{\mathrm{Ising}}$ , running dynamic programming on the Ising instances with noise added is expected to take a time scaling as $\mathrm{TTS}_{\mathrm{DP}}\times\left\lceil\frac{\ln(1-0.99)}{\ln(1-P_{g})}\right\rceil$ , which reduces, in the limit $P_{g}\ll 1$ , to:

[TABLE]

since the dynamic programming algorithm is only presented the intended Ising instance once every $1/P_{g}$ times on average. Thus the worst case classical cost is asymptotically $O(e^{8\eta^{\alpha}L^{2}})$ . Note that this scaling is determined not by the intrinsic performance of the DP algorithm, but by the probability $P_{g}$ that it is presented with the intended Hamiltonian, which is algorithm-independent (and is, of course, not a problem an algorithm running on classical digital computers would need to suffer from). A random guess would find the intended ground state with the same asymptotic scaling: $\mathrm{TTS}_{\mathrm{rand}}/P_{g}=2^{N}e^{8\eta^{\alpha}L^{2}}=O(e^{c_{\mathrm{rand}}L^{2}})$ , but with a larger exponent: $c_{\mathrm{rand}}=c_{\mathrm{DP}}+8\ln 2$ .

To compare the D-Wave device’s TTS scaling to this form, we attempted a data collapse of the results shown in Fig. 6, in order to include the $\eta$ dependence in the scaling function. To achieve the data collapse we ran a comprehensive search for functions $f(L,\eta)$ that would collapse both the C and QAC data using as few fitting parameters as possible (see Methods Section .5 for details of the procedure). A natural choice for such a function is a generalization of Eq. (5) with up to five free parameters, of the form $\mathrm{TTS}=aL^{2}10^{bL+c(\eta^{2}+d^{2})^{e}L^{2}}$ . However, we found it to perform poorly (see Section B.3). Instead, we found that the four-parameter form

[TABLE]

where the crucial difference is the replacement of $L^{2}$ by $L^{d}$ in the exponential, works very well for both the C and QAC data (using three or fewer parameters gives poor agreement). The data collapse and fit results are shown in Fig. 8, and the fit parameters along with their $95\%$ C.I. are given in Table 1. The relatively tight error bounds are evidence of the quality of the data collapse.

Surprisingly, we find that $d>2$ for the C strategy, with high statistical confidence. This means that without error suppression, and even after using a majority vote among four copies of the problem, the performance of the quantum annealer is worse than that of a deterministic worst-case classical algorithm, for which $d=2$ . Hence the “doom” advertised in the title of this work.

Fortunately, not all is lost: this disturbing finding is mitigated by QAC. As seen in Fig. 8 and Table 1, for QAC we obtain $d<2$ , again with high statistical confidence. This result restores the hope that a quantum annealer can eventually become competitive with classical optimization algorithms, but only after the incorporation of an error suppression and correction strategy such as QAC.

III Discussion

It should be remarked that our results on optimization have no direct bearing on other tasks quantum annealers are potentially capable of speeding up, such as approximate optimization King et al. (2015); Vinci and Lidar (2016) and sampling Adachi and Henderson (2015); Amin et al. (2018); Mott et al. (2017); Li et al. (2018); Perdomo-Ortiz et al. (2018). Nor do our results address quantum annealing slowdowns due to small gaps van Dam et al. (2001); Reichardt (2004); Jörg et al. (2010); Laumann et al. (2012), which may be addressed via other methods, such as non-stoquastic Hamiltonians Nishimori and Takada (2017); Albash (2019), reverse annealing Perdomo-Ortiz et al. (2011); Chancellor (2017); Ohkuwa et al. (2018), or inhomogeneous transverse field driving Susa et al. (2018a, b). However, none of these methods is immune to the effects of $J$ -chaos.

To conclude, we have shown that QAC can reduce the detrimental effects of $J$ -chaos on the performance of quantum annealers. In the regime we tested, QAC becomes more effective the higher the noise is and the larger the problem size is. The improvements seen are distinctly greater than without error suppression and correction, even after equalizing resources in terms of total qubit count, in terms of both scaling and absolute effort. Moreover, QAC undoes a catastrophic loss to an exhaustive classical algorithm by improving the scaling of the annealer’s TTS to below the classical upper bound. Thus, we have demonstrated that QAC is not only an effective tool that can be used to improve current quantum annealing hardware, but that error suppression and correction are essential to ensure competitive performance against classical alternatives. Further improvements using more powerful error suppression and correction strategies than the simple one we explored here are certainly expected, and undoubtedly necessary, as ultimately only a fully fault-tolerant approach is expected to be effective in the asymptotic limit of large problem sizes.

Methods

.1 Random Ising Instances

Without noise — The set of instances used were generated randomly on the ${[3,1,3]}_{1}$ code graph produced by the $L\times L$ Chimera graph for $L\in\{2,\dots,12\}$ on the D-Wave 2X and $L\in\{13,\dots,16\}$ on the D-Wave 2000Q. There were $100$ instances at each graph size such that the local fields, $h_{i}$ , were [math] and the couplings $J_{ij}$ were drawn uniformly at random from the set $\pm\frac{1}{6}\times\{1,2,3\}$ . We found the ground state energy of these logical instances via the Hamze-Freitas-Selby (HFS) algorithm Selby (2014); Hamze and de Freitas (2004) and parallel tempering with iso-energetic cluster (Houdayer) moves (PTICM) Houdayer (2001); Zhu et al. (2015). By using both, we consistently found ground state energies lower than or just as low as those found by the D-Wave devices. In a few instances the latter found lower energies than HFS (one instance at $L=15$ and five at $L=16$ ), and these were confirmed as correct using PTICM. As such, we are confident that the ground state energies found were in fact correct.

With noise — We generated random numbers $\delta J_{ij}\sim\mathcal{N}(0,\eta^{2})$ . If a modified coupler value $\tilde{J}_{ij}=J_{ij}+\delta J_{ij}$ fell outside the experimentally allowed range $[-1,1]$ , we truncated it to $\pm 1$ . Since the largest coupling in our set was $0.5$ and the largest noise had a standard deviation of $\eta=0.15$ , these truncated values were only used a handful of times in our entire data collection.

.2 The D-Wave devices used

In this study, we used the D-Wave 2X (DW2X) annealer installed at the USC Information Sciences Institute and the D-Wave 2000Q (DW2000Q) annealer installed at the NASA Quantum Artificial Intelligence Laboratory (QuAIL). The qubits of the annealer occupy the vertices of the Chimera graph of size $12\times 12$ for the DW2X and $16\times 16$ for the DW2000Q (see Fig. 10 in Appendix A). The DW2X has $1098$ functional qubits, leading to a ${[3,1,3]}_{1}$ code graph with $236$ functional logical qubits and the DW2000Q has $2031$ functional qubits, leading to a ${[3,1,3]}_{1}$ code graph with $504$ functional logical qubits, as shown in Fig. 9.

The annealing time $t_{f}$ can be chosen in the range $[5,2000]\mu$ s on the DW2X and $[1,2000]\mu$ s on the DW2000Q. We used $t_{f}=5\mu$ s, since this is the fastest time that could be used across both devices.

There were some differences between the structure and performance of these two devices. A discussion of these differences can be found in Appendix C.

.3 Data analysis

We used a Bayesian bootstrap Rubin (1981) over the underlying data (collected as described in Methods Section .1) to compute the mean $\mu$ and the standard deviation $\sigma$ of the success probabilities and their associated error bars. The ground state probability $P_{g}$ used in Eq. 4 is the same as $\mu$ .

Consider $g$ gauges where for each gauge $i$ we find $s_{i}$ successful readouts out of the total $M$ readouts. We used the Beta function $\beta(s_{i},M-s_{i})$ as our posterior probability distribution of success, i.e., it is our best guess of the distribution of the success probability, given the observation of $s_{i}$ successful hits in $M$ attempts. To draw one sample of our bootstrap distribution, we did the following:

First, sample from each of the $\beta$ -distributions. Let $B_{i}$ be a sample from the distribution $\beta(s_{i},M-s_{i})$ and let $\vec{B}=\{B_{1},B_{2},\ldots,B_{g}\}$ . 2. 2.

Then, sample a point from the $g$ -dimensional uniform Dirichlet distribution. This is a $g$ -dimensional vector $\vec{D}$ . 3. 3.

The estimate of the success probability for this bootstrap sample is given by

[TABLE]

and the estimate of the standard deviation of this sample is the square root of the weighted variance of $\vec{B}$ where the weights are given by $\vec{D}$ ,

[TABLE]

In Eqs. 7 and 8, we have used the fact that the samples of the uniform Dirichlet distribution sum to $1$ : $\sum_{i}D_{i}=1$ . 4. 4.

From these two quantities, we computed $\sigma/\mu$ . Other quantities of interest can be similarly derived from a combination of $\vec{B}$ and $\vec{D}$ .

We repeated these steps a large number of times to obtain a bootstrap distribution over our quantity of interest (in this case, $\mu$ and $\sigma/\mu$ ). Our best estimate of the quantity and its associated error bars are given by the mean and the spread of the bootstrap distribution respectively.

Data for C was collected as follows. For each instance and each noise value, we ran the annealer with $5$ random gauges Boixo et al. (2013); Job and Lidar (2018) with $10,000$ readouts each. If $p$ is the success probability of one unencoded copy and each copy is statistically independent, then the C strategy will have at least one successful copy with probability $1-(1-p)^{4}$ . Here, we only collected data for a single copy, and then used this combinatorial formula to get an estimate on the success probability of the classical repetition case, as in earlier work Pudenz et al. (2015). In fact, due to crosstalk this provides an upper bound on the actual performance of the C strategy (see Section C.2), so that our results favor QAC over C even more than our plots indicate.

Data for QAC (see Section .4) was collected as follows. Every data collection run for problem instance $i$ of size $L$ can be labeled by two additional parameters; the strength of artificial injected noise $\eta$ and the strength of the penalty value $\gamma$ . The penalty strength was chosen from the set $\gamma\in\{0.1,0.2,\ldots,0.5\}$ . For each data collection run $(i,L,\eta,\gamma)$ , we ran the annealer with $5$ random gauges with $10,000$ readouts each, for a total of $50,000$ readouts. From this data we estimated the mean success probability over the gauges, $P_{g}(i,L,\eta,\gamma)$ . The optimal penalty value

[TABLE]

maximizes the success probability within the chosen range of penalty values. The results shown in Section II were picked to be at this optimal value for each instance. Histograms of the optimal strengths for each problem size are shown in Appendix D.

.4 Quantum annealing correction

In QAC we encode each logical Pauli-Z operator as a sum of $n$ such physical operators, i.e.,

[TABLE]

Furthermore, we add a term to ferromagnetically couple the physical copies through an auxiliary qubit, i.e.,

[TABLE]

where $N$ is the number of logical variables in the original optimization problem. We refer to the physical copies, $\sigma_{i_{l}}^{z}$ , as data qubits and the auxiliary qubit, $\sigma_{i_{p}}^{z}$ , as a penalty qubit. Thus, we arrive at the following encoding of our logical problem:

[TABLE]

where $\overline{H}_{\mathrm{Ising}}$ is the encoded version of ${H}_{\mathrm{Ising}}$ in Eq. (2a), i.e., ${\sigma_{i}^{z}}\mapsto\overline{\sigma_{i}^{z}}$ and ${\sigma_{i}^{z}\sigma_{j}^{z}}\mapsto\overline{\sigma_{i}^{z}\sigma_{j}^{z}}$ , and $\alpha$ is an overall energy scale for the problem Hamiltonian (not used in this work, but complementary to adding control noise Vinci et al. (2016)). When we add control noise to the QAC Hamiltonian, we replace $h_{i}$ by $\tilde{h}_{i}=h_{i}+\delta h_{i}$ and $J_{ij}$ by $\tilde{J}_{ij}=J_{ij}+\delta J_{ij}$ , with the noise satisfying Eq. (3).

The current generations of D-Wave devices allow a direct implementation of this code in the Ising Hamiltonian for $n=3$ , as shown in Fig. 9, but are unable to encode the driver Hamiltonian $H_{X}$ , as this requires many-body terms of the form $(\sigma^{x})^{\otimes n}$ . Thus, increasing the penalty strength, $\gamma$ , begins to diminish the effect of the quantum fluctuations that drive quantum annealing. On the other hand, larger $\gamma$ values are more able to suppress bit flip errors. Thus, there exists an optimal value of $\gamma$ which depends on the spectrum of the problem instance Pudenz et al. (2014, 2015); Vinci et al. (2015); Mishra et al. (2015). This optimization is further discussed in Methods Section .1.

After annealing, we obtain a state vector where each data qubit is measured in the computational basis. From this, we can obtain a state vector of logical qubits via a variety of decoding strategies Vinci et al. (2015). In this work, we exclusively used the strategy in which each logical qubit is decoded by a majority vote of its constituent data qubits.

.5 Data collapse

Here we explain our procedure for identifying the optimal fit and data collapse function, and for extracting confidence intervals (C.I.’s) and error bars. We considered trial TTS functions of the form $f(L,\eta)=10^{g_{i}(L,\eta)}$ , with:

[TABLE]

For $g_{3}$ we focused on the three cases $\{d_{1}=1/2,d_{2}=2\}$ , $\{d_{1}=d,d_{2}=2\}$ , $\{d_{1}=1/2,d_{2}=d\}$ . Thus our trial functions had either four ( $\{a,b,c,d\}$ ) or five ( $\{a,b,c,d,e\}$ ) free fitting parameters. For each trial function we computed non-linear least-square fits to the median TTS data for C on $L\in\{2,\dots,12\}$ , and QAC on $L\in\{2,\dots,16\}$ . The fitting parameters were initially allowed to take any values. However, we only accepted fits with $a\geq 0$ in $g_{3}$ , since $a<0$ (scaling that decreases with $L$ ) would have to reflect overfitting. Thus, we also computed fits where we squared all the fitting parameters [i.e., replaced $a$ by $a^{2}$ in Eq. 13, etc.] in order to enforce positivity. Furthermore, we tested if the discrepancy between the ideal and actual number of Chimera graph couplers made a difference by fitting with an effective $L$ ; see Appendix C for details. Thus, for each trial function there were four different methods: unconstrained/squared fitting parameters with $L$ /effective $L$ . Lastly, all fits were attempted with each of the optimization methods possible in Mathematica: SimulatedAnnealing, RandomSearch, NelderMead, and DifferentialEvolution.

Across the different methods and optimization algorithms used, $g_{1}$ was consistently the best of the $4$ -parameter fits and was always very close to $g_{2}$ , which is its $5$ -parameter generalization. The $g_{3}$ functions always resulted either in $a<0$ or otherwise a very poor fit. Parameter squaring also improved the fit quality, and of the optimization methods only NelderMead tended to give inferior results.

After determining the three parameters $\{a,b,c\}$ for the median TTS data for $g_{1}$ , we found least-squares fits to the upper and lower bounds determined by the $95\%$ C.I.’s for the median TTS, by using the same set $\{a,b,c\}$ and letting only $d$ be a free parameter. In this manner we found $d_{-}$ and $d_{+}$ , the exponents that provide respective lower and upper bounds on $d$ for the median TTS data. In turn, $d_{-}$ and $d_{+}$ have associated $95\%$ C.I.’s, denoted $\Delta d_{-}$ and $\Delta d_{+}$ . The reported range of $d$ in Table 1 is then $[d_{-}-\Delta d_{-},d_{+}+\Delta d_{+}]$ . The resulting fits for each $\eta$ are shown in Section B.3.

Data availability

All raw data is available upon reasonable request from the authors. A Mathematica notebook containing the TTS results, analysis scripts, error analysis, and our detailed fitting and data collapse results is available Pearson and Lidar (2019).

Appendix A Chimera graphs of the D-Wave devices used in this work

The Chimera graphs of the DW2X and DW2000Q devices we used are shown in Fig. 10. In both graphs, green (red) circles denote operational (inactive) physical qubits. The lines denote the possible coupling between the operational physical qubits. Minor embedding of the ${[3,1,3]}_{1}$ code leads to the logical graphs shown in Fig. 9.

Appendix B Additional results

B.1 Fraction of failures of both QAC and C

Success probability drops as more noise is added and problem size grows. Fig. 11 shows the fraction of instances where neither QAC nor C found the ground state. This figure complements Fig. 5, which includes all other instances, and shows that QAC improves upon C for sufficiently large values of $L$ and $\eta$ .

B.2 TTS Scaling with Size and Hardness

Figure 12 shows the same data as in Fig. 6 when we combine all the noise realizations at each size $L$ and plot different percentiles of hardness. QAC improves the scaling at all percentiles, from the easiest instances at the $10^{\mathrm{th}}$ percentile to the hardest instances at the $90^{\mathrm{th}}$ percentile.

B.3 Data collapse of TTS with error bars

Figure 13 shows the results of the fits computed for the data collapse procedure described in Methods Section .5, including the error bars.

Appendix C Difference Between the DW2X and DW2000Q

C.1 Coupler counts

Since the DW2X and DW2000Q are different generations of the D-Wave devices, differing in both structure (see Fig. 10) and noise characteristics, performance differences are to be expected. In particular, the DW2X used has $1098$ functional qubits out of $1152$ , and the DW2000Q used has $2031$ functional qubits out of $2048$ . This leads to differences in the logical problems embeddable on each device, as we now explain in detail.

The couplings in the logical graphs of the ${[3,1,3]}_{1}$ code differ between the two devices, as seen in Fig. 9. Without any holes (missing physical qubits), in the logical graph each unit cell would be reduced to two logical qubits. These logical qubits would have one coupling between them, contributing a total of $L^{2}$ couplings to the logical problem. Furthermore, each unit cell would have one coupling to the unit cell below it, contributing another $L^{2}$ , except that the last row of unit cells has no unit cells below it to connect to, so we have over-counted by $L$ couplings. The same analysis applies to the couplings to the right of each unit cell, contributing another $L^{2}-L$ couplings. Thus, the total number of couplings in the ideal graph is $L^{2}+2(L^{2}-L)=L(3L-2)$ . However, each hole in the physical graph contributes to the holes in the logical graph, removing some number of active couplers.

The difference between the ideal and actual number of couplers is shown in Fig. 14(a). As can be seen, there is a sudden jump from the DW2X to the DW2000Q in terms of number of couplers. Since the problem instances used in this work involve adding noise to each coupler present, this implies that at equal $(L,\eta)$ the problems solved on the DW2000Q are expected to be somewhat harder than on the DW2X.

C.2 The no added noise case

The results discussed in the main text excluded the $\eta=0$ results for $L\geq 13$ . We now explain why, and provide an analysis focused on the performance in the case of no added noise.

As discussed in the main text in Section II.1, there is an intrinsic level of control noise. When $\eta=0$ , only this noise plays a role. We show the TTS for both C and QAC in the $\eta=0$ case in Fig. 14(b). As can be seen, there is a sudden change in C’s performance when we switch from the DW2X to the DW2000Q at $L=13$ . This is consistent with there being less noise on the later generation machine, the DW2000Q. However, the latter also has a larger fraction of active qubits, which, as discussed Section C.1, yields a higher count of couplers involved in the problem instances for $L\geq 13$ [Fig. 14(a)]. Since the physical implementation of QAC uses four times more couplers than the C strategy, QAC should be much more affected by noise due to this jump in coupler count. Thus, for C the lower intrinsic noise dominates, while the smooth behavior seen for QAC is likely due to a cancellation of the lower intrinsic noise with the higher noise due to the higher coupler count. This argument explains why C exhibits a discontinuity in its TTS between $L=12$ and $L=13$ , and why QAC appears to transition smoothly from the DW2X to the DW2000Q.

Furthermore, this difference in physical implementation also implies that QAC will be more sensitive to coupler cross talk effects than C. Indeed, the harmful effect of cross talk can be seen in Fig. 15, in which a theoretical $3$ -copies repetition code outperforms the physical implementation for every instance. Thus, the jump in coupler count from the DW2X to the DW2000Q would introduce significant cross talk that is not harming this implementation of C. The benefits of a less noisy machine are thus countered by the cross talk associated with the sharp increase in physical coupler count for QAC, while the idealized form of C simply benefits from less noise.

Appendix D Optimal penalty strength

Fig. 16 shows histograms of the optimal penalty strength defined in Eq. 9. As can be seen, as we increase the amount of noise added, the optimal penalty strength increases. This ought to be expected, since a noisier problem will require more error suppression. The DW2000Q histograms (i.e., $L\geq 13$ ) tend slightly more towards the larger penalty strengths once we have added enough noise. This is consistent with the discussion in Appendix C about why the instances on the DW2000Q have more couplings for a given grid size.

Bibliography85

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Shor (1997) P. Shor, SIAM Journal on Computing 26 , 1484 (1997) . · doi ↗
2Bravyi et al. (2018) S. Bravyi, D. Gosset, and R. König, Science 362 , 308 (2018) . · doi ↗
3Barends et al. (2014) R. Barends, J. Kelly, A. Megrant, A. Veitia, D. Sank, E. Jeffrey, T. C. White, J. Mutus, A. G. Fowler, B. Campbell, Y. Chen, Z. Chen, B. Chiaro, A. Dunsworth, C. Neill, P. O’Malley, P. Roushan, A. Vainsencher, J. Wenner, A. N. Korotkov, A. N. Cleland, and J. M. Martinis, Nature 508 , 500 (2014) . · doi ↗
4Landauer (1995) R. Landauer, Proc. R. Soc. London Ser. A 353 , 367 (1995) . · doi ↗
5Shor (1996) P. W. Shor, Proceedings of 37th Conference on Foundations of Computer Science , 56 (1996) .
6Aliferis et al. (2006) P. Aliferis, D. Gottesman, and J. Preskill, Quantum Inf. Comput. 6 , 97 (2006) .
7Chao and Reichardt (2018) R. Chao and B. W. Reichardt, npj Quantum Information 4 , 42 (2018) . · doi ↗
8Lidar and Brun (2013) D. Lidar and T. Brun, eds., Quantum Error Correction (Cambridge University Press, Cambridge, UK, 2013).