A Quantum Solution for Efficient Use of Symmetries in the Simulation of   Many-Body Systems

Albert T. Schmitz; Sonika Johri

arXiv:1902.08625·quant-ph·April 17, 2019

A Quantum Solution for Efficient Use of Symmetries in the Simulation of Many-Body Systems

Albert T. Schmitz, Sonika Johri

PDF

TL;DR

This paper introduces a quantum algorithm leveraging Grover's search to efficiently identify symmetry-adapted basis states in many-body Hamiltonian simulations, offering exponential memory savings and quadratic speedup over classical methods.

Contribution

The paper presents a novel quantum approach using Grover's minimization to find symmetry group representatives, reducing memory and computational complexity in many-body system simulations.

Findings

01

Quantum method achieves exponential memory reduction.

02

Quadratic speedup over classical algorithms.

03

Error mitigation scheme improves robustness without extra qubits.

Abstract

A many-body Hamiltonian can be block-diagonalized by expressing it in terms of symmetry-adapted basis states. Finding the group orbit representatives of these basis states and their corresponding symmetries is currently a memory/computational bottleneck on classical computers during exact diagonalization. We apply Grover's search in the form of a minimization procedure to solve this problem. Our quantum solution provides an exponential reduction in memory, and a quadratic speedup in time over classical methods. We discuss explicitly the full circuit implementation of Grover minimization as applied to this problem, finding that the oracle only scales as polylog in the size of the group, which acts as the search space. Further, we design an error mitigation scheme that, with no additional qubits, reduces the impact of bit-flip errors on the computation, with the magnitude of mitigation…

Tables2

Table 1. Table 1: List of the different methods for solving the group representative problem. We generally expect that log ⁡ | G | ≪ log ⁡ | V | ≤ | G | ≪ | V | much-less-than 𝐺 𝑉 𝐺 much-less-than 𝑉 \log|G|\ll\log|V|\leq|G|\ll|V| . C ( G ) 𝐶 𝐺 C(G) is the classical cost to calculate the action of G 𝐺 G on an arbitrary v 𝑣 v , while 𝒞 ( G ) 𝒞 𝐺 \mathcal{C}(G) is the quantum cost to calculate the action of G 𝐺 G on an arbitrary v 𝑣 v .

Method	Cl. Mem	Q Mem	Time
Look-up	$𝒪 (\| V \|)$	0	$𝒪 (\log \| V \|)$
On-the-fly	$𝒪 (1)$	0	$𝒪 (\| G \| C (G))$
Divide & conquer	$𝒪 (\sqrt{\| V \|})$	0	$𝒪 (\| G \| C (G) \log \| V \|)$ (smaller constant coefficient than on-the-fly)
Gmin	$𝒪 (1)$	$𝒪 (2 \log \| V \| + \log \| G \|)$	$𝒪 (\sqrt{\| G \|} (𝒞 (\hat{G}) + polylog (\| V \|))$

Table 2. Table 2: Truth table used to form PhComp

$a_{i}$	$b_{i}$	${(continue)}_{i}$	${(apply phase)}_{i}$
0	0	1	0
0	1	0	1
1	0	0	0
1	1	1	0

Equations84

[H, g] = 0.

[H, g] = 0.

∣ v_{α} ⟩ \propto g \in G \sum χ (g)_{α}^{*} g ∣ v ⟩,

∣ v_{α} ⟩ \propto g \in G \sum χ (g)_{α}^{*} g ∣ v ⟩,

⟨ \tilde{v}_{α} ⟩ H \tilde{u}_{α}

⟨ \tilde{v}_{α} ⟩ H \tilde{u}_{α}

= \frac{1}{∣ G ∣} g_{1}, g_{2} \in G \sum χ_{α} (g_{1}) χ_{α}^{*} (g_{2}) ⟨ \tilde{v} ⟩ g_{1}^{- 1} H g_{2} \tilde{u}

=

=

=

Oracle ∣ x ⟩ = {- ∣ x ⟩ ∣ x ⟩ if x is marked, otherwise .

Oracle ∣ x ⟩ = {- ∣ x ⟩ ∣ x ⟩ if x is marked, otherwise .

\hat{G} ∣ x ⟩ ∣ v ⟩ = ∣ x ⟩ ∣ g (x) v ⟩ .

\hat{G} ∣ x ⟩ ∣ v ⟩ = ∣ x ⟩ ∣ g (x) v ⟩ .

PhComp ∣ a ⟩ ∣ b ⟩ = {- ∣ a ⟩ ∣ b ⟩ ∣ a ⟩ ∣ b ⟩ if a < b, otherwise .

PhComp ∣ a ⟩ ∣ b ⟩ = {- ∣ a ⟩ ∣ b ⟩ ∣ a ⟩ ∣ b ⟩ if a < b, otherwise .

U_{s} = I - 2 ∣ s ⟩ ⟨ s ∣ = V (I - 2 ∣ 0 ⟩ ⟨ 0 ∣) V^{†},

U_{s} = I - 2 ∣ s ⟩ ⟨ s ∣ = V (I - 2 ∣ 0 ⟩ ⟨ 0 ∣) V^{†},

\overset{g}{^} ∣ v ⟩ = ∣ g v ⟩ .

\overset{g}{^} ∣ v ⟩ = ∣ g v ⟩ .

C (\hat{G}) \sim C (\overset{g}{^}) k = 1 \sum l o g ∣ G ∣ 2^{k} \sim C (\overset{g}{^}) ∣ G ∣,

C (\hat{G}) \sim C (\overset{g}{^}) k = 1 \sum l o g ∣ G ∣ 2^{k} \sim C (\overset{g}{^}) ∣ G ∣,

\hat{G}_{add}^{N} ∣ x ⟩ ∣ y ⟩ = ∣ x ⟩ ∣ x + y ⟩,

\hat{G}_{add}^{N} ∣ x ⟩ ∣ y ⟩ = ∣ x ⟩ ∣ x + y ⟩,

P_{success} \sim 1 - exp (- \frac{T ^{2}}{a ^{2} N}),

P_{success} \sim 1 - exp (- \frac{T ^{2}}{a ^{2} N}),

α \sim a - ln ϵ .

α \sim a - ln ϵ .

\frac{1}{a _{eff} N} = avg (diff_{T} (- ln (1 - P_{success})))

\frac{1}{a _{eff} N} = avg (diff_{T} (- ln (1 - P_{success})))

δ a_{eff} = \frac{N}{M} σ_{eff} a_{eff}^{2},

δ a_{eff} = \frac{N}{M} σ_{eff} a_{eff}^{2},

⟨ t ⟩ \sim \frac{δ}{4} C (Grov) N,

⟨ t ⟩ \sim \frac{δ}{4} C (Grov) N,

P

P

\sim \frac{δ ^{2}}{1 + δ ^{2}} (\frac{( 1 - σ ^{p} )}{2} + σ^{p} sin^{2} ((2 p + 1) θ)),

U_{error} = exp (i v_{x} X + i v_{y} Y + i v_{z} Z),

U_{error} = exp (i v_{x} X + i v_{y} Y + i v_{z} Z),

\frac{1}{T _{ϕ}} = \frac{1}{T _{2}} - \frac{1}{2 T _{1}} .

\frac{1}{T _{ϕ}} = \frac{1}{T _{2}} - \frac{1}{2 T _{1}} .

k = 1 \sum N \frac{1}{k + 1} (\frac{9}{4} \frac{N}{k ( N - k )})

k = 1 \sum N \frac{1}{k + 1} (\frac{9}{4} \frac{N}{k ( N - k )})

= \frac{9 N}{4} (\frac{1}{2 N - 1} + k = 2 \sum N - 1 \frac{1}{k + 1} \frac{1}{( N - k ) k}) .

k = 2 \sum N - 1 \frac{1}{k + 1} \frac{1}{( N - k ) k} < k = 2 \sum N - 1 \frac{1}{k} \frac{1}{( N - k ) k}

k = 2 \sum N - 1 \frac{1}{k + 1} \frac{1}{( N - k ) k} < k = 2 \sum N - 1 \frac{1}{k} \frac{1}{( N - k ) k}

< \int_{1}^{N - 1} \frac{d k}{k} \frac{1}{( N - k ) k}

= [- \frac{2}{N} \frac{( N - k )}{k}]_{1}^{N - 1}

= \frac{2}{N} N - 1 (1 - \frac{1}{N - 1})

\frac{9 N}{8 N - 1} + \frac{9}{2} N - 1 \sim \frac{45}{8} N .

\frac{9 N}{8 N - 1} + \frac{9}{2} N - 1 \sim \frac{45}{8} N .

\tilde{G} (ρ) = σ G (ρ) + (1 - σ) E (ρ),

\tilde{G} (ρ) = σ G (ρ) + (1 - σ) E (ρ),

\tilde{G}_{AEM} (ρ) = \tilde{G} P_{C} (ρ) + P_{E} (ρ) .

\tilde{G}_{AEM} (ρ) = \tilde{G} P_{C} (ρ) + P_{E} (ρ) .

P_{C} E (ρ) =

P_{C} E (ρ) =

P_{E} E (ρ) =

ρ_{init} = ∣ s ⟩ ⟨ s ∣ \otimes ∣ v ⟩ ⟨ v ∣ \otimes ∣ v_{best} ⟩ ⟨ v_{best} ∣,

ρ_{init} = ∣ s ⟩ ⟨ s ∣ \otimes ∣ v ⟩ ⟨ v ∣ \otimes ∣ v_{best} ⟩ ⟨ v_{best} ∣,

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

A Quantum Solution for Efficient Use of Symmetries in the Simulation of Many-Body Systems

Albert T. Schmitz

[email protected]

Department of Physics and Center for Theory of Quantum Matter, University of Colorado, Boulder, Colorado 80309, USA

Intel Labs, Intel Corporation, Hillsboro, Oregon 97124, USA

Sonika Johri

Intel Labs, Intel Corporation, Hillsboro, Oregon 97124, USA

Abstract

A many-body Hamiltonian can be block-diagonalized by expressing it in terms of symmetry-adapted basis states. Finding the group orbit representatives of these basis states and their corresponding symmetries is currently a memory/computational bottleneck on classical computers during exact diagonalization. We apply Grover’s search in the form of a minimization procedure to solve this problem. Our quantum solution provides an exponential reduction in memory, and a quadratic speedup in time over classical methods. We discuss explicitly the full circuit implementation of Grover minimization as applied to this problem, finding that the oracle only scales as polylog in the size of the group, which acts as the search space. Further, we design an error mitigation scheme that, with no additional qubits, reduces the impact of bit-flip errors on the computation, with the magnitude of mitigation directly correlated with the error rate, improving the utility of the algorithm in the Noisy Intermediate Scale Quantum era.

I Introduction

As several quantum computing platforms become available for general use, finding practical applications for quantum computers is a key driver for the development and adoption of quantum computing technology. Additionally, since the field is expected to remain in the Noisy Intermediate Scale Quantum (NISQ) eraPreskill2018 for the next few decades, designing error mitigation strategies for these algorithms is essential. In this paper, we identify a new application for quantum computers, as well as show how the algorithm should be implemented in the NISQ era.

Much of the excitement around quantum computing started with the introduction of two algorithms: Shor’s factorization algorithmShor1997 and Grover’s search algorithmGrover1996 . Though the former represents the paradigmatic example of quantum speed up, the latter has been criticized as often only nominally showing speed-up. The criticism stems from the fact that although the oracle-query scaling is polynomially reduced, any quantum oracle which contains all the information of the database must scale with the size of the databaseMateus2005 . This suggests we must look to problems where the oracle in Grover’s search can be applied efficiently, treating it as a means to invert a Boolean function.

Dürr and HøyerDurr1996 suggested a use for Grover’s algorithm as a method to find the minimal element of a database. The general idea is to hold the best-known minimum value and search for a member less than that. If a better value is found, the best-known value is updated and the process is repeated for a set number of oracle calls. Assuming the oracle can be efficiently implemented, such a process might not be ideal in all cases as it still scales exponentially compared to approximation schemes such as adiabatic evolution and related minimization processes such as quantum approximate optimization algorithm (QAOA)Farhi2014 . However, as the names suggest, these are only approximate methods. Furthermore, adiabatic evolution is sensitive to phase transitions due to a closing gap, and QAOA may require significant classical computational overhead. These limitations ultimately stem from the fact that such methods are sensitive to not just order, but also ‘distance.’ Grover minimization (Gmin) on the other hand is only dependent on the order. It treats the minimum the same whether it’s separated from the next largest value by 1 or 100. This suggests that in special cases where an exact minimum is required or where we wish to ignore distance (or there is no notion of distance), Gmin is a good alternative.

We present one such problem which occurs in the simulation of strongly-correlated materials or quantum chemistry problems where one might perform an exact diagonalization calculation. A many-body Hamiltonian often contains several symmetries which might represent spin symmetries, translation symmetry or various other discrete point-symmetries such as an $n$ -fold rotation or reflection. Collectively, these symmetries can be formalized as a discrete group. One can leverage these symmetries by using group representation theory to block-diagonalize the full HamiltonianTinkham2003 in a symmetry-adapted basis, making the remaining diagonalization computationally cheaper.

However, to calculate the block-diagonal matrix elements, each of the original basis states must be associated to an orbit representative which, for convenience, is chosen by labeling all basis states with a single integer value, and the orbit representative is defined as the element with the smallest integer label. One must also know the group operator connecting a basis state to its representativeWietek2018 . For large systems, the Hamiltonian cannot be stored explicitly but is calculated on-the-fly during diagonalization which means the matrix elements need to be computed over and over during the computation. For this, one has to either store the representative corresponding to each element in the original basis explicitly, which becomes costly in terms of memory, or calculate them on-the-fly. Thus, finding the orbit representative has become a serious bottleneck for using symmetries in exact diagonalization problems. For large spin systems, special-purpose hardware such as FPGAs have been considered to ease this bottleneckFPGA . In some cases where distributed memory systems are used for the diagonalization calculation, the overhead of the symmetry adapted basis is so large that the authors abandon the symmetry-based approach altogetherLauchli2011 . A technique for addressing this bottleneck for spin-systems with translational symmetry is proposed in Ref. Weibe2013 ; Wietek2018 using a divide & conquer method based upon sub-lattice coding. This splits the costs between memory and computational time, but only reduces the time by a constant factor, and the memory by a polynomial amount.

In this paper, we consider the use of Gmin for this problem, which results in a quadratic speed-up over the classical algorithms, and requires virtually no classical memory and relatively little quantum memory. We improve upon the textbook version of Gmin to optimize the number of oracle calls and reduce the number of qubits required to implement the oracle. Furthermore, we show that for many reasonable problem instances, the oracle is poly-log in the size of the group and dimension of the Hamiltonian’s Hilbert space assuming the group action generators can be efficiently simulated on a quantum computer, making this a practical use for Grover’s algorithm. We consider the full circuit implementation for a benchmark case as well as the effects of error on the performance of the algorithm. Our error-mitigation scheme based on real-time post-selection on measurement results between coherent steps of the algorithm represents a near-term use for pre-fault tolerant quantum computing. Furthermore, using Gmin as a sub-routine in classical exact diagonalization is an example of the power of interfacing quantum and classical machines for hybrid algorithms. Alternately, we envision that this algorithm could also be used as a sub-routine which generates the matrix entries of a larger quantum algorithm using symmetry-adapted basis states to simulate a strongly-correlated quantum system.

The remainder of the paper is structured as follows: In Section II, we introduce the problem of finding the orbit representative, give an overview of the existing classical solutions, and then describe in detail our quantum algorithm, including the full circuit description and an analysis of the running time of the algorithm. Section III shows results from the simulation on the Intel Quantum SimulatorqHIP . Section IV discusses our error mitigation strategies in the presence of noise and their numerical simulation. We conclude in Section V.

II Overview of the Problem and the Quantum Solution

We first briefly review symmetry-adapted basis states and how their matrix elements are calculated following Refs. Tinkham2003 ; Wietek2018 : For a given many-body problem instance, Let $H$ be the Hamiltonian. We then characterize its symmetries by operators $g\in G$ , such that

[TABLE]

From the group and its associated representation theory, we define the symmetry-adapted basis states as

[TABLE]

where $\alpha$ indexes some one-dimensional representation of $G$ , $\chi_{\alpha}(g)$ is the character of the $\alpha^{th}$ representation evaluated at $g$ and $\ket{v}$ are the original “position” basis states, such that the action of $g$ on the basis states is $g\ket{v}=\ket{gv}$ . One can see that two symmetry-adapted basis states $\ket{v_{\alpha}},\ket{u_{\alpha}}$ are equal (once normalized) so long as $\ket{u}\in\text{orbit}(\ket{v})$ , where $\text{orbit}(\ket{v})$ is the set of all basis elements connected to $\ket{v}$ by a group element. Therefore each block of the Hamiltonian in this basis is characterized by just the representation index, with the states in each block represented by unique orbits, so we can choose a single representative $\ket{\tilde{v}}$ for each orbit. For simplicity, let’s assume the group action is free, which is to say $gv=v$ if and only if $g$ is the identity element. Then all states have the same normalization constant up to phase, $\mathcal{N}$ . Since $\sum_{g\in G}\chi_{\alpha}(g)\chi_{\alpha}^{*}(g)=|G|$ , we find that $\mathcal{N}=\frac{1}{\sqrt{|G|}}$ . We can now calculate the matrix elements of $H$ for a given block via

[TABLE]

where we have used the fact that all member of $G$ commute with the Hamiltonian, $\chi_{\alpha}(g)$ is a one-dimensional representation of the group and define $g_{v}$ such that $g_{v}v=\tilde{v}$ . As we can see, one needs $g_{v}$ to calculate the appropriate character. In practice, one calculates the action of $H$ on the representative state $\ket{\tilde{u}}$ , then sorts all coefficients of the resulting vector according to the orbits to form the appropriately weighted sum for each orbit. If the group action is not free, then one also has to calculate and store the normalization factors which also enter the sum.

In the rest of this section, we describe the problem of finding the group orbit representative with some comments on the classical methods which are used to solve it. We then propose a quantum method based on Gmin which exponentially reduces the memory cost while yielding a quadratic reduction in computational time.

II.1 Orbit Representative Problem Statement

With the above motivation, we formally state the orbit representative problem.

Problem statement: suppose we have some finite group $G$ with a group action $G\times V\to V$ such that $(g,v)\mapsto gv$ . We shall refer to $V$ as the position set and its members positions, though they may not correspond to physical position, but rather index some basis set for a Hamiltonian’s Hilbert space. Furthermore, we have some function $\text{int}(v)$ which totally orders the set $V$ . We assume int maps to the integer value used to label $v$ 111What follows could be mapped to more exotic orderings including partial orders if the phase comparator discussed below can be generalized to the given ordering efficiently.. Define the orbit of a position $\text{orbit}(v)=\{gv:\text{for all }g\in G\}$ , which is represented by $\tilde{v}\in\text{orbit}(v)$ such that for all $u\in\text{orbit}(v)$ , $\text{int}(\tilde{v})\leq\text{int}(u)$ , i.e. it is the smallest element.

Given a member $v\in V$ , find the orbit representative $\tilde{v}$ as well as the group element which gives that representative, i.e find $g_{v}$ such that $g_{v}v=\tilde{v}$ .

Note that based on the application of this problem from the last section, a near-minimum value for the orbit representative is not sufficient; we need the exact minimum. In the general case, one expects that $\log|G|\ll\log|V|\leq|G|\ll|V|$ . Table 1 gives a list of the solutions to this problem including Gmin and compares the costs. We denote the classical time complexity cost of computing the group action on an arbitrary member of $V$ by $C(G)$ and in general, the quantum time-complexity cost of implementing an operator $A$ on a quantum computer as $\mathcal{C}(A)$ .

II.2 Classical Solutions

There are three classical means of addressing this problem:

Look-up: Store orbit representatives corresponding to every element in $V$ and connecting group elements in a look-up table. This can then be efficiently searched when needed, but it requires $\mathcal{O}(|V|)$ amount of memory. 2. 2.

On-the-fly: When needed, calculate the full orbit to find the smallest element and the connecting group element. This is efficient in terms of memory, but the computation scales as $\mathcal{O}(|G|)$ . 3. 3.

Divide & conquer: There exist sub-lattice coding methods Wietek2018 , which allow one to split the costs between memory and computation (see Table 1 for these costs).

While the divide & conquer method represents a significant reduction in the resources needed, this bottleneck can still be prohibitively expensive. To the best of our knowledge, no one has considered using quantum methods for solving this problem as we discuss in the next section.

II.3 Overview of the Grover Minimization Algorithm

In this section we look to use the Gmin algorithm to solve the problem. We first review the algorithm as given in Ref. Durr1996 and then adapt it for this problem which includes modifications to optimize the memory and time costs.

Gmin utilizes the function $f_{v}:G\to V$ such that $g\mapsto f_{v}(g)=\text{int}(gv)$ acting on an unsorted database of $|G|$ items; $g$ acts as an index and we want to find the index which points to the smallest value in $f_{v}$ . To encode the group, we also introduce an index on the group elements $g:\mathbb{N}_{<|G|}\to G$ such that $x\mapsto g(x)$ 222For notation convenience, we equivocate $f_{v}$ with $f_{v}\circ g$ and we mean the latter throughout the remainder of the paper.. Then the number of bits (qubits) needed to index all members of the group is $m=\mathcal{O}(\log|G|)$ . The original algorithm proceeds as follows:

Let $\alpha$ be some real, positive number which we refer to as the oracle budget parameter from which we define $\alpha\sqrt{|G|}$ as the oracle budget. Using two quantum registers each of size $m$ (referred to as the group registers), choose an index $0<y<|G|-1$ randomly, and repeat the following, using no more than $\alpha\sqrt{|G|}$ Grover steps:

Initialize the two registers in the state $\left(\frac{1}{\sqrt{|G|}}\sum_{x}\ket{x}\right)\ket{y}$ , 2. 2.

Mark all $x$ in the first register such that $f_{v}(x)<f_{v}(y)$ , 3. 3.

Apply a “Grover search with an unknown number of marked elements” (Gsun) Boyer1998 to the first register and 4. 4.

measure the first register with outcome $y^{\prime}$ ; if $f_{v}(y^{\prime})<f_{v}(y)$ , $y\leftarrow y^{\prime}$ .

It is argued in the reference that for $\alpha=\frac{45}{2}$ , the second register holds the minimum value with a probability of at least $50\%$ . Below, we discuss how to relate the success rate and $\alpha$ using numerical methods. Appendix A gives a modified analytic derivation such that one finds a better value of $\alpha=\frac{45}{8}$ to achieve a success rate of at least $50\%$ .

To make this algorithm more explicit, we must address how to implement the second and third steps, which is equivalent to a method for implementing Gsun and its oracle. In general for Grover search, if the number of marked elements is known, one can apply the exact number of Grover steps to reach one of the marked states with high probability. However, this probability is not monotonic with the number of oracle calls. One can “overshoot” the target state and reduce the probability of reaching the answer with additional oracle calls. Thus, not knowing the number of marked elements could be problematic if we don’t include some additional procedures. We refer those unfamiliar with Grover’s search algorithm to Refs. Grover1996 ; Nielsenbook for details. Ref. Boyer1998 provides a solution given by Gsun. Gsun iterates the search and randomly draws the number of Grover steps from a running interval. Those authors prove that the probability of selecting a marked element is asymptotically bounded below by $\frac{1}{4}$ , thus insuring we can find a marked element with probability greater that 50% after a number of oracle calls that still scales as $\sqrt{|G|}$ .

To mark elements as in step two, we must define the oracle. According to Refs. Boyer1998 ; Grover1996 , marking an element means the oracle produces the action on any computational basis state $\ket{x}$ ,

[TABLE]

Note the second step requires we calculate $f_{v}(x)$ and $f_{v}(y)$ which implies we also require quantum registers to hold these values. There may exist multiple methods for implementing such an oracle, but the simplest and perhaps cheapest method for our problem is to further hold the value $v$ in a quantum register of size $n=\mathcal{O}(\log|V|)$ which we refer to as the first position register. Furthermore, we replace the second group register with a second position register of size $n$ . So our method is not to store the best-known value for the group index ( $y$ in the above algorithm) as was done in previous implementations of Grover minimization, but rather store $\tilde{v}_{\text{best}}=f_{v}(y)$ in a quantum register. $y$ can then be stored classically and updated when $\tilde{v}_{\text{best}}$ is updated. This innovation reduces the number of gates and qubits required for the oracle. The oracle is then implemented as follows: We first implement the group action operator $\hat{G}$ on the group register and the first position register which has been initialized with $v$ such that

[TABLE]

We then apply a quantum circuit that in general acts on two quantum registers of equal size such that it applies a negative sign to the state if the computational basis state of the first register is less than that of the second. We refer to this circuit as phase comparator (PhComp) which has the behavior

[TABLE]

So after applying the group action operator, we apply PhComp to the two position registers, and then uncompute the group action operator. This completes the oracle as show in Fig. 1. To complete one Grover step (Grov), we then apply the usual reflection operator defined as

[TABLE]

where $\ket{s}=\frac{1}{\sqrt{|G|}}\sum_{x}\ket{x}$ and $V$ is any unitary such that $V\ket{0}=\ket{s}$ . For completeness, the circuit for Grov is shown in Fig. 2.

If we unpack Gsun and integrate this into our modified version of Gmin, the psuedo-code flow of the algorithm is shown in Algorithm 1.

Note that we have chosen the initializing best guess $v_{\text{best}}=v$ , as we assume $v$ is effectively random. Also, we count the check step in line 12 as an effective oracle call so that the classical and quantum solutions can be more accurately compared. $\gamma\in\left(1,\frac{4}{3}\right)$ and $\beta\in\left[0,1\right]$ are additional parameters which we use to minimize $\alpha$ . $\gamma$ is discussed in Ref. Boyer1998 and controls the rate of the exponential “ramp-up” for the parameter $t$ which in turn determines the ceiling of the random sampling for number of oracle calls used in the Grover search step of the algorithm. In principle, a large $\gamma$ reduces the time to reach $t\sim\sqrt{|G|}$ which is optimal if $v_{\text{best}}$ is near the minimum (the number of marked elements is small; the search takes longer). However if $v_{\text{best}}$ is far from the minimum, $\gamma$ being too large and $t\sim\sqrt{|G|}$ increases the chances that we apply too many oracle calls and dramatically overshoot a state of high overlap with a marked element. Thus, we need to balance the rate at which $t$ increases by optimizing $\gamma$ . $\beta$ is a parameter which we introduce here. As the algorithm was originally written, after a better value of $v_{\text{best}}$ is found in line 13 of Algorithm 1, Gsun effectively ends and on the next cycle is re-called. Gsun then assumes it knows nothing about how close we are to the minimum by resetting the value of $t$ back to $1$ (as would be the case for $\beta=0$ ). However, we do know something, namely that we are closer to the minimum than the iteration before (the number of marked elements has decreased). Thus we don’t need the ramp-up time for $t$ which is only included to address when we are far from the minimum. By including the $\beta$ parameter, we are looking to exploit this limited knowledge about the number of marked elements. We discuss the exact values chosen for these parameters in Sec. II.5.

II.4 Circuit Implementation of Grov and its Cost

We now discuss a full circuit implementation of all subroutines of Grov. As $\hat{G}$ is specified by the problem instance, we only give an explicit implementation for the group $G_{\text{add}}^{N}$ which represents addition modulo $N=2^{n}$ or translation on a cycle of $N$ positions. Otherwise, we discuss a general strategy for more complicated realistic groups.

The simplest part of Grov to implement is the standard $U_{s}$ operator as defined in Eq. (7). As discussed in Ref. Boyer1998 , if $|G|=2^{n}$ for some $n$ , then $V=H^{\otimes n}$ is given by the Hadamard gate acting on every qubit of the group register. The remaining reflection is implemented by a controlled $\pi$ phase gate on a computational [math] input, i.e. apply NOT to all qubits and then apply the multi-controlled $Z$ gate. Finally, we uncompute everything but the multi-controlled $Z$ gate. An example of this circuit is shown in Fig. 3. If $|G|$ is not a power of $2$ , we only have to modify the change of basis given by $V$ to some other change of basis operator such as the quantum Fourier transform (QFT). The cost of the former is $\mathcal{C}(U_{s})\sim\mathcal{O}(\log|G|)$ while the latter case scales as $\mathcal{C}(U_{s})\sim\mathcal{O}(\log^{2}|G|)$ if we use QFT.

We next consider an implementation of PhComp as defined in Eq. (6) by considering a bitwise comparison of the input registers. We start with the most significant bit of the binary expansion of a computational input value and proceed to the least significant. At the $i^{th}$ bit, we need to calculate two binary values, the first representing whether or not we should apply the $\pi$ phase at the current bit, and the second representing whether or not we should continue to compare on the remaining lesser bits. That is, if the two bits differ, the value containing $1$ is greater, so we need to prevent any additional phases from being apply on lesser bits. A truth table for this calculation is given in Table 2 for input bits $a_{i}$ and $b_{i}$ . From this, we find that $\text{(apply phase)}_{i}=\overline{a}_{i}b_{i}$ conditioned on the truth (AND-ed with) all greater $\text{(continue)}_{j}=\overline{a}_{j}\oplus b_{j}$ bits for $j>i$ . So our method of implementing PhComp is to NOT all qubits of the first register $a$ and then compare from the most to least significant qubit. At the $i^{th}$ qubit, we calculate $\text{(continue)}_{i}$ on $b_{i}$ using CNOT, but not before calculating $\text{(apply phase)}_{i}$ in the phase with a multi-control Z gate between $\overline{a}_{i}$ , $b_{i}$ and all the $b_{j}$ for $j>i$ ( which now contain the (continue) bits). Finally, we uncompute the CNOT and NOT gates. An example circuit is shown in Fig 4. Assuming the cost of a multi-control $Z$ gate scales linearly with the number of controls, the cost of PhComp is $\mathcal{C}(\text{PhComp})\sim\mathcal{O}(\log^{2}|V|)$ . However, if we have additional ancilla qubits available, we can use these to reduce $\mathcal{C}(\text{PhComp})\sim\mathcal{O}(\log|V|)$ . See Appendix B for details.

The form of the group action operator is entirely dependent on the group. We take the simplest case first which is an abelian group with a single cycle, whereby $g(x)=g^{x}$ for the group generator $g$ . We assume we can form a circuit for the operator $\hat{g}$ acting on a position register which achieves

[TABLE]

We then control $\hat{g}^{2^{i}}$ on the $i^{th}$ qubit of the group register as show in Fig. 5. This method can then be generalized to multi-cycle abelian groups by subdividing the group register so there is one subregister for each cycle and generate a circuit similar to Fig. 5 for each cycle. If the group is non-abelian, one has to consider a strategy for indexing powers of the generators and their order. For example, suppose the group is generated by two non-commuting operators $g_{1}$ and $g_{2}$ . Each generator forms its own abelian subgroup so we can use the same strategy for them separately and with their own sub-group register. Furthermore, the order for applying these operators can be controlled by a single qubit, $\ket{\text{order}}$ , using the circuit in Fig. 6. If $\ket{\text{order}}=\ket{0}$ , then the group operator applied is $g_{1}^{x_{1}}g_{2}^{x_{2}}$ and if $\ket{\text{order}}=\ket{1}$ then the group operator applied is $g_{2}^{x_{2}}g_{1}^{x_{1}}$ . We note this may not be the most efficient method in terms of qubit use for the group register qubits. For example, if $x_{2}=0$ , then the state of $\ket{\text{order}}$ doesn’t matter and so there are redundant index states in the group register. This is also the case if there are redundancies in the order of non-zero powers of the generators, i.e. $g_{1}^{x_{1}}g_{2}^{x_{2}}=g_{2}^{x^{\prime}_{2}}g_{1}^{x^{\prime}_{1}}$ , for some values of the indices. The most efficient method depends on the group, but we include this example to demonstrate that, in principle, one can handle non-abelian groups using roughly the same strategy as was used for abelian groups.

The scaling of $\hat{G}$ is highly dependent on the group being used, but it should be clear that in many reasonable cases, the scaling should be $\mathcal{C}(\hat{G})\sim\log(|G|)\mathcal{C}(\hat{g})$ where we assume the generators can be implement at cost $\mathcal{C}(\hat{g})\sim\mathcal{C}(\hat{g}^{n})\sim\text{polylog}(|V|$ ) for any power $n$ . That is, implementing the power of a generator must not scale with that power. To demonstrate the importance of this, consider the single-cycle abelian case. If we implement $\hat{g}^{2}$ with two copy of $\hat{g}$ and so on for the other powers, then

[TABLE]

Clearly, this is not efficient, and our oracle scales with the size of the search space. However, if the implementation of the powers of $\hat{g}$ can be simplified so as to scale on the order of $\hat{g}$ or less, then we achieve our desired scaling $\mathcal{C}(\hat{G})\sim\mathcal{C}(\hat{g})\log|G|$ . It is reasonable to believe this is possible in the general case. Suppose we take for granted the complexity of a quantum circuit corresponding to a periodic operator scales with the size of its period. $\hat{g}^{2}$ has half the period of $\hat{g}$ and $\hat{g}^{4}$ has half the period of $\hat{g}^{2}$ and so on. So one would expect that $\hat{g}$ is actually the most expensive power to implement.

To make this discussion more concrete, consider the example of the group representing addition mod $N=2^{n}$ for some $n$ which we denote $G_{\text{add}}^{N}$ , i.e.

[TABLE]

where mod $N$ is implicit. Implementing this operator using the methods discussed here333We are aware of better in-place adders, namely those which calculate the addition in the phase or via some ripple-carry schemeCuccaro2004 . We stick to this less efficient implementation of the adder as it imitates our more generic construction of $\hat{G}$ ., one can use $\hat{g}_{\text{add}}$ consisting of a sequence of multi-control NOT gates as shown in Fig. 7 (where we recall that $\hat{g}_{\text{add}}\ket{y}=\ket{y+1}$ ). It is clear that $\hat{g}_{\text{add}}^{2}$ is given by removing all gate action and control lines on the least significant bit, and so on for the other powers. For such a simple case, it’s easy to see this simplification, but for a compiler which only moves commutative gates and considers local pattern matching, this dramatic reduction might go unexploited, so a manually-optimized implementation might be preferred.

It is worth considering the specific case of $G$ for spin Hamiltonians as this is the most natural use for a quantum solution to this problem. The natural mapping of the problem would assign one qubit to each spin in the physical system, and most geometric symmetry generators such as those for translation or rotation (as opposed to spin symmetries), can be simulated by a $\hat{g}_{\text{spin}}$ consisting of swap gates. For example, translation on a spin chain would use a $\hat{g}_{\text{spin}}$ consisting of a cascade of nearest-neighbor swaps. Note that here and in general, $\mathcal{C}(\hat{g}_{\text{spin}})\sim\mathcal{O}(|G_{\text{spin}}|)=\mathcal{O}(\log|V|)$ , but this is because the group is already exponentially small in the size of $V$ . So in general we expect $\mathcal{C}(\hat{G}_{\text{spin}})\sim|G_{\text{spin}}|\log|G_{\text{spin}}|$ .

In terms of comparing costs with the classical on-the-fly method, we expect $C(G)$ to be of the order of $\mathcal{C}(\hat{G})$ in which case the quantum solution out performs the classical one. The classical divide & conquer method has a smaller constant coefficient, so the quantum solution outperforms it for relatively larger group sizes, but always uses exponentially less memory.

II.5 Oracle Budget and Probability of Success

To complete the algorithm, we need to determine the constants $\alpha,\beta$ and $\gamma$ . As we have the exact solution for the probability of success for a single Grover search Grover1996 ; Boyer1998 (line 9-11 of Algorithm 1), we are able to simulate the classical parts of the algorithm using $G_{\text{add}}^{N}$ as the group in order to determine the behavior of these parameters. Note that, without error, the oracle query complexity is unaffected by the details of the group (aside from its size), so the following results should be general. As we know the solution to the orbit representative problem for this trivial example, we can run the simulation until the correct answer is obtained. This allows us to empirically determine the probability of success as a function of the total number of calls. For a window of probabilities $P_{\text{success}}\in[0.2,0.995]$ 444We clearly don’t care about probabilities less that $20\%$ and beyond $99.5\%$ , the rate of increase of the probability is hard to discriminate within our simulation., we find that the asymptotic form of the probability for large $N$ is given by

[TABLE]

where $T$ is the number of oracle calls and $a$ is the rate parameter which is a function of only $\beta,\gamma$ and is empirically determined. By linearizing Eq.(11) with $\frac{1}{a}$ as the slope, we can calculate the rate parameter as is the case in Fig. 8 as well as demonstrate this is the correct asymptotic form. One can see that the $R^{2}$ -value of the linear regression asymptotically approaches $1$ and the rate parameter approaches a constant for fixed $\beta,\gamma$ . This allows us to determine the oracle budget parameter $\alpha$ . For a given application, if we allow for a probable error in the solution of $\epsilon>0$ , then $\alpha$ is given by

[TABLE]

So we want to determine the values of $\beta$ and $\gamma$ such that we minimize $a$ . Figure 9 shows a survey of $a$ as a function of $\beta$ and $\gamma$ . From this, we have chosen $\gamma=1.15$ and $\beta=0.95$ as the near optimal values. This value of $\gamma$ is near previously discussed values, where Ref. Boyer1998 suggests $\frac{6}{5}$ . However, $\beta$ being near one suggests we gain a good deal of information knowing that the number of marked items has decreased from one call of Gsun to another. For comparison, if we use $\epsilon=0.5$ and $a\approx 2\text{-}4$ , the resulting oracle budget parameter is $\alpha\approx 1.6\text{-}3.3$ which is a considerable reduction compared to $\alpha\approx 5.6$ for the analytic value found in Appendix A. For applications which require a high probability of success i.e. $\epsilon=0.01$ , we obtain $\alpha\sim 4.3\text{-}8.6$ .

III Full Simulation for a Perfect Quantum Machine

To check the behavior of Algorithm 1, we implement a full quantum simulation using the Intel Quantum Simulator (Intel-QS)qHIP and $G_{\text{add}}^{2^{n}}$ as our group for $n=4\text{-}8$ . This requires $12\text{-}24$ qubits using no additional ancilla to reduce the depth of the quantum circuits. Although $G_{\text{add}}^{2^{n}}$ is not a useful problem instance, it does maximize the group size relative to the number of positions, i.e. $\log|G|=\log|V|$ and so this represents the most efficient benchmark using the fewest qubits. Furthermore just as with the purely classical simulation from the last section, knowing the correct answer allows us to avoid choosing an oracle budget, and instead run the algorithm until the solution is found to determine the probability of success as a function of total number of calls555 Still, we do choose a hard stop of $\alpha=\frac{45}{2}$ which as we established is on the high side for a reasonable oracle budget.. As our simulation is exact, i.e. we are treating the quantum machine as perfect, the details of the group do not affect the results. All quantum subroutines are implemented according to the discussion in Section II.4, where multi-controlled gates have been broken down to one- and two-qubit gates using methods from Ref. Barenco1995 . This was done to better simulate the algorithm acting on real hardware once noise is added in Section IV.3.

Figure 10 shows the probability of success as a function of oracle calls, where the insert shows an effective rate parameter. We note that the probability in Eq. (11) is asymptotically correct in the limit of large $N$ and as such, the curves for these smaller group sizes do not fit this form well. Instead we define the effective rate parameter, $a_{\text{eff}}$ via

[TABLE]

where we treat $P_{\text{success}}$ as a function of oracle calls, $T$ , and $\text{diff}_{T}$ is the difference between two successive values of $T$ . Despite the poor fit, $a_{\text{eff}}$ is still indicative of the trends. We then determine error bars for $a_{\text{eff}}$ via

[TABLE]

where $\sigma_{\text{eff}}$ is the standard deviation of the expression which is averaged in Eq. (13) and $M$ is the number of trials. From Fig. 10, we find the behavior as expected from the classical simulations. We notice however that the effective rate parameter is higher for the full simulation as compared to the classical simulation. Although not optimal, the effective rate parameter still suffices. For example, if we desired a $99\%$ chance of success and we chose the rate parameter to be $a=4$ ( $\alpha\approx 5.7$ ), then the oracle budgets would be $23,32,45,64$ and $91$ , respectively, for the group sizes shown in Fig. 10. From the figure, we see that we would achieve nearly or better than our target $99\%$ chance of success.

IV Error Mitigation Strategies and their Simulation

One of the benefits of Gmin is that the best-known value for the minimum is alway monotonically decreasing with the number of oracle calls. Unlike Grover search, this is true even for a faulty implementation on an imperfect quantum machine. Furthermore, vagaries of faulty implementation are partially compensated for by the classical random sampling of the number of oracle calls for any single coherent Grover search. Put a different way, though we allot a set oracle budget, not all these calls are implemented in a single coherent step. This suggests Grover minimization is a reasonable use for near-term, noisy hardware. Still, noise has its costs. In this section, we describe some strategies for mitigating the cost of errors. We then simulate some of these methods to determine their effectiveness.

IV.1 Strategies

We start by describing two error mitigation strategies. As mentioned, the approach to a solution is monotonic regardless of the error rates. Thus the most obvious method is to simply increase the oracle budget, leaving all else the same, a method we refer to as static error mitigation (SEM). The obvious downside to this method is that the increase in the oracle budget would reasonable need to scale with the size of the system –assuming roughly independent error rates for each qubit–in which case, we may lose our quantum advantage. This is supported by analytic results on Grover search with a faulty oracle in Refs. Shenvi2003 ; Regev2012 , where for certain toy error models, the polynomial quantum speed-up is either partially or entirely lost.

The other strategy takes advantage of the additional qubits which do not hold the search space. The two position registers are included only as a means of marking elements of the search space and implementing the oracle. As such, they should hold the same computational basis value at the beginning and end of a single call to Grov. This allows us to measure these registers without disturbing the coherence of the group register which is responsible for the quantum speed-up. Moreover, any terms in the full state of the system (as expanded in the computational basis) which hold values in the position registers which differ from $v$ and $v_{\text{best}}$ are in error and measuring the correct values projects the system back to an un-errored, or at least less-errored state. Thus we suggest the following: at the end of any call to Grov, measure the two position registers. If their measured values differ from that of the classically stored values $v$ and $v_{\text{best}}$ , we abort the remaining Grover steps on line 10 of Algorithm 1 and go back to step 8, for which the errored oracle calls do not count against our oracle budget. It is important to note that we do not randomly sample $p$ again as this would introduce a bias toward smaller values of $p$ as they are less likely to experience an error. We refer to this strategy as active error mitigation (AEM). This is because the total number of oracle calls, both errored and un-errored, is not fixed, but depends on the rate of error.

The downside of this method is that all the oracle calls up to the point an error is found still cost time which is now wasted due to the error state. To mitigate this waste, before restarting the Grover search, we measure the group register and continue to check to see if a better value is found. To do so is practically free (up to one additional effective oracle call to perform the check) and it can only increase our chances of finding the minimum, even if by a minuscule amount. Moreover, simulations demonstrate the increase is significant. We refer to this as a measure-and-check strategy.

All together, the AEM version of the Gmin algorithm is presented in Algorithm 2. Note we have added a hard stop for total number of oracle calls as characterized by $\ell$ to avoid infinite run-time. Ideally AEM “protects” the probability of success for a fixed oracle budget and a large range of error rates. That is, $P_{\text{success}}$ as a function of un-errored oracle calls (i.e. as a function of the $c_{1}$ count in Algorithm 2) takes the form of Eq. (11) with a rate parameter which is only weakly dependent on the error rates. Again, the downside is the non-deterministic run-time which can bloat if the error rate is too high.

IV.2 Performance of AEM Gmin

In Appendix C, we analyze the performance of AEM Gmin using a simple error model. Let the average qubit lifetime $\braket{t}$ (say the average between $T_{1}$ and $T_{2}$ as described below) scale as

[TABLE]

for some $\delta>0$ . Then we find that optimally (such that $e=1$ ; see Appendix C for details) the probability of success for $p$ AEM Grov calls, including measure-and-check when an error is found, is asymptotically

[TABLE]

where $\sigma\sim\exp\left(-\frac{\mathcal{C}(\text{Grov})}{\braket{t}}\right)=\exp\left(-\frac{4}{\sqrt{N}\delta}\right)$ is the probability of having no error in a single call to Grov. Recall that the probability of success in the absence of noise ( $\delta\to\infty$ ) is $\sin^{2}\left((2p+1)\theta\right)$ . Without measure-and-check after the error, this probability is degraded to $\sigma^{p}\sin^{2}\left((2p+1)\theta\right)$ , in which case the probability of success is exponentially sensitive to the value of $\delta$ . With measure-and-check, the probability is only polynomially sensitive to $\delta$ . To demonstrate this, consider the case when we are searching for a single element, $p\sim\sqrt{N}$ and so $\sin^{2}((2p+1)\theta)\sim 1$ . If $\delta=4$ , then the AEM probability of success with measure-and-check goes as $\frac{16}{17}\left(\frac{1-\exp(-1)}{2}+\exp(-1)\right)\approx 64\%$ , which is reasonably better than the no-measure-and-check probability of $\exp(-1)\approx 37\%$ . However, if $\delta=1$ , then the AEM probability is $\frac{1}{2}\left(\frac{1-\exp(-4)}{2}+\exp(-4)\right)\approx 25\%$ as compared to $e(-4)\approx 2\%$ . If we go even further and take $\delta=\frac{1}{2}$ , the AEM probability of success goes as $\frac{1}{10}=10\%$ whereas without measure-and-check, it is negligible. So even though we need the coherence time to scale as $\sim\sqrt{N}$ , AEM Gmin is far more forgiving for a smaller value of the coefficient $\delta$ . This is further demonstrated numerically in Section IV.5.

The analysis given in the appendix is a general result for Grover search with the same measure-and-check AEM strategy. For AEM Gmin, the fact that the probability of success of a single search is necessarily degraded by noise means we still need to increase the oracle budget in order that the target overall probability of success is maintained. This is done automatically by not counting errored oracle counts.

IV.3 Simulation of Error Mitigation Strategies

To simulate noisy hardware, we used the error model included in the Intel-QS package which is based upon the Pauli-twirling approximation error model Geller2013 . In this model, before a gate is applied, a random single qubit rotation is applied to each qubit acted on by that gate. The error unitary is given by

[TABLE]

where $X,Y,Z$ are the single-qubit Pauli operators. $v_{x},v_{y}$ and $v_{z}$ are parameters chosen at random from a Gaussian distribution whose variance grows with the time from the last gate action in units of the hardware dependent parameters $T_{1},T_{\phi}$ and $T_{2}$ respectively. As $X,Y$ and $Z$ are dependent on one another, the parameters are related by

[TABLE]

Because $T_{1}$ is associated with the $X$ Pauli operator which flips the computational state, we can think of $T_{1}$ as the “bit-flip” error rate. Likewise, $T_{2}$ is associated with the $Z$ Pauli operator which applies a $\pi$ phase, so we can think of this as the “phase-flip” error rate. To accurately accommodate for this non-deterministic, measurement-based algorithm, some modifications had to be made to the Intel-QS. See Appendix D for details.

Simulations for both SEM and AEM are shown in Fig. 11 for $\log|G|=4,5$ where we have fixed either $T_{1}$ or $T_{2}$ to be a large, effectively infinite constant and varied the other. This allows us to determine the effect of each kind of error. $T_{1}$ and $T_{2}$ are measured in units of the single-qubit gate time (SQGT); see Appendix D for details.

In terms of bit-flip error, we can see that AEM does protect the rate parameter over the values of $T_{1}$ shown in Fig. 11a and 11b as evidenced by the flatness of the curves for AEM, $T_{2}=\infty$ . However, AEM only partially protects the rate parameter against phase-flip error. This should not be surprising as phase error would persist even after the projection due to measurement at the end of a call to Grov. That is, phase error tends to accumulate in the superposition of the group register and is not corrected by the AEM strategy. Still looking at the SEM results, we see that the algorithm is altogether less susceptible to phase-flip error.

The protection of the rate parameter by AEM is important as it means our choice of the oracle budget parameter is less dependent on knowing the rate of error. However, the rate parameter is no-longer directly proportional to the run-time of the algorithm as errored calls to Grov are not counted against the oracle budget. Thus we have to evaluate whether the total run-time is better or worse under AEM, which not only includes the errored calls, but also includes the time to perform the measurements. Fig. 11c and 11d plots the average run-time as a function of either $T_{1}$ or $T_{2}$ for a fixed, large value of the other parameter. By average run-time, we mean the average over all trials of the total run-time (to find the correct answer) of the quantum computation cycles of the algorithm, including all measurements and gates, in units of the SQGT. This does not include time to perform the classical computation cycles of the algorithm666The variability in the average run-time is maximal as the time to reach the minimum can be zero if $v$ happens to be the minimum. For this reason, we give no error bars on the average run-time.. From this figure, we see that AEM does not bloat the run-time for bit-flip error and as desired, significantly decreases the run-time for small $T_{1}$ times. It also only adds a modest, roughly constant increase for phase-flip error. Note that for higher $T_{1}$ and $T_{2}$ there is a cross-over where SEM has a smaller average run-time. This is due to the additional time needed to perform the measurements, which is only a constant time increase for each call to Grov.

From this analysis, we see that AEM is always preferred over SEM as it both protects the rate parameter and decreases the run-time except for when coherence times are sufficiently high, in which case its cost is only a constant for each call to Grov.

IV.4 Reducing Phase-flip Error

AEM is effective against bit-flip error, but less so for phase-flip error. Even though the algorithm is less susceptible to this kind of error, it is worth considering a method for reducing phase-flip error. This can be achieved using simple fault-tolerant methods. As we are only looking to correct one channel of error, we can use simple, essentially classical fault-tolerant error-correcting codes such as a repetition code Terhal2015 . It should be sufficient to use an error-correcting code on the group register only to reduce the qubit overhead. With enough physical qubits to form robust logical qubits, we could achieve an effective $T_{2}\sim\infty$ in which case AEM should fully protect the rate parameter.

IV.5 Simulation for Realistic Hardware

AEM Gmin requires interaction between quantum and classical instructions, but unlike similar hybrid computations such as decoding an error-correcting code or variational eigensolver (VQE), the classical computation cycles are simple and should not take a significant amount of time between coherent quantum steps. Thus AEM Gmin could stand as a good test of real-time hybrid quantum-classical computation. For this reason, we simulate AEM Gmin with realistic $T_{1},T_{2}$ times using the addition group of sizes $n=\log|G|=4,5$ and $6$ . To increase the chances of a successful run, we use the maximum number of ancilla qubits to reduce the depth of the circuit. So the total qubits used is $3n+(n-2)=4n-2$ , or $14,18$ and $22$ , respectively, for our cases. Methods for using the ancilla to reduce the depth are give in Appendix B. We used $T_{1}=T_{2}=700$ SQGTs which are extracted from Ref. O'Brien2017 for superconducting qubits.

Fig. 12 plots the rate parameter and average run-time for AEM Gmin as well as SEM Gmin and no noise Gmin which are included for comparison. For these realistic hardware parameters, we see that the rate parameter is well-protected by AEM, and the increase in run-time over no-noise conditions is still within reason, whereas the time for SEM is beyond a reasonable run-time. When observing the simulation in real-time, we recognize for $n=6$ the probability of failure for a single oracle call is high, implying that a test of any larger groups would require an increase in the $T_{1}$ and $T_{2}$ times as argued in Section IV.2.

V Conclusions

In this work, we have identified a new application for the Grover minimization algorithm, and provided a full quantum solution for the problem. Since Grover’s search often comes with the caveat of not having an efficiently implementable oracle, our work is notable for finding a practical use for Grover’s algorithm as the oracle is expected to scale poly-logarithmically with the size of the group. We have discussed both the structure of the algorithm and refinements to the original version, as well as a full gate decomposition for the simplest group given by modular addition. We discussed how we can leverage the intermediate measurement steps to mitigate the effects of error, increasing the likelihood of the algorithm being useful in the NISQ era.

In addition to being a sub-routine in classical exact diagonalization, our algorithm could also be called by a larger quantum algorithm which is performing a simulation of a many-body quantum system using symmetry-adapted basis states.

The algorithm discussed is far more general than what has been presented here. We achieve a reasonably sized oracle by leveraging the structure of the group, whereas the unstructured nature of the search is encapsulated in the arbitrary labeling of positions/basis states. Similarly, we can envisage using Gmin to find/prepare the ground state of some Hamiltonian. In such a case, one leverages the structure of Hamiltonian dynamics by replacing the group action operator with phase estimation. We hope to explore this more in future work.

The error mitigation scheme we have designed is also likely to be generally applicable to oracles using ancilla qubits, and thus could be used in a much wider context to improve the accuracy of quantum oracles.

VI Acknowledgements

The authors would like to thank Jim Held, Justin Hogaboam, Anne Matsuura, and Xiang Zou for useful discussion. ATS would also like to thank Rahul M. Nandkishore.

Appendix A Deriving a Tighter Lower Bound for the Oracle Budget

In this appendix, we derive a tighter lower bound on the oracle budget. We follow the exact method used in Ref. Durr1996 but use a tighter bound from Ref. Boyer1998 for the average number of oracle calls for Gsun to find the solution to a search among $k$ marked elements. In particular, we use the exact expression for number of calls to reach the critical stage of the algorithm, with which we achieve a bound for Gsun of $\frac{9}{4}\frac{N}{\sqrt{k(N-k)}}$ ( whereas Ref. Durr1996 used $\frac{9}{2}\sqrt{\frac{N}{k}}$ ). Taking Lemma 1 from Ref. Durr1996 for granted, we follow the procedure for Lemma 2 using this tighter bound to find that the average number of oracle calls to reach the minimum is bound above by

[TABLE]

We can approximate the sum using an integral as an upper bound,

[TABLE]

Ignoring the $\mathcal{O}\left(N^{-\frac{3}{2}}\right)$ term we find our upper bound is

[TABLE]

Appendix B Reducing the Cost of PhComp and $\hat{G}_{\text{add}}^{N}$

To reduce the cost of PhComp, we avoid re-calculating the AND of (continue) bits i.e. remove the multi-control Z gates. This is done by storing the AND between two (continue) bits in an ancilla initialized in the zero computational state. We then pass this down the circuit as shown in Fig. 13. The most significant and least significant bits do not benefit from having an ancilla, so we can use any number of ancilla up to $\log|V|-2$ . For the maximum number, the cost of PhComp goes as $\mathcal{C}(\text{PhComp})\sim\mathcal{O}(\log|V|)$ . A similar method can be used to reduce the cost of $\hat{G}_{\text{add}}^{N}$ as shown in Fig. 14. With the maximum number of ancilla, which is again $\log|V|-2$ , this reduces the cost of $\hat{G}_{\text{add}}^{N}$ to $\mathcal{C}(\hat{G}_{\text{add}}^{N})\sim\mathcal{O}(\log|G|)$ . The resulting adder is on par with the ripple-carry adder from Ref. Cuccaro2004 , but uses far more qubits. We include our version here to demonstrate that ancilla can be useful for reducing the group action operator. Furthermore, the ancilla can be shared between PhComp and the group action operator and measured along with the position registers in the AEM scheme. This was done for the data in Fig. 12.

Appendix C Derivation of AEM Performance

In this appendix, we derive an estimate of the performance for AEM Gmin using a simple error model. Importantly, the analysis includes the measure-and-check strategy.

Let $\tilde{\mathcal{G}}$ be the noisy Grover call quantum channel. We make the assumption that we can decompose $\tilde{\mathcal{G}}$ as

[TABLE]

for some $\sigma\in[0,1]$ , where $\mathcal{G}$ is the noise-less Grover call quantum channel and $\mathcal{E}$ is some error channel. In this version of AEM, we conditionally call $\tilde{\mathcal{G}}$ based upon the outcome of measuring the correct values in the position registers after the previous call to $\tilde{\mathcal{G}}$ . Let $\mathcal{P}_{C}(\rho)=P_{v,v_{\text{best}}}\rho P_{v,v_{\text{best}}}$ be the channel which projects onto the correct computational basis states in the position registers and $\mathcal{P}_{E}(\rho)=\sum_{(u_{1},u_{2})\neq(v,v_{\text{best}})}P_{u_{1},u_{2}}\rho P_{u_{1},u_{2}}$ be the projection channel onto all incorrect basis states. We then model an AEM call to noisy Grover as

[TABLE]

We simplify the error channel by considering a model such that

[TABLE]

where $e\in[0,1]$ and $\mathcal{F}$ is some quantum channel which only acts non-trivially on the position registers. For concreteness, we can think of $\mathcal{F}$ as some channel that applies an arbitrary string of Pauli $X$ operators with some probability, but the exact form does not matter for our purposes. $\rho_{\text{mix}}$ is the mixed state for the entire system. We interpret this error model as saying an error with the correct values in the position registers is effectively a maximally mixed state, and an error with the incorrect values in the position registers is such that it only affects those registers with some probability $e$ and is otherwise maximally mixed. That we take $e$ to be some value other than [math] is informed by the fact that measure-and-check is numerically shown to significantly increase the probability of success.

Now suppose we apply $p$ AEM noisy Grover calls to the initial state,

[TABLE]

followed by a final measurement of the position registers so that our final state is

[TABLE]

where we use the fact that $\mathcal{P}_{C}(\rho_{\text{init}})=(\rho_{\text{init}})$ , $\mathcal{P}_{E}(\rho_{\text{init}})=0$ and $\mathcal{P}_{E}\mathcal{P}_{C}=0$ . Once we substitute our error model into the above expression, we have several terms which are proportional to $\rho_{\text{mix}}$ , noting that $\mathcal{G}(\rho_{\text{mix}})=\mathcal{F}(\rho_{\text{mix}})=\rho_{\text{mix}}$ . These terms are sub-leading as their contribution to the final probability of success goes as $\frac{1}{N}$ , so we collect all such terms in the set $\mathcal{O}(\rho_{\text{mix}})$ . We then expand the errored terms in Eq. (26),

[TABLE]

where we are using $\mathcal{P}_{E}\mathcal{G}(\mathcal{P}_{C}\mathcal{G})^{n}(\rho_{\text{init}})=0$ as $\mathcal{G}$ acts as the identity on the position registers. Now suppose $\mathcal{P}_{\text{sol}}(\rho)=P_{\text{sol}}\rho P_{\text{sol}}$ is the projection channel for the solution space of the search. We then use the known exact solution for noise-less Grover search,

[TABLE]

where $\theta$ is defined by $\sin^{2}\theta=\frac{m}{N}$ for $m$ marked elements. Note we can apply $\mathcal{F}$ in the second equality as it only acts on the position registers and not the group register, i.e. the search space. Ignoring the $\mathcal{O}(\rho_{\text{mix}})$ terms, we can bound our success probability as

[TABLE]

The first term represents the probability of success when no error in the AEM scheme is detected and the other terms represent the probability of success when we measure the group register after an error is found at the $n^{th}$ Grover step. Using geometric series identities, we can perform the sum to find that

[TABLE]

To simplify this expression, consider the case when $m=1$ and $N\gg 1$ . In the denominator for both terms, we have the expression $\frac{4\sigma\sin^{2}\left(2\theta\right)}{(1-\sigma)^{2}}$ , where care has to be taken as we have competing limits as $N\to\infty$ , when assuming

[TABLE]

for the coherence time $\braket{t}$ , which we also assume is a monotonically increasing function of $N$ . Thus to lowest order in $\frac{1}{N}$ and using $\sin^{2}(2\theta)=4\sin^{2}(\theta)\cos^{2}(\theta)=\frac{4}{N}+\mathcal{O}(\frac{1}{N^{2}})$ , we find that

[TABLE]

Looking back at Eq. (C), $\delta=\mathcal{O}(1)$ for AEM with measure-and-check to significantly increase the probability of success. So in terms of $\delta$ , the coherence time goes as

[TABLE]

To give a final expression for the probability of success, we make a few approximations. First we use $\sin^{2}\left((2p+1)\theta\right)=\sin^{2}\left((2p-1)\theta\right)+\mathcal{O}\left(\frac{1}{N}\right)$ and likewise, we ignore the term in Eq. (C) which goes as $\sim\sin^{2}\theta=\frac{1}{N}$ . Our probability of success is then asymptotically

[TABLE]

where $\sigma\sim\exp\left(-\frac{4}{\delta\sqrt{N}}\right)$ . When $e=1$ , i.e. the most optimistic case, this reduces to Eq. (IV.2).

Appendix D Details of the Noisy Simulation

In this appendix, we discuss some of the details of the noisy simulation. Relative gate times are extracted from Ref. O'Brien2017 which uses data for superconducting qubits. All single qubit gate times (SQGT) are assumed to be equal and all other simulation times are measured in units of this time. All two-qubit gates are assumed to be twice the SQGT and all gates are decomposed into one- and two-qubit gates. The Intel-QS does not have a feature to simulate measurements, so our source code has been altered to include measurement simulation capabilities. A Mersenne twist random number generator is added specifically to simulate the probabilistic nature of quantum measurement. Furthermore, a measurement time of 10 SQGTs is added to simulate the accumulation of error that would occur in a real system while a measurement is being performed. We do not consider the possibility of error in the measured value as compared to the resulting quantum state though this is an important source of error to consider in a real system. Finally, the method by which the Intel-QS accounts for the time between gate action has been altered to include parallelization. A sequence of gates with disjoint support on the qubits is assumed to be applied in parallel in which case time is only incremented by the largest gate time in that sequence. No error is accumulated during classical computation cycles, though this is an important source of error to consider for real systems. All these considerations are used to calculate the total run-time for a single trial of Gmin.

We also note that currently we do not make use of a compiler to reduce the number of gates. Therefore the error rates for all simulations are higher than they would be if we used such an optimizing software.

Bibliography21

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] John Preskill. Quantum Computing in the NISQ era and beyond. Quantum , 2:79, August 2018.
2[2] P. Shor. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM Journal on Computing , 26(5):1484–1509, 1997.
3[3] Lov K. Grover. A fast quantum mechanical algorithm for database search. In Proceedings of the Twenty-eighth Annual ACM Symposium on Theory of Computing , STOC ’96, pages 212–219, New York, NY, USA, 1996. ACM.
4[4] P. Mateus and Omar Y. Quantum pattern matching. ar Xiv:quant-ph/0508237 , 2005.
5[5] Christoph Dürr and Peter Høyer. A quantum algorithm for finding the minimum. Co RR , quant-ph/9607014, 1996.
6[6] Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. A quantum approximate optimization algorithm, 2014.
7[7] M. Tinkham. Group Theory and Quantum Mechanics . Dover Books on Chemistry and Earth Sciences. Dover Publications, 2003.
8[8] Alexander Wietek and Andreas M. Läuchli. Sublattice coding algorithm and distributed memory parallelization for large-scale exact diagonalizations of quantum many-body systems. Phys. Rev. E , 98:033309, Sep 2018.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

A Quantum Solution for Efficient Use of Symmetries in the Simulation of Many-Body Systems

Abstract

I Introduction

II Overview of the Problem and the Quantum Solution

II.1 Orbit Representative Problem Statement

II.2 Classical Solutions

II.3 Overview of the Grover Minimization Algorithm

II.4 Circuit Implementation of Grov and its Cost

II.5 Oracle Budget and Probability of Success

III Full Simulation for a Perfect Quantum Machine

IV Error Mitigation Strategies and their Simulation

IV.1 Strategies

IV.2 Performance of AEM Gmin

IV.3 Simulation of Error Mitigation Strategies

IV.4 Reducing Phase-flip Error

IV.5 Simulation for Realistic Hardware

V Conclusions

VI Acknowledgements

Appendix A Deriving a Tighter Lower Bound for the Oracle Budget

Appendix B Reducing the Cost of PhComp and G^addN\hat{G}_{\text{add}}^{N}G^addN​

Appendix C Derivation of AEM Performance

Appendix D Details of the Noisy Simulation

Appendix B Reducing the Cost of PhComp and $\hat{G}_{\text{add}}^{N}$