Quantum-enhanced learning of rotations about an unknown direction

Yin Mo; Giulio Chiribella

arXiv:1906.01300·quant-ph·September 28, 2021

Quantum-enhanced learning of rotations about an unknown direction

Yin Mo, Giulio Chiribella

PDF

TL;DR

This paper introduces a quantum machine learning approach for rotating a quantum bit about an unknown axis, demonstrating quantum memory advantages over classical methods for finite spin values.

Contribution

It shows that quantum memory of logarithmic size outperforms any classical memory in learning rotations about an unknown direction.

Findings

01

Quantum memory of O(log j) qubits surpasses classical memory performance.

02

Quantum advantage persists for all finite j and limited access times.

03

Provides a benchmark for experimental demonstration of quantum learning.

Abstract

We design machines that learn how to rotate a quantum bit about an initially unknown direction, encoded in the state of a spin-j particle. We show that a machine equipped with a quantum memory of O(log j) qubits can outperform all machines with purely classical memory, even if the size of their memory is arbitrarily large. The advantage is present for every finite j and persists as long as the quantum memory is accessed for no more than O(j) times. We establish these results by deriving the ultimate performance achievable with purely classical memories, thus providing a benchmark that can be used to experimentally demonstrate the implementation of quantum-enhanced learning.

Equations325

ρ_{θ, n}

ρ_{θ, n}

\displaystyle=\mathcal{C}_{\theta}\big{(}|j,j\rangle_{\mathbf{n}}\langle j,j|_{\mathbf{n}}\otimes|\psi\rangle\langle\psi|\big{)}\,,

\displaystyle F_{1}(j,\theta):=\int{\rm d}{\bf n}\,\int{\rm d}\psi\,\langle\psi|V_{\theta,\bf n}^{{\dagger}}\,\Big{[}\mathcal{C}_{\theta}\big{(}|j,j\rangle_{\mathbf{n}}\langle j,j|_{\mathbf{n}}\otimes|\psi\rangle\langle\psi|\big{)}\Big{]}\,V_{\theta,\bf n}|\psi\rangle\,,

\displaystyle F_{1}(j,\theta):=\int{\rm d}{\bf n}\,\int{\rm d}\psi\,\langle\psi|V_{\theta,\bf n}^{{\dagger}}\,\Big{[}\mathcal{C}_{\theta}\big{(}|j,j\rangle_{\mathbf{n}}\langle j,j|_{\mathbf{n}}\otimes|\psi\rangle\langle\psi|\big{)}\Big{]}\,V_{\theta,\bf n}|\psi\rangle\,,

∣ ϕ_{x} ⟩ := (U_{x} \otimes I_{A}) ∣ ϕ ⟩,

∣ ϕ_{x} ⟩ := (U_{x} \otimes I_{A}) ∣ ϕ ⟩,

V_{θ, g} := U_{g} V_{θ} U_{g}^{†},

V_{θ, g} := U_{g} V_{θ} U_{g}^{†},

\displaystyle|\phi_{\theta,g}\rangle:=\big{(}U_{g}^{(j)}\otimes I_{\rm A}\big{)}\,|\phi_{\theta}\rangle\,,

\displaystyle|\phi_{\theta,g}\rangle:=\big{(}U_{g}^{(j)}\otimes I_{\rm A}\big{)}\,|\phi_{\theta}\rangle\,,

F_{2} (j, θ) = \int d g \int d ψ ⟨ ψ ∣ V_{θ, g}^{†} C_{θ} (∣ ϕ_{θ, g} ⟩ ⟨ ϕ_{θ, g} ∣ \otimes ∣ ψ ⟩ ⟨ ψ ∣) V_{θ, g} ∣ ψ ⟩,

F_{2} (j, θ) = \int d g \int d ψ ⟨ ψ ∣ V_{θ, g}^{†} C_{θ} (∣ ϕ_{θ, g} ⟩ ⟨ ϕ_{θ, g} ∣ \otimes ∣ ψ ⟩ ⟨ ψ ∣) V_{θ, g} ∣ ψ ⟩,

F_{2} (j, θ)

F_{2} (j, θ)

\displaystyle=\int{\rm d}g\,\int{\rm d}\psi\,\langle\psi|V_{\theta,\mathbf{n}(g)}^{\dagger}\,\mathcal{C}_{\theta}\Big{(}|j,j\rangle_{\mathbf{n}(g)}\langle j,j|_{\mathbf{n}(g)}\otimes|\psi\rangle\langle\psi|\Big{)}\,V_{\theta,\mathbf{n}(g)}|\psi\rangle

\displaystyle=\int{\rm d}\mathbf{n}\,\int{\rm d}\psi\,\langle\psi|V_{\theta,\mathbf{n}}^{\dagger}\,\mathcal{C}_{\theta}\Big{(}|j,j\rangle_{\mathbf{n}}\langle j,j|_{\mathbf{n}}\otimes|\psi\rangle\langle\psi|\Big{)}\,V_{\theta,\mathbf{n}}|\psi\rangle

= F_{1} (j, θ) .

C_{θ} (U_{g}^{(j)} \otimes U_{g}) = U_{g} C_{θ} \forall g \in SO (3),

C_{θ} (U_{g}^{(j)} \otimes U_{g}) = U_{g} C_{θ} \forall g \in SO (3),

F_{2} (j, θ)

F_{2} (j, θ)

\displaystyle=\int{\rm d}h\,\int{\rm d}k\,\int{\rm d}\psi\,\langle\psi|V_{\theta,k}^{\dagger}\,\mathcal{C}_{\theta}\big{(}\phi_{\theta,kh^{-1}}\otimes\psi\big{)}\,V_{\theta,k}|\psi\rangle\,,

⟨ ϕ_{θ} ⟩ = \int d h ϕ_{θ, h^{- 1}}

⟨ ϕ_{θ} ⟩ = \int d h ϕ_{θ, h^{- 1}}

F_{2} (j, θ)

F_{2} (j, θ)

⟨ ϕ_{θ} ⟩ = m = - j \sum + j p_{m}^{(θ)} ∣ j, m ⟩ ⟨ j, m ∣ \otimes ∣ α_{m}^{(θ)} ⟩ ⟨ α_{m}^{(θ)} ∣,

⟨ ϕ_{θ} ⟩ = m = - j \sum + j p_{m}^{(θ)} ∣ j, m ⟩ ⟨ j, m ∣ \otimes ∣ α_{m}^{(θ)} ⟩ ⟨ α_{m}^{(θ)} ∣,

F_{2} (j, θ)

F_{2} (j, θ)

\displaystyle=\int{\rm d}g\,\int{\rm d}\psi^{\prime}\,\langle\psi^{\prime}|V_{\theta}^{\dagger}U_{g}^{\dagger}\,\mathcal{C}_{\theta}\big{(}\mathcal{U}_{g}^{(j)}(\phi_{\theta})\otimes\mathcal{U}_{g}(\psi^{\prime})\big{)}\,U_{g}V_{\theta}|\psi^{\prime}\rangle

\displaystyle=\int{\rm d}\psi^{\prime}\,\langle\psi^{\prime}|\,V_{\theta}^{\dagger}\mathcal{C}^{\prime}_{\theta}\big{(}\phi_{\theta}\otimes\psi^{\prime}\big{)}\,V_{\theta}|\psi^{\prime}\rangle\,,

\displaystyle F_{2}(j,\theta)=\int{\rm d}\psi\,\langle\psi|V_{\theta}^{\dagger}\,\mathcal{C}_{\theta}\big{(}\phi_{\theta}\otimes\psi\big{)}\,V_{\theta}|\psi\rangle\,.

\displaystyle F_{2}(j,\theta)=\int{\rm d}\psi\,\langle\psi|V_{\theta}^{\dagger}\,\mathcal{C}_{\theta}\big{(}\phi_{\theta}\otimes\psi\big{)}\,V_{\theta}|\psi\rangle\,.

F_{2} (j, θ) = \frac{1}{3} + \frac{2}{3} F_{2}^{(e)} (j, θ),

F_{2} (j, θ) = \frac{1}{3} + \frac{2}{3} F_{2}^{(e)} (j, θ),

F_{2}^{(e)} (j, θ) := ⟨ Φ^{+} ∣ (V_{θ} \otimes I_{R})^{†} [(C_{θ} \otimes I_{R}) (∣ j, m_{θ} ⟩ ⟨ j, m_{θ} ∣ \otimes Φ^{+})] (V_{θ} \otimes I_{R}) ∣ Φ^{+} ⟩,

F_{2}^{(e)} (j, θ) := ⟨ Φ^{+} ∣ (V_{θ} \otimes I_{R})^{†} [(C_{θ} \otimes I_{R}) (∣ j, m_{θ} ⟩ ⟨ j, m_{θ} ∣ \otimes Φ^{+})] (V_{θ} \otimes I_{R}) ∣ Φ^{+} ⟩,

F_{2}^{(e)} (j, θ)

F_{2}^{(e)} (j, θ)

∣ Φ_{θ} ⟩ = (V_{θ} \otimes I_{R}) ∣ Φ^{+} ⟩,

∣ Φ_{θ} ⟩ = (V_{θ} \otimes I_{R}) ∣ Φ^{+} ⟩,

C_{θ} = 2 (2 j + 1) (I_{R_{j}} \otimes C_{θ} \otimes I_{R}) (Φ_{j}^{+} \otimes Φ^{+}),

C_{θ} = 2 (2 j + 1) (I_{R_{j}} \otimes C_{θ} \otimes I_{R}) (Φ_{j}^{+} \otimes Φ^{+}),

C_{θ}^{*} := (e^{- iπ J_{y}} \otimes I \otimes σ_{y}) C_{θ} (e^{iπ J_{y}} \otimes I \otimes σ_{y}) .

C_{θ}^{*} := (e^{- iπ J_{y}} \otimes I \otimes σ_{y}) C_{θ} (e^{iπ J_{y}} \otimes I \otimes σ_{y}) .

[C_{θ}^{*}, U_{g}^{(j)} \otimes U_{g} \otimes U_{g}] = 0, \forall g \in SO (3) .

[C_{θ}^{*}, U_{g}^{(j)} \otimes U_{g} \otimes U_{g}] = 0, \forall g \in SO (3) .

C^{2 j + 1} \otimes C^{2} \otimes C^{2} = C^{2 j - 1} \oplus C^{2 j + 3} \oplus (C^{2 j + 1} \otimes C^{2}) .

C^{2 j + 1} \otimes C^{2} \otimes C^{2} = C^{2 j - 1} \oplus C^{2 j + 3} \oplus (C^{2 j + 1} \otimes C^{2}) .

C_{θ}^{*} = α P_{j + 1} \oplus β P_{j - 1} \oplus (P_{j} \otimes M),

C_{θ}^{*} = α P_{j + 1} \oplus β P_{j - 1} \oplus (P_{j} \otimes M),

Tr_{out} [C_{θ}^{*}] = α \frac{2 j + 3}{2 j + 2} P_{j + \frac{1}{2}} + β \frac{2 j - 1}{2 j} P_{j - \frac{1}{2}} + ⟨ + ∣ M ∣ + ⟩ \frac{2 j + 1}{2 j + 2} P_{j + \frac{1}{2}} + ⟨ - ∣ M ∣ - ⟩ \frac{2 j + 1}{2 j} P_{j - \frac{1}{2}},

Tr_{out} [C_{θ}^{*}] = α \frac{2 j + 3}{2 j + 2} P_{j + \frac{1}{2}} + β \frac{2 j - 1}{2 j} P_{j - \frac{1}{2}} + ⟨ + ∣ M ∣ + ⟩ \frac{2 j + 1}{2 j + 2} P_{j + \frac{1}{2}} + ⟨ - ∣ M ∣ - ⟩ \frac{2 j + 1}{2 j} P_{j - \frac{1}{2}},

⎩ ⎨ ⎧ \frac{2 j + 3}{2 j + 2} α + \frac{2 j + 1}{2 j + 2} ⟨ + ∣ M ∣ + ⟩ = 1 \frac{2 j - 1}{2 j} β + \frac{2 j + 1}{2 j} ⟨ - ∣ M ∣ - ⟩ = 1 .

⎩ ⎨ ⎧ \frac{2 j + 3}{2 j + 2} α + \frac{2 j + 1}{2 j + 2} ⟨ + ∣ M ∣ + ⟩ = 1 \frac{2 j - 1}{2 j} β + \frac{2 j + 1}{2 j} ⟨ - ∣ M ∣ - ⟩ = 1 .

F_{2}^{(e)} (j, θ)

F_{2}^{(e)} (j, θ)

∣ j, - m_{θ} ⟩ \otimes ∣ Φ_{θ}^{*} ⟩ = a ∣ j + 1, - m_{θ} ⟩ + b ∣ j - 1, - m_{θ} ⟩ + c_{+} ∣ j, - m_{θ} ⟩ \otimes ∣ + ⟩ + c_{-} ∣ j, - m_{θ} ⟩ \otimes ∣ - ⟩,

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Quantum-enhanced learning of rotations about an unknown direction

Yin Mo

Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong

Giulio Chiribella

Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong

Department of Computer Science, University of Oxford, Wolfson Building, Parks Road, Oxford, UK

Abstract

We design machines that learn how to rotate a quantum bit about an initially unknown direction, encoded in the state of a spin- $j$ particle. We show that a machine equipped with a quantum memory of $O(\log j)$ qubits can outperform all machines with purely classical memory, even if the size of their memory is arbitrarily large. The advantage is present for every finite $j$ and persists as long as the quantum memory is accessed for no more than $O(j)$ times. We establish these results by deriving the ultimate performance achievable with purely classical memories, thus providing a benchmark that can be used to experimentally demonstrate the implementation of quantum-enhanced learning.

††preprint: APS/123-QED

I Introduction

Quantum machine learning Biamonte et al. (2017); Dunjko and Briegel (2018) explores the interface between machine learning and quantum information science. On the one hand, quantum algorithms have been shown to offer speedups to a variety of classical machine learning tasks Aïmeur et al. (2006); Harrow et al. (2009); Rebentrost et al. (2014); Rønnow et al. (2014); Wiebe et al. (2014); Dunjko et al. (2016); Amin et al. (2018). On the other hand, ideas from machine learning stimulated the formulation of new quantum tasks, such as quantum state classification Sasaki et al. (2001); Sasaki and Carlini (2002); Guţă and Kotłowski (2010); Sentís et al. (2012), quantum learning of gates Bisio et al. (2010); Marvian and Lloyd (2016); Sedlák et al. (2019) and measurements Bisio et al. (2011); Sentís et al. (2012).

An important component of any learning machine is its internal memory, wherein information gathered from the environment is stored. A machine equipped with purely classical memory can only gather information through measurements, and can only perform conditional operations controlled by classical data. In contrast, a machine equipped with a quantum memory can gather information by interacting coherently with its environment, and can perform operations that are controlled by quantum data. A fundamental question is whether the additional freedom offered by the quantum memory can enhance the learning performance.

A task where quantum memories are known to enhance the performance is quantum cloning Scarani et al. (2005), which can be rephrased as the task of learning how to prepare copies of an unknown quantum state by gathering sample copies of it. In this task, a machine with a quantum memory can achieve strictly higher accuracy than all machines with a purely classical memory Bužek and Hillery (1996); Gisin and Massar (1997); Werner (1998).

A strikingly different situation occurs in the task of learning how to perform unitary gates. In this case, quantum memories can enhance the performance of probabilistic learning machines Nielsen and Chuang (1997); Vidal et al. (2002); Hillery et al. (2002); Vidal et al. (2002); Brazier et al. (2005); Ishizaka and Hiroshima (2008); Bartlett et al. (2009); Sedlák et al. (2019), but the enhancements observed so far disappear if the machines are required to approximate the desired gate with unit probability Bisio et al. (2010). The reason for such behaviour is that the learning machines considered so far were designed to perform groups of unitary gates, such as the group $\mathsf{SO}(3)$ of all qubit rotations, or the group $\mathsf{U}(1)$ of qubit rotations about a fixed axis. In these highly symmetric scenarios, a general theorem by Bisio *et al * Bisio et al. (2010) implies that every quantum machine operating with unit probability can be replaced by a machine that achieves the same learning accuracy with a purely classical memory. Given the generality of this result, one may be tempted to conjecture that quantum memories are of no use for deterministically learning unitary gates. Such conjecture would be consistent with Nielsen and Chuang’s no-programming theorem Nielsen and Chuang (1997), which implies that whenever a set of unitary gates can be perfectly encoded into a set of quantum states, the states in the set must be orthogonal, and therefore storable in a purely classical memory.

In contrast with the above observations, here we show that quantum memories can generally enhance the performance of deterministic machines attempting to learn unitary gates. To make this point, we provide a concrete example where the optimal deterministic learning strategies, with and without quantum memory, can be determined explicitly. Specifically, we consider machines that learn how to rotate a quantum particle by a desired angle $\theta$ about an initially unknown direction $\mathbf{n}=(n_{x},n_{y},n_{z})$ , imprinted in the state of a spin- $j$ particle. We consider two different ways of imprinting the direction in a spin- $j$ particle, corresponding to the following scenarios:

•

*Scenario 1: spin relaxation. * A static magnetic field $\mathbf{B}=(B_{x},B_{y},B_{z})$ , pointing in an unknown direction $\mathbf{n}=\mathbf{B}/\|\mathbf{B}\|$ , acts in a certain region of space. A spin- $j$ probe enters the region and undergoes a thermalisation process with respect to the magnetic dipole Hamiltonian $H=-\mu\,\mathbf{B}\cdot\mathbf{J}$ , where $\mathbf{J}=(J_{x},J_{y},J_{z})$ are the spin operators, $\mathbf{B}\cdot\mathbf{J}:=B_{x}J_{x}+B_{y}J_{y}+B_{z}J_{z}$ , and $\mu>0$ is a suitable constant. For simplicity, we assume that the temperature is low enough that the thermal state is approximately the ground state of the Hamiltonian, namely the eigenstate of the operator $\mathbf{n}\cdot\mathbf{J}:=n_{x}J_{x}+n_{y}J_{y}+n_{z}J_{z}$ with maximum eigenvalue $j$ , hereafter denoted as $|j,j\rangle_{\mathbf{n}}$ . An extension to thermal states at finite temperature will be discussed in Section VI.

•

Scenario 2: action of an unknown unitary gate. A black box implements an unknown rotation $g\in\mathsf{SO}(3)$ , which transforms the $z$ -axis into the direction $\mathbf{n}$ . A machine prepares a spin- $j$ probe in an initial state $|\phi_{\theta}\rangle$ (possibly depending on the desired rotation angle), and sends the probe as input to the black box. After the action of the black box, the state of the probe is $|\phi_{\theta,g}\rangle=U_{g}^{(j)}|\phi_{\theta}\rangle$ , where $U_{g}^{(j)}$ is the unitary matrix representing the action of the rotation $g$ .

Scenario 2 is also relevant to the study of quantum reference frames Bartlett et al. (2007). Our learning task can be translated into a distributed quantum protocol involving two distant parties, Alice and Bob, who do not share a reference frame for spatial directions. The goal of the protocol is to allow Bob to rotate a target particle by a desired angle $\theta$ about the direction of Alice’s $z$ -axis, encoded in the state of a spin- $j$ particle prepared by Alice and sent to Bob as a token of her reference frame. In this setting, the unknown rotation $g$ describes the mismatch between Alice’s and Bob’s Cartesian axes, and the optimal learning strategy provides the optimal protocol for encoding the direction of Alice’s $z$ -axis in a spin- $j$ particle and for rotating Bob’s target particle accordingly.

A key difference between Scenarios 1 and 2 is that the initial state of the probe is irrelevant in Scenario 1 (every initial state is reset to the state $|j,j\rangle_{\mathbf{n}}$ ), while it can be optimised in Scenario 2. More generally, the optimal probe state in Scenario 2 could be an entangled state involving, in addition to the the spin- $j$ particle, an auxiliary system stored in the internal memory of the machine. Nevertheless, we will show that such auxiliary system does not increase the accuracy in the execution of the desired rotation, and therefore it can be omitted without loss of generality.

In this paper we establish the optimal learning strategies for both Scenarios 1 and 2, focussing on the case where the target particle is a qubit. A summary of the key result is as follows. For $j>1$ , we find that the optimal strategies for Scenarios 1 and 2 coincide. In both cases, the optimal learning strategy consists in

preparing the probe in the initial state $|\phi_{\theta}\rangle=|j,j\rangle$ , the eigenstate of $J_{z}$ with maximum eigenvalue $j$ 2. 2.

imprinting the direction in the probe, and storing the resulting state in a quantum memory of $\lceil\log(2j+1)\rceil$ qubits 3. 3.

retrieving the probe’s state from the memory and letting it interact with the target through the isotropic Heisenberg interaction $H\propto\sigma_{x}J_{x}+\sigma_{y}J_{y}+\sigma_{z}J_{z}$ , where $(\sigma_{i})_{i=x,j,z}$ are the Pauli matrices for the target qubit.

Notably, the structure of the optimal learning machine is independent of the desired rotation angle $\theta$ : a single probe state and a single interaction Hamiltonian work optimally for all possible angles. The rotation angle only affects the interaction time between the probe and the target.

For every $j>1$ , we prove that the optimal machine with quantum memory outperforms every machine with purely classical memory. We determine the optimal fidelity over all machines with purely classical memory, providing a benchmark that can be used to demonstrate the advantage of quantum memories in realistic experiments. For example, we show that the optimal classical strategy for $j=3/2$ and $\theta=\pi$ has fidelity $64\%$ , while the optimal quantum strategy has fidelity of $71\%$ . As a consequence, every experimental fidelity above $64\%$ guarantees the demonstration of quantum-enhanced learning. In general, we show that a non-zero quantum advantage is present for every rotation angle $\theta\not=0$ and for every $j>1$ . We also prove that the advantage persists even if the memory is accessed multiple times, as long as the number of accesses to the memory is $O(j)$ . In Scenario 1, we find that the quantum advantage persists at non-zero temperature $T$ , as long as the magnetic energy $\mu\|\mathbf{B}\|$ is large compared to the thermal fluctuation $k_{\rm B}T$ , $k_{\rm B}$ being the Boltzmann constant.

For $j=1$ , we find out a striking difference between Scenarios 1 and 2. In Scenario 1, the quantum memory offers an advantage for all possible rotation angles. In Scenario 2, the advantage disappears when the rotation angle approaches $\pi$ . In that regime, the optimal strategy consists in

preparing the probe in the initial state $|\phi_{\theta}\rangle=|1,0\rangle$ , the eigenstate of $J_{z}$ with eigenvalue $m=0$ , 2. 2.

sending the probe to the unknown gate $U_{g}^{(j)}$ , and measuring the resulting state $U_{g}^{(j)}|1,0\rangle$ on the basis $\{|1,0\rangle_{i}\}_{i\in\{x,y,z\}}$ , where $|1,0\rangle_{i}$ is the eigenstate of $J_{i}$ with eigenvalue $m=0$ , 3. 3.

conditionally on outcome $i$ , performing a spin flip around the $i$ -axis.

For $j=1/2$ , the optimal strategies for Scenarios 1 and 2 coincide, and the availability of a quantum memory offers advantages for all rotation angles except $\theta=0$ and $\theta=\pi$ .

The paper is structured as follows. In Section II we introduce the problem of learning a rotation about an unknown direction, considering two alternative ways of imprinting the direction into the state of a spin- $j$ probe. We derive the optimal quantum strategy in Section III, and the corresponding quantum benchmark in Section IV. In Section V, we show that the advantage persists even if the memory state is accessed multiple times, and in Section VI, we show that the optimal learning strategy for Scenario 1 is robust to thermal noise. In Section VII, we extend our results from qubits to systems of arbitrary dimensions. The conclusions are drawn in Section VIII.

II Learning how to rotate about an unknown axis

In this section we introduce the task of learning how to rotate a quantum particle about an initially unknown axis. We consider two scenarios, in which the unknown axis is imprinted in a quantum probe via two physically different processes: (1) spin relaxation, and (2) action of an unknown rotation gate. We formalise the optimisation problems corresponding to these scenarios and establish a relation between the corresponding solutions.

II.1 Scenario 1: learning from a relaxation process

Suppose that a static magnetic field $\mathbf{B}=(B_{x},B_{y},B_{z})$ is turned on for a limited amount of time in a bounded region of space. While the field is turned on, a spin- $j$ particle is placed in the region and undergoes a relaxation process, whereby its spin becomes aligned with the field’s direction. After the alignment has taken place, the state of the particle is stored in the internal memory of a quantum machine, which will later use it to rotate a target particle by a desired angle $\theta$ about the direction $\mathbf{n}=\mathbf{B}/\|\mathbf{B}\|$ .

We denote the spin- $j$ particle as ${\rm P}_{j}$ , and let $J_{x},J_{y}$ and $J_{z}$ be its spin operators, satisfying the commutation relations $[J_{x},J_{y}]=iJ_{z}$ , $[J_{y},J_{z}]=iJ_{x}$ , and $[J_{z},J_{x}]=iJ_{y}$ . All throughout the paper the standard notation $|j,m\rangle$ (respectively, $|j,m\rangle_{\mathbf{n}}$ ) for the eigenstate of the operator $J_{z}$ (respectively, $\mathbf{n}\cdot\mathbf{J}$ ) with eigenvalue $m$ .

The alignment of the magnet with the external magnetic field can be described by a thermalisation process, whereby the initial state of the magnet converges to the thermal state of the magnetic Hamiltonian $H=-\mu\,\mathbf{B}\cdot\mathbf{J}=-\mu(B_{x}J_{x}+B_{y}J_{y}+B_{z}J_{z})$ , where $\mu>0$ is a suitable constant. For simplicity, we will assume that the temperature of the bath is low enough that the thermal state is approximately the ground state of $H$ . Explicitly, the ground state is the spin coherent state $|j,j\rangle_{\mathbf{n}}$ .

Overall, the alignment process can be modelled as a quantum channel (completely positive trace-preserving map) $\mathcal{T}_{\mathbf{n}}$ that resets every state of the probe to the state $|j,j\rangle_{\mathbf{n}}$ . In Section VI we will extend our discussion to the finite-temperature scenario, where the channel $\mathcal{T}_{\mathbf{n}}$ resets the probe state to the thermal state of the magnetic dipole Hamiltonian.

The goal of the quantum machine is to rotate a target particle $\rm S$ by a given angle $\theta$ about the direction $\mathbf{n}$ . We will mostly focus on the case where the target is a spin- $1/2$ particle, regarded as a qubit. In this case, we denote the target rotation by $V_{\theta,\bf n}:=\cos\frac{\theta}{2}\,I-i\sin\frac{\theta}{2}\,\mathbf{n}\cdot\boldsymbol{\sigma}$ , where $\boldsymbol{\sigma}=(\sigma_{x},\sigma_{y},\sigma_{z})$ are the three Pauli matrices, and $\mathbf{n}\cdot\boldsymbol{\sigma}:=n_{x}\sigma_{x}+n_{y}\sigma_{y}+n_{z}\sigma_{z}$ .

To learn how to implement the target rotation, the machine will transfer information from the magnet to its internal memory $\rm M$ . Mathematically, this operation is described by a quantum channel (completely positive trace-preserving map) $\mathcal{E}_{\theta}$ transforming states of ${\rm P}_{j}$ into states of $\rm M$ . To be completely general, we allow the channel to depend on the desired angle $\theta$ . If the memory is classical, the channel $\mathcal{E}_{\theta}$ represents a measurement on the magnet, followed by the storage of the outcome in the memory. If the memory is quantum, the channel $\mathcal{E}_{\theta}$ can be any process transforming states of the magnet into states of the memory.

When asked to perform the target rotation, the machine will retrieve information from its internal memory, and will use such information to control the evolution of the target system, hereafter denoted by $\rm S$ . If the memory is classical, the control amounts to a conditional operation on the target depending on the classical data stored in the memory. If the memory is quantum, the control can be any general interaction between the memory and the target system. In both cases, the control operation can be described by a quantum channel $\mathcal{R}_{\theta}$ transforming joint states of the composite system $\rm M\otimes\rm S$ into states of $\rm S$ .

Overall, the structure of the learning process is depicted in Figure 1.

If the initial state of the target is $|\psi\rangle$ , then the final state is

[TABLE]

where $\mathcal{C}_{\theta}:=\mathcal{R}_{\theta}\circ(\mathcal{E}_{\theta}\otimes\mathcal{I}_{\rm S})$ is the effective quantum channel transforming joint states of the probe and the target into states of the target alone.

To evaluate the accuracy of the learning process, we compare the output state $\rho_{\theta,\mathbf{n}}$ with the desired output $V_{\theta,\bf n}|\psi\rangle\langle\psi|V_{\theta,\bf n}^{\dagger}$ . As a figure of merit, we use the average input-output fidelity Gilchrist et al. (2005)

[TABLE]

where ${\rm d}\bf n$ is the rotationally-invariant probability distribution on the unit sphere, $|\psi\rangle$ is the initial state of the target qubit, and ${\rm d}\psi$ is the unitarily invariant probability distribution on the pure states. The associated optimisation problem is:

Problem 1

Find the quantum channel $\mathcal{C}_{\theta}$ that maximises the fidelity $F_{1}(j,\theta)$ in Equation (2).

The optimisation can be performed with different constraints on the channel $\mathcal{C}_{\theta}$ , corresponding to different assumptions on the machine’s internal memory. In this paper, we will consider two cases:

The machine is equipped with a quantum memory of $\log\lceil 2j+1\rceil$ qubits. In this case, the channel $\mathcal{C}_{\theta}$ is an arbitrary completely positive trace-preserving map. 2. 2.

The machine is equipped with a classical memory of arbitrary size. In this case, the channel $\mathcal{C}_{\theta}$ must be decomposable into a measurement on the probe followed by a conditional operation on the target.

We will carry out both optimisations and compare the maximum fidelity achievable with a quantum memory with the maximum fidelity achievable with classical memories of arbitrary size.

II.2 Scenario 2: learning from a rotation gate

Consider the following general problem. A quantum machine has access to one use of a black box implementing some unknown unitary gate $U_{x}$ , randomly drawn from some set $(U_{x})_{x\in\mathsf{X}}$ . By interacting with the black box, the machine has to learn how to perform another unitary gate $V_{x}$ , acting on a target system $\rm S$ . Typically, the gate learning problems considered so far correspond to the case $V_{x}=U_{x}$ (the machine attempts to emulate the gate $U_{x}$ Bisio et al. (2010); Marvian and Lloyd (2016); Sedlák et al. (2019)), or to the case $V_{x}=U_{x}^{\dagger}$ (the machine attempts to invert the gate $U_{x}$ Bartlett et al. (2009); Bisio et al. (2010)). In general, the relation between $U_{x}$ and $V_{x}$ can be arbitrary.

To learn the target gate, the machine sends a probe $\rm P$ to the black box. In general, the probe can be entangled with an auxiliary system $\rm A$ , stored in the machine’s internal memory. If the initial state of the composite system $\rm{P}\otimes{\rm A}$ is $|\phi\rangle$ , then the state after the action of the black box is

[TABLE]

where $I_{\rm A}$ denotes the identity operator on the auxiliary system.

After the black box has acted, the probe returns to the machine, which transfers information from the state $|\phi_{x}\rangle$ to its internal memory $\rm M$ . The transfer of information is described by a quantum channel $\mathcal{E}$ with input system $\rm P\otimes\rm A$ and output system $\rm M$ . Overall, the imprinting of the parameter $x$ in the machine’s memory is called the training phase. Accordingly, we call $U_{x}$ the training gate.

After the training phase has been concluded, the machine will be asked to perform the gate $V_{x}$ on the target system. We call this phase the execution phase. The machine will access its internal memory and use it to control the evolution of the target system. The control mechanism is described by a quantum channel $\mathcal{R}$ with input system $\rm M\otimes\rm S$ and output system $\rm S$ .

We call the above scenario the $U_{x}$ -to- $V_{x}$ learning problem. Its overall structure is summarised in Figure 2. The temporal separation between the training phase and the execution phase makes the $U_{x}$ -to- $V_{x}$ learning problem distinct from the problem of simulating the gate $V_{x}$ using the gate $U_{x}$ as a resource Chiribella et al. (2008); Bisio et al. (2014); Chiribella et al. (2015); Yang et al. (2017); Miyazaki et al. (2017); Quintino et al. (2018). In that problem, the gate $V_{x}$ is simulated by an arbitrary circuit using the gate $U_{x}$ , not necessarily a circuit of the form depicted in Figure 2.

A general result by Bisio et al concerns the case where the set $\mathsf{X}$ is a group and the mappings $x\mapsto U_{x}$ and $x\mapsto V_{x}$ (or $x\mapsto V_{x}^{\dagger}$ ) are two unitary representation of the group $\mathsf{X}$ . In this scenario, the authors showed that the optimal learning performance can be achieved with a purely classical memory Bisio et al. (2010). In this paper, we present an instance of $U_{x}$ -to- $V_{x}$ learning problem that evades Bisio et al no go theorem. In our scenario, $x$ is a rotation $g\in\mathsf{SO}(3)$ , the probe is a spin- $j$ particle ${\rm P}_{j}$ , the training gate is the unitary gate $U_{g}^{(j)}$ that implements the rotation $g$ on the probe, the target is a qubit, and the target gate is the rotation $V_{\theta,g}$ defined by

[TABLE]

where $\theta\in[0,2\pi)$ is a fixed, but otherwise arbitrary angle, $U_{g}$ is the 2-by-2 unitary matrix representing the rotation $g$ , and $V_{\theta}=\cos\frac{\theta}{2}\,I-i\sin\frac{\theta}{2}\,\sigma_{z}$ is the 2-by-2 matrix representing a rotation by $\theta$ about the $z$ -axis. Since the rotation angle is fixed, the target operations do not form a group, and therefore our learning problem falls outside the hypotheses of Bisio et al’s no go theorem.

The $U_{g}$ -to- $V_{\theta,g}$ learning problem is also relevant to the study of quantum reference frames Bartlett et al. (2007). Suppose that two distant parties, Alice and Bob, do not share a reference frame for directions. This means that Bob’s Cartesian axes $\mathbf{n}_{x}^{(B)},$ $\mathbf{n}_{y}^{(B)}$ , and $\mathbf{n}_{z}^{(B)}$ are related to Alice’s Cartesian axes $\mathbf{n}_{x}^{(A)},$ $\mathbf{n}_{y}^{(A)}$ , and $\mathbf{n}_{z}^{(A)}$ by an unknown element of the rotation group $\mathsf{SO}(3)$ , namely $\mathbf{n}_{i}^{(B)}=g\mathbf{n}_{i}^{(A)}$ , for all $i\in\{x,y,z\}$ . Now, imagine that Bob wants to rotate a qubit by an angle $\theta$ about the direction of Alice’s $z$ -axis. To assist Bob in this task, Alice will send him a quantum system carrying information about her reference frame. If the transmitted system is a spin- $j$ particle, prepared by Alice in the state $|\phi_{\theta}\rangle$ , then Bob will receive the particle in the state $U_{g}^{(j)}|\phi_{\theta}\rangle$ , owing to the mismatch of their reference frames. Using the state $U_{g}^{(j)}|\phi_{\theta}\rangle$ as a resource, Bob can attempt to execute the desired rotation, corresponding to the unitary gate $V_{\theta,g}=U_{g}V_{\theta}U_{g}^{\dagger}$ . More generally, Alice could send Bob a spin- $j$ particle together with an auxiliary particle $\rm A$ whose state space is invariant under rotations. In this case, Bob will receive the state $\left(U_{g}^{(j)}\otimes I_{\rm A}\right)|\phi_{\theta}\rangle$ , where $|\phi_{\theta}\rangle$ is the initial state of the spin- $j$ and the auxiliary particle. In this setting, the search for the optimal communication protocol between Alice and Bob is equivalent to the search of the optimal learning strategy for the $U_{g}$ -to- $V_{\theta,g}$ learning problem.

A diagrammatic representation of the $U_{g}$ -to- $V_{\theta,g}$ learning problem is provided in Figure 3. The spin- $j$ probe and the auxiliary system $\rm A$ start off in the state $|\phi_{\theta}\rangle$ . Then, the probe is sent through the gate $U_{g}^{(j)}$ . After the action of the gate $U_{g}^{(j)}$ , the probe and system $\rm A$ will be in the state

[TABLE]

where $I_{\rm A}$ is the identity on the auxiliary system’s Hilbert space. Then, the state $|\phi_{\theta,g}\rangle$ is encoded in the machine’s memory via a channel $\mathcal{E}_{\theta}$ . In the execution phase, the machine will perform a quantum channel $\mathcal{R}_{\theta}$ , transforming the input state of the memory and the target into the output state of the target.

The average fidelity for $U_{g}$ -to- $V_{\theta,g}$ learning task is

[TABLE]

where ${\rm d}g$ is the normalized Haar measure over the rotation group, and $\mathcal{C}_{\theta}:=\mathcal{R}_{\theta}\circ(\mathcal{E}_{\theta}\otimes\mathcal{I}_{\rm S})$ is the effective channel transforming states of the composite system ${\rm P}_{j}\otimes\rm A\otimes\rm S$ into states of $\rm S$ .

This leads to the following optimisation problem:

Problem 2

Find the auxiliary system $\rm A$ , the input state $|\phi_{\theta}\rangle$ , and the channel $\mathcal{C}_{\theta}$ that maximise the fidelity $F_{2}(j,\theta)$ in Equation (6).

Problem 2 reduces to Problem 1 of the previous subsection if system $\rm A$ is trivial, and if the initial state $|\phi_{\theta}\rangle$ is the spin coherent $|j,j\rangle$ , independently of $\theta$ . In this case, the state (5) inputted in the quantum machine is the spin coherent state $U_{g}|j,j\rangle=|j,j\rangle_{\mathbf{n}(g)}$ , where $\mathbf{n}(g)$ is the rotated $z$ -axis $\mathbf{n}(g):=g\mathbf{e}_{z}$ , $\mathbf{e}_{z}=(0,0,1)$ . Since the rotation $g$ is chosen at random according to the Haar measure, the direction $\mathbf{n}(g)$ is distributed uniformly over the unit sphere (see e.g. Section 4.1 of Holevo’s textbook Holevo (2011)). Hence, one has

[TABLE]

Hence, the fidelities $F_{1}(j,\theta)$ and $F_{2}(j,\theta)$ coincide when the input state $|\phi_{\theta}\rangle$ is the spin coherent state $|j,j\rangle$ . Under this condition, both fidelities $F_{1}(j,\theta)$ and $F_{2}(j,\theta)$ are maximised by the same quantum channel $\mathcal{C}_{\theta}$ .

III Optimal quantum strategies

Here we determine the optimal quantum strategies for learning rotations around an unknown direction. We solve Problems 1 and 2 defined in the previous section for all values of the spin $j$ and for all values of the rotation angle $\theta$ . For $j>1$ , we show that the optimal state for Problem 2 is the spin coherent state $|j,j\rangle$ , and therefore the optimal fidelity coincides with the optimal fidelity for Problem 1. In both problems, the best approximation of the target rotation is realised by setting up an isotropic Heisenberg interaction between the target and the probe. For $j=1/2$ and $j=1$ , we find some curious features of the optimal strategies. Notably, the optimal solution of Problem 2 deviates from the optimal solution of Problem 1 for $j=1$ when the rotation angle approaches $\pi$ .

III.1 Structure of the optimal solution of Problem 2

Here we focus on Problem 2 and determine the structure of its optimal solution. The main result is the following theorem:

Theorem 1

The optimal strategy for learning the target gate $V_{\theta,g}=U_{g}V_{\theta}U_{g}^{\dagger}$ from the training gate $U_{g}^{(j)}$ has the following features:

no auxiliary system is needed 2. 2.

the optimal input state is an eigenstate of $J_{z}$ 3. 3.

the optimal quantum channel is rotationally covariant, namely

[TABLE]

where $\mathcal{U}_{g}$ and $\mathcal{U}_{g}^{(j)}$ are the quantum channels induced by the unitary gates $U_{g}$ and $U_{g}^{(j)}$ , respectively.

The theorem follows from two lemmas:

Lemma 1

No auxiliary system is needed in the optimal strategy for learning the gate $V_{\theta,g}$ from the gate $U_{g}^{(j)}$ . The optimal input is an eigenstate of the $z$ -component of the angular momentum.

**Proof. **Note the target gate $V_{\theta,g}$ satisfies the relation $V_{\theta,g}=V_{\theta,gh}$ for every rotation $h$ around the $z$ axis. Then, the fidelity (6) can be rewritten as

[TABLE]

where we used the shorthand notation $\chi:=|\chi\rangle\langle\chi|$ , and we derived the second equality from the invariance of the Haar measure with the change of variables $k=gh$ . Defining the average state

[TABLE]

and its rotated version $\langle\phi_{\theta}\rangle_{k}=\big{(}U_{k}^{(j)}\otimes I_{\rm A}\big{)}\,\langle\phi_{\theta}\rangle\,\big{(}U_{k}^{(j)}\otimes I_{\rm A}\big{)}^{\dagger}$ , the fidelity can be expressed as

[TABLE]

Since $\langle\phi_{\theta}\rangle$ is the average of $\phi$ over all rotations about the $z$ axis, it can be expressed as

[TABLE]

where $\{p^{(\theta)}_{m}\}_{m=-j}^{j}$ is a probability distribution, and each $|\alpha^{(\theta)}_{m}\rangle$ is a pure state of the auxiliary system. Since the fidelity is linear in the input state, the optimal choice is to pick one of the terms in the mixture, such as $|j,m\rangle\langle j,m|\otimes|\alpha^{(\theta)}_{m}\rangle\langle\alpha^{(\theta)}_{m}|$ . Moreover, the state of the the auxiliary system can be absorbed in the definition of the channel $\mathcal{C}_{\theta}$ . This concludes the proof that the optimal input state can be chosen to be $|j,m\rangle$ without loss of generality and that no auxiliary system is needed. $\blacksquare$

Consistently with the above result, we will omit the auxiliary system $\rm A$ from now on.

Lemma 2

The optimal channel $\mathcal{C}_{\theta}$ for learning the gate $V_{\theta,g}$ from the gate $U_{g}^{(j)}$ can be chosen to be covariant without loss of generality.

**Proof. **The optimality of covariant channels follows from the following chain of equalities:

[TABLE]

having defined $|\psi^{\prime}\rangle:=U_{g}^{\dagger}|\psi\rangle$ in the second equality, and $\mathcal{C}_{\theta}^{\prime}:=\int{\rm d}g\,\mathcal{U}_{g}^{\dagger}\mathcal{C}_{\theta}\big{(}\mathcal{U}_{g}^{(j)}\otimes\mathcal{U}_{g}\big{)}$ in the third equality. Since $\mathcal{C}_{\theta}^{\prime}$ is covariant, the above equality shows that every channel can be replaced by a covariant channel with exactly the same fidelity. $\blacksquare$

Covariant channels have the same performance for all possible training gates. Hence, for a covariant channel $\mathcal{C}_{\theta}$ the fidelity can be rewritten as

[TABLE]

III.2 Choi operator formulation

Theorem 1 guarantees that the optimal input state for learning the gate $V_{\theta,g}$ from the gate $U_{g}^{(j)}$ is an eigenstate of $J_{z}$ . Let us denote it generically as $|j,m_{\theta}\rangle$ , for some $m_{\theta}$ between $-j$ and $+j$ , possibly depending on the rotation angle $\theta$ . In the following we will search for the optimal value $m_{\theta}$ and for the optimal covariant channel $\mathcal{C}_{\theta}$ .

First of all, we rewrite the average fidelity as

[TABLE]

where $F_{2}^{\rm(e)}$ is the entanglement fidelity Horodecki et al. (1999), defined as

[TABLE]

$|\Phi^{+}\rangle=(|0\rangle\otimes|0\rangle+|1\rangle\otimes|1\rangle)/\sqrt{2}$ being the canonical maximally entangled state and $\rm R$ denoting a reference qubit, entangled with the target qubit. In turn, the entanglement fidelity can be expressed as

[TABLE]

where $|\Phi_{\theta}\rangle$ is the rotated maximally entangled state

[TABLE]

and $C_{\theta}$ is the Choi operator Choi (1975)

[TABLE]

${\rm R}_{j}$ being a reference system of dimension $2j+1$ , $\mathcal{I}_{{\rm R}_{j}}$ ( $\mathcal{I}_{\rm R}$ ) being the identity map on the reference system ${\rm R}_{j}$ ( ${\rm R}$ ), and $|\Phi^{+}_{j}\rangle=\sum_{m}\,|j,m\rangle\otimes|j,m\rangle/\sqrt{2j+1}$ being the canonical maximally entangled state in dimension $2j+1$ .

The problem is to maximise the fidelity (17) over all Choi operators of covariant channels. The set of the possible Choi operators is characterised by the following three conditions:

Covariance Chiribella et al. (2009): $[C_{\theta},\overline{U}_{g}^{(j)}\otimes U_{g}\otimes\overline{U}_{g}]=0$ for all rotations $g\in\mathsf{SO}(3)$ (here $\overline{U}^{(j)}_{g}$ and $\overline{U}_{g}$ denote the entry-wise complex conjugates of the matrices $U_{g}^{(j)}$ and $U_{g}$ , respectively.) 2. 2.

Positivity: $C_{\theta}$ is positive semidefinite, denoted as $C_{\theta}\geq 0$ 3. 3.

Trace preservation: $\operatorname{Tr}_{\rm out}[C_{\theta}]=I_{\rm in}$ , where $\operatorname{Tr}_{\rm out}$ denotes the trace over the output, and $I_{\rm in}$ denotes the identity over the input.

We now put the above conditions in a form that is convenient for optimization.

Covariance. The covariance condition can be further simplified using the fact that complex conjugate representations of the rotation group are unitarily equivalent. Defining the operator

[TABLE]

the covariance condition becomes

[TABLE]

At this point, the total Hilbert space can be decomposed into orthogonal subspaces, corresponding to different values of the total angular momentum. Specifically, the angular momentum takes values $j-1,j,$ and $j+1$ , and the total Hilbert space is decomposed as

[TABLE]

Relative to this decomposition, using Schur’s lemmas and the covariance condition (21), the operator $C_{\theta}^{*}$ can be written as:

[TABLE]

where $P_{l}$ is the projection on the factor with total angular momentum $l$ , $\alpha$ and $\beta$ are complex coefficients, and $M$ is a complex 2-by-2 matrix.

Positivity. The positivity of the operator $C_{\theta}$ is equivalent to the positivity of the coefficients $\alpha,\beta$ and of the matrix $M$ .

Trace preservation. The condition of trace preservation can be conveniently expressed in terms of the real coefficients $\alpha,\beta$ and of the complex matrix $M$ . Indeed, tracing over the output, we obtain

[TABLE]

for a suitable choice of basis $\{|+\rangle,|-\rangle\}$ . Using Eq. (24), the trace preservation condition $\operatorname{Tr}_{\rm out}[C_{\theta}]=I_{\rm in}$ becomes

[TABLE]

Figure of merit. In terms of the operator $C_{\theta}^{*}$ , the entanglement fidelity can be expressed as

[TABLE]

The expression can be further simplified by decomposing the state $|j,-m_{\theta}\rangle\otimes|\Phi^{*}_{\theta}\rangle$ on the subspaces of Eq. (22). After a bit of labor with the Clebsch-Gordan coefficients, we find the decomposition

[TABLE]

with

[TABLE]

Using the above decomposition, the entanglement fidelity can be expressed as

[TABLE]

to be maximized over all positive coefficients $\alpha$ and $\beta$ , and over all non-negative matrices $M$ satisfying the constraint (25).

Lemma 3

The matrix $M$ can be chosen to be rank-one without loss of generality, namely $M=|v\rangle\langle v|$ for some suitable vector $|v\rangle=v_{+}|+\rangle+v_{-}|-\rangle\in\mathbb{C}^{2}$ .

**Proof. **The entanglement fidelity depends on the matrix $M$ through the matrix element $\langle c|M|c\rangle$ . Now, one has the chain of inequalities

[TABLE]

the second inequality following from the fact that $M$ is positive.

The first inequality holds with the equality sign when the phase of the complex number $\langle+|M|-\rangle$ is equal to the phase of the complex number $\overline{c}_{+}c_{-}$ . The second inequality holds with the equality sign if $M$ is rank-one. In particular, the upper bound is attained by the rank-one matrix $M^{\prime}=|v\rangle\langle v|$ with $v_{+}=\sqrt{\langle+|M|+\rangle}$ and $v_{-}=\sqrt{\langle-|M|-\rangle}\overline{c}_{+}c_{-}/|c_{+}c_{-}|$ .

Since the normalization constraint (25) involves only the diagonal matrix elements of $M$ , the matrix $M$ can be replaced by the matrix $M^{\prime}$ without loss of generality. $\blacksquare$

The proof of the above lemma shows that the optimal entanglement fidelity has the form

[TABLE]

with $|v_{\pm}|=\sqrt{\langle\pm|M|\pm\rangle}$ . The maximum of the fidelity (31) under the constraints (25) can be determined with the method of Lagrange multipliers. In the following we present the result of the maximization, leaving the details to Appendix A.

III.3 Optimal quantum strategy for $j>1$

For $j>1$ , it turns out that Problems 1 and 2 have the same optimal solution:

Theorem 2

*When $j>1$ , the optimal probe state for learning the gate $V_{\theta,g}=U_{g}V_{\theta}U_{g}^{\dagger}$ from the gate $U_{g}^{(j)}$ is $|j,j\rangle$ for every value of $\theta$ . For both Problems 1 and 2, optimal average fidelity over all pure input states is *

[TABLE]

and has the asymptotic expression

[TABLE]

The optimality of the probe state $|j,j\rangle$ is in agreement with a result by Holevo on the optimal estimation of directions, cf. Section 4.10 of Holevo (2011). In other words, the optimal probe state for learning how to rotate about an unknown direction coincides with the optimal probe state for producing a classical estimate of such direction, as long as $j$ is larger than 1. It is worth stressing, however, that the optimal quantum strategy for rotating about an unknown direction is not based on estimation: in Section IV we will show that no estimation-based strategy can achieve the optimal quantum fidelity (32).

The exact values of the average fidelity are plotted in Figure 4 for various values of $j$ from $j=2$ to $j=100$ . Note that the fidelity decreases monotonically with the rotation angle $\theta$ . Intuitively, rotating by smaller angles is easier, because the uncertainty about the rotation axis has less influence on the performance. The easiest rotation is the identity $(\theta=0)$ , which is independent of the rotation axis and therefore can be implemented without error. The hardest rotation is the spin flip, corresponding to $\theta=\pi$ . In this case, the average fidelity has the simple form

[TABLE]

Note that, since the optimal probe state is $|j,j\rangle$ , the optimal channel $\mathcal{C}_{\theta}$ for Problem 1 coincides with the optimal channel $\mathcal{C}_{\theta}$ for Problem 2. In Appendix B, we show that an optimal channel $\mathcal{C}_{\theta}$ can be attained by setting up an isotropic Heisenberg interaction between the memory spin and the target spin. Explicitly, we show that the maximum fidelity (32) is attained by the channel

[TABLE]

where $\operatorname{Tr}_{{\rm P}_{j}}$ denotes the partial trace over the probe, and $U_{\theta}$ is the unitary operator

[TABLE]

in which $\boldsymbol{\sigma}=(\sigma_{x},\sigma_{y},\sigma_{z})$ is the vector of the three Pauli matrices, $\mathbf{J}\cdot\boldsymbol{\sigma}=\sum_{i=x,y,z}J_{i}\otimes\sigma_{i}$ is the Heisenberg coupling, and $f(\theta)$ is the function

[TABLE]

where $s(\theta)=0$ for $\theta\in[0,\pi]$ , and $s(\theta)=\pi$ for $\theta\in(\pi,2\pi)$ . Note that $f(\theta)$ is approximately equal to $\theta$ in the large $j$ limit.

Physically, the unitary evolution (36) can be realized by setting up an isotropic Heisenberg interaction, described by the Hamiltonian $H=\alpha\,\mathbf{J}\cdot\boldsymbol{\sigma}$ , for some suitable coupling constant $\alpha$ , and by letting the two spins evolve for time

[TABLE]

depending on the angle $\theta$ of the target rotation. Remarkably, the same probe states and the same interaction can be used to control the full time evolution of the target system: one has only to adjust the interaction time [determined by the angle $f(\theta)$ ] based on the evolution time in the target dynamics [determined by the angle $\theta$ ]. For example, we can set $\theta=\omega t$ and simulate the precession of a spin- $1/2$ particle around the direction indicated by the memory state.

An important feature of the optimal strategy is that the optimal probe state is independent of the rotation angle $\theta$ . Since the operation of storing the state $U_{g}|j,j\rangle$ in the quantum memory is also independent of $\theta$ , it follows that all the operations in the training phase can be accomplished without knowing the rotation angle. This offers the possibility to decide the value of $\theta$ at later times. In fact, the machine can optimally approximate the full continuous-time dynamics of the target particle, because the optimal operations for different $\theta$ corresponds to unitary evolutions with the same Hamiltonian, just with different evolution times.

The optimality of the Heisenberg interaction is not limited to the average fidelity. In terms of scaling with $j$ , the unitary gate (36) is optimal also for the worst-case fidelity, defined as

[TABLE]

where $F(j,\theta,g,\psi)$ is the fidelity for the simulation of $V_{g}$ on the specific input state $|\psi\rangle$ . Indeed, in Appendix C, we show that the worst-case fidelity of the unitary gate (36) is

[TABLE]

Hence, the worst-case infidelity $1-F_{\rm w,He}(j,\theta)$ has the scaling $1/j$ . This is the best scaling one can hope for, because the average infidelity cannot vanish faster than $1/j$ [as shown by Eq. (33)], and the average infidelity is a lower bound to the worst-case infidelity.

The optimality of the Heisenberg interaction answers in the affirmative a question raised by Marvian and Mann Marvian and Mann (2008), who assumed the Heisenberg interaction and showed that it can be used to approximate arbitrary rotations in the limit of large $j$ limit. In the conclusion of their work, Marvian and Mann asked whether the Heisenberg interaction achieves the best scaling of the error with the spin size. Our results provide an affirmative answer, showing that the Heisenberg interaction maximizes the average fidelity and has the optimal error scaling $O(1/j)$ in the worst-case scenario.

III.4 Optimal quantum strategy for $j=1/2$

For $j=1/2$ , the optimal probe state for Problem 2 is still the coherent state $|j,j\rangle$ for every rotation angle $\theta$ , and the optimal solutions of Problems 1 and 2 still coincide.

Curiously, the optimal learning strategy exhibits a transition when the rotation angle approaches $\pi$ . For $|\theta-\pi|>\delta_{1/2}=\arccos[(4+\sqrt{7})/9]$ , the optimal fidelity is still given by Equation (32), and the optimal channel $\mathcal{C}_{\theta}$ is still given by Equation (35).

For $|\theta-\pi|\leq\delta_{1/2}$ , instead, the optimal fidelity becomes

[TABLE]

and is achieved by the following strategy:

Perform a joint measurement on the memory and the target. The measurement has two outcomes and is described by the quantum operations $\mathcal{M}_{\rm yes}(\cdot)=M_{\rm yes}\cdot M_{\rm yes}^{\dagger}$ and $\mathcal{M}_{\rm no}(\cdot)=M_{\rm no}\cdot M_{\rm no}^{\dagger}$ , with

[TABLE]

$P_{l}$ being the projector on the subspace with total angular momentum $l$ , with $l\in\{0,1\}$ . 2. 2.

If the measurement yields outcome “ $\rm yes$ ”, then apply the unitary gate (36), corresponding to the Heisenberg interaction, and discard the memory. If the measurement yields outcome “ $\rm no$ ”, then perform the optimal 2-to-1 universal NOT channel Bužek et al. (1999), namely the channel $\mathcal{C}_{\rm UNOT}$ defined by

[TABLE]

The probability of the outcome “ $\rm no$ ”, corresponding to the universal NOT, depends on the parameter $\alpha$ in Eq. (78). At the critical distance $|\theta-\pi|=\arccos[(4+\sqrt{7})/9]$ , one has $\alpha=0$ , and the optimal strategy is realized through the Heisenberg interaction. As the rotation angle gets closer to $\pi$ , the coefficient $\alpha$ increases, reaching its maximum value $\alpha=2/3$ for $\theta=\pi$ . At this point, the weight of the universal NOT is maximum. Notably, the value $\alpha=1$ is never reached, meaning that the optimal joint measurement on the input qubits is never projective.

III.5 Optimal quantum strategies for $j=1$

The $j=1$ case is the only case where Problems 1 and 2 yield different solutions. The difference appears when the rotation angle is within a critical distance $\delta_{1}=0.23\pi$ from $\pi$ .

For $|\pi-\theta|>\delta_{1}$ , the optimal probe state for Problem 2 is $|1,1\rangle$ , and therefore the optimal solutions for Problems 1 and 2 still coincide. The optimal average fidelity is still given by Equation (32) and the optimal channel $\mathcal{C}_{\theta}$ is still given by Equation (35).

For $|\theta-\pi|\leq\delta_{1}$ , the optimal average fidelity for Problem 1 is

[TABLE]

corresponding to Equation (32) with $j=1$ . The optimal channel $\mathcal{C}_{\theta}$ is still given by Equation (35).

Instead, the optimal fidelity for Problem 2 is

[TABLE]

and is attained with the probe state $|1,0\rangle$ , the $p$ -orbital aligned in the direction of the $z$ -axis. In Subsection IV.5, we will show that the optimal quantum fidelity (45) is achievable with a purely classical memory. Specifically, we will see that the optimal strategy is to perform a projective measurement on the probe, with the three measurement outcomes corresponding to the three Cartesian axes. The measurement outcome is then stored into a classical memory of 2 bits. In the execution phase, the machine rotates the target qubit by an angle $\pi$ about the axis corresponding to the measurement outcome.

III.6 Optimal fidelities for $j=1/2$ and $j=1$

The dependence of the fidelity on the rotation angle is plotted in Figure 5 for $j=1$ and $j=1/2$ . The value of the optimal quantum fidelity is contrasted with the maximum fidelity achievable with a purely classical memory, which will be derived in Section IV.

IV The quantum benchmark

In this section we derive the maximum fidelity achievable by learning machines with a purely classical memory of arbitrarily large size. Such fidelity provides a benchmark that can be used to certify the experimental demonstration of quantum-enhanced learning. We consider the two learning tasks corresponding to Problem 1 (learning from a spin coherent state) and Problem 2 (learning from a rotation gate) coincide. The quantum benchmarks for these two problems coincide for all values of $j$ except $j=1$ . For $j=1$ , the two benchmarks become different when the desired rotation angle approaches $\pi$ .

IV.1 Measure-and-operate (MO) channels

Here we consider learning strategies where the memory $\rm M$ in Figures 1 and 3 is purely classical. In this case, the transfer of information from the probe to the memory is described by a quantum-to-classical channel $\mathcal{E}_{\theta}$ , of the form

[TABLE]

where $\{|y\rangle\}_{y\in\mathsf{Y}}$ is a set of orthogonal states of the memory, and $(P_{\theta,y})_{y\in\mathsf{Y}}$ is a Positive Operator-Valued Measure (POVM), describing a quantum measurement on system ${\rm P}_{j}$ in the case of Figure 1, or a quantum measurement on system ${\rm P_{j}}\otimes\rm A$ in the case of Figure 3.

The execution phase consists in reading out the index $y$ from the classical memory and performing a conditional operation $\mathcal{O}_{\theta,y}$ on the system. Hence, the channel $\mathcal{R}_{\theta}$ has the form

[TABLE]

The operations performed by machines with purely classical memory will be called measure-and-operate (MO) strategies. Combined together, the “measure” channel $\mathcal{E}_{\theta}$ and the “operate” channel $\mathcal{R}_{\theta}$ give a single quantum channel $\mathcal{C}_{\theta,\rm MO}$ , of the form

[TABLE]

where $\operatorname{Tr}_{\overline{\rm S}}$ denotes the partial trace over all systems except system $\rm S$ .

In the following, we will solve the optimisations in Problems 1 and 2 under the constraint that the channel $\mathcal{C}_{\theta}$ is of the $\rm MO$ form (48). By definition, the optimal MO fidelities are by definition no larger than the optimal quantum fidelities derived in the previous Section.

IV.2 Structure of the optimal MO strategy for Problem 2

The structure of the optimal MO strategy for Problem 2 is summarized by the following Theorem, proven in Appendix D.

Theorem 3

The optimal MO strategy for learning the gate $V_{\theta,g}=U_{g}V_{\theta}U_{g}^{\dagger}$ from the gate $U_{g}^{(j)}$ has the following features:

no auxiliary system is needed 2. 2.

the optimal probe state is an eigenstate of $J_{z}$ , denoted as $|j,m_{\theta}\rangle$ 3. 3.

the outcome of the optimal POVM is an element of the rotation group $\mathsf{SO}(3)$ , denoted as $\hat{g}$ 4. 4.

the optimal POVM $(P_{\theta,g})_{g\in\mathsf{SO}(3)}$ is rotationally covariant Holevo (2011)**, and has the form

[TABLE]

where $|\xi_{\theta}\rangle$ is a unit vector 5. 5.

the optimal conditional operation has the form $\mathcal{O}_{\theta,{\hat{g}}}=\mathcal{U}_{\hat{g}}^{(j)}\circ\mathcal{O}_{\theta}\circ\mathcal{U}_{\hat{g}}^{(j){\dagger}}$ , where $\mathcal{O}_{\theta}$ is a fixed channel acting on the target qubit.

In the following we will maximise the gate fidelity over all MO strategies with the features described by Theorem 3. For convenience we will express the gate fidelity in terms of the entanglement fidelity [cf. Equation (15)].

IV.3 Choi operator formulation

For an optimal strategy as in Theorem 3, the entanglement fidelity takes the form

[TABLE]

where $O_{\theta,{\hat{g}}}$ is the Choi operator of the channel $\mathcal{O}_{\theta,{\hat{g}}}$ , $O_{\theta}$ is the Choi operator of the channel $\mathcal{O}_{\theta}$ , and $|\Phi^{+}_{\theta,g}\rangle:=(V_{\theta,g}\otimes I_{\rm R})\,|\Phi^{+}\rangle$ .

Our goal is to maximise the entanglement fidelity (IV.3) over all values of $m_{\theta}$ , over all unit vectors $|\xi_{\theta}\rangle$ , and over all Choi operators $O_{\theta}$ . To this purpose, the key observation is that the Choi operator $O_{\theta}$ can be chosen to be real in a suitable basis. Specifically, we have the following

Proposition 1

The Choi operator $O_{\theta}$ maximizing the fidelity (IV.3) can be chosen to be real in the Bell basis

[TABLE]

**Proof. **Every unitary $V_{\theta,g}=U_{g}V_{\theta}U_{g}$ is a real linear combination of the matrices $I,i\sigma_{x},i\sigma_{y},$ and $i\sigma_{z}$ . Hence, every vector $|\Phi^{+}_{\theta,g}\rangle=(V_{\theta,g}\otimes I)|\Phi^{+}\rangle$ is a real linear combination of the vectors $|\Phi^{+}\rangle$ , $i|\Psi^{+}\rangle=(i\sigma_{x}\otimes I)|\Phi^{+}\rangle$ , $|\Psi^{-}\rangle=(i\sigma_{y}\otimes I)|\Phi^{+}\rangle$ , and $i|\Phi^{-}\rangle=(i\sigma_{z}\otimes I)|\Phi^{+}\rangle$ . Since the fidelity depends on the Choi operator $O_{\theta}$ only through the matrix elements $\langle\Phi^{+}_{\theta,g}|O_{\theta}|\Phi^{+}_{\theta,g}\rangle$ , the optimal Choi operator can be chosen to be real in the same basis as the vectors $|\Phi^{+}_{\theta,g}\rangle$ . $\blacksquare$

Thanks to Proposition 1, the maximization of the fidelity can be restricted to the set of Choi operators that are real in the Bell basis. This set of Choi operators can be equivalently characterized as the set of Choi operators of unital channels, i.e. quantum channels mapping the identity operator to itself. Indeed, we have the following

Proposition 2

A qubit channel is unital if and only if its Choi operator is real in the Bell basis

[TABLE]

**Proof. **If a qubit channel is unital, then it is a convex combination of unitary channels Landau and Streater (1993). For every unitary channel, the corresponding Choi operator is real in the Bell basis. Indeed, every unitary channel has a Kraus decomposition with a single unitary operator of the form $U=\cos\frac{\tau}{2}\,I-i\sin\frac{\tau}{2}\,\mathbf{n}\cdot\boldsymbol{\sigma}$ , with $\tau\in[0,2\pi)$ and $\mathbf{n}\in\mathbb{R}^{3}$ . Hence, the Choi operator $2\,(U\otimes I)|\Phi^{+}\rangle\langle\Phi^{+}|(U\otimes I)^{\dagger}$ is real in the Bell basis. Since the set of real Choi operators is convex, every unital channel is contained in it.

Conversely, suppose that a channel $\mathcal{C}$ has a Choi operator $C$ that is real in the Bell basis, i.e. $C=\sum_{k,l}\,C_{kl}\,|\Phi_{k}\rangle\langle\Phi_{l}|$ , for some real symmetric matrix $(C_{kl})$ . Then, one has

[TABLE]

the last equality following from the relation $2=\operatorname{Tr}[I]=\operatorname{Tr}[\mathcal{C}(I)]=\operatorname{Tr}[C]=\sum_{i=0}^{3}C_{ii}$ . Hence, the channel $\mathcal{C}$ is unital. $\blacksquare$

Since the fidelity is a linear function, its maximization can be restricted to the extreme points of the set of unital channels. For qubits, such extreme points are unitary channels Landau and Streater (1993). Hence, we obtained the following

Theorem 4

The quantum channel $\mathcal{O}_{\theta}$ maximizing the fidelity (IV.3) can be chosen to be unitary without loss of generality.

Thanks to Theorem 4, the optimal entanglement fidelity (IV.3) can be expressed as

[TABLE]

where $W_{\theta}$ is a suitable unitary and $|\Phi^{+}_{W_{\theta}}\rangle:=(W_{\theta}\otimes I_{\rm R})\,|\Phi^{+}\rangle$ .

The optimization can be further simplified using the following observation:

Proposition 3

The unitary gate $W_{\theta}$ maximizing the fidelity (IV.3) can be chosen without loss of generality to be a rotation about the $z$ axis.

**Proof. **Every unitary $W_{\theta}$ can be written as $W_{\theta}=U_{h}V_{\theta^{{}^{\prime}}}U_{h}^{\dagger}$ , where $V_{\theta^{{}^{\prime}}}$ is a rotation about the $z$ axis by an angle $\theta^{{}^{\prime}}$ , and $h$ is the rotation that transforms the $z$ axis into the rotation axis of $W_{\theta}$ . Hence, the corresponding state can be written as $|\Phi^{+}_{W_{\theta}}\rangle=(U_{h}\otimes\overline{U}_{h})\,|\Phi^{+}_{V_{\theta^{{}^{\prime}}}}\rangle$ .

Using this fact, the optimal MO fidelity can be rewritten as

[TABLE]

(here $U^{T}_{h}$ denotes the transpose of the matrix $U_{h}$ .) The last equation shows that the maximisation of the fidelity can be reduced to rotations about the $z$ axis. $\blacksquare$

At this point, it remains to maximise the fidelity (IV.3) over $m_{\theta}$ , $\xi_{\theta}$ , and $V_{\theta^{{}^{\prime}}}$ . The result of the optimization is summarised in the following, while the details are provided in Appendix E.

IV.4 Optimal MO strategy for $j\not=1$

For $j\neq 1$ , it turns out that the quantum benchmarks for Problems 1 and 2 coincide.

Theorem 5

*For $j\not=1$ , the optimal probe state for learning the gate $V_{\theta,g}=U_{g}V_{\theta}U_{g}^{\dagger}$ from the gate $U_{g}^{(j)}$ by MO operations is $|j,j\rangle$ for every value of $\theta$ . For both Problems 1 and 2, optimal MO fidelity is *

[TABLE]

and has the asymptotic expression

[TABLE]

The optimal MO strategy consists in

measuring the probe with the POVM $P_{\hat{g}}=(2j+1)~{}U_{\hat{g}}^{(j)}|j,j\rangle\langle j,j|U_{\hat{g}}^{(j){\dagger}}$ , and 2. 2.

rotating the target qubit about the rotated $z$ -axis $\hat{g}\,\mathbf{e}_{z}$ by the angle

[TABLE]

where $s(\theta)=0$ for $\theta\in[0,\pi]$ , and $s(\theta)=\pi$ for $\theta\in(\pi,2\pi)$ .

Note that the probe state and the measurement are both independent of the rotation angle $\theta$ . This means that the machine can be trained optimally even before the value of the rotation angle has been decided. The operations in the training phase coincide with the optimal estimation strategy for directions, derived in the classic work by Holevo Holevo (2011).

The optimal MO strategy can be implemented by a learning machine with a purely classical memory. The size of the classical memory can be chosen without loss of generality to be $\lceil 2\log(2j+1)\rceil$ bits. This is because the fidelity is a linear function of the POVM, and therefore its maximum is attained by an extreme point of the convex set of all POVMs with outcomes in $\mathsf{SO}(3)$ . The extreme points of such set consist of POVMs that assign non-zero probability to at most $(2j+1)^{2}$ rotations Chiribella et al. (2007). Hence, the optimal POVM in Theorem 5 can be replaced by another, equally optimal POVM with at most $(2j+1)^{2}$ outcomes, which can be stored into a classical memory of $\lceil 2\log(2j+1)\rceil$ bits.

A plot of the MO fidelity and of the optimal quantum fidelity is provided in Figure 6. Note that the error (one minus fidelity) goes to zero in both cases, but the rate for quantum strategies is twice as fast, as one can see by comparing Equations (33) and (57).

IV.5 Optimal MO strategies for $j=1$

The $j=1$ case exhibits an anomalous behaviour when the rotation angle approaches $\pi$ . For $|\pi-\theta|>0.303\pi$ , Problems 1 and 2 have the same optimal MO fidelity, and the same optimal MO strategy, described in Theorem 5. For $|\pi-\theta|\leqslant 0.303\pi$ , the optimal the optimal MO fidelities become different. For $|\theta-\pi|\leq\delta_{1}$ , the optimal average fidelity for Problem 1 is

[TABLE]

with

[TABLE]

corresponding to Equation (56) with $j=1$ . The MO strategy is still the one described in Theorem 5.

For Problem 2, the optimal probe states transitions from $|1,1\rangle$ to $|1,0\rangle$ , and the optimal fidelity becomes

[TABLE]

The optimal MO strategy consists of

Measuring the memory with the POVM operators $P_{\hat{g}}=(2j+1)~{}U_{\hat{g}}^{(j)}|1,0\rangle\langle 1,0|U_{\hat{g}}^{(j){\dagger}}$ 2. 2.

Rotating the target qubit about the axis $\mathbf{n}=g\,\mathbf{e}_{z}$ by an angle $\pi$ , independently of $\theta$ .

Physically, the optimal POVM can be interpreted as a randomisation of the projective measurement that projects the spin- $1$ particle along the three Cartesian axes $x,y$ , and $z$ Chiribella et al. (2007). This projective measurement corresponds to the orthonormal basis $\{|x\rangle,|y\rangle,|z\rangle\}$ for $\mathbb{C}^{3}$ defined by $|z\rangle:=|1,0\rangle$ , $|x\rangle:=(|1,1\rangle+|1,-1\rangle)/\sqrt{2}$ , and $|y\rangle:=(|1,1\rangle-|1,-1\rangle)/\sqrt{2}$ . In the language of atomic physics, $|x\rangle$ , $|y\rangle$ , and $|z\rangle$ are the $p$ -orbitals aligned in the directions $x,y$ , and $z$ , respectively. Since the fidelity is a linear function of the POVM, the optimal POVM $P_{\hat{g}}=(2j+1)\,U_{\hat{g}}^{(j)}|1,0\rangle\langle 1,0|U_{\hat{g}}^{(j){\dagger}}$ can be replaced by an equally optimal POVM based on the projective measurement of $\{|x\rangle,|y\rangle,|z\rangle\}$ , followed by a rotation by $\pi$ about the Cartesian axis identified by the measurement outcome. In this discretised version of the MO strategy, the learning machine only needs a classical memory of $2$ bits.

V Persistence of the quantum advantage

We have seen that a machine equipped with a quantum memory can outperform every classical machine at the task of learning rotations about an unknown axis. Still, our analysis was restricted to the scenario where the quantum process accesses its memory only once, with the goal of reproducing a single use of the target gate. In the following we will study how the performance depends on the number of required executions of the target gate.

Let us focus on the regular case $j>1$ , where the optimal strategies for Problems 1 and 2 coincide, and the channel is realised by setting up a Heisenberg interaction between the memory and the target qubit. An important question is how many times the memory can be accessed before the accuracy drops below a certain threshold. In the context of quantum reference frames, the maximum number of accesses such that the fidelity is above threshold was called the longevity in Ref. Bartlett et al. (2006). Another important question is how many times the memory can be accessed before the quantum advantage is lost. The maximum number of accesses for which the fidelity is above the quantum benchmark (56) will be called persistence of the quantum advantage in the following.

Suppose that the joint evolution of memory and target is described by the same unitary gate at every step. Assuming the gate to be of the form of Eq. (36) for some fixed function $f(\theta)$ , we obtain the close-form expression

[TABLE]

quantifying the average fidelity at the leading order in $j$ (see Appendix F for the derivation). From this expression one can see that the longevity grows as $j^{2}$ . However, the persistence of the quantum advantage is much shorter: comparing the fidelity (62) with the MO fidelity (57), we find that the quantum advantage disappears when the number of repetitions is larger than

[TABLE]

One could also consider more elaborate strategies where the interaction time between memory and target is optimised at every step. However, we find that these strategies do not increase the longevity nor the persistence of the quantum advantage in the large $j$ limit.

VI Robustness to thermal noise

In Problem 1, we made the simplifying assumption that the unknown direction $\mathbf{n}$ is imprinted into the pure spin-coherent state $|j,j\rangle_{\mathbf{n}}$ , regarded as the low-temperature approximation of the thermal state of the magnetic dipole Hamiltonian. An interesting question is how this approximation affects our discussion of the quantum advantage. In the following we will address this question in the large $j$ limit, showing that quantum memories are useful whenever the magnetic energy is sufficiently large compared to the thermal fluctuations.

The thermal states of the Hamiltonian $H=-\mu\,\mathbf{B}\cdot\mathbf{J}$ can be written as

[TABLE]

where $T$ is the temperature and $k_{\rm B}$ is the Boltzmann constant. The spin coherent state $|j,j\rangle_{\mathbf{n}}$ is retrieved in the low temperature ( $\gamma\to\infty$ ) limit, as one has $\lim_{\gamma\to\infty}\rho_{\gamma,\mathbf{n}}=|j,j\rangle_{\mathbf{n}}\langle j,j|_{\mathbf{n}}$ .

Now, suppose that the learning strategy designed for the spin coherent state $|j,j\rangle_{\mathbf{n}}$ is adopted for the mixed state $\rho_{\gamma,\mathbf{n}}$ . In Appendix G, we show that the average fidelity has the asymptotic expression

[TABLE]

The above fidelity can be compared the benchmark in Equation (57), which quantifies the maximum fidelity achievable with classical memories. Note that Equation (57) provides the benchmark for both Problems 1 and 2, meaning that the benchmark applies to every pure probe state of the form $|\psi_{\theta,g}\rangle=U_{g}^{(g)}|\psi\rangle$ , and by convexity, to every mixed probe state of the form $\rho_{g}=U_{g}^{(j)}\rho U_{g}^{(j){\dagger}}$ . In particular, it applies to the thermal states $\rho_{\gamma,\mathbf{n}}$ , as the average fidelity over all directions $\mathbf{n}$ is equal to the average fidelity over all rotations $g$ . Comparing the fidelity (65) with the benchmark in Equation (57), we obtain that the quantum strategy outperforms all classical strategies whenever $\tanh\gamma$ is larger than $1/2$ , corresponding to the condition $\gamma>\frac{1}{2}\ln 3\approx 0.55$ . Hence, the quantum advantage persists whenever the magnetic energy $\mu|\mathbf{B}|$ is larger than $1.1$ times the thermal energy $k_{\rm B}T$ .

Note that the quantum benchmark in Equation (57) is the optimal fidelity achievable with arbitrary probe states. If one further enforces the condition that the the probe state be thermal, then the value of the benchmark would be even lower, thereby extending the set of temperatures for which the quantum memory offers an advantage.

Note also that the above discussion applies to a variant of Problem 2 where the probe is subject to thermal noise before the action of the training gate $U_{g}^{(j)}$ , resulting into a mixed input state $\rho_{\gamma}:=\rho_{\gamma,\mathbf{e}_{z}}$ . Also in this setting, the quantum memory offers a provable advantage when the parameter $\gamma$ is larger than $\frac{1}{2}\ln 3$ .

VII Learning higher dimensional gates

Our result establishes the existence of a quantum advantage for learning single-qubit rotations about an unknown axis. This finding is conceptually important, because the advantage for single qubits implies an advantage of coherent learning for quantum systems of arbitrary dimension. Indeed, one can immediately prove the advantage by using the qubit benchmark for gates that act nontrivially only in a fixed two-dimensional subspace.

Our results also give a heuristic for the problem of learning rotation gates on higher dimensional spins. The idea is to encode the rotation axis in a spin coherent state and to let the memory and target spin interact as closed system. Explicitly, we make two spin systems undergo the Heisenberg interaction $U_{\theta}^{(k)}=\exp\left[-i\theta\,\,2\mathbf{J}\cdot\mathbf{K}/(2j+1)\right]$ , where $\mathbf{K}=(K_{x},K_{y},K_{z})$ are the spin operators of the target spin. Using the unitary gate $U_{\theta}^{(k)}$ , in Appendix H we obtain the average fidelity

[TABLE]

in the large $j$ limit. Remarkably, the error grows quadratically—rather than linearly—with the size of the target spin: in order to ensure high fidelity, the size of the memory must be large compared to the square of the size of the target system. The same conclusion holds for the worst-case fidelity, which has the asymptotic expression

[TABLE]

with $c(k)=0$ for even $k$ and $c(k)=1/4$ for odd $k$ .

The quantum strategy exhibits an advantage over the MO strategy consisting in measuring the direction $\mathbf{n}$ from the spin coherent state pointing in direction $\mathbf{n}$ and performing a rotation based on the outcome. Again, we find that the error of the quantum strategy vanishes in the macroscopic limit of large memory systems, at a rate twice as fast than the error of the classical strategy (see Appendix H for more details). It is an open question whether the above quantum and MO strategies are optimal for arbitrary $k>1/2$ .

VIII Conclusions

We determined the ultimate accuracy for the task of learning a rotation of a desired angle $\theta$ about an unknown axis, imprinted in the state of a spin- $j$ particle. In this task, we found that quantum memories enhance the learning performance for every $j>1$ and for every rotation angle $\theta\not=0$ . Specifically, we found that a quantum machine with a memory of $\lceil\log(2j+1)\rceil$ qubits outperforms all learning machines with classical memory of arbitrarily large size.

We found that the advantage of the quantum memory persists even when the memory is accessed multiple times, as long as the total number of accesses is at most linear in the spin size. Quite interestingly, we observe a relation between the persistence and the size of the advantage: in the large $j$ limit, the quantum advantage is of size $O(1/j)$ and persists when the memory is accessed for $O(j)$ times. Our results indicate that, as the memory size grows, the quantum advantage is spread over a larger amount of time. This tradeoff achieves the classical limit for spins of infinite size, for which the advantage disappears and the memory can be accessed infinitely many times.

At the fundamental level, our results provides the first example of a quantum memory advantage in a deterministic learning task involving unitary gates as the target operations. Advantages of quantum memories have been known for longer time for non-deterministic learning tasks, where the learning machine has a non-zero probability of aborting. For example, Refs. Nielsen and Chuang (1997); Vidal et al. (2002); Hillery et al. (2002); Vidal et al. (2002); Brazier et al. (2005); Ishizaka and Hiroshima (2008); Bartlett et al. (2009) provide examples of machines that learn an unknown unitary gate without errors, albeit with a non-unit probability of success. In all these examples, a quantum memory is necessary in order to achieve error-free learning. In practice, however, no real machine is error-free, and in order to experimentally demonstrate the advantage of the quantum memory one needs a benchmark that quantifies the best performance achievable with classical machines. No such benchmark has been derived for the non-deterministic learning tasks considered in Refs. Nielsen and Chuang (1997); Vidal et al. (2002); Hillery et al. (2002); Vidal et al. (2002); Brazier et al. (2005); Ishizaka and Hiroshima (2008); Bartlett et al. (2009), and a rigorous demonstration of the advantage of the quantum memory has not been possible so far. A promising direction of future research is to apply the techniques developed in this paper to the derivation of quantum benchmarks for non-deterministic learning of unitary gates.

Our work calls for the experimental demonstration of quantum-enhanced learning of rotations around an unknown direction. For small values of the spin, a possible testbed is provided by NMR systems, where spin-spin interactions are naturally available Vandersypen and Chuang (2005). Another possibility is to use quantum dots, where one can engineer a coupling between a single spin and an assembly of spins effectively behaving as a single spin $j$ particle Chesi and Coish (2015). This scenario, named the box model, can be achieved through a uniform coupling of a central spin to the neighbouring sites. No matter what platform is adopted, our results provide the rigorous benchmark that can be used to validate the successful demonstration of quantum-enhanced unitary gate learning in realistic scenarios where the implementation is subject to noise and experimental imperfections.

Acknowledgements. The authors thank E Bagan for discussions and feedback on an earlier version of the manuscript. This work is supported by the National Natural Science Foundation of China through grant 11675136, the Hong Kong Research Grant Council through Grant No. 17326616 and 17300317, the Croucher Foundation, the HKU Seed Funding for Basic Research, the Foundational Questions Institute through grant FQXi-RFP3-1325, and the Canadian Institute for Advanced Research (CIFAR).

Appendix A Derivation of the optimal quantum strategy

In order to find the maximum of the fidelity (31) under the constraints (25) we use the method of Lagrange multipliers, setting $\alpha=x^{2}$ and $\beta=y^{2}$ . The search of the stationary points of the fidelity yields the following four cases:

Case 1: $x=y=0$ . In this case, the fidelity is given by

[TABLE]

and is attained by the Choi operator

[TABLE]

with

[TABLE]

The maximum of the fidelity is attained by $m_{\theta}=j$ , independently of $\theta$ . Explicitly, the maximum fidelity is

[TABLE]

Note that the fidelity converges to 1 in the large $j$ limit, meaning that the learning becomes nearly perfect for large spins. Comparison with Cases 2,3, and 4 in the following shows that the fidelity (71) is optimal for every angle $\theta$ whenever the spin is larger than 1. 2. Case 2: $x\not=0,y=0$ . In this case, the Lagrangian method yields the fidelity

[TABLE]

achieved by setting

[TABLE]

and $x$ according to Eq. (25). The fidelity does not tend to $1$ in the large $j$ limit, indicating that the Case 2 strategy is suboptimal for large $j$ . Still, it turns out that for $j=1/2$ this strategy is optimal for some values of the angle $\theta$ around $\theta=\pi$ . In this case, the entanglement fidelity becomes

[TABLE]

and the optimal Choi operator is

[TABLE]

with

[TABLE]

The transition from the Case 1 strategy to the Case 2 strategy occurs when the distance $|\pi-\theta|$ is below the critical value $\delta_{\rm c}=\arccos[(4+\sqrt{7})/9]\approx 0.236\pi$ . 3. Case 3: $x\not=0$ , $y\not=0$ . Note that a strategy with $y\not=0$ can only exist for $j>1/2$ , because for $j=1/2$ there is no subspace with spin $j-1$ , and therefore the coefficient $y$ is not present. The method of Lagrange multipliers implies that, among the strategies with $x\not=0$ and $y\not=0$ , the maximum fidelity is attained when $x$ and $y$ take their maximum values. The corresponding the Choi operator $C_{\theta}^{*}$ is

[TABLE]

and its fidelity is

[TABLE]

The maximum, attained for $m_{\theta}=0$ , is

[TABLE]

The fidelity does not reach 1 in the large $j$ limit, indicating that the Case 3 strategy is suboptimal for large $j$ . Nevertheless, we find out that for $j=1$ the Case 3 strategy is optimal for rotation angles around $\theta=\pi$ . For $j=1$ , the entanglement fidelity is

[TABLE]

A numerical comparison with the fidelity for Case 1 indicates that the above fidelity is optimal for $|\theta-\pi|\leq\delta_{\rm c}$ , with $\delta_{\rm c}=0.23\pi$ . For $|\pi-\theta|>\delta_{\rm c}$ , instead, the Case 1 strategy is optimal. 4. *Case 4: $x=0,y\not=0$ . * This case is similar to Case 3, and the fidelity has the expression

[TABLE]

By comparison with the other cases, we find that the Case 4 fidelity is never optimal.

Note that for Problem 1 with $j=1$ , only Case 1 and Case 2 need to be considered as the memory state is $|1,1\rangle$ . It is easy to check that $|v_{+}|$ in Eq. (73) does not satisfy constraint Eq. (25) for arbitrary $\theta$ , showing that Case 1 is always the optimal solution for Problem 1 when $j=1$ .

Appendix B Heisenberg interaction is the optimal learning strategy

In this section, we prove that the channel $\mathcal{C}_{\theta,\rm Hei}$ in Eq. (35) with unitary gate $U_{\theta}$ in Eq. (36) is the optimal learning channel. To this purpose, we calculate its entanglement fidelity $F^{(\rm e)}_{\rm Hei}(j,\theta)$ , and show that it is equal to the optimal entanglement fidelity given by Eq. (68).

First of all, we note that the unitary gate $U_{\theta}$ can be expanded as

[TABLE]

where $h(\theta)$ is an irrelevant global phase, which we will ignore from now on. Using this expression, we obtain the relations

[TABLE]

and we can get:

[TABLE]

The entanglement fidelity for this physical realization can be written as

[TABLE]

where $\mathcal{I}_{\rm R}$ being the identity map on the reference system ${\rm R}$ . Then by inserting Eq. (B) and

[TABLE]

into Eq. (87), we can get that:

[TABLE]

where the equality will be reached when we set $f(\theta)$ equal to Eq. (37). It is equal to the optimal entanglement fidelity in Eq. (68).

Appendix C Worst-case fidelity

Here we show that learning to perform target gate $V_{\theta,g}$ by using Heisenberg interaction in Eq. (35, 36) has an error scaling in $1/j$ in terms of the worst-case fidelity (defined by Eq. (39)).

The worst-case fidelity is over all learning gate $g$ and over all input target states $\psi$ :

[TABLE]

where

[TABLE]

is the fidelity for the simulation of $V_{g}$ on the specific input state $|\psi\rangle$ , and

[TABLE]

is calculated according to the optimal physical realization.

Note that the trace is invariant under cyclic permutations and $V_{\theta,g}=U_{g}V_{\theta}U_{g}^{\dagger}$ , we can rewrite Eq. (91) as:

[TABLE]

By expanding $U_{g}^{\dagger}|\psi\rangle$ in basis { $|{\scriptstyle\frac{1}{2}},{\scriptstyle\frac{1}{2}}\rangle$ , $|{\scriptstyle\frac{1}{2}},-{\scriptstyle\frac{1}{2}}\rangle$ }: $U_{g}^{\dagger}|\psi\rangle=\cos\frac{\alpha}{2}|{\scriptstyle\frac{1}{2}},{\scriptstyle\frac{1}{2}}\rangle+e^{i\beta}\sin\frac{\alpha}{2}|{\scriptstyle\frac{1}{2}},-{\scriptstyle\frac{1}{2}}\rangle$ , we find that:

[TABLE]

By inserting Eq. (C) into Eq. (91), we can get

[TABLE]

showing that

[TABLE]

Appendix D Proof of Theorem 3

The proof of the first two items of Theorem 3 is identical of the proof of Lemma 2.

It remains to prove that there exists an optimal MO strategy consisting of a covariant POVM $(P_{\theta,\hat{g}})$ and of conditional operations $\mathcal{O}_{\theta,{\hat{g}}}=\mathcal{U}_{\hat{g}}\circ\mathcal{O}_{\theta}\circ\mathcal{U}_{\hat{g}}^{\dagger}$ .

The MO fidelity for Problem 2 can be expressed as

[TABLE]

For every $y\in\mathsf{Y}$ , we define the probability

[TABLE]

the POVM

[TABLE]

and the quantum channels

[TABLE]

Note that the operators $\left(P_{\theta,\hat{g}}^{(y)}\right)_{g\in\mathsf{G}}$ satisfy the normalization condition

[TABLE]

following from Schur’s lemma.

In terms of the above probabilities, POVMs, and channels, the expression (97) can be rewritten as

[TABLE]

Since the fidelity is a convex combination, we have the upper bound

[TABLE]

It is immediate to check that the bound is attained by the MO strategy consisting of the POVM $\left(P_{\theta,g}^{(y_{*})}\right)_{g\in\mathsf{G}}$ and of the conditional operations $\mathcal{O}_{\theta,y_{*},g}$ , where $y_{*}$ is the outcome that maximizes the expression in the right-hand-side of Equation (103).

Appendix E Optimization of the MO strategy

Our goal is to maximize the fidelity

[TABLE]

over all values of $m$ , all unit vectors $\xi_{\theta}$ , and all unitary gates $V_{\theta^{{}^{\prime}}}$ . Using the relation $\overline{U}_{g}=\sigma_{y}U_{g}\sigma_{y}$ , we can rewrite the fidelity as

[TABLE]

with $|\Phi_{V_{\theta^{{}^{\prime}}}}^{*}\rangle=(I\otimes\sigma_{y})\,|\Phi_{V_{\theta^{{}^{\prime}}}}^{+}\rangle$ and $|\Phi^{*}_{V_{\theta}}\rangle=(I\otimes\sigma_{y})\,|\Phi^{+}_{V_{\theta}}\rangle$ .

For every angle $\alpha$ , the vector $|\Phi^{*}_{V_{\alpha}}\rangle$ can be expanded as

[TABLE]

having used the notation $|l,n;j_{1},j_{2}\rangle$ for the eigenstates of the $z$ -component of the total spin of a bipartite system consisting of two spins $j_{1}$ and $j_{2}$ , respectively. Hence, we have

[TABLE]

and

[TABLE]

Moreover, the fidelity can be expressed as

[TABLE]

having defined $|\widetilde{\xi}_{\theta}\rangle=e^{i\pi J_{y}}\,|\overline{\xi}_{\theta}\rangle$ . We now insert Equation (106) into the above expression, taking advantage of the orthogonality relation

[TABLE]

In this way, the fidelity becomes

[TABLE]

with

[TABLE]

Expanding $|\xi_{\theta}\rangle$ as $|\xi_{\theta}\rangle=\sum_{n}\,\xi_{\theta,n}\,|j,n\rangle$ , we obtain

[TABLE]

Note that the bound can be attained by choosing $|\xi_{\theta}\rangle$ to be an eigenstate of $J_{z}$ with suitable eigenvalue $n$ .

Now, let $|\Gamma|=\sqrt{\Gamma^{2}}$ be the the modulus of $\Gamma$ , and let $\Gamma_{+}:=(|\Gamma|+\Gamma)/2$ and $\Gamma_{-}:=(|\Gamma|-\Gamma)/2$ be the positive and negative part of $\Gamma$ , respectively. With inserting these definitions in Eq. (110), the fidelity can be upper bounded as

[TABLE]

the second inequality following from the Cauchy-Schwarz inequality applied to the vectors $\left(\sqrt{\Gamma_{+}}-\sqrt{\Gamma_{-}}\right)\,\big{(}|j,m\rangle\otimes|j,-m\rangle\big{)}$ and $\left(\sqrt{\Gamma_{+}}+\sqrt{\Gamma_{-}}\right)\,\big{(}|j,n\rangle\otimes|j,-n\rangle\big{)}$ . We will discuss the attainability of the bound (111) in the end of the proof.

Inserting the definition of $\Gamma$ [Eq. (E)] in the bound (111), we obtain

[TABLE]

which becomes

[TABLE]

For $j>1$ , one can easily see that each of the three summands in the above expression has its maximum value for $|m|=j$ , independently of the angles $\theta$ and $\theta^{\prime}$ . Setting $m=j$ and optimizing over $\theta^{\prime}$ we obtain that the maximum is obtained for

[TABLE]

for $\theta$ in $[0,\pi]$ , and by

[TABLE]

for $\theta$ in $(\pi,2\pi)$ (recall that the range of $\mathrm{arccot}\,$ is between [math] and $\pi$ ). For these values of $\theta^{{}^{\prime}}$ , the entanglement fidelity is

[TABLE]

The same approach works for $j=1/2$ , in which case $|m|=j$ is the only possible choice, and the optimization over $\theta^{\prime}$ yields again the optimal value (114).

Note that the choice of angles $\theta^{\prime}$ in Eqs. (114) and (115) satisfies the condition $\cos\frac{\theta^{{}^{\prime}}}{2}\cos\frac{\theta}{2}\sin\frac{\theta^{{}^{\prime}}}{2}\sin\frac{\theta}{2}\geq 0$ . Hence, the operator $\Gamma$ is positive, and therefore $\Gamma=|\Gamma|$ . As a consequence, the inequality (111) is attained by choosing $|\xi_{\theta}\rangle=|j,m\rangle$ .

For $j=1$ , the optimal MO strategy is determined by a brute-force approach, by setting $m=0$ and $m=1$ , optimizing the right-hand-side of Eq. (113) over $\theta^{\prime}$ . When $|\pi-\theta|>0.303\pi$ , the optimal MO strategy is the same as when $j\neq 1$ . When $|\pi-\theta|\leqslant 0.303\pi$ , the optimal $m$ is $m=0$ , and the optimal angle $\theta^{{}^{\prime}}$ becomes $\theta^{{}^{\prime}}=\pi$ . Also in this case, the operator $\Gamma$ is positive, and therefore the inequality (111) is attained by choosing $|\xi_{\theta}\rangle=|j,m\rangle$ .

Appendix F Persistence of the quantum advantage

The state of the memory spin after the interaction can be obtained by application of the complementary channel $\widetilde{\mathcal{C}}_{\theta}$ , defined by

[TABLE]

where $\operatorname{Tr}_{{\rm S}}$ denotes the partial trace over the target spin, and $U_{\theta}$ is the unitary operator in Eq. (36).

To evaluate this state, it is convenient to look at the evolution of the basis states $|j,m\rangle_{g}:=U_{g}^{(j)}|j,m\rangle$ . By explicit calculation, we obtain the relation

[TABLE]

where the coefficients $c_{m+i,m}$ are given by

[TABLE]

At the first step, the memory starts in the state $|j,j\rangle_{g}$ . By repeatedly applying Eq.(117), we then obtain the memory state at every step. Explicitly, the memory state for the $n$ -th usage is given by

[TABLE]

where $p(n-1,m,\theta)$ is the probability distribution after $n-1$ usages, which is given by

[TABLE]

$U$ being Tricomi’s function (confluent hypergeometric function of the second kind). Using the recursion formula

[TABLE]

we get the asymptotic expression

[TABLE]

Now, Equation (119) gives us the memory state at the $n$ -th iteration. The fidelity obtained by using this state is given by

[TABLE]

where $F_{\rm Hei}(j,\theta,m)$ is the average fidelity when the probe is in the state $|j,m\rangle_{g}$ , namely

[TABLE]

The average over the input states can be easily computed using the relation with the entanglement fidelity, Equation (15) . Using Equation (35) for the gate $U_{\theta}$ , we obtain the asymptotic expression

[TABLE]

One can see directly that in asymptotics, $F(j,\theta,m)$ is a arithmetic progression and $p(n,m,\theta)$ is a geometric progression. Inserting the above expressions into Eq. (123) we obtain

[TABLE]

Comparing with the MO fidelity in Eq.(57), we obtain that the persistence of the quantum advantage tends to $N(j,\theta)=j/(1-\cos\theta)$ .

The exact dependence of the fidelity on $n$ is shown in Figure 7 for different values of the spin and for rotation angle $\theta=\pi$ . Interestingly, the persistence of the quantum advantage is exactly equal to the asymptotic value $j/2$ for all the values of $j$ shown in the figure.

We showed the explicit calculation of $F(j,\theta,m)$ and $p(n-1,m,\theta)$ when the interaction time is fixed at every step. More general strategies where the interaction time is optimized at every step can be studied in the same way. In the large $j$ limit, we find that such step-by-step optimization is not needed: the fidelity tends to the same value, no matter whether the interaction time is optimized at every step or once for all. As a result, the persistence of the quantum advantage is the same in both scenarios.

Appendix G Robustness of the quantum strategy

Here we evaluate the fidelity in the execution of the gate $V_{\theta,\mathbf{n}}=\cos\frac{\theta}{2}\,I-i\sin\frac{\theta}{2}\,\mathbf{n}\cdot\boldsymbol{\sigma}$ when the optimal learning strategy for pure states is adopted with a probe in the thermal state $\rho_{\mathbf{n},\gamma}$ . The fidelity of this strategy is

[TABLE]

with $U_{\theta}$ as in Equation (36). Inserting the expression for the state $\rho_{\mathbf{n},\gamma}$ into the above equation, we obtain

[TABLE]

with $F_{\rm Hei}(j,\theta,m)$ defined as in Equation (124). The asymptotic expression for $F_{\rm Hei}(j,\theta,m)$ was computed in Equation (125). Inserting this expression in the above equation, we obtain

[TABLE]

Appendix H Learning higher dimensional rotations for spin- $k$ particle

Following the structure of the optimal learning mechanism for spin $1/2$ , we choose the memory state to be $|j,j\rangle_{g}$ and we let the two spins undergo the Heisenberg interaction

[TABLE]

where $\mathbf{K}=(K_{x},K_{y},K_{z})$ are the spin operators of the target spin.

Using the above strategy, we can explicitly compute the entanglement fidelity, given by

[TABLE]

where $|\Phi^{(k)+}\rangle=\frac{1}{2k+1}\sum_{m=-k}^{k}|k,m\rangle\otimes|k,m\rangle$ being the canonical maximally entangled state of two spin- $k$ particles, $\rm R$ denotes a reference qubit, entangled with the target spin- $k$ particle, and $V_{\theta}^{(k)}$ is a rotation of $\theta$ around the $z$ axis in $2k+1$ representation.

Inserting the formula of $U_{\theta}^{(k)}$ in Eq. (131), using the expressions of the Clebsch-Gordan coefficients, we arrive at the asymptotic expression

[TABLE]

The average fidelity is then given by

[TABLE]

A similar calculation can be done for the MO strategy consisting in measuring the memory state with POVM $P_{\hat{g}}=(2j+1)\,\mathcal{U}_{\hat{g}}^{\dagger}(|j,j\rangle\langle j,j|)$ and then performing the conditional operation $V_{\theta,\hat{g}}^{(k)}=U_{\hat{g}}^{(k)}V_{\theta}^{(k)}U_{\hat{g}}^{(k)\dagger}$ on the target spin- $k$ particle, which means rotate with angle $\theta$ with the rotated $z$ -axis $\hat{g}\bf{e}_{z}$ :

[TABLE]

with

[TABLE]

By denoting $\varphi$ as the angle between $|j,j\rangle_{g}$ and $|j,j\rangle_{\hat{g}}$ , and $\tau$ the rotation angle for the rotation $V_{\theta,{g}}^{(k)\dagger}V_{\theta,{\hat{g}}}^{(k)}$ , the entanglement fidelity can be rewritten as

[TABLE]

Performing the average, we obtain the asymptotic expression

[TABLE]

which can then be used to evaluate the average fidelity as

[TABLE]

By comparing with Eq. (134), we again see that the error is exactly twice the error of the coherent quantum learning strategy.

Bibliography46

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Biamonte et al. (2017) J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd, Nature 549 , 195 (2017).
2Dunjko and Briegel (2018) V. Dunjko and H. J. Briegel, Reports on Progress in Physics 81 , 074001 (2018).
3Aïmeur et al. (2006) E. Aïmeur, G. Brassard, and S. Gambs, in Conference of the Canadian Society for Computational Studies of Intelligence (Springer, 2006) pp. 431–442.
4Harrow et al. (2009) A. W. Harrow, A. Hassidim, and S. Lloyd, Physical Review Letters 103 , 150502 (2009).
5Rebentrost et al. (2014) P. Rebentrost, M. Mohseni, and S. Lloyd, Physical Review Letters 113 , 130503 (2014).
6Rønnow et al. (2014) T. F. Rønnow, Z. Wang, J. Job, S. Boixo, S. V. Isakov, D. Wecker, J. M. Martinis, D. A. Lidar, and M. Troyer, Science 345 , 420 (2014).
7Wiebe et al. (2014) N. Wiebe, A. Kapoor, and K. M. Svore, ar Xiv preprint ar Xiv:1412.3489 (2014).
8Dunjko et al. (2016) V. Dunjko, J. M. Taylor, and H. J. Briegel, Physical Review Letters 117 , 130501 (2016).

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Quantum-enhanced learning of rotations about an unknown direction

Abstract

I Introduction

II Learning how to rotate about an unknown axis

II.1 Scenario 1: learning from a relaxation process

Problem 1

II.2 Scenario 2: learning from a rotation gate

Problem 2

III Optimal quantum strategies

III.1 Structure of the optimal solution of Problem 2

Theorem 1

Lemma 1

Lemma 2

III.2 Choi operator formulation

Lemma 3

III.3 Optimal quantum strategy for j>1j>1j>1

Theorem 2

III.4 Optimal quantum strategy for j=1/2j=1/2j=1/2

III.5 Optimal quantum strategies for j=1j=1j=1

III.6 Optimal fidelities for j=1/2j=1/2j=1/2 and j=1j=1j=1

IV The quantum benchmark

IV.1 Measure-and-operate (MO) channels

IV.2 Structure of the optimal MO strategy for Problem 2

Theorem 3

IV.3 Choi operator formulation

Proposition 1

Proposition 2

Theorem 4

Proposition 3

IV.4 Optimal MO strategy for j≠1j\not=1j=1

Theorem 5

IV.5 Optimal MO strategies for j=1j=1j=1

V Persistence of the quantum advantage

VI Robustness to thermal noise

VII Learning higher dimensional gates

VIII Conclusions

Appendix A Derivation of the optimal quantum strategy

Appendix B Heisenberg interaction is the optimal learning strategy

Appendix C Worst-case fidelity

Appendix D Proof of Theorem 3

Appendix E Optimization of the MO strategy

Appendix F Persistence of the quantum advantage

Appendix G Robustness of the quantum strategy

Appendix H Learning higher dimensional rotations for spin-kkk particle

III.3 Optimal quantum strategy for $j>1$

III.4 Optimal quantum strategy for $j=1/2$

III.5 Optimal quantum strategies for $j=1$

III.6 Optimal fidelities for $j=1/2$ and $j=1$

IV.4 Optimal MO strategy for $j\not=1$

IV.5 Optimal MO strategies for $j=1$

Appendix H Learning higher dimensional rotations for spin- $k$ particle