An Artificial Spiking Quantum Neuron
Lasse Bj{\o}rn Kristensen, Matthias Degroote, Peter Wittek, Al\'an, Aspuru-Guzik, Nikolaj T. Zinner

TL;DR
This paper introduces a novel artificial quantum spiking neuron architecture that leverages Hamiltonian dynamics and quantum measurements, enabling quantum state classification and potential advantages in quantum neural networks.
Contribution
It presents the first design of a quantum spiking neuron using Hamiltonian evolution and measurements, integrating quantum correlations into neural network models.
Findings
Demonstrated classification of Bell pairs as a quantum certification protocol
Proposed a scalable architecture combining spiking neural features with quantum correlations
Showed potential for quantum neural networks leveraging non-local quantum effects
Abstract
Artificial spiking neural networks have found applications in areas where the temporal nature of activation offers an advantage, such as time series prediction and signal processing. To improve their efficiency, spiking architectures often run on custom-designed neuromorphic hardware, but, despite their attractive properties, these implementations have been limited to digital systems. We describe an artificial quantum spiking neuron that relies on the dynamical evolution of two easy to implement Hamiltonians and subsequent local measurements. The architecture allows exploiting complex amplitudes and back-action from measurements to influence the input. This approach to learning protocols is advantageous in the case where the input and output of the system are both quantum states. We demonstrate this through the classification of Bell pairs which can be seen as a certification protocol.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
An Artificial Spiking Quantum Neuron
Lasse Bjørn Kristensen
Department of Physics and Astronomy, Aarhus University, DK-8000 Aarhus C, Denmark
Matthias Degroote
Department of Chemistry, University of Toronto, Toronto, Ontario M5G 1Z8, Canada
Department of Computer Science, University of Toronto, Toronto, Ontario M5G 1Z8, Canada
Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138,USA
Peter Wittek
Rotman School of Management, University of Toronto, Toronto, Ontario, Canada
Creative Destruction Lab, Toronto, Ontario, Canada
Vector Institute for Artificial Intelligence, Toronto, Ontario M5S 1M1, Canada
Perimeter Institute for Theoretical Physics, Toronto, Ontario N2L 2Y5, Canada
Alán Aspuru-Guzik
Department of Chemistry, University of Toronto, Toronto, Ontario M5G 1Z8, Canada
Department of Computer Science, University of Toronto, Toronto, Ontario M5G 1Z8, Canada
Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138,USA
Vector Institute for Artificial Intelligence, Toronto, Ontario M5S 1M1, Canada
Zapata Computing Inc., Cambridge, MA 02139, USA
Canadian Institute for Advanced Research, Toronto, Ontario M5G 1Z8, Canada
Nikolaj T. Zinner
Department of Physics and Astronomy, Aarhus University, DK-8000 Aarhus C, Denmark
Aarhus Institute of Advanced Study, Aarhus University, DK-8000 Aarhus C, Denmark
Abstract
Artificial spiking neural networks have found applications in areas where the temporal nature of activation offers an advantage, such as time series prediction and signal processing. To improve their efficiency, spiking architectures often run on custom-designed neuromorphic hardware, but, despite their attractive properties, these implementations have been limited to digital systems. We describe an artificial quantum spiking neuron that relies on the dynamical evolution of two easy to implement Hamiltonians and subsequent local measurements. The architecture allows exploiting complex amplitudes and back-action from measurements to influence the input. This approach to learning protocols is advantageous in the case where the input and output of the system are both quantum states. We demonstrate this through the classification of Bell pairs which can be seen as a certification protocol. Stacking the introduced elementary building blocks into larger networks combines the spatiotemporal features of a spiking neural network with the non-local quantum correlations across the graph.
††thanks: email: [email protected]
Introduction
As Moore’s law slows down waldrop2016chips , increased attention has been put towards alternative models for solving computationally hard problems and analyzing the ever growing stream of data gantz2012digital ; hashem2015rise . One significant example has been the reinvigoration of the field of machine learning: neuromorphic models, inspired by biology, found applications in a large host of fields krizhevsky2012imagenet ; sutskever2014sequence . In parallel, quantum computing has been taking significant steps moving from a scientific curiosity towards a practical technology capable of solving real-world problems preskill2018quantum . Given the prominence of both fields, it is not surprising that a lot of work has gone into exploring their parallels, and how one may be used to enhance the other. One such synergy has emerged in the field of quantum machine learning biamonte2017quantum ; dunjko2018machine ; kapoor2016quantum . Recent results aim to mimic the parametric, teachable structure of a neural network with a sequence of gates on a set of qubits schuld2018circuit ; killoran2018continuous ; tacchino2019artificial or on a set of photonic modes steinbrecher2019quantum . A subset of these algorithms focus on quintessentially quantum problems: the input to the learning model is a quantum state and so is its output. This scenario is relevant in building and scaling experimental devices and it is often referred to as quantum learning monras2016inductive ; albarran2018measurement .
We take a slightly different approach to quantum learning wherein the structure of the network manifests itself as interactions between qubits in space rather than as gates in a circuit diagram. Specifically, we will present a small toolbox of simple spin models that can be combined into larger networks capable of neuromorphic quantum computation. To illustrate the power of such networks, a small example of such a ‘spiking quantum neural network’ capable of comparing two Bell states is presented, a task which could have applications in both state preparation and quantum communication. The term ‘spiking’ refers to the temporal aspect in the functioning of the model during the activation of the neuron, akin to the classical spiking neural networks maass1997networks . As illustrated in the example, a fundamental property of these networks is that they generate entanglement between the inputs and outputs of the network, thus allowing measurement back-action from standard measurements on the output to influence the state of the input in highly non-trivial ways. The proposed model for spiking quantum neural networks is amenable to implementation in a variety of physical systems, e.g., using superconducting qubits kjaergaard2019superconducting ; kounalakis2018tuneable ; wallraff2007sideband .
A key inspiration for the constructions presented in this paper is the advent of dedicated classical hardware for simulating spiking classical neural networks, including implementations from both Intel and IBM Roy2019 ; Merolla2014 ; Davies2018 ; Benjamin2014 . These systems emulate the function of a biological spiking neural network through networks of small neuron-like computing resources. They can be roughly grouped into digital systems simulating the dynamics of spiking networks using binary variables and in discrete time-steps Merolla2014 ; Davies2018 , and analog circuitry emulating spiking behaviour of physical observables in continuous time Benjamin2014 . The main focus of this paper is to extend the concept of the discretized models to the quantum domain, facilitating the use of purpose-built neuromorphic systems for applications within quantum learning, and providing the models access to quantum ressources like entanglement. Additionally, an outlook towards the implementation of continuous-time computational quantum dynamics is also briefly discussed.
Note that several similar proposals for real-space quantum neural networks do exist. One prominent example is within the field of quantum memristors, where a spiking quantum neuron was recently constructed, and networks of these objects were proposed gonzalez2019quantized . The central distinction to our scheme is the degrees of freedom under consideration. Whereas our scheme is based on generic qubits, memristive schemes revolve around the dynamics of voltages and currents. This means that the operations of our neural networks below will be more easily interpretable in a quantum computing context in comparison to these dissipative memristive schemes. A more apt comparison may therefore be a recent proposal for implementing a quantum perceptron through adiabatic evolution of an Ising model torrontegui2019unitary . Indeed, the interaction implemented in that proposal very closely resembles the operation of the first spin-model presented below. However, the nature of the adiabatic protocol makes the path towards connecting such building blocks into a larger dynamical network unclear. Neither case fully captures the properties of the spiking quantum neural network proposed here.
Results
Defining the Building Blocks
The first step towards a neuromorphic quantum spin model is the construction of neuron-like building blocks. In other words, we need objects capable of sensing the state of a multi-spin input state and encoding information about relevant properties of this input into the state of an output spin. Additionally, we will require that this operation does not disturb the state of the input. This additional property is partially motivated by a similar property of the neurons used in e.g. classical feed forward networks, which similarly only exerts influence on the state of the network through their output. Furthermore, we show in Sec. .3 below that the preservation of inputs allow for interesting and non-trivial effects of the entanglement between the generated output and the preserved input. The cost of this preservation is a set of restrictions on each building block, as described in more detail below.
Inspired by the way classical neurons activate based on a (weighted) sum of their inputs, the first building block will be one that flips the state of its output spin, depending on how many of the input spins are in the ‘active’ -state. The second building block, on the other hand, measures relative phases of components in the computational (i.e. the ) basis, and thus has no classical analogue.
.1 Neuron 1: Counting Excitations
In analogy to the thresholding behaviour of classical spiking neurons, we start by constructing a spin system that is capable of detecting the number of excitations (i.e. the number of inputs in the state ) and exciting its output spin conditional on this information. As shown in Supplementary Note 1, this behaviour can be implemented using dynamical evolution driven by the Hamiltonian
[TABLE]
assuming the evolution is run for a time and that the Hamiltonian fulfils some restrictions on the interaction strengths that will be discussed below. In this model, we label spins 1 and 2 as the input, and spin 3 as the output. The intuition behind this model is that the Heisenberg interaction between the inputs sets up energy differences among the four possible Bell state of the inputs. Through the coupling to the output, these differences then influence the energy cost of flipping the output spin, resulting in the cosine drive on the output only being resonant when the input qubits are in certain states. The result is that the driving induces flips in the output qubit if and only if the input is in a Bell state with an even number of excitations. Since the conditionality is a resonance/off-resonance effect, the detuning of the undesired transitions needs to be much larger than the strength of the driving, which leads to the criterion
[TABLE]
which is naturally fulfilled whenever the driving-strength is much smaller than the chain interaction strength .
Due to dynamical phases, the requirement that superpositions of input states should be preserved adds two additional constraints for the parameters of the model. Specifically, conservation of relative phases within the subspaces of inputs that either flips (“above threshold”) or does not flip (“below threshold”) the output yields
[TABLE]
If these constraints are fulfilled, the only non-trivial phase will be a coherent phase between the above-threshold and below-threshold subspaces of , where is the time elapsed during detection. Since these two subspaces are now distinguished by the state of the output, correcting for this phase is just a matter of performing the corresponding phase-gate on the output qubit. In the special case where is an integer, this reduces to performing a -rotation of the output about the -axis (see Supplementary Note 1 for details).
When combined with this subsequent unconditional phase gate, the dynamical evolution induced by the Hamiltonian in (1) is to coherently detect the parity of the number of excitations of the input, and to encode this information in the output spin, i.e. in conventional Bell-state notation:
[TABLE]
where the first ket denotes the state of the inputs and the second ket the state of the output. The output is either fully excited or not excited at all by the evolution—hence we refer to this structure as a spiking quantum neuron, in analogy to similar objects from classical computing. Sample simulations showing the dynamics of the spiking process are shown in Fig. 1. Note that the dynamics implementing this behaviour is linear unitary time evolution, thus the effect on general 2-qubit inputs follows from expressing the input in the Bell-state basis and applying the rules of (4). For the neuron-parameters and , the corresponding operation is performed with an average fidelity of when averaged over all possible 2-qubit inputs.
.2 Neuron 2: Detecting Phases
While neuron 1 is a fully quantum mechanical object, capable of coherently treating superpositions in the inputs, the property that it detects—the number of excitations in the input—would be similarly well-defined for a classical neuron participating in a classical digital computation. However, the state of the two input qubits will also be characterized by properties that have no classical analogue, such as the relative phases of terms in a superposition state. The goal of the second neuron is to be able to detect these relative phases of states in the computational basis. Specifically, it aims to distinguish the states from the states . Combining this detection-capability with the capabilities of the excitation-counting neuron of the previous section (exemplified by (4)) allows complete discrimination between the four Bell states.
The operational principle of the phase-detection neuron is similar to that of the excitation-detection neuron: it relies on a combination of single-qubit gates and the unitary time-evolution generated by a Hamiltonian of the form:
[TABLE]
As shown in Supplementary Note 2, running the dynamics of this Hamiltonian for a time performs a -gate on the output qubit if and only if the state of input qubits are in the subspace spanned by and . Thus by conjugating this operation with Hadamard gates on the output qubit and correcting for the -phase using a phase-gate (see Figure 2) yields the desired phase-detection operation:
[TABLE]
The fundamental principle of operation is identical to the one of the excitation-counting neuron, in that the Hamiltonian once again contains three terms: a Heisenberg interaction to set up an energy spectrum that distinguishes the Bell states, an interaction that tunes the energy of the output qubit (i.e. qubit 3) dependent on the state of the inputs, and a single-qubit operator attempting to change the state of the output and succeeding if and only if the driving related to this term matches the energy cost of flipping the output. The only difference is that the interaction term now sets up an energy splitting between the spin states rather than the states , hence the need for Hadamard gates to convert between the two bases. Note that all of these operations are again linear, thus (6) specifies the operation also for general 2-qubit inputs. For the neuron-parameters and , the corresponding phase-detection operation is performed with an average fidelity of when averaged over all possible 2-qubit inputs, with higher values achievable through small adjustments (see Supplementary Note 2 for details).
As the operation of the phase-detection neuron also relies on resonance/off-resonance effects, a restriction of similar to (2) is present. Specifically, we require that
[TABLE]
Additionally, the requirement that the state of the inputs should not be distorted by the operation of the neuron yields the requirement that the ratios between and should fulfil:
[TABLE]
with .
.3 State Comparison Network
Having introduced a set of computational building blocks above, we now aim to illustrate how these can be combined into larger networks in order to solve computation- and classification-problems. Specifically, we will illustrate how a network of these objects allows one to compare pairs of Bell states to determine if they are the same Bell state. As detailed below, such a network could play a central role in the certification of Bell-pair sources and quantum channels, and may also have potential applications for machine learning and state preparation tasks.
The basic structure of the proposed network is depicted in Fig. 3, and consists of three layers. The first layer constitutes the input to the system. It is in these four qubits that the two Bell states to be compared are stored. Each pair is used as an input to both of the types of neurons detailed in sections .1 and .2, with the output stored in two pairs of qubits in the second layer. In this way, sequentially running each of the two neuron operations extracts both the excitation-number parity and the relative phase of superpositions in the inputs and encodes it into the second layer. In other words, this layer ends up containing exactly the information needed to distinguish among the four Bell states. State comparison therefore boils down to detecting if the information extracted from one input matches that extracted from the other input. Since detecting if two qubits are in the same state (i.e. both or both ) boils down to checking the number of excitations modulo two, this comparison can be done using the neuron of Section .1. The third layer thus encodes two bits of information: whether the excitation parity of the two inputs match, and whether the relative phases match. Detecting if the two inputs were the same Bell state is thus a matter of detecting if both of these bits are in the -state. This can be done using methods similar to those of Section .1 – see Supplementary Note 3 for details.
The result of the manipulations is that the output is put into the -state if the two Bell-state inputs were identical and otherwise. Since all of the operations are achieved through linear, unitary dynamics, the behaviour for superposition inputs follow from this rule and linearity. This also implies that the network cannot compare arbitrary states, as is indeed prohibited by the no-cloning theorem of quantum mechanics. For instance, two identical inputs in a superposition in the Bell basis may return either or as output:
[TABLE]
This illustrates an interesting property of quantum neural networks, namely that entanglement between inputs and outputs means that measurements of the outputs may have dramatic effects on the state of the inputs. In this case, a measurement of in the output will always project the input qubits into the states corresponding to this output, i.e to identical Bell-state pairs. In this sense, a classifying quantum network like the one above will simultaneously be a projector onto the spaces corresponding to the states it is build to classify—a property that might prove helpful in, for instance, state preparation schemes. Note that this simple interpretation of measurement back-action follows from the fact that no other perturbations on the input state have been performed by the network during the computation, and hence from the corresponding non-disturbance requirement applied to each of the neurons.
A more concrete application of the network above is the certification of quantum channels and Bell-state sources. The ability to determine the reliability of resources such as Bell-state sources and quantum channels would be a practical benefit in many quantum communication and quantum cryptography applications. This is an active area of research: for instance, device-independent self-testing through Bell inequalities works for certain multipartite entangled states supic2017simple , or quantum template matching for the case where we have two possible template states sasaki2001quantum ; sentis2012quantumlearning . Since the device presented above allows for the comparison of unknown systems with known-good ones, it is ideally suited to this kind of certification task.
Finally, it is worth noting that the comparative nature of the network means that the output of the network defines a kernel between 2-qubit quantum states. Specifically, given two inputs represented by amplitudes and in the Bell-state basis, the probability of measuring in the output will be given by
[TABLE]
which bears a strong resemblance to the classical “expected likelihood” kernel jebara2004probability . Considering this, comparison networks like the one above may also find applications within kernel-based quantum machine learning approaches schuld2019quantum ; havlicek2019supervised .
One thing to note is that the design of the structure in Fig. 3 is motivated by a desire for a one-step forward propagation of information between the layers of qubits, in analogy to the propagation of information in artificial classical neural networks. If each layer is allowed to probe the preceding layer more than once, a significant reduction in qubit overhead is possible. An example of this is the network depicted on Fig. 4. This network performs the same operation as the network in Fig. 3 while omitting the entire second layer. This is achieved by using the the same qubit as target for both of the neurons detecting a given property. In this way, a shared property between the two input pairs means the two sequential detections result in an even number of flips to the corresponding qubit of the middle layer – either 0 or 2. On the other hand, non-identical properties will instead lead to precisely one flip. Thus the initial state of a qubit in the middle layer is preserved if and only if the property that it detects is identical between the two input pairs. The state of the output qubit is then determined by performing a flip if and only if both qubits in the middle layer have remained in their initial -state. In practice, this can be done using a similar generalization to the final step as the one used in the larger network – See Supplementary Note 3 for details. Thus allowing multiple sequential probings of the inputs by the middle layer allows the qubit-count to be reduced by 4, though at the expense of a slight increase in the complexity of the protocol for forward propagation of information as well as an increase in the maximum number of connections required by a qubit from 3 to 4.
Discussion
We have presented a set of building blocks that detects properties of two-qubit inputs and encodes these properties in a binary and coherent way into the state of an output qubit. To illustrate the power of such spiking quantum neurons, we have presented a network of these building blocks capable of identifying if two Bell states are identical or not, and argued the usefulness of such comparison networks for quantum certification tasks within quantum communication and quantum cryptography. Additionally, we have seen how the entanglement of the inputs and output results in highly non-trivial effects on the inputs when a measurement is performed on the output of the network.
From the considerations above, several interesting questions arise. A main question might be how to scale up structures made from these and similar building blocks into larger networks capable of performing more complex quantum processing tasks. Concerning scaling, it seems reasonable to expect that the kind of intuitive reasoning behind the operation of the network presented here will become ever more challenging. As a result, it might be fruitful to take inspiration from the field of classical neural networks and design quantum networks whose operation depend on parameters. In this way, one can then adjust these parameters to make the network perform a certain task, in a way analogous to how both classical and quantum neural networks are trained. Since the Hamiltonians responsible for the operation of the neurons already contain a number of parameters, the architecture presented in this paper seems well-suited to such an approach. An interesting avenue of further research is therefore the generalization of the dynamical models presented here to tunable models capable of detecting other structures in multi-qubit inputs than the two-qubit Bell-state properties detected here, or to models capable of solving other relevant quantum-computing problems, for instance parity detection for error correction protocols.
A central challenge in such a learning-based approach would be how best to optimize the parameters of the model, and how to identify the class of operations that can be implemented by a given model. Much work is currently being dedicated to these questions in the context of variational algorithms in gate-based quantum computation Peruzzo2014 ; Mcclean2016 ; schuld2018circuit ; Farhi2018 ; Farhi2014 . However, for the dynamical models described in this paper, these problems are best framed within the field of optimal control theory, where a number of methods and results already exist on the optimization of pulses and parameters for quantum models Magann2020 ; Yang2020 . Furthermore, advanced methods such as genetic algorithms Mortimer2020 and reinforcement-learning Niu2019 have recently begun to garner interest within this field. Thus the intersection of quantum optimal control and quantum machine learning seems a fertile avenue of further research, as recently pointed out in Magann2020 . Nevertheless, optimization of the parameters of the model is in general expected to be a hard problem, although the difficulty compared to the corresponding optimizations currently faced by variational algorithms in gate based quantum computers is believed by the authors to be an open question.
Another possible avenue of research towards improving the schemes presented here is to reduce the complexity of operation related to having to turn interactions between the different layers on and off by instead employing autonomous methods similar to those already used within the field of quantum error correction. Using such methods to perform quantum operations in a coherent way can be highly non-trivial, but for networks like the one described above where the latter stages of the network are essentially classical processing, strict coherences should not be needed for the network to operate, thus lowering the bar for autonomous implementations of similar networks. Furthermore, a faithful reproduction of the spiking action of biological neurons necessarily requires non-linearities and non-unitary reset. Thus engineered decoherence would also be an essential resource if more closely reproducing the continuous-time dynamical behaviour of classical spiking neurons is the goal.
Finally, tapping into the temporality of the neurons presented above also holds great promise. Indeed, it has already been shown that the temporal behaviour of comparatively simpler networks of spins allows for universal quantum computation childs2013universal . Thus we believe that augmenting the neurons of this paper with less constrained and clock-like dynamics combined with tunable, teachable behaviour and perhaps partial autonomy would be a promiseful route towards a neuromorphic architecture capable of solving complicated and interesting problems within quantum learning. Additionally, this approach will further distinguish the spiking quantum neural networks from conventional gate-based approaches, both through temporality and through increased complexity. Indeed, while all neuron operations presented here can be implemented on gate-based quantum computers at a cost of between two and six 2-qubit gates, it is unlikely that the same will hold true for the operation of larger, coupled, parametrized networks. This expectation mirrors the expected advantage of classical special-purpose hardware for neuromorphic computing—it does not perform computations that a general-purpose processor could not perform, but may do so faster and more efficiently. In a similar manner, the operations performed here could be emulated on either gate-based quantum computers or (universal) annealing architectures, but would require some overhead, for instance in the form of requiring multiple control pulses to emulate the single-pulse evolution of the excitation-counting neuron. Conversely, while we expect universal computation with models similar to those presented here to be possible—assuming either sufficiently large networks, like in childs2013universal , or sufficiently reconfigurable couplings—we do not expect such a construction to be an efficient architecture for arbitrary classes of algorithms. In other words, it seems likely that a spiking quantum neural network will tend to naturally implement different operations, and therefore tackle different classes of problems, compared to annealing-based or gate-based algorithms, thus potentially making it a valuable additional tool for quantum machine learning.
Methods
Simulations of the Dynamics
The plots depicted on Fig. 1 and Fig. 2 were generated by numerically simulating the dynamics of the Hamiltonians (1) and (6), respectively, using the Python toolbox QuTip Johansson2012 . Specifically, the system was initialized in states of the form
[TABLE]
where the two kets in the notation specifies the state of the input qubits (qubits 1 and 2, first ket) and the output qubit (qubit 3, second ket) separately. The built-in numerical solver of the QuTiP library was then used to find the trajectories resulting from the time-evolution of each of these states:
[TABLE]
Using these trajectories, the evolution of the quantities depicted in the plot could then be calculated, including the expectation values related to the output qubit:
[TABLE]
and the overlap between the current state of the inputs and their initial state:
[TABLE]
Note that the effect of the second term in this expectation-value is to trace out the dependence on the state of the output. The phase- and Hadamard-gates required for the operation of the neurons were implemented by evolving the system using Hamiltonians of the form:
[TABLE]
for a sufficient amount of time to implement the operation, i.e.:
[TABLE]
Note that all simulations were performed without the simulation of noise and decoherence.
Computation of the average Fidelity
In order to quantify the performance of the neurons, the simulations of the dynamics were combined with tools based on Nielsen2002 for calculating the average fidelity of operations, thus allowing the operations implemented by the neurons to be compared to the idealized operations defined in Eq. (4) and Eq. (6). Each of the fidelities was calculated within the subspace consisting of the 6 states appearing in the corresponding definition, with the following additions made in order to fully specify the desired effects of the neurons within this subspace:
[TABLE]
In other words, the average fidelity was computed by comparing the implemented operation to this idealized operation and then averaging over a uniformly distributed ensemble over the space spanned by the 6 states used in the definition of the gate:
[TABLE]
Note that the 6 states only enters in the ensemble of initial states—the simulations employed the full Hilberspace, and infidelity from leakage out of the 6-state space is fully accounted for by the fidelity-metric. In the specific models presented here, this leakage additionally turned out to be relatively negligible, on the order of .
Data Availability
The code and datasets used in the current study are available from the corresponding author upon request.
Acknowledgments
L.B.K. and N.T.Z. acknowledge funding from the Carlsberg Foundation and the Danish Council for Independent Research (DFF-FNU). M.D. acknowledges support by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Quantum Algorithms Teams Program. A.A.-G acknowledges support from the Army Research Office under Award No. W911NF-15-1-0256 and the Vannevar Bush Faculty Fellowship program sponsored by the Basic Research Office of the Assistant Secretary of Defense for Research and Engineering (Award number ONR 00014-16-1-2008). A.A.-G. also acknowledges generous support from Anders G. Frøseth and from the Canada 150 Research Chair Program.
Author Contributions
N.T.Z. and L.B.K. formulated the initial goal of the research, and M.D., P.W. and A.A.-G. subsequently participated in a further refinement of the scope and focus of the research. The bulk of the analytical and numerical work of the study as well as the creation of the initial draft was performed by L.B.K, with further significant contributions to the manuscript by M.D. and P.W. and crucial revisions by A.A.-G. and N.T.Z.
Competing Interests
The authors declare no competing interests.
Supplementary Material
This supplemental material aims to give a more detailed analysis of the spin-models leading to the two neurons presented in the main text, as well as some elaborations on the construction and operation of the Bell-state comparison network.
1 Further details on excitation-parity neuron
As explained in the main text, the goal with the first neuron is to detect the odd/even parity of the number of excitations (i.e. -states) in the input. In other words, we wish to be able to distinguish the Bell-states from the states . It turns out that this can be achieved using three ingredients. First, we add a Heisenberg-XXZ interaction between the two input-qubits:
[TABLE]
where are energies and is a unitless parameter. The intuition behind this interaction is that it tunes the energy-spectrum of the system in such a way that it allows us to distinguish between the - and -states:
[TABLE]
The next ingredient is to couple the output-qubit to the input-qubits in such a way that the state of the input-qubits influences the energy-spacing of the output-qubit, thus allowing us to do conditional flips of the output qubit by only driving at specific frequencies. Since the property we want to detect has to do with the total -component of the input state (when thought of as a spin state), a reasonable interaction for achieving this would be
[TABLE]
As a result of this interaction, the eigenstates of the system are no longer the Bell states, but instead take the general form:
[TABLE]
for suitable coefficients . Assume now that we add as the final ingredient a drive on the output-qubit:
[TABLE]
This drive will resonantly drive the two transitions
[TABLE]
On the other hand, all other transitions that this drive could potentially induce will be detuned due to the structure of the energy-spectrum. Specifically, the energy-differences between other pairs of states connected by the -operator will be
[TABLE]
and as a result the driving with frequency will be detuned by
[TABLE]
As long as the strength of the driving is weak compared to these energy-scales, the driving will be unable to induce the corresponding transitions. Thus if we require
[TABLE]
the only effect of the term in (20) to first order will be to induce the transitions of (21).
Let’s consider one of these subspaces more closely. To be specific, consider the subspace . In the basis of these two states, the Hamiltonian reads:
[TABLE]
Performing the unitary transformation
[TABLE]
yields the transformed Hamiltonian
[TABLE]
Neglecting the rapidly oscillating terms thus yields an effective Hamiltonian
[TABLE]
Letting this run for a time
[TABLE]
yields the transition
[TABLE]
which corresponds to the transition
[TABLE]
if we undo the unitary transformation.
A similar analysis can be performed on the subspace , and the dynamics of the rest of the states are undisturbed by the driving and thus conforms to the description in eq. (1). In other words, the effect of waiting the time is that the system performs the operation
[TABLE]
where are the states with a single excitation in the input, as sketched in (1). Note that this already implements an operation akin to the one we want—the output-qubit is flipped if and only if the input-register contains an even number of excitations. However, the fact that different components pick up different phases results in a distortion of the input-states. For instance, the difference in phase between the and -states may partially convert a -input to a . Additionally, the difference in phase between the states where a flip of the output occurs and those where it does not will distort the relative amplitudes when superpositions are used as input. We would like to pick the parameters in such a way that we avoid these effects, i.e. in such a way that all of the phases are identical. Starting with the two first states, we see that this imposes the restriction
[TABLE]
Note that this implies
[TABLE]
Turning to the two other states, requiring identical phases among these implies
[TABLE]
Note that in order to assure is real we are forced to pick so that it is larger than , and that picking as prescribed above gives
[TABLE]
At this point, the phases look as follows:
[TABLE]
The criterion for identical phases therefore reduce to
[TABLE]
with the sign inherited from the sign of . Note that neither nor allow solutions to this equation– in both cases, squaring the expression yields an integer on the left-hand side and not on the right-hand side. However, if we allow the factor of to be corrected by a subsequent phase-gate, the requirement above in the case reduces to
[TABLE]
For this to be fulfilled, we will at the very least need to be an integer, meaning need to be part of a pythagorean triple. In fact, going through all of the possible combinations of and being even or odd yields the fact that whenever is an integer it is even (odd) whenever is even (odd). In other words, the sum of these objects is always even when both are integers. Thus it is not just necessary but also sufficient to require and to be part of a pythagorean triple for (23) to be fulfilled. The reason that is especially interesting is that this is required for the operation of the phase-detection neuron (see Sec. 2 below), and thus using another value of for our excitation-detection neuron either prohibits the detection of phase or necessitates an implementation where can be easily tuned.
Returning to the general case where may be arbitrary, we see that the phases match when
[TABLE]
This fully specifies the parameters of the model. However, as alluded to in the case where above, fixing the final phase between the subspace outputting and the subspace outputting is not essential, since this phase can be adjusted separately through a subsequent phase-gate. Thus (24) is a less strict requirement than those determining and .
2 Further details on phase-detection neuron
The goal of the second neuron is to detect the relative phases of the two components of Bell-states in the computational basis. In other words, we wish to be able to distinguish the states from the states . Similarly to the neuron of the previous section we start by adding a Heisenberg-XXZ interaction among the input-qubits in order to set up an energy-spectrum that distinguishes among the four Bell-states:
[TABLE]
yielding the spectrum
[TABLE]
Next, we add an interaction that will tune the energy-spectrum of the output-qubit dependent on the state of the input-qubits. An obvious interaction to use would be one of the form , since that exactly tunes the energy it takes to flip the output-qubit in a way that is conditional on the phase encoded in the input. However, as we will see below it is also possible to achieve the same result using only 2-qubit interactions. Specifically, adding a term of the form
[TABLE]
will also allow us to do the required phase detection. The effect of this term is to couple states with the same phase but different number of excitations. For instance, the state becomes coupled to the state through a Hamiltonian that in the basis of these two states takes the form
[TABLE]
Of course a similar coupling of the states and takes place as well. Indeed, switching the state of the output-qubit is essentially equivalent to switching the sign of . Applying similar arguments to the other four states, we arrive at coupling-Hamiltonians with the following structure:
[TABLE]
For general , these expressions look relatively symmetric, thus making it hard to fulfil our goal of distinguishing the upper and lower subspaces from each other. The exception to this is whenever . In this case, the symmetry of the two expressions is very explicitly broken– one contains a -term while the other does not. Let us for definiteness pick , resulting in the Hamiltonians
[TABLE]
The eigenstates in the subspaces are then straightforward to write down:
[TABLE]
The eigenstates of the subspaces are in principle more involved. However, in the limit where , the effect of the -term in (2) will be negligible, leading to an effective Hamiltonian of the form
[TABLE]
and thus approximate eigenstates
[TABLE]
Note that flipping the state of the output-qubit between and does not change the energy of this second batch of states, while flipping the state of the output-qubit changes the energy of the first batch of states (2) by the amount . In other words, adding a constant driving-term of the form
[TABLE]
to the Hamiltonian will induce resonant flipping of the output-qubit if the input is or , while the same driving will be detuned by an amount
[TABLE]
if the inputs are in the state or . Assuming this detuning is large compared to the driving-strength ensures that nothing happens in the latter case, and thus we have exactly what we want: A flipping of the state of the output-qubit if and only if the phases of the Bell-states of the input have a certain value (in this case: ). Assuming we start the output-qubit in the state , evolving the four possible inputs over a time would yield the transitions
[TABLE]
As described in the section on the excitation-parity neuron, we would like the phases picked up by these terms to match so that the input-amplitudes are not distorted by the evolution of the neuron. Matching the phases on the two first terms yield the criteria
[TABLE]
while matching the two phases of the last line of (2) yields
[TABLE]
Note that when these criteria are fulfilled, the following holds:
[TABLE]
As a result, the phases reduce to
[TABLE]
Note that we have no parameters left to adjust in order to make these phases identical—indeed, there will be a relative factor no matter what value of we use. However, due to our work above the only problematic phases occur between the subspace that flips the output and that which does not. As a result, the phase can be corrected simply by applying a phase-gate on the output-qubit after the operation has finished. Combining this operation with some Hadamards to shift the output-qubit between the computational basis and the basis where the flips occur, the full sequence now fulfils our goal of coherently detecting the sign of the Bell-states in the input.
Before proceeding, let’s briefly review the criteria that need to be fulfilled in order for the neuron to operate in the way explained above. In order for the approximate eigenstates presented in (2) and (28) to be accurate we need
[TABLE]
Additionally, the detuning-criteria that allows us to drive transitions in only one subspace reduces to
[TABLE]
Combining this with the phase considerations above yields the requirements
[TABLE]
In other words we need the driving to be weak compared to the input-output coupling, which in turn needs to be weak compared to the coupling among the input-qubits, and this strong coupling needs to be of a Heisenberg-XXX type. In practice, it turns out that in fact does not need to be that large for the scheme to work, probably due to the fact that scales like and the detuning scales like . In other words, a more accurate criteria is that
[TABLE]
which is a much milder criteria on the size of than the original one. This is supported by the fact that parameters such as is able to reach average fidelities of .
It is worth noting that many of the higher-order effects neglected above can have a detrimental effect on the overall fidelity. This is especially true in relation to matching the relative phases of the different input states, since the neglected terms will tend to induce different second-order energy-shifts to the different inputs. As a result, it can at times be fruitful to depart from the criteria described above and tune the interaction-parameters slightly in order to obtain better overall fidelity. For instance, picking instead of increases the fidelity of the operation from to . Even more remarkably, shifting parameters from to increases the average fidelity from all the way to , though achieving higher fidelities than this seem require a larger to match the relatively large .
As a final aside, it is worth noting that the form of the input-output interaction presented in (25) is far from the oly one that would work. Indeed, the only essential part is that it takes the form
[TABLE]
with an operator that acts on the third qubit and which breaks the degeneracy and sets up a spectrum with two distinct energy-eigenstates that we can subsequently drive transitions between. For instance, an interaction similar to the cross-resonance interaction favoured by IBM sheldon2016procedure .
[TABLE]
would work equally well if paired with driving of the form
[TABLE]
In fact, in this case the eigenstates that we would be inducing flips between would be and rather than the states , which means the Hadamards from the protocol in the main text would no longer be required.
3 Further details on Bell-state comparison network
The operations of both of the Bell-state comparison networks presented in the main text relied mostly on the application of the two neuron building blocks also presented in the main text. However, both networks required a different operation for the final step of propagating the detected information into the output qubit. This section contains information on how each of these operations can be achieved within the same framework as the neuron models.
Final layer of the main network
As explained in the main text, most of the Bell-state comparison network in Fig. 3 of the main text simply consist of iterative applications of the two types of neuron building blocks. However, the final step of the network is lightly different, since it is no longer trying to determine if two bits are equal but rather trying to determine if they are both , corresponding to the qubits in the third layer having determined both that the phases are equal (blue qubit in -state) and that the excitation parities are equal (orange qubit in -state). In other words, the operation of the last layer of the network differs from the excitation-parity neuron in that the output should only be flipped if the inputs are in the state rather than being flipped for both the input and 111In the language of gate-based computation, this corresponds to implementing a Toffoli-gate rather than a pair of CNOT’s.. One way to achieve this functionality is to adjust the driving in the excitation-counting neuron. To see how this works, we note first that the driving term used in this neuron can be written as
[TABLE]
Looking closely at the arguments in Sec. 1 reveals that only the first of these terms played a role in driving the transition within the subspace , while the second term was neglected due to rotating-wave arguments. Similarly, driving the transition within the subspace turns out to only involve the second term in (30). As a result, using a modified driving of the form
[TABLE]
would drive only the transition , and thus we would only detect the -state. The general intuition behind this is that an operator that changes the (unperturbed) energy of the system by needs to enter in the Hamiltonian as
[TABLE]
in order for the combined term to resonantly drive the transitions that the operator induce. Thus because a induce transitions costing in the -subspace and in the -subspace we can easily use this rule of thumb to identify which term drives which transitions.
At first, the new reduced driving of (31) may look more complicated than the original form in (30). However, it is possible to extract this type of driving from a term very similar to the one in (30) using a term corresponding to a local magnetic field on the output-qubit:
[TABLE]
With this term, flipping the output-qubit from to now costs the energy , depending on whether the input is in the state or . Adding the drive
[TABLE]
the first term will then again resonantly induce the transitions , as predicted by our rule of thumb. On the other hand, the transitions within the -subspace will cost , which does not fit with the driving-frequency of the second term. The conclusion that we can draw from our rule of thumb is in other words that the second term is detuned by an amount
[TABLE]
compared to the required value to drive resonant transitions . Picking sufficiently large compared to the strength of the driving should therefore suppress the effect of the second term, leading to effective dynamics of the type we desire.
Note that in some platforms adding a local magnetic field of the type (32) would not increase the complexity of the protocol, since such energy-shifts between the computational states would be naturally present within the architecture. Indeed, such terms appear naturally in many implementations of superconducting qubits kjaergaard2019superconducting , and would need to be compensated by changing to a rotating picture in order to implement Hamiltonians of the form presented in the main text.
As with the two main types of neurons, it would be beneficial if the relative phases related to different inputs could be brought to match. Performing an analysis similar to the one in Sec. 1 yields that running the interactions above for the time yields the time-evolution
[TABLE]
where are the one-excitation states also used in Sec. 1. Note that making the phases of the first two states match through clever selection of parameters is not possible– whatever we do, we cannot get rid of the factor that appear on the -state. Luckily, the usual trick of correcting this phase through a subsequent phase-gate on the output-qubit works just as well here as it did in Sec. 1 and 2. Adding this correction, matching the phases reduce to the requirements
[TABLE]
Note that these phases are identical to some of the phases that needed to be matched when dealing with the excitation-parity neuron (see (1) ) when a phase-gate was applied to the output qubit of that system. From the discussion of that system we therefore already know a set of solutions:
[TABLE]
where is the largest number in a Pythagorean triple that also contains . However, looking closer at how this solution came about, we can note two things: Firstly, the problem of matching the phases in Sec. (1) contained an additional constraint compared to our current problem. Indeed, the fact that should be an integer originated from matching the phases of and , a set of phases that in this case are automatically identical once the factor of is taken care of. Secondly, we arrived at the solution (3) by assuming , a requirement that was only necessary if we also wanted to operate with a phase-neuron on the same input-state. Both of these observations indicate that more general solutions than the one in (3) should exist to our phase-matching problem. Finding these more general solutions from (34) is relatively straightforward. From the matching of the phases on the -states we get
[TABLE]
Using the fact that this implies now yields the final criteria
[TABLE]
where the sign is the one appearing in the expression for . We can use this equation to either express in terms of or express in terms of . The first option is relatively straightforward, yielding simply
[TABLE]
analogeously to the expression in (24). The second option is a bit more involved. Squaring (36) yields the quadratic equation
[TABLE]
This has real solutions whenever
[TABLE]
in which case they can be expressed as
[TABLE]
where is an integer. Thus because (36) implies (37), we know that needs to be of this form to fulfil our phase-criteria. However, (37) does not in general imply (36), and thus it is not a priori obvious that all of the solutions in (39) will also be solutions to the original equation. Because the right hand side of (36) is real, a necessary condition is that the left hand side of this expression should also be real, or equivalently:
[TABLE]
Indeed, if this is the case you can move from the squared criterion to the original one by taking the square root. Thus solves the original requirement for some sign of if and only if it is of the form (39) and is smaller in norm than . Interestingly, this is automatically the case whenever is real. In other words, it turns out that (38) implies (40). To see this, let . The solutions then take the form
[TABLE]
The interval where this is real follows directly from (38):
[TABLE]
Determining the largest absolute value of the function (41) on this interval is now a question of simple calculus. Taking the derivative and setting it zero yields
[TABLE]
Squaring this yields the fact that the function can only have local extrema when , i.e. when . At these points, the absolute value of the function reaches
[TABLE]
which is at most equal to one. Similarly, determining the value of the function on the boundary of the interval on which it is defined is straightforward and yields , another number that is at most equal to one. Since we know that the largest absolute values that our function takes must be attained either at points with vanishing derivatives or at the boundaries of the interval on which it is defined, we can now conclude that the absolute value of our function never exceeds one—and thus that the requirement (40) is fulfilled.
Final layer of the reduced network
For the final layer of the reduced network in Fig. 4 in the main text, we would like to perform a gate that detects only the input . The conceptually simplest way of performing this detection would be to first perform a pair of NOT-gates to map this state to the state , then performing the detection of this state in the way detailed in the previous section. Alternatively, running the excitation parity detection from Sec. 1 followed by the -detection of the previous section similarly detects only the state . However, for the sake of practical efficiency and conceptual simplicity, an implementation using similar dynamics to the rest of the paper would be preferable. Fortunately, such an implementation can be constructed using arguments that very closely mirror those of the preceding section. Specifically, inspecting (30) and now keeping only the term
[TABLE]
yields the following mapping after a time has elapsed:
[TABLE]
in analogy to the mapping of (33). Note that the problem of matching the phases of these components is identical to the problem tackled for the detection of above. Furthermore, identical arguments to those contained in the previous section show that the effect of the driving terms in (42) can also be achieved through the addition of a term
[TABLE]
to the Hamiltonian, combined with the application of a simple cosine-drive of the form
[TABLE]
In other words, it is possible to implement this detection using a scheme that precisely mirrors the detection of the except for the application of a different driving frequency on the output qubit.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1) Waldrop, M. M. The chips are down for Moore’s law. Nature 530 , 144–147 (2016).
- 2(2) Gantz, J. & Reinsel, D. The digital universe in 2020: Big data, bigger digital shadows, and biggest growth in the far east. IDC i View: IDC Analyze the future 2007 , 1–16 (2012).
- 3(3) Hashem, I. A. T. et al. The rise of “big data” on cloud computing: Review and open research issues. Inf. Syst. 47 , 98–115 (2015).
- 4(4) Krizhevsky, A., Sutskever, I. & Hinton, G. E. Image Net classification with deep convolutional neural networks. In Adv. Neural Inform. Process. Syst. 25 , 1097–1105 (2012).
- 5(5) Sutskever, I., Vinyals, O. & Le, Q. V. Sequence to sequence learning with neural networks. In Adv. Neural Inform. Process. Syst. 27 , 3104–3112 (2014).
- 6(6) Preskill, J. Quantum computing in the NISQ era and beyond. Quantum 2 , 79 (2018).
- 7(7) Biamonte, J. et al. Quantum machine learning. Nature 549 , 195–202 (2017).
- 8(8) Dunjko, V. & Briegel, H. J. Machine learning & artificial intelligence in the quantum domain: a review of recent progress. Rep. Prog. Phys. 81 , 074001 (2018).
