Reinforcement learning for semi-autonomous approximate quantum eigensolver
F. Albarr\'an-Arriagada, J. C. Retamal, E. Solano, L. Lamata

TL;DR
This paper introduces a reinforcement learning-based protocol to approximate eigenvectors of Hermitian quantum operators, achieving high fidelity with minimal iterations, useful for semi-autonomous quantum devices.
Contribution
It presents a novel reinforcement learning protocol for approximating eigenvectors of arbitrary Hermitian operators using measurement and feedback in a quantum setting.
Findings
Achieves over 90% fidelity in less than 10 iterations for single-qubit operators.
Surpasses 98% fidelity in less than 300 iterations for single-qubit operators.
Obtains eigenvectors with over 89% fidelity in 8000 iterations for two-qubit operators.
Abstract
The characterization of an operator by its eigenvectors and eigenvalues allows us to know its action over any quantum state. Here, we propose a protocol to obtain an approximation of the eigenvectors of an arbitrary Hermitian quantum operator. This protocol is based on measurement and feedback processes, which characterize a reinforcement learning protocol. Our proposal is composed of two systems, a black box named environment and a quantum state named agent. The role of the environment is to change any quantum state by a unitary matrix where is a Hermitian operator, and is a real parameter. The agent is a quantum state which adapts to some eigenvector of by repeated interactions with the environment, feedback process, and semi-random rotations. With this proposal, we can obtain an approximation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsQuantum Computing Algorithms and Architecture · Quantum Information and Cryptography · Quantum and electron transport phenomena
Reinforcement learning for semi-autonomous approximate quantum eigensolver
F. Albarrán-Arriagada 1,2,3, J. C. Retamal 2,3, E. Solano 1,4,5 and L. Lamata 4,6
1 International Center in Quantum Artificial Intelligence for Science and Technology (QuArtist) and Physics Department, Shanghai University, 200444 Shanghai, China
2 Departamento de Física, Universidad de Santiago de Chile (USACH), Avenida Ecuador 3493, 9170124, Santiago, Chile
3 Center for the Development of Nanoscience and Nanotechnology 9170124, Estación Central, Santiago, Chile
4 Department of Physical Chemistry, University of the Basque Country UPV/EHU, Apartado 644, 48080 Bilbao, Spain
5 IKERBASQUE, Basque Foundation for Science, Maria Diaz de Haro 3, 48013 Bilbao, Spain
6 Departamento de Física Atómica, Molecular y Nuclear, Universidad de Sevilla, 41080 Sevilla, Spain
Abstract
The characterization of an operator by its eigenvectors and eigenvalues allows us to know its action over any quantum state. Here, we propose a protocol to obtain an approximation of the eigenvectors of an arbitrary Hermitian quantum operator. This protocol is based on measurement and feedback processes, which characterize a reinforcement learning protocol. Our proposal is composed of two systems, a black box named environment and a quantum state named agent. The role of the environment is to change any quantum state by a unitary matrix where is a Hermitian operator, and is a real parameter. The agent is a quantum state which adapts to some eigenvector of by repeated interactions with the environment, feedback process, and semi-random rotations. With this proposal, we can obtain an approximation of the eigenvectors of a random qubit operator with average fidelity over 90% in less than 10 iterations, and surpass 98% in less than 300 iterations. Moreover, for the two-qubit cases, the four eigenvectors are obtained with fidelities above 89% in 8000 iterations for a random operator, and fidelities of for an operator with the Bell states as eigenvectors. This protocol can be useful to implement semi-autonomous quantum devices which should be capable of extracting information and deciding with minimal resources and without human intervention.
1 Introduction
In the past few years, the symbiosis between quantum mechanics and machine learning into the topic named quantum machine learning (QML) has been a fruitful area [1, 2, 3, 4], either applying classical machine learning techniques to quantum tasks such as quantum metrology [5, 6], quantum state estimation [7, 8], and others [9, 11, 10, 12, 13, 14]; or using quantum mechanics to enhance machine learning algorithms for classical applications [15, 16, 17, 18, 19, 3, 20, 21]. Any machine learning algorithm can be classified into learning from big data and learning from interactions.
For the first group, we have two classes of algorithms, one of them are the supervised learning algorithms, which use a previously labeled data set named training data to infer a labeled criterion which is used to classify new data; a remarkable example is pattern recognition algorithms [22, 23, 24]. The other class is unsupervised learning algorithms. In this case, the training data is not necessary, and the approach is to group the unlabeled data in different sets, where each set is characterized by the mean value of some property of its constituents. The different groups are constructed to optimize some indicator of the dispersion in each subset with respect to the value that characterized it, e.g., the standard deviation. An example of these algorithms is the clustering problem [25, 26].
For the second group, we have the reinforcement learning (RL) algorithms [27]. Here, one accessible and manipulable system called agent interacts with another unknown system called environment . The strategy relies on improving its performance in a specific task , which depends on the state of the systems and . This improvement employs the results of multiple interactions among and . The general framework of the RL paradigm is composed of three parts, the policy, the reward function (RF) and the value function (VF). The policy defines the main steps of the algorithm that we can divide into three steps. First, the information extraction, which considers the interaction among and , and how to obtain the information from it. Second, the feedback loop, that specifies the channel used to communicate the information extracted to . Third, the decision process, where we decide the action on in order to progress towards the aimed-for goal, and then start with the information extraction again. The RF defines the criterion to reward (punish) the actions which improve (worsen) the performance of respect to the task at each step. Finally, the VF gives us the global performance of the algorithm, ensuring the convergence of it. One of the most impressive examples of this paradigm is the recent developing of chess, go and shogi masters players without database [28, 29]. This class of algorithms mimic the most primitive form of human learning, commonly named trial and error. It means that a near-future implementation of quantum artificial intelligence may apply this paradigm to a quantum system to enhance a quantum task as the main way to learn. For this reason, the development of the quantum version of the RL paradigm has played an important role in QML in recent years [30, 31, 3, 32, 33, 34].
A crucial task in physics is the characterization of the different interactions among systems. This characterization is helpful to evaluate the risks of our actions and act to minimize them. Therefore, any autonomous artificial intelligence must have this ability.
In quantum mechanics, a physical interaction (observable) is represented by a Hermitian matrix or quantum operator, which is characterized by its eigenvalues and eigenvectors. The calculation of the eigenvectors and eigenvalues of a quantum interaction by a classical computer implies that we need to encode the quantum information into classical bits, which is inconvenient for unknown quantum interactions. Moreover, the implementation of a full quantum eigensolver [35, 36, 37, 38] using near-future quantum computers seems impractical due to the number of needed resources [39]. The emergence of hybrid classical-quantum algorithms in the past few years [40, 41, 42, 43, 44, 45, 46] opens the door to the development of useful eigensolvers. Nevertheless, these works are mainly focused on the eigenvalues, eigenvectors, and properties of quantum systems such as molecules, being the characterization of a physical interaction less studied.
In this article, we propose a hybrid quantum-classical algorithm to calculate an approximation to the eigenvector of any quantum interaction described by a Hermitian matrix with minimal resources [47]. In our proposal, we use single-shot measurement and classical communication given by a feedback loop, which characterizes a RL protocol. The main goal of this proposal is to obtain a high-fidelity approximation (above 98% for the single-qubit case), without measuring fidelities or some expectation value, which reduce drastically the number of iterations of the algorithm, decreasing the effect of noise sources, and without human intervention. We also show how to extend the algorithm to the multiqubit and high-dimensional situations. This protocol could be useful to implement semi-autonomous quantum devices with the capability to decide using the characterization of an interaction, which is an essential ingredient for the implementation of artificial quantum intelligence [4] and artificial quantum life [48, 49].
2 Quantum eigensolver protocol
Our proposal is related to recent works about a measurement-based algorithm to adapt one known state to another unknown one [50, 51, 52]. Here, we define the general framework of our protocol based on the RL paradigm and then, we explain in details the single qubit case, the single qudit case, and the multiqubit case.
In our protocol, we consider as the agent a manipulable and known quantum system described by the state , which correspond to any initialization of a given physical system. The environment is a black box, which produces an unknown interaction inside it. This interaction is characterized by an unknown Hermitian operator , which generates a unitary transformation over the quantum system when it interacts with the system , where is a parameter related to the interaction time with the black-box, e.g., a spin particle (agent) traversing a region with a magnetic field (environment) for a time .
The policy is as follows:
- •
Information extraction: The system interacts with changing its state as
[TABLE]
Next, we perform a measurement process over in the basis , where is the dimension of the Hilbert space of and .
- •
Feedback loop: The information of the measuring process is communicated to a command center with the ability to perform a unitary transformation (quantum gate) over the state of in order to change the possible results in the next information extraction step.
- •
Decision process: If the outcome of the measurement process is the state , with , this means that changes when system interacts with , therefore, cannot be an eigenvector of . In this case, we define the unitary transformation as
[TABLE]
where
[TABLE]
and is a random angle in the range , with the searching range given by the RF. We note that is a pseudo random rotation in the subspace expanded by . For this outcome we define the state of as , and start again with the information extraction step.
If the outcome of the measuring process is , it means that could be an eigenvector of . We point out that the eigenvectors of an operator remain constant up to a global phase under the action of a function of this operator. In this case, we apply the identity operator . Moreover, we keep the same state and start again with the information extraction step. Figure 1 shows a scheme of the policy of the algorithm.
For the RF we define the reward rate and the punishment rate . If the outcome of the measure is we define and in other case. Finally, we renamed for the next iteration of the algorithm, which means that when we measure we reduce the searching range, and we increase it in other case. The initial value for is chosen according to the problem.
As we can note, the protocol does not need store the states, or all the history of the algorithm, it only needs to store the final operation via storing the parameters that characterize this operation classically.
To ensure the convergence of our algorithm, we define the VF as the value of . This implies that, when , our protocol converges. For a correct choice of and we have that only if we obtain, in the measurement process of , the outcome many times in a row. This means that , therefore is an approximate eigenvector of .
As this is an iterative protocol, we define the following notation for the remainder of the article: any super-index between parenthesis refers to the iteration of the algorithm, e.g., is the state of before the interaction with in the fourth iteration. Similarly, is the unitary transformation defined in the decision process for the iteration . As a special case, the super-index refers to the initial values, e.g., represents the initial searching range.
It is necessary to mention that our algorithm uses one single-shot measurement per loop, representing advantage with respect to employing an expectation value or the fidelity. The latter imply hundreds of measurements for a two-level system, being this proposal exposed less time to noise sources. Also, as we use pseudo-random operations , the effect of any noise in the gate can be seen as part of the randomness of the protocol.
2.1 Single-qubit case
In the single-qubit case, is described by a Hermitian matrix with eigenvectors and eigenvalues respectively. As these two eigenvectors are orthonormal, we can write
[TABLE]
where , and
[TABLE]
We define and as
[TABLE]
Policy. In this case, we write the state before the black-box as
[TABLE]
and the state after as
[TABLE]
where
[TABLE]
For the explicit form and in terms of , , and the eigenvalues of see appendix A. Moreover, for the explicit form of and , see appendix B. Now, to perform the measurement process over , we apply the basis-rotation matrix
[TABLE]
in order to measure in the basis for all iterations. After the measurement process, the state of is , where is the outcome of the measurement with probabilities and , respectively. If , then we transform the state , using the matrix , and start again the algorithm. If , we transform the state using , where is the Pauli matrix , and apply the pseudo-random operator defined by Eq. (2). Then, after the measurement process, we apply over the operator defined by
[TABLE]
where
[TABLE]
Given that transforms (), we can write , where
[TABLE]
with the spin operators, with the Pauli matrix . Then, the operator reads
[TABLE]
For this case, the RF that defines the value of for each step reads
[TABLE]
where and are the reward rate and punishment rate, respectively described previously.
When the algorithm converges, we have , where is the number of iterations. Moreover, in this case is an approximation of the matrix that diagonalizes , that is
[TABLE]
In order to explore the complete space we must choose .
2.2 Single-qudit case
In this case, the agent is a -dimensional system or qudit, the operator is described by a Hermitian matrix with eigenvalues , eigenvectors and . In the th iteration of the algorithm, the state of before reads
[TABLE]
while for simplicity we choose the initial state . After the interaction with , we have
[TABLE]
Subsequently, we apply the operator , which is defined now as
[TABLE]
and perform the measurement process in the basis . After this process, the state of is , where is the outcome of the measurement process. In this case the decision process applies the operator defined by Eq. (11), but with
[TABLE]
where
[TABLE]
with as defined in Eq. (2) and . Also in this case , where
[TABLE]
and
[TABLE]
therefore,
[TABLE]
The state of for the next iteration reads .
Finally, the RF that updates the value of the searching range is given by
[TABLE]
Once the algorithm converges, we have that
[TABLE]
is an approximate eigenvector, therefore,
[TABLE]
In order to find another eigenvector of , we start again the algorithm for the iteration , i.e., , but now the state before is given by . We redefine Eq. (23) as
[TABLE]
Thus, we can calculate the operator as in Eq. (22).
The decision process changes as
[TABLE]
where
[TABLE]
and . Finally, the RF reads,
[TABLE]
These changes mean that we perform the protocol in the subspace orthogonal to . When the algorithm converges again, after iterations more, we have that the states and are approximate eigenvectors. Therefore, to obtain the next eigenvector we perform the algorithm again but in the subspace orthogonal to , and so on. At iterations we have that the states with are the eigenvectors of .
2.3 Multiqubit case
For this case, we can suppose that the system is a qudit state, where now the states of the basis, correspond to the binary representation of with digits. For example, for we have digits, where each of them represents the state of a qubit; then . Also, we can produce the different operators using controlled-not gates and single-qubit rotations [53]. Therefore, we can map this problem to the qudit case obtaining the same algorithm as in the previous case.
As we can see from this section, our protocol does not need to encode quantum information in a classical processor, being advantageous with respect to classical algorithms that need to characterize the quantum interactions by quantum tomography. The latter imply hundreds of measurements of the quantum system, using in this process more resources than the entire algorithm proposed. Moreover, as our algorithm finds the eigenstate statistically, it is simpler than a full quantum algorithm that finds the eigenstates exactly, being our protocol experimentally feasible. The references [51, 52] show the experimental implementation of an algorithm that employs the same basics steps in which our current algorithm is based, for the case of quantum states, instead of quantum operators, opening the door to the implementation of this work.
3 Numerical results
It is convenient to define the following quantities for the numerical analysis of the protocol, , with () the reward (punishment) rate, the total number of rewards and the total number of punishments in the algorithm. The VF of our algorithm is the value of where are the total number of iterations. Also, we can rewrite
[TABLE]
where the convergence condition is given by . If , we see from Eq. (32) that the convergence condition can be satisfied even if , which implies that the protocol does not necessarily converge to the eigenstates of . If , we have that . For , the algorithm converges whenever . Moreover, when is larger, the algorithm needs more iterations to converge, but nevertheless it achieves larger fidelities. This is the exploration versus exploitation balance known in reinforcement learning. Here, we perform the simulation for a single- and two-qubit case for different values of and . Remember that for all cases we choose . Also, for simplicity we choose for the single-qubit case and for the two-qubit case, where is the binary representation of , e.g., . Moreover, for all cases.
Finally, as the unitary operator given by Eq. (22) depends on pseudo-randoms angles, we perform many times the algorithm, defining the mean fidelity and the mean searching range as
[TABLE]
where is the th eigenvector of , the index refers to the th repetition of the protocol and is the total number of repetitions. In all subsequent cases we choose .
3.1 Single-qubit case
For the general performance of our protocol, we start with a described by a random Hermitian matrix. Figure 2 shows the mean fidelity for different values of the reward rate , and the parameter . From this figure, we can see that for and , we obtain with . Also, in all cases we have for . It means that using a reduced number of iterations we can obtain good fidelities for the eigenvector of a completely random single-qubit operator. On the other hand, we observe that when and are larger, the maximum value of increases, but we need more iterations for the convergence of the algorithm. Figure 3 shows the mean searching range for the same cases. From this figure we can clearly see how the algorithm needs less iterations when and decrease, with the extreme case of , , where the algorithm converges before 70 iterations.
Now, we consider a particular example . In this case, the distance in the Bloch sphere between and the eigenstates of is the largest possible. Figure 4 shows that our algorithm converges with few iterations to good approximations of the eigenvectors, we can see that we obtain the eigenvectors with fidelity above 98 in 400 iterations, for the case and .
As we can see, the maximum fidelity for the case has decreased with respect to the random one. This is because the distance between and the eigenvectors of is larger than the distance between and the eigenvectors of in the random case, therefore, the protocol has worse convergence.
3.2 Two-qubit case
This case is analogous to the single-qudit case with . First, for a general performance, we consider as a random two-qubit operator. Moreover, we choose and calculate the mean fidelity and the mean searching range given by Eq. (33). Figure 5 shows the numerical calculation for and . It shows again that for small the convergence is faster but the maximum value of is smaller. Furthermore, with we need iterations such that the four approximate eigenvectors converge. With , we only need iterations. Nevertheless, for we obtain for all , with even and up to . In the other case, with , the maximum values are , and . Also, we can see from the evolution of that the number of iterations needed for the convergence is smaller each time that the algorithm starts again to approximate the next eigenvector, that is, . Finally, we consider as special case , where is an operator given by
[TABLE]
with
[TABLE]
the maximally-entangled Bell states. Figure 6 shows the performance of our protocol for this case. We can see that we obtain high fidelities () with only 1000 iterations to approximate the four eigenvectors. We obtain this performance due to the fact that our algorithm is sensitive to the number of the product states involved in each subspace (dimension of the subspace) and not to the total dimension of the operator . In this case, the operator is block-diagonal, where one block acts in the subspace and the other in . This implies that the present case is similar to two independent single-qubit cases. In Fig. 6, we can see that from to we approximate the eigenstates of the first block, that is at the same time, and from to we approximate the eigenstates of the second block , where both cases have a performance similar to the single-qubit case.
4 Conclusions
We propose and analyze an approximate quantum eigensolver based on reinforcement learning with minimal resources. This proposal can be classified as a hybrid classical-quantum algorithm, such that we use a classical optimization algorithm to change a quantum system to improve a quantum task using a feedback loop combined with partially-random unitary gates. This is in contrast with other hybrid algorithms that measure the fidelities or some expectation value in each step. Therefore, our proposal is advantageous with respect to the usual hybrid algorithms, in the sense that our protocol needs minimal storage to save only the last step of the algorithm and employs just one single-shot measurement per iteration, instead of fidelities or expectation-value measurements, which decrease the effect of the source of noise. Moreover, our protocol considers pseudo-random two-level rotations, such that it is not necessary to implement high-fidelity operations, because the randomness of the algorithm absorbs the errors of the gates. For this reason, our algorithm would be experimentally feasible in almost any current quantum platform.
Additionally, we validated our proposal with numerical calculations of four different choices of the operator , random single-qubit operator, operator, random two-qubit operator, and operator defined by Eq. (34), obtaining as a general rule that our algorithm reaches higher fidelities for the approximate eigenvectors for large values of and , but the convergence in this case is slower. This is related to the balance between exploration and exploitation typical from reinforcement learning algorithms. Moreover, our algorithm is sensitive to the size of the different subspaces expanded by product states and not to the size of the total space of the operator . This is the case showed in Fig. (6), where the eigenvectors are the maximally-entangled Bell states. We point out that, in order to improve the performance of the protocol in future extensions, it could be interesting to study dynamical reward rates (r) and dynamical parameter .
Finally, due to the simplicity, minimal resources employed by our protocol, and the fact that we need only a basic classical processor (command center) capable to perform pseudo-random rotations, it can be useful for the development of near future semi-autonomous quantum devices, which will have to make decisions with incomplete information obtained by interaction with the external environment.
We acknowledge support from Financiamiento Basal para Centros Científicos y Tecnológicos de Excelencia (Grant No. FB0807), projects QMiCS (820505) and OpenSuperQ (820363) of the EU Flagship on Quantum Technologies, EU FET Open Grant Quromorphic, Basque Government IT986-16, and PGC2018-095113-B-I00 (MCIU/AEI/FEDER, UE).
Data availability statement
The data that support the findings of this study are openly available at https://github.com/PanchoAlbarran/EigenSolver
Appendix A Explicit form of and
Here, we further clarify the protocol developed in the main text.
From Eq. (4), we have
[TABLE]
Replacing Eq. (7) we obtain
[TABLE]
Thus,
[TABLE]
By means of the definition of and given by Eq. (4), we obtain
[TABLE]
We rewrite the eigenvalues as and where and . Then, we rewrite Eq. (39) up to a global phase as
[TABLE]
This state has the form
[TABLE]
with
[TABLE]
Finally, up to a global phase, the state given by Eq. (42) can be written in the form of Eq. (8), where
[TABLE]
Appendix B Explicit form of and
From Eqs. (7) and (9) we have,
[TABLE]
Replacing this expression in the first line of Eq. (8), we obtain
[TABLE]
where
[TABLE]
[TABLE]
and
[TABLE]
Finally, up to a global phase, we can write the state as
[TABLE]
with .
References
- [1] Adcock J C, Allen E, Day M, Frick S, Hinchliff J, Johnson M, Morley-Short S, Pallister S, Price A B and Stanisic S 2015 arXiv:1512.02900 [quant-ph]
- [2] Biamonte J, Wittek P, Pancotti N, Rebentrost P, Wiebe N and Lloyd S 2017 Quantum machine learning Nature 549 074001
- [3] Dunjko V, Taylor J M and Briegel H J 2016 Quantum-Enhanced Machine Learning Phys. Rev. Lett. 117 130501
- [4] Dunjko V and Briegel H J 2018 Machine learning & artificial intelligence in the quantum domain: a review of recent progress Rep. Prog. Phys. 81 195
- [5] Hentschel A and Sanders B C, 2010 Machine Learning for Precise Quantum Measurement Phys. Rev. Lett. 104 063603
- [6] Hentschel A and Sanders B C 2011 Efficient Algorithm for Optimizing Adaptive Quantum Metrology Processes Phys. Rev. Lett. 107 233601
- [7] Torlai G, Mazzola G, Carrasquilla J, Troyer M, Melko R and Carleo G 2018 Neural-network quantum state tomography Nat. Phys. 14 447
- [8] Rocchetto A, Aaronson S, Severini S, Carvacho G, Poderini D, Agresti I, Bentivegna M and Sciarrino F 2019 Experimental learning of quantum states Sci. Adv. 5 eaau1946
- [9] Häse F, Kreisbeck C and Aspuru-Guzik A 2017 Machine learning for quantum dynamics: deep learning of excitation energy transfer properties Chem. Sci. 8 8419
- [10] Gao J, Qiao L-F, Jiao Z-Q, Ma Y-C, Hu C-Q, Ren R-J, Yang A-L, Tang H, Yung M-H, and Jin X-M 2018 Experimental Machine Learning of Quantum States Phys. Rev. Lett. 120 240501
- [11] Gupta R S and Biercuk M J 2018 Machine Learning for Predictive Estimation of Qubit Dynamics Subject to Dephasing Phys. Rev. Applied 9 064042
- [12] Bukov M 2018 Reinforcement learning for autonomous preparation of Floquet-engineered states: Inverting the quantum Kapitza oscillator Phys. Rev. B 98 224305
- [13] Bukov M, Day A G R, Sels D, Weinberg P, Polkovnikov A and Mehta P 2018 Reinforcement Learning in Different Phases of Quantum Control Phys. Rev. X 8 031086
- [14] Melnikov A A, Nautrup H P, Krenn M, Dunjko V, Tiersch M, Zeilinger A and Briegel H J 2018 Active learning machine learns to create new quantum experiments Proc. Natl. Acad. Sci. U.S.A. 6 115
- [15] Aïmeur E, Brassard G and Gambs S 2013 Quantum speed-up for unsupervised learning Mach. Learn. 90, 261
- [16] Lloyd S, Mohseni M and Rebentrost P 2013 arXiv:1307.0411 [quant-ph]
- [17] Rebentrost P, Mohseni M and Lloyd S 2014 Quantum Support Vector Machine for Big Data Classification Phys. Rev. Lett. 113 130503
- [18] Li Z, Liu X, Xu N and Du J 2015 Experimental Realization of a Quantum Support Vector Machine Phys. Rev. Lett. 114 140504
- [19] Cai X-D, Wu D, SuZ-E, Chen M-C, Wang X-L, Li L, Liu N-L, Lu C-Y and Pan J-W 2015 Entanglement-Based Machine Learning on a Quantum Computer Phys. Rev. Lett. 114 110504
- [20] Sheng Y-B and Zhou L 2017 Distributed secure quantum machine learning Sci. Bull. 62 1025
- [21] Schuld M and Killoran N 2019 Quantum Machine Learning in Feature Hilbert Spaces Phys. Rev. Lett. 122 040504
- [22] Jain A K 2007 Biometric recognition Nature 449 38
- [23] Carrasquilla J and Melko R G 2017 Machine learning phases of matter Nat. Phys. 13 431
- [24] Schützhold R 2003 Pattern recognition on a quantum computer Phys. Rev. A 67, 062311
- [25] Fahad A, Alshatri N, Tari Z, Alamri A, Khalil I, Zomaya A Y, Foufou S and Bouras A 2014 A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis IEEE Trans. Emerging Top. Comput. 2 267
- [26] Otterbach J S, Manenti R, Alidoust N, Bestwick A, Block M, Bloom B, Caldwell S, Didier N, Fried E S, Hong S, Karalekas P, Osborn C B, Papageorge A, Peterson E C, Prawiroatmodjo G, Rubin N, Ryan C A, Scarabelli D, Scheer M, Sete E A, Sivarajah P, Smith R S, Staley A, Tezak N, Zeng W J, Hudson A, Johnson B R, Reagor M, da Silva M P and Rigetti C 2017 arXiv:1712.05771 [quant-ph]
- [27] Sutton R S and Barto A G 2018 Reinforcement Learning: An introduction (Cambridge: MIT press)
- [28] Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A, Chen Y, Lillicrap T, Hui F, Sifre L, van den Driessche G, Graepel T and Hassabis D 2017 Mastering the game of Go without human knowledge Nature 550, 354
- [29] Silver D, Hubert T, Schrittwieser J, Antonoglou I, Lai M, Guez A, Lanctot M, Sifre L, Kumaran D, Graepel T, Lillicrap T, Simonyan K, and Hassabis D 2018 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play Science 362 1140
- [30] Dong D, Chen C, Li H and Tarn T J 2008 Quantum Reinforcement Learning IEEE Trans. Syst. Man Cybern. B Cybern. 38 1207
- [31] Paparo G D, Dunjko V, Makmal A, Martin-Delgado M A and Briegel H J 2014 Quantum Speedup for Active Learning Agents Phys. Rev. X 4 031002
- [32] Lamata L 2017 Basic protocols in quantum reinforcement learning with superconducting circuits Sci. Rep. 7 1609
- [33] Cárdenas-López F A, Lamata L, Retamal J C, and Solano E 2018 Multiqubit and multilevel quantum reinforcement learning with quantum technologies PLOS ONE 13 e0200455
- [34] Crawford D, Levit A, Ghadermarzy N, Oberoi J S and Ronagh P 2019 arXiv:1612.05695quant-ph
- [35] Abrams D S and Lloyd S 1999 Quantum Algorithm Providing Exponential Speed Increase for Finding Eigenvalues and Eigenvectors Phys. Rev. Lett. 83 5162
- [36] Jaksch P and Papageorgiou A 2003 Eigenvector Approximation Leading to Exponential Speedup of Quantum Eigenvalue Calculation Phys. Rev. Lett. 91 257902
- [37] Wang H, Wu L-A, Liu Y-X and Nori F 2010 Measurement-based quantum phase estimation algorithm for finding eigenvalues of non-unitary matrices Phys. Rev. A 82 062303
- [38] Wang H 2016 Quantum algorithm for obtaining the eigenstates of a physical system Phys. Rev A 93 052334
- [39] Wecker D, Bauer B, Clark B K, Hastings M B and Troyer M 2014 Gate-count estimates for performing quantum chemistry on small quantum computers Phys. Rev. A 90 022305
- [40] Peruzzo A, McClean J, Shadbolt P, Yung M-H, Zhou X-Q, Love P J, Aspuru-Guzik A and O’Brien J L 2014 A variational eigenvalue solver on a photonic quantum processor Nat. Comm. 5, 4213
- [41] Yung M-H, Casanova J, Mezzacapo A, McClean J, Lamata L, Aspuru-Guzik A and Solano E 2014 From transistor to trapped-ion computers for quantum chemistry Sci. Rep. 4 3589
- [42] McClean J R, Romero J, Babbush R, and Aspuru-Guzik A 2016 The theory of variational hybrid quantum-classical algorithms New J. Phys. 18 023023
- [43] O’Malley P J J, Babbush R, Kivlichan I D, Romero J, McClean J R, Barends R, Kelly J, Roushan P, Tranter A, Ding N, Campbell B, Chen Y, Chen Z, Chiaro B, Dunsworth A, Fowler A G, Jeffrey E, Lucero E, Megrant A, Mutus J Y, Neeley M, Neill C, Quintana C, Sank D, Vainsencher A, Wenner J, White T C, Coveney P V, Love P J, Neven H, Aspuru-Guzik A and Martinis J M 2016 Scalable Quantum Simulation of Molecular Energies Phys. Rev. X 6 031007
- [44] Kandala A, Mezzacapo A, Temme K, Takita M, Brink M, Chow J M and Gambetta J M 2017 Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets Nature 549 242
- [45] Hempel C, Maier C, Romero J, McClean J, Monz T, Shen H, Jurcevic P, Lanyon B P, Love P, Babbush R, Aspuru-Guzik A, Blatt R and Roos C F 2018 Quantum Chemistry Calculations on a Trapped-Ion Quantum Simulator Phys. Rev. X 8 031022
- [46] Kokail C, Maier C, van Bijnen R, Brydges T, Joshi M K, Jurcevic P, Muschik C A, Silvi P, Blatt R, Roos C F and Zoller P 2019 Self-verifying variational quantum simulation of lattice models Nature 569 355
- [47] Code repository at https://github.com/PanchoAlbarran/EigenSolver
- [48] Alvarez-Rodriguez U, Sanz M, Lamata L and Solano E 2016 Artificial Life in Quantum Technologies Sci. Rep. 6 20956
- [49] Alvarez-Rodriguez U, Sanz M, Lamata L and Solano E 2018 Quantum Artificial Life in an IBM Quantum Computer Sci. Rep. 8 14793
- [50] Albarrán-Arriagada F, Retamal J C, Solano E and Lamata L 2018 Measurement-based adaptation protocol with quantum reinforcement learning Phys. Rev. A 98 042315
- [51] Yu S, Albarrán-Arriagada F, Retamal J C, Wang Y-T, Liu W, Ke Z-J, Meng Y, Li Z-P, Tang J-S, Solano E, Lamata L, Li C-F and Guo G-C 2019 Reconstruction of a Photonic Qubit State with Reinforcement Learning Adv. Quantum Technol. 2, 1800074
- [52] Olivares-Sánchez J, Casanova J, Solano E and Lamata L 2018 arXiv:1811.07594 [quant-ph]
- [53] Nielsen M A and Chuang I L 2010 Quantum Computation and Quantum Information (Cambridge: Cambridge University Press)
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Adcock J C, Allen E, Day M, Frick S, Hinchliff J, Johnson M, Morley-Short S, Pallister S, Price A B and Stanisic S 2015 ar Xiv :1512.02900 [quant-ph]
- 2[2] Biamonte J, Wittek P, Pancotti N, Rebentrost P, Wiebe N and Lloyd S 2017 Quantum machine learning Nature 549 074001 · doi ↗
- 3[3] Dunjko V, Taylor J M and Briegel H J 2016 Quantum-Enhanced Machine Learning Phys. Rev. Lett. 117 130501
- 4[4] Dunjko V and Briegel H J 2018 Machine learning & artificial intelligence in the quantum domain: a review of recent progress Rep. Prog. Phys. 81 195 · doi ↗
- 5[5] Hentschel A and Sanders B C, 2010 Machine Learning for Precise Quantum Measurement Phys. Rev. Lett. 104 063603
- 6[6] Hentschel A and Sanders B C 2011 Efficient Algorithm for Optimizing Adaptive Quantum Metrology Processes Phys. Rev. Lett. 107 233601
- 7[7] Torlai G, Mazzola G, Carrasquilla J, Troyer M, Melko R and Carleo G 2018 Neural-network quantum state tomography Nat. Phys. 14 447
- 8[8] Rocchetto A, Aaronson S, Severini S, Carvacho G, Poderini D, Agresti I, Bentivegna M and Sciarrino F 2019 Experimental learning of quantum states Sci. Adv. 5 eaau 1946
