Avoiding barren plateaus via Gaussian Mixture Model
Xiao Shi, Yun Shang

TL;DR
This paper introduces a Gaussian Mixture Model-based initialization strategy for variational quantum algorithms, effectively avoiding barren plateaus and improving trainability regardless of qubit number or circuit depth.
Contribution
The paper presents a novel initialization method using Gaussian Mixture Models that guarantees avoidance of barren plateaus in quantum circuit training.
Findings
Gradient norm lower bound is independent of qubit number
Lower bound increases with circuit depth
Method enhances trainability of quantum circuits
Abstract
Variational quantum algorithms is one of the most representative algorithms in quantum computing, which has a wide range of applications in quantum machine learning, quantum simulation and other related fields. However, they face challenges associated with the barren plateau phenomenon, especially when dealing with large numbers of qubits, deep circuit layers, or global cost functions, making them often untrainable. In this paper, we propose a novel parameter initialization strategy based on Gaussian Mixture Models. We rigorously prove that, the proposed initialization method consistently avoids the barren plateaus problem for hardware-efficient ansatz with arbitrary length and qubits and any given cost function. Specifically, we find that the gradient norm lower bound provided by the proposed method is independent of the number of qubits and increases with the circuit depth .…
Peer Reviews
Decision·ICLR 2025 Conference Withdrawn Submission
The novel parameter initialization strategy is provided for the barren plateau phenomenon. And the prove is given,
The provided parameter initialization strategy is not clear in the figure 1. What’s more, the comparison with other methods is not given. Furthermore, the induced Gaussian mixture models is not the firstly introduced.
This paper proposes and proves that the mixture of gaussians as an initialization scheme avoids barren plateau (at initialization) even when the cost function is global. Experiments, even though settings are a bit unclear, seem to support their claim.
- My biggest concern is that the relation to the previous work [1] is hardly discussed. The authors should at least mention that [1] proposed for the first time the gaussian initialization precisely to prevent barren plateau at initialization—the exact setting this paper is addresses. I understand there are some differences discussed briefly starting from line 258, but even there the authors do not mention [1] uses Gaussian initialization. Such writing gives me the impression that the authors in
1. As far as I am concerned, this is a novel contribution in the field of quantum machine learning. 2. Given it concerns only parameter initialization, the proposed initialization scheme is simple and easy to implement and compute efficient. 3. The method is backed up by solid theoretical guarantees which are also validated empirically through experiments. Also, the theoretical guarantees hold for rather practical situations, not just an idealized case. 4. The proofs in the appendix seem correct
1. Although your method is applied to a popular ansatz, it does not seem to generalize to other ansatz structures. Could you discuss potential for generalizing to different ansatz? 2. I found the paper to be written in a style that hinders understanding. There are various errors/inconsistencies in the notation and long inline math sections (around lines 158, 236, 250 for instance) that I personally found difficult to read. Please see the minor comments section for examples. It would be construct
- The paper adresses a fundamental problem in the context of variational quantum algoritms.
- The major weakness of this paper lies in its claim. In the abstract the authors claim that "*rigorously prove that, the proposed initialization method consistently avoids the barren plateaus problem for hardware-efficient ansatz*". While this may even be true in theory, I do find their numerical experiments and the PQCs considered in the paper not to be general such that this claim can hold with the current phrasing. This very **strong** claim is reiterated several times throughout the paper.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Soil Geostatistics and Mapping · Geochemistry and Geologic Mapping
