Large Language Models Can Help Mitigate Barren Plateaus in Quantum Neural Networks

Jun Zhuang; Chaowen Guan

arXiv:2502.13166·quant-ph·April 14, 2026

Large Language Models Can Help Mitigate Barren Plateaus in Quantum Neural Networks

Jun Zhuang, Chaowen Guan

PDF

TL;DR

This paper introduces AdaInit, a novel framework using large language models and submartingale properties to adaptively initialize quantum neural networks, effectively mitigating barren plateaus and improving training efficiency.

Contribution

AdaInit is the first adaptive initialization method leveraging large language models and theoretical guarantees to address barren plateaus in quantum neural networks.

Findings

01

AdaInit maintains higher gradient variance across various QNN scales.

02

AdaInit outperforms existing static initialization methods in experiments.

03

Theoretical analysis confirms convergence properties of AdaInit.

Abstract

In the era of noisy intermediate-scale quantum (NISQ) computing, Quantum Neural Networks (QNNs) have emerged as a promising approach for various applications, yet their training is often hindered by barren plateaus (BPs), where gradient variance vanishes exponentially as the qubit size increases. Most initialization-based mitigation strategies rely heavily on pre-designed static parameter distributions, thereby lacking adaptability to diverse model sizes or data conditions. To address these limitations, we propose AdaInit, a foundational framework that leverages large language models with the submartingale property to iteratively synthesize initial parameters for QNNs that yield non-negligible gradient variance, thereby mitigating BPs. Unlike conventional one-shot initialization methods, AdaInit adaptively explores the parameter space by incorporating dataset characteristics and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.