QFFN-BERT: An Empirical Study of Depth, Performance, and Data Efficiency in Hybrid Quantum-Classical Transformers
Pilsung Kang

TL;DR
This paper introduces QFFN-BERT, a hybrid quantum-classical transformer that replaces classical feedforward networks with parameterized quantum circuits, demonstrating improved accuracy and data efficiency on benchmark tasks.
Contribution
The work systematically investigates PQC depth and design choices in hybrid transformers, showing their potential as efficient alternatives to classical FFNs.
Findings
Achieves up to 102% of baseline accuracy with fewer parameters.
Outperforms classical models in few-shot learning scenarios.
PQC design with residuals and entanglement ensures stable training.
Abstract
Parameterized quantum circuits (PQCs) have recently emerged as promising components for enhancing the expressibility of neural architectures. In this work, we introduce QFFN-BERT, a hybrid quantum-classical transformer where the feedforward network (FFN) modules of a compact BERT variant are replaced by PQC-based layers. This design is motivated by the dominant parameter contribution of FFNs, which account for approximately two-thirds of the parameters within standard Transformer encoder blocks. While prior studies have primarily integrated PQCs into self-attention modules, our work focuses on the FFN and systematically investigates the trade-offs between PQC depth, expressibility, and trainability. Our final PQC architecture incorporates a residual connection, both and rotations, and an alternating entanglement strategy to ensure stable training and high expressibility. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
