HQTN-SER: Speech Emotion Recognition with Hybrid Quantum Tensor Networks
Mahad Mohtashim, Nouhaila Innan, Muhammad Shafique

TL;DR
This paper introduces HQTN-SER, a hybrid quantum-classical framework for speech emotion recognition that leverages structured quantum tensor networks to improve accuracy with low qubit counts.
Contribution
It proposes a novel quantum tensor network module and fusion strategy for SER, demonstrating effective performance on multiple benchmarks with small-qubit quantum circuits.
Findings
Achieved over 80% accuracy on RAVDESS dataset.
Demonstrated stable convergence with low qubit counts.
Provided a reproducible baseline for quantum-assisted SER.
Abstract
Speech emotion recognition (SER) remains fragile in real-world conditions because emotional cues are subtle, speaker-dependent, and easily confounded by recording variability, while high-performing deep models typically rely on large and carefully curated training sets. Quantum machine learning offers an alternative way to introduce nonlinear correlation modeling with compact modules, yet existing quantum SER studies remain limited and the impact of circuit structure is not well understood. This paper presents HQTN-SER, a hybrid quantum-classical framework that investigates how quantum tensor network connectivity can support SER under small-qubit settings. HQTN-SER introduces (i) an MPS-inspired quantum tensor network module that enforces structured interactions to model correlations in speech representations with a small number of trainable parameters, and (ii) a fusion strategy that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
