Spoken Digit Recognition and Speaker Classification by Nonlinear Interfered Spin Wave-Based Physical Reservoir Computing
Sota Hikasa, Wataru Namiki, Daiki Nishioka, Maki Nishimura, Ryo Iguchi, Kazuya Terabe, Takashi Tsuchiya

TL;DR
This paper demonstrates that a nonlinear interfered spin wave-based physical reservoir computing system can effectively perform speech recognition tasks, including spoken digit recognition and speaker classification, with reduced preprocessing requirements.
Contribution
The study introduces a novel spin wave-based PRC that achieves high accuracy in speech tasks without relying on computationally intensive preprocessing like cochleagrams.
Findings
Cochleagram alone achieved ~90% accuracy in both tasks.
Interfered spin wave PRC alone achieved 85.8% accuracy in speaker classification.
The proposed PRC reduces preprocessing needs while maintaining competitive performance.
Abstract
Recently, artificial-intelligence (AI) technologies have been increasingly utilized in a wide range of real-world applications. Speech recognition is one of these practical AI tasks and is regarded as a key application for edge AI systems. Consequently, speech recognition has been widely employed as a representative benchmark task for assessing the performance of physical reservoir computing (PRC). Although many PRCs have performed this task, the majority of them rely on the frequency-extraction preprocessing method, such as a cochleagram and mel-frequency cepstrum. Especially about the cochleagram, this method enables high-accuracy recognition; however, it requires a substantial computational cost for preprocessing and is unsuitable for edge computing, due to the limited resources. In this study, we employed a nonlinear interfered spin wave-based PRC, which demonstrated superior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
