Challenges and Insights: Exploring 3D Spatial Features and Complex Networks on the MISP Dataset
Yiwen Shao

TL;DR
This paper investigates the use of 3D spatial features in multi-channel multi-talker speech recognition on the MISP dataset, highlighting challenges, insights, and the potential of direct 'All-in-one' ASR models for improved accuracy.
Contribution
It extends the application of 3D spatial features and 'All-in-one' models to the MISP dataset, providing new insights and preliminary experiments with complex inputs and models.
Findings
3D spatial features effectively help target speaker recognition in noisy environments
Direct 'All-in-one' ASR models show promising performance on MISP data
Replacing spatial features with complex inputs requires further investigation
Abstract
Multi-channel multi-talker speech recognition presents formidable challenges in the realm of speech processing, marked by issues such as background noise, reverberation, and overlapping speech. Overcoming these complexities requires leveraging contextual cues to separate target speech from a cacophonous mix, enabling accurate recognition. Among these cues, the 3D spatial feature has emerged as a cutting-edge solution, particularly when equipped with spatial information about the target speaker. Its exceptional ability to discern the target speaker within mixed audio, often rendering intermediate processing redundant, paves the way for the direct training of "All-in-one" ASR models. These models have demonstrated commendable performance on both simulated and real-world data. In this paper, we extend this approach to the MISP dataset to further validate its efficacy. We delve into the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Geographic Information Systems Studies · Data Management and Algorithms
