Low-Resource Contextual Topic Identification on Speech
Chunxi Liu, Matthew Wiesner, Shinji Watanabe, Craig Harman, Jan Trmal,, Najim Dehak, Sanjeev Khudanpur

TL;DR
This paper introduces a novel attention-based contextual model for low-resource spoken topic identification, significantly improving accuracy by leveraging sequential segment dependencies in unstructured audio.
Contribution
It presents a new attention-based approach that exploits contextual information across segments, enhancing topic ID performance in low-resource language settings.
Findings
Attention-based model outperforms non-contextual models on four languages
Contextual dependencies significantly improve topic classification accuracy
Method is effective for low-resource spoken language processing
Abstract
In topic identification (topic ID) on real-world unstructured audio, an audio instance of variable topic shifts is first broken into sequential segments, and each segment is independently classified. We first present a general purpose method for topic ID on spoken segments in low-resource languages, using a cascade of universal acoustic modeling, translation lexicons to English, and English-language topic classification. Next, instead of classifying each segment independently, we demonstrate that exploring the contextual dependencies across sequential segments can provide large improvements. In particular, we propose an attention-based contextual model which is able to leverage the contexts in a selective manner. We test both our contextual and non-contextual models on four LORELEI languages, and on all but one our attention-based contextual model significantly outperforms the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Natural Language Processing Techniques
