Spoken language change detection inspired by speaker change detection
Jagabandhu Mishra, S. R. Mahadeva Prasanna

TL;DR
This paper proposes a language change detection system inspired by speaker change detection, utilizing longer analysis windows and prior language knowledge, resulting in significant performance improvements on code-switched speech datasets.
Contribution
The study adapts speaker change detection techniques for language change detection, incorporating longer analysis windows and language priors, and demonstrates substantial performance gains.
Findings
Increased analysis window length improves detection accuracy.
Prior language knowledge significantly enhances performance.
Performance varies between synthetic and real datasets due to segment duration differences.
Abstract
Spoken language change detection (LCD) refers to identifying the language transitions in a code-switched utterance. Similarly, identifying the speaker transitions in a multispeaker utterance is known as speaker change detection (SCD). Since tasks-wise both are similar, the architecture/framework developed for the SCD task may be suitable for the LCD task. Hence, the aim of the present work is to develop LCD systems inspired by SCD. Initially, both LCD and SCD are performed by humans. The study suggests humans require (a) a larger duration around the change point and (b) language-specific prior exposure, for performing LCD as compared to SCD. The larger duration requirement is incorporated by increasing the analysis window length of the unsupervised distance-based approach. This leads to a relative performance improvement of 29.1% and 2.4%, and a priori language knowledge provides a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Speech and dialogue systems
