DISPLACE Challenge: DIarization of SPeaker and LAnguage in Conversational Environments
Shikha Baghel, Shreyas Ramoji, Sidharth, Ranjana H, Prachi Singh,, Somil Jain, Pratik Roy Chowdhuri, Kaustubh Kulkarni, Swapnil Padhi, Deepu, Vijayasenan, Sriram Ganapathy

TL;DR
The DISPLACE challenge introduces a benchmark for speaker and language diarization in multilingual, code-mixed conversational speech, addressing a gap in current speech technology for complex social interactions.
Contribution
It presents the first benchmark dataset and evaluation framework for simultaneous speaker and language diarization in multilingual, code-mixed conversations.
Findings
Baseline system provided for comparison.
System submissions highlight current challenges in diarization.
Evaluation results identify key areas for future research.
Abstract
In multilingual societies, social conversations often involve code-mixed speech. The current speech technology may not be well equipped to extract information from multi-lingual multi-speaker conversations. The DISPLACE challenge entails a first-of-kind task to benchmark speaker and language diarization on the same data, as the data contains multi-speaker conversations in multilingual code-mixed speech. The challenge attempts to highlight outstanding issues in speaker diarization (SD) in multilingual settings with code-mixing. Further, language diarization (LD) in multi-speaker settings also introduces new challenges, where the system has to disambiguate speaker switches with code switches. For this challenge, a natural multilingual, multi-speaker conversational dataset is distributed for development and evaluation purposes. The systems are evaluated on single-channel far-field…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Speech and dialogue systems
