STCON System for the CHiME-8 Challenge
Anton Mitrofanov, Tatiana Prisyach, Tatiana Timofeeva, Sergei, Novoselov, Maxim Korenevsky, Yuri Khokhlov, Artem Akulov, Alexander Anikin,, Roman Khalili, Iurii Lezhenin, Aleksandr Melnikov, Dmitriy Miroshnichenko,, Nikita Mamaev, Ilya Odegov, Olga Rudnitskaya, Aleksei Romanenko

TL;DR
This paper presents the STCON system for the CHiME-8 Challenge, focusing on improved diarization, speaker counting, and source separation techniques to enhance distant speech transcription accuracy.
Contribution
It introduces a carefully tuned diarization pipeline, a novel Guided Target speaker Extraction model, and data augmentation methods to improve speech recognition in multi-device recordings.
Findings
Reduced diarization error rate (DER) significantly
Achieved more reliable speech segments for recognition
Enhanced source separation with G-TSE and GSS methods
Abstract
This paper describes the STCON system for the CHiME-8 Challenge Task 1 (DASR) aimed at distant automatic speech transcription and diarization with multiple recording devices. Our main attention was paid to carefully trained and tuned diarization pipeline and speaker counting. This allowed to significantly reduce diarization error rate (DER) and obtain more reliable segments for speech separation and recognition. To improve source separation, we designed a Guided Target speaker Extraction (G-TSE) model and used it in conjunction with the traditional Guided Source Separation (GSS) method. To train various parts of our pipeline, we investigated several data augmentation and generation techniques, which helped us to improve the overall system quality.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques
MethodsSoftmax · Attention Is All You Need
