Leveraging Cross-Lingual Transfer Learning in Spoken Named Entity Recognition Systems
Moncef Benaicha, David Thulke, M. A. Tu\u{g}tekin Turan

TL;DR
This paper investigates cross-lingual transfer learning for spoken Named Entity Recognition (NER) across Dutch, English, and German, demonstrating improved performance especially in low-resource settings using E2E models and transfer techniques.
Contribution
It introduces the application of transfer learning with Wav2Vec2 XLS-R models to spoken NER across multiple languages, highlighting the superiority of E2E models and the benefits of cross-lingual transfer.
Findings
E2E models outperform pipeline models in spoken NER
Transfer learning from German to Dutch improves accuracy by 7%
Cross-lingual transfer is effective in low-resource spoken NER scenarios
Abstract
Recent Named Entity Recognition (NER) advancements have significantly enhanced text classification capabilities. This paper focuses on spoken NER, aimed explicitly at spoken document retrieval, an area not widely studied due to the lack of comprehensive datasets for spoken contexts. Additionally, the potential for cross-lingual transfer learning in low-resource situations deserves further investigation. In our study, we applied transfer learning techniques across Dutch, English, and German using both pipeline and End-to-End (E2E) approaches. We employed Wav2Vec2 XLS-R models on custom pseudo-annotated datasets to evaluate the adaptability of cross-lingual systems. Our exploration of different architectural configurations assessed the robustness of these systems in spoken NER. Results showed that the E2E model was superior to the pipeline model, particularly with limited annotation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
