Language Portability Strategies for Open-domain Dialogue with Pre-trained Language Models from High to Low Resource Languages
Ahmed Njifenjou, Virgile Sucal, Bassam Jabaian, Fabrice Lef\`evre

TL;DR
This paper investigates strategies to adapt large pre-trained language models for open-domain dialogue in low-resource languages, using French as a case study, and compares different translation and adaptation approaches.
Contribution
It evaluates neural machine translation and adapter-based methods to enhance language portability of dialogue models from high-resource to low-resource languages.
Findings
NMT-based approaches improve low-resource language performance.
Adapter-based fine-tuning with BLOOM enhances dialogue quality.
Strategies enable leveraging multilingual models without extensive new data.
Abstract
In this paper we propose a study of linguistic portability strategies of large pre-trained language models (PLMs) used for open-domain dialogue systems in a high-resource language for this task. In particular the target low-resource language (L_T) will be simulated with French, as it lacks of task-specific resources and allows our human evaluation, when the source language (L_S) is English. For obvious reasons, recent works using such models for open-domain dialogue are mostly developed in English. Yet building specific PLMs for each possible target language supposes collecting new datasets and is costly. For this reason, trying to leverage all existing resources (PLMs and data) in both L_S and L_T , we wish to assess the performance achievable in L_T with different approaches. The first two approaches evaluate the usage of Neural Machine Translation (NMT) at different levels:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsAdapter · BLOOM
