MDAPT: Multilingual Domain Adaptive Pretraining in a Single Model
Rasmus K{\ae}r J{\o}rgensen, Mareike Hartmann, Xiang Dai and, Desmond Elliott

TL;DR
This paper investigates how to effectively adapt a single multilingual language model to specific domains across multiple languages, demonstrating that such models can outperform general multilingual models and approach monolingual performance.
Contribution
It introduces techniques for domain adaptive pretraining in a multilingual setting, enabling a single model to excel in domain-specific tasks across multiple languages.
Findings
Multilingual domain-adaptive models outperform general multilingual models.
Single models perform close to monolingual models in domain-specific tasks.
Techniques work across adapter-based and full model pretraining methods.
Abstract
Domain adaptive pretraining, i.e. the continued unsupervised pretraining of a language model on domain-specific text, improves the modelling of text for downstream tasks within the domain. Numerous real-world applications are based on domain-specific text, e.g. working with financial or biomedical documents, and these applications often need to support multiple languages. However, large-scale domain-specific multilingual pretraining data for such scenarios can be difficult to obtain, due to regulations, legislation, or simply a lack of language- and domain-specific text. One solution is to train a single multilingual model, taking advantage of the data available in as many languages as possible. In this work, we explore the benefits of domain adaptive pretraining with a focus on adapting to multiple languages within a specific domain. We propose different techniques to compose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
