Low-Resource Cross-Lingual Adaptive Training for Nigerian Pidgin
Pin-Jie Lin, Muhammed Saeed, Ernie Chang, Merel Scholman

TL;DR
This paper introduces a cross-lingual adaptive training framework for Nigerian Pidgin, leveraging English pre-trained models and data augmentation to improve language understanding and translation in low-resource settings.
Contribution
It proposes a novel adaptive training framework combining continual and task adaptive training, and demonstrates its effectiveness on Nigerian Pidgin tasks using a large-scale parallel corpus.
Findings
English pre-trained models outperform multilingual models on Pidgin tasks
Data augmentation and back-translation significantly improve performance
Up to 2.38 BLEU score improvements achieved
Abstract
Developing effective spoken language processing systems for low-resource languages poses several challenges due to the lack of parallel data and limited resources for fine-tuning models. In this work, we target on improving upon both text classification and translation of Nigerian Pidgin (Naija) by collecting a large-scale parallel English-Pidgin corpus and further propose a framework of cross-lingual adaptive training that includes both continual and task adaptive training so as to adapt a base pre-trained model to low-resource languages. Our studies show that English pre-trained language models serve as a stronger prior than multilingual language models on English-Pidgin tasks with up to 2.38 BLEU improvements; and demonstrate that augmenting orthographic data and using task adaptive training with back-translation can have a significant impact on model performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
MethodsBalanced Selection
