Bridging the Gap between Language Models and Cross-Lingual Sequence Labeling
Nuo Chen, Linjun Shou, Ming Gong, Jian Pei, Daxin Jiang

TL;DR
This paper introduces novel pre-training tasks and regularization techniques to improve cross-lingual sequence labeling by bridging the training objective gap and enhancing language alignment, leading to superior performance especially in low-resource scenarios.
Contribution
It proposes Cross-lingual Language Informative Span Masking (CLISM) and ContrAstive-Consistency Regularization (CACR) to address the objective gap and improve cross-lingual alignment in pre-trained language models.
Findings
Achieves superior results on multiple cross-lingual benchmarks.
Outperforms previous methods significantly in few-shot settings.
Effectively bridges the pretrain-finetune gap and enhances language alignment.
Abstract
Large-scale cross-lingual pre-trained language models (xPLMs) have shown effectiveness in cross-lingual sequence labeling tasks (xSL), such as cross-lingual machine reading comprehension (xMRC) by transferring knowledge from a high-resource language to low-resource languages. Despite the great success, we draw an empirical observation that there is a training objective gap between pre-training and fine-tuning stages: e.g., mask language modeling objective requires local understanding of the masked token and the span-extraction objective requires global understanding and reasoning of the input passage/paragraph and question, leading to the discrepancy between pre-training and xMRC. In this paper, we first design a pre-training task tailored for xSL named Cross-lingual Language Informative Span Masking (CLISM) to eliminate the objective gap in a self-supervised manner. Second, we present…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsContrastive Learning
