Bridging the Gap between Language Models and Cross-Lingual Sequence   Labeling

Nuo Chen; Linjun Shou; Ming Gong; Jian Pei; Daxin Jiang

arXiv:2204.05210·cs.CL·April 12, 2022

Bridging the Gap between Language Models and Cross-Lingual Sequence Labeling

Nuo Chen, Linjun Shou, Ming Gong, Jian Pei, Daxin Jiang

PDF

Open Access

TL;DR

This paper introduces novel pre-training tasks and regularization techniques to improve cross-lingual sequence labeling by bridging the training objective gap and enhancing language alignment, leading to superior performance especially in low-resource scenarios.

Contribution

It proposes Cross-lingual Language Informative Span Masking (CLISM) and ContrAstive-Consistency Regularization (CACR) to address the objective gap and improve cross-lingual alignment in pre-trained language models.

Findings

01

Achieves superior results on multiple cross-lingual benchmarks.

02

Outperforms previous methods significantly in few-shot settings.

03

Effectively bridges the pretrain-finetune gap and enhances language alignment.

Abstract

Large-scale cross-lingual pre-trained language models (xPLMs) have shown effectiveness in cross-lingual sequence labeling tasks (xSL), such as cross-lingual machine reading comprehension (xMRC) by transferring knowledge from a high-resource language to low-resource languages. Despite the great success, we draw an empirical observation that there is a training objective gap between pre-training and fine-tuning stages: e.g., mask language modeling objective requires local understanding of the masked token and the span-extraction objective requires global understanding and reasoning of the input passage/paragraph and question, leading to the discrepancy between pre-training and xMRC. In this paper, we first design a pre-training task tailored for xSL named Cross-lingual Language Informative Span Masking (CLISM) to eliminate the objective gap in a self-supervised manner. Second, we present…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications

MethodsContrastive Learning