From Good to Best: Two-Stage Training for Cross-lingual Machine Reading   Comprehension

Nuo Chen; Linjun Shou; Min Gong; Jian Pei; Daxin Jiang

arXiv:2112.04735·cs.LG·December 10, 2021

From Good to Best: Two-Stage Training for Cross-lingual Machine Reading Comprehension

Nuo Chen, Linjun Shou, Min Gong, Jian Pei, Daxin Jiang

PDF

Open Access 1 Video

TL;DR

This paper introduces a two-stage training method for cross-lingual machine reading comprehension that improves answer recall and precision, significantly outperforming existing baselines on benchmark datasets.

Contribution

It proposes a novel two-stage training approach combining hard-learning and contrastive learning to enhance cross-lingual MRC performance.

Findings

01

Significant performance improvements over strong baselines.

02

Effective recall and precision enhancement through two-stage training.

03

Robust results on multiple cross-lingual MRC benchmarks.

Abstract

Cross-lingual Machine Reading Comprehension (xMRC) is challenging due to the lack of training data in low-resource languages. The recent approaches use training data only in a resource-rich language like English to fine-tune large-scale cross-lingual pre-trained language models. Due to the big difference between languages, a model fine-tuned only by a source language may not perform well for target languages. Interestingly, we observe that while the top-1 results predicted by the previous approaches may often fail to hit the ground-truth answers, the correct answers are often contained in the top-k predicted results. Based on this observation, we develop a two-stage approach to enhance the model performance. The first stage targets at recall: we design a hard-learning (HL) algorithm to maximize the likelihood that the top-k predictions contain the accurate answer. The second stage…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

From Good to Best: Two-Stage Training for Cross-Lingual Machine Reading Comprehension· underline

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques

MethodsContrastive Learning