Unicoder: A Universal Language Encoder by Pre-training with Multiple Cross-lingual Tasks
Haoyang Huang, Yaobo Liang, Nan Duan, Ming Gong, Linjun Shou, Daxin, Jiang, Ming Zhou

TL;DR
Unicoder is a universal language encoder trained with novel cross-lingual tasks, enabling effective transfer of NLP models across multiple languages with improved accuracy on inference and question answering tasks.
Contribution
The paper introduces three new cross-lingual pre-training tasks that enhance multilingual understanding and demonstrates improved performance over baseline models like XLM.
Findings
Achieved 1.8% accuracy improvement on XNLI across 15 languages.
Achieved 5.5% accuracy improvement on XQA for French and German.
Fine-tuning on multiple languages simultaneously further boosts performance.
Abstract
We present Unicoder, a universal language encoder that is insensitive to different languages. Given an arbitrary NLP task, a model can be trained with Unicoder using training data in one language and directly applied to inputs of the same task in other languages. Comparing to similar efforts such as Multilingual BERT and XLM, three new cross-lingual pre-training tasks are proposed, including cross-lingual word recovery, cross-lingual paraphrase classification and cross-lingual masked language model. These tasks help Unicoder learn the mappings among different languages from more perspectives. We also find that doing fine-tuning on multiple languages together can bring further improvement. Experiments are performed on two tasks: cross-lingual natural language inference (XNLI) and cross-lingual question answering (XQA), where XLM is our baseline. On XNLI, 1.8% averaged accuracy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsLinear Layer · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Byte Pair Encoding · Weight Decay · XLM · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam
