Lightweight Cross-Lingual Sentence Representation Learning
Zhuoyuan Mao, Prakhar Gupta, Pei Wang, Chenhui Chu, Martin Jaggi and, Sadao Kurohashi

TL;DR
This paper presents a lightweight, 2-layer cross-lingual sentence embedding model that uses novel training tasks, including a cross-lingual token reconstruction and contrastive learning, to achieve competitive performance with reduced memory requirements.
Contribution
Introduces a novel lightweight dual-transformer architecture with new training tasks for efficient cross-lingual sentence representation learning.
Findings
Effective on cross-lingual sentence retrieval tasks
Improved multilingual document classification accuracy
Outperforms larger models in memory efficiency
Abstract
Large-scale models for learning fixed-dimensional cross-lingual sentence representations like LASER (Artetxe and Schwenk, 2019b) lead to significant improvement in performance on downstream tasks. However, further increases and modifications based on such large-scale models are usually impractical due to memory limitations. In this work, we introduce a lightweight dual-transformer architecture with just 2 layers for generating memory-efficient cross-lingual sentence representations. We explore different training tasks and observe that current cross-lingual training tasks leave a lot to be desired for this shallow architecture. To ameliorate this, we propose a novel cross-lingual language model, which combines the existing single-word masked language model with the newly proposed cross-lingual token-level reconstruction task. We further augment the training task by the introduction of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsContrastive Learning
