TL;DR
CLEAR introduces a reverse-training loss function that enhances cross-lingual retrieval by leveraging English as a bridge, significantly improving performance especially in low-resource languages.
Contribution
The paper proposes a novel reverse-training scheme that improves cross-lingual alignment and retrieval performance, addressing limitations of contrastive learning methods.
Findings
Achieves up to 15% improvement in cross-lingual retrieval tasks.
Enhances performance in low-resource languages without degrading English results.
Effective even in multilingual training scenarios.
Abstract
Existing multilingual embedding models often encounter challenges in cross-lingual scenarios due to imbalanced linguistic resources and less consideration of cross-lingual alignment during training. Although standardized contrastive learning approaches for cross-lingual adaptation are widely adopted, they may struggle to capture fundamental alignment between languages and degrade performance in well-aligned languages such as English. To address these challenges, we propose Cross-Lingual Enhancement in Retrieval via Reverse-training (CLEAR), a novel loss function utilizing a reverse training scheme to improve retrieval performance across diverse cross-lingual retrieval scenarios. CLEAR leverages an English passage as a bridge to strengthen alignments between the target language and English, ensuring robust performance in the cross-lingual retrieval task. Our extensive experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
