Cross-Lingual Word Alignment for ASEAN Languages with Contrastive Learning
Jingshen Zhang, Xinying Qiu, Teng Shen, Wenyu Wang, Kailin Zhang,, Wenhe Feng

TL;DR
This paper introduces a contrastive learning approach within a BiLSTM encoder-decoder model to improve cross-lingual word alignment accuracy for ASEAN languages, especially in low-resource settings.
Contribution
It proposes a novel contrastive learning method with multi-view negative sampling to explicitly model differences in word embeddings for better alignment.
Findings
Contrastive learning improves alignment accuracy across datasets.
The method outperforms previous models in low-resource scenarios.
The approach is validated on five bilingual datasets for ASEAN languages.
Abstract
Cross-lingual word alignment plays a crucial role in various natural language processing tasks, particularly for low-resource languages. Recent study proposes a BiLSTM-based encoder-decoder model that outperforms pre-trained language models in low-resource settings. However, their model only considers the similarity of word embedding spaces and does not explicitly model the differences between word embeddings. To address this limitation, we propose incorporating contrastive learning into the BiLSTM-based encoder-decoder framework. Our approach introduces a multi-view negative sampling strategy to learn the differences between word pairs in the shared cross-lingual embedding space. We evaluate our model on five bilingual aligned datasets spanning four ASEAN languages: Lao, Vietnamese, Thai, and Indonesian. Experimental results demonstrate that integrating contrastive learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Text Readability and Simplification
MethodsSparse Evolutionary Training · Contrastive Learning
