Cross-Domain Bilingual Lexicon Induction via Pretrained Language Models
Qiuyu Ding, Zhiqiang Cao, Hailong Cao, Tiejun Zhao

TL;DR
This paper introduces a new cross-domain bilingual lexicon induction method leveraging pretrained language models and code-switching techniques, improving translation accuracy in specialized fields.
Contribution
Proposes a novel approach combining pretrained models and code-switching for better cross-domain bilingual lexicon induction, addressing domain-specific challenges.
Findings
Improved performance over baseline methods on three domain datasets
Achieved an average increase of 0.78 points in translation accuracy
Demonstrated effectiveness in specialized and medical domains
Abstract
Bilingual Lexicon Induction (BLI) is generally based on common domain data to obtain monolingual word embedding, and by aligning the monolingual word embeddings to obtain the cross-lingual embeddings which are used to get the word translation pairs. In this paper, we propose a new task of BLI, which is to use the monolingual corpus of the general domain and target domain to extract domain-specific bilingual dictionaries. Motivated by the ability of Pre-trained models, we propose a method to get better word embeddings that build on the recent work on BLI. This way, we introduce the Code Switch(Qin et al., 2020) firstly in the cross-domain BLI task, which can match differit is yet to be seen whether these methods are suitable for bilingual lexicon extraction in professional fields. As we can see in table 1, the classic and efficient BLI approach, Muse and Vecmap, perform much worse on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text and Document Classification Technologies
MethodsSparse Evolutionary Training
