Cross-lingual Transfer for Text Classification with Dictionary-based Heterogeneous Graph
Nuttapong Chairatanakul, Noppayut Sriwatanasakdi, Nontawat, Charoenphakdee, Xin Liu, Tsuyoshi Murata

TL;DR
This paper introduces a novel graph neural network approach using bilingual dictionaries and word embeddings for cross-lingual text classification, outperforming pretrained models without relying on large corpora.
Contribution
It proposes DHGNet, a new method that effectively handles heterogeneous bilingual graph data for cross-lingual transfer, even with noisy or automatically generated dictionaries.
Findings
Outperforms pretrained models in cross-lingual tasks
Maintains robustness with noisy or imperfect dictionaries
Requires only task-independent embeddings and dictionaries
Abstract
In cross-lingual text classification, it is required that task-specific training data in high-resource source languages are available, where the task is identical to that of a low-resource target language. However, collecting such training data can be infeasible because of the labeling cost, task characteristics, and privacy concerns. This paper proposes an alternative solution that uses only task-independent word embeddings of high-resource languages and bilingual dictionaries. First, we construct a dictionary-based heterogeneous graph (DHG) from bilingual dictionaries. This opens the possibility to use graph neural networks for cross-lingual transfer. The remaining challenge is the heterogeneity of DHG because multiple languages are considered. To address this challenge, we propose dictionary-based heterogeneous graph neural network (DHGNet) that effectively handles the heterogeneity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text and Document Classification Technologies · Natural Language Processing Techniques
MethodsGraph Neural Network
