T3L: Translate-and-Test Transfer Learning for Cross-Lingual Text Classification
Inigo Jauregi Unanue, Gholamreza Haffari, Massimo Piccardi

TL;DR
This paper introduces T3L, a translate-and-test transfer learning approach for cross-lingual text classification that separates translation from classification, enabling end-to-end training and significantly improving performance across multiple datasets.
Contribution
The paper proposes a novel translate-and-test pipeline that couples neural machine translation with a high-resource language classifier, allowing end-to-end backpropagation for better cross-lingual transfer.
Findings
Significant performance improvements over baseline methods.
Effective across multiple datasets including XNLI, MLDoc, and MultiEURLEX.
Demonstrates the benefit of soft translations for end-to-end training.
Abstract
Cross-lingual text classification leverages text classifiers trained in a high-resource language to perform text classification in other languages with no or minimal fine-tuning (zero/few-shots cross-lingual transfer). Nowadays, cross-lingual text classifiers are typically built on large-scale, multilingual language models (LMs) pretrained on a variety of languages of interest. However, the performance of these models vary significantly across languages and classification tasks, suggesting that the superposition of the language modelling and classification tasks is not always effective. For this reason, in this paper we propose revisiting the classic "translate-and-test" pipeline to neatly separate the translation and classification stages. The proposed approach couples 1) a neural machine translator translating from the targeted language to a high-resource language, with 2) a text…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
