Defining Boundaries: The Impact of Domain Specification on Cross-Language and Cross-Domain Transfer in Machine Translation
Lia Shahnazaryan, Meriem Beloucif

TL;DR
This paper explores how domain boundaries and linguistic factors affect zero-shot cross-lingual transfer in neural machine translation, highlighting the importance of domain specification for effective in-domain adaptation across multiple languages.
Contribution
It provides an empirical analysis of the influence of domain and language-specific factors on zero-shot cross-lingual NMT transfer, emphasizing the role of domain boundaries.
Findings
Domain characteristics significantly impact transfer effectiveness.
Well-defined domain boundaries improve in-domain transfer.
Cross-lingual transfer varies across different target languages and domains.
Abstract
Recent advancements in neural machine translation (NMT) have revolutionized the field, yet the dependency on extensive parallel corpora limits progress for low-resource languages and domains. Cross-lingual transfer learning offers a promising solution by utilizing data from high-resource languages but often struggles with in-domain NMT. This paper investigates zero-shot cross-lingual domain adaptation for NMT, focusing on the impact of domain specification and linguistic factors on transfer effectiveness. Using English as the source language and Spanish for fine-tuning, we evaluate multiple target languages, including Portuguese, Italian, French, Czech, Polish, and Greek. We demonstrate that both language-specific and domain-specific factors influence transfer effectiveness, with domain characteristics playing a crucial role in determining cross-domain transfer potential. We also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
