ZmBART: An Unsupervised Cross-lingual Transfer Framework for Language   Generation

Kaushal Kumar Maurya; Maunendra Sankar Desarkar; Yoshinobu Kano and; Kumari Deepshikha

arXiv:2106.01597·cs.CL·June 4, 2021

ZmBART: An Unsupervised Cross-lingual Transfer Framework for Language Generation

Kaushal Kumar Maurya, Maunendra Sankar Desarkar, Yoshinobu Kano and, Kumari Deepshikha

PDF

1 Repo

TL;DR

ZmBART is an unsupervised cross-lingual transfer framework that enables natural language generation in low-resource languages without using parallel data, leveraging monolingual pre-training and task-specific fine-tuning.

Contribution

This work introduces ZmBART, a novel unsupervised transfer method for NLG that does not rely on parallel data and effectively transfers from high-resource to low-resource languages.

Findings

01

Effective zero-shot transfer to low-resource languages.

02

Improved performance with few-shot training.

03

Robustness demonstrated through ablations and analyses.

Abstract

Despite the recent advancement in NLP research, cross-lingual transfer for natural language generation is relatively understudied. In this work, we transfer supervision from high resource language (HRL) to multiple low-resource languages (LRLs) for natural language generation (NLG). We consider four NLG tasks (text summarization, question generation, news headline generation, and distractor generation) and three syntactically diverse languages, i.e., English, Hindi, and Japanese. We propose an unsupervised cross-lingual language generation framework (called ZmBART) that does not use any parallel or pseudo-parallel/back-translated data. In this framework, we further pre-train mBART sequence-to-sequence denoising auto-encoder model with an auxiliary task using monolingual data of three languages. The objective function of the auxiliary task is close to the target tasks which enriches the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kaushal0494/ZmBART
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsmBART