Towards Unifying Multi-Lingual and Cross-Lingual Summarization

Jiaan Wang; Fandong Meng; Duo Zheng; Yunlong Liang; Zhixu Li; Jianfeng; Qu; Jie Zhou

arXiv:2305.09220·cs.CL·May 17, 2023·1 cites

Towards Unifying Multi-Lingual and Cross-Lingual Summarization

Jiaan Wang, Fandong Meng, Duo Zheng, Yunlong Liang, Zhixu Li, Jianfeng, Qu, Jie Zhou

PDF

Open Access

TL;DR

This paper introduces a unified many-to-many multilingual summarization framework, Pisces, which leverages a three-stage pre-training to improve cross-lingual transfer and zero-shot summarization performance.

Contribution

It unifies multi-lingual and cross-lingual summarization into a single many-to-many setting and proposes Pisces, a novel pre-trained model for this task.

Findings

01

Pisces outperforms state-of-the-art baselines in zero-shot settings.

02

Unified M2MS setting enhances cross-lingual transfer.

03

Pre-training improves multilingual summarization capabilities.

Abstract

To adapt text summarization to the multilingual world, previous work proposes multi-lingual summarization (MLS) and cross-lingual summarization (CLS). However, these two tasks have been studied separately due to the different definitions, which limits the compatible and systematic research on both of them. In this paper, we aim to unify MLS and CLS into a more general setting, i.e., many-to-many summarization (M2MS), where a single model could process documents in any language and generate their summaries also in any language. As the first step towards M2MS, we conduct preliminary studies to show that M2MS can better transfer task knowledge across different languages than MLS and CLS. Furthermore, we propose Pisces, a pre-trained M2MS model that learns language modeling, cross-lingual ability and summarization ability via three-stage pre-training. Experimental results indicate that our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis