Unifying Cross-lingual Summarization and Machine Translation with Compression Rate
Yu Bai, Heyan Huang, Kai Fan, Yang Gao, Yiming Zhu, Jiaao Zhan, Zewen, Chi, and Boxing Chen

TL;DR
This paper introduces a unified framework for cross-lingual summarization and machine translation using a novel compression rate concept, enhancing training efficiency and summary controllability.
Contribution
It proposes the Cross-lingual Summarization with Compression rate (CSC) task, integrating MT data into CLS training via a data augmentation method for different compression rates.
Findings
Outperforms strong baselines on three datasets
Improves cross-lingual summarization performance
Enables controllable summary length generation
Abstract
Cross-Lingual Summarization (CLS) is a task that extracts important information from a source document and summarizes it into a summary in another language. It is a challenging task that requires a system to understand, summarize, and translate at the same time, making it highly related to Monolingual Summarization (MS) and Machine Translation (MT). In practice, the training resources for Machine Translation are far more than that for cross-lingual and monolingual summarization. Thus incorporating the Machine Translation corpus into CLS would be beneficial for its performance. However, the present work only leverages a simple multi-task framework to bring Machine Translation in, lacking deeper exploration. In this paper, we propose a novel task, Cross-lingual Summarization with Compression rate (CSC), to benefit Cross-Lingual Summarization by large-scale Machine Translation corpus.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
