Differentiable N-gram Objective on Abstractive Summarization
Yunqi Zhu, Xuebing Yang, Yuanyuan Wu, Mingjin Zhu and, Wensheng Zhang

TL;DR
This paper introduces a differentiable n-gram objective for abstractive summarization that better aligns training with evaluation metrics, leading to improved ROUGE scores on benchmark datasets.
Contribution
It proposes a novel differentiable n-gram objective that weights matched sub-sequences equally and jointly optimizes it with cross-entropy loss.
Findings
Achieves higher ROUGE scores on CNN/DM and XSum datasets.
Outperforms existing n-gram objectives in summarization tasks.
Enhances alignment between training objectives and evaluation metrics.
Abstract
ROUGE is a standard automatic evaluation metric based on n-grams for sequence-to-sequence tasks, while cross-entropy loss is an essential objective of neural network language model that optimizes at a unigram level. We present differentiable n-gram objectives, attempting to alleviate the discrepancy between training criterion and evaluating criterion. The objective maximizes the probabilistic weight of matched sub-sequences, and the novelty of our work is the objective weights the matched sub-sequences equally and does not ceil the number of matched sub-sequences by the ground truth count of n-grams in reference sequence. We jointly optimize cross-entropy loss and the proposed objective, providing decent ROUGE score enhancement over abstractive summarization dataset CNN/DM and XSum, outperforming alternative n-gram objectives.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Advanced Text Analysis Techniques
