Noised Consistency Training for Text Summarization
Junnan Liu, Qianren Mao, Bang Liu, Hao Peng, Hongdong Zhu, Jianxin Li

TL;DR
This paper introduces a semi-supervised consistency training method that leverages unlabeled data with noise to improve neural abstractive summarization, reducing reliance on large labeled datasets.
Contribution
It proposes a novel semi-supervised framework using noise-based consistency regularization to enhance summarization models with limited labeled data.
Findings
Unlabeled data with noise improves summarization performance.
Consistency training regularizes models to be invariant to input noise.
Method achieves comparable results with less labeled data.
Abstract
Neural abstractive summarization methods often require large quantities of labeled training data. However, labeling large amounts of summarization data is often prohibitive due to time, financial, and expertise constraints, which has limited the usefulness of summarization systems to practical applications. In this paper, we argue that this limitation can be overcome by a semi-supervised approach: consistency training which is to leverage large amounts of unlabeled data to improve the performance of supervised learning over a small corpus. The consistency regularization semi-supervised learning can regularize model predictions to be invariant to small noise applied to input articles. By adding noised unlabeled corpus to help regularize consistency training, this framework obtains comparative performance without using the full dataset. In particular, we have verified that leveraging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
