Leverage Unlabeled Data for Abstractive Speech Summarization with Self-Supervised Learning and Back-Summarization
Paul Tardy, Louis de Seynes, Fran\c{c}ois Hernandez, Vincent Nguyen,, David Janiszek, Yannick Est\`eve

TL;DR
This paper explores leveraging large amounts of unaligned data for French meeting abstractive summarization by using self-supervised pre-training and back-summarization techniques, significantly improving performance over baseline models.
Contribution
It introduces two novel approaches—self-supervised pre-training and back-summarization—for utilizing unaligned data in speech summarization, achieving substantial performance gains.
Findings
Large ROUGE score improvements over baseline
Effective use of unaligned reports for training
Combining methods yields best results
Abstract
Supervised approaches for Neural Abstractive Summarization require large annotated corpora that are costly to build. We present a French meeting summarization task where reports are predicted based on the automatic transcription of the meeting audio recordings. In order to build a corpus for this task, it is necessary to obtain the (automatic or manual) transcription of each meeting, and then to segment and align it with the corresponding manual report to produce training examples suitable for training. On the other hand, we have access to a very large amount of unaligned data, in particular reports without corresponding transcription. Reports are professionally written and well formatted making pre-processing straightforward. In this context, we study how to take advantage of this massive amount of unaligned data using two approaches (i) self-supervised pre-training using a target-side…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
