On Uncertainty Calibration and Selective Generation in Probabilistic   Neural Summarization: A Benchmark Study

Polina Zablotskaia; Du Phan; Joshua Maynez; Shashi Narayan; Jie Ren,; Jeremiah Liu

arXiv:2304.08653·cs.CL·April 19, 2023·1 cites

On Uncertainty Calibration and Selective Generation in Probabilistic Neural Summarization: A Benchmark Study

Polina Zablotskaia, Du Phan, Joshua Maynez, Shashi Narayan, Jie Ren,, Jeremiah Liu

PDF

Open Access

TL;DR

This study evaluates various probabilistic deep learning methods for neural summarization, demonstrating their ability to improve uncertainty calibration and selective generation, while highlighting their limitations across diverse benchmarks.

Contribution

It provides a comprehensive benchmark analysis of probabilistic methods in neural summarization, revealing their strengths and failure modes in uncertainty calibration and selective abstention.

Findings

01

Probabilistic methods improve uncertainty calibration and summary quality.

02

They enhance selective generation by abstaining from low-quality outputs.

03

Certain methods like Deep Ensemble and Monte Carlo Dropout have notable failure patterns.

Abstract

Modern deep models for summarization attains impressive benchmark performance, but they are prone to generating miscalibrated predictive uncertainty. This means that they assign high confidence to low-quality predictions, leading to compromised reliability and trustworthiness in real-world applications. Probabilistic deep learning methods are common solutions to the miscalibration problem. However, their relative effectiveness in complex autoregressive summarization tasks are not well-understood. In this work, we thoroughly investigate different state-of-the-art probabilistic methods' effectiveness in improving the uncertainty quality of the neural summarization models, across three large-scale benchmarks with varying difficulty. We show that the probabilistic methods consistently improve the model's generation and uncertainty quality, leading to improved selective generation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques