Revisiting Zero-Shot Abstractive Summarization in the Era of Large Language Models from the Perspective of Position Bias
Anshuman Chhabra, Hadi Askari, Prasant Mohapatra

TL;DR
This paper investigates position bias in large language models during zero-shot abstractive summarization, revealing how models unfairly prioritize certain input parts, affecting summarization quality across diverse datasets.
Contribution
It introduces a general formulation of position bias, systematically measures it across multiple models and datasets, and provides insights into its impact on zero-shot summarization performance.
Findings
Position bias varies significantly across models and datasets.
Large language models exhibit a tendency to favor certain input positions.
Understanding position bias can guide improvements in summarization models.
Abstract
We characterize and study zero-shot abstractive summarization in Large Language Models (LLMs) by measuring position bias, which we propose as a general formulation of the more restrictive lead bias phenomenon studied previously in the literature. Position bias captures the tendency of a model unfairly prioritizing information from certain parts of the input text over others, leading to undesirable behavior. Through numerous experiments on four diverse real-world datasets, we study position bias in multiple LLM models such as GPT 3.5-Turbo, Llama-2, and Dolly-v2, as well as state-of-the-art pretrained encoder-decoder abstractive summarization models such as Pegasus and BART. Our findings lead to novel insights and discussion on performance and position bias of models for zero-shot summarization tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Cosine Annealing · Adam · Discriminative Fine-Tuning · Refunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · Residual Connection · Attention Dropout
