Revisiting Zero-Shot Abstractive Summarization in the Era of Large   Language Models from the Perspective of Position Bias

Anshuman Chhabra; Hadi Askari; Prasant Mohapatra

arXiv:2401.01989·cs.CL·March 20, 2024·1 cites

Revisiting Zero-Shot Abstractive Summarization in the Era of Large Language Models from the Perspective of Position Bias

Anshuman Chhabra, Hadi Askari, Prasant Mohapatra

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates position bias in large language models during zero-shot abstractive summarization, revealing how models unfairly prioritize certain input parts, affecting summarization quality across diverse datasets.

Contribution

It introduces a general formulation of position bias, systematically measures it across multiple models and datasets, and provides insights into its impact on zero-shot summarization performance.

Findings

01

Position bias varies significantly across models and datasets.

02

Large language models exhibit a tendency to favor certain input positions.

03

Understanding position bias can guide improvements in summarization models.

Abstract

We characterize and study zero-shot abstractive summarization in Large Language Models (LLMs) by measuring position bias, which we propose as a general formulation of the more restrictive lead bias phenomenon studied previously in the literature. Position bias captures the tendency of a model unfairly prioritizing information from certain parts of the input text over others, leading to undesirable behavior. Through numerous experiments on four diverse real-world datasets, we study position bias in multiple LLM models such as GPT 3.5-Turbo, Llama-2, and Dolly-v2, as well as state-of-the-art pretrained encoder-decoder abstractive summarization models such as Pegasus and BART. Our findings lead to novel insights and discussion on performance and position bias of models for zero-shot summarization tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

anshuman23/llm_position_bias
pytorchOfficial

Videos

Revisiting Zero-Shot Abstractive Summarization in the Era of Large Language Models from the Perspective of Position Bias· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Cosine Annealing · Adam · Discriminative Fine-Tuning · Refunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · Residual Connection · Attention Dropout