Multi-Dimensional Evaluation of Text Summarization with In-Context   Learning

Sameer Jain; Vaishakh Keshava; Swarnashree Mysore Sathyendra; Patrick; Fernandes; Pengfei Liu; Graham Neubig; Chunting Zhou

arXiv:2306.01200·cs.CL·February 20, 2024·5 cites

Multi-Dimensional Evaluation of Text Summarization with In-Context Learning

Sameer Jain, Vaishakh Keshava, Swarnashree Mysore Sathyendra, Patrick, Fernandes, Pengfei Liu, Graham Neubig, Chunting Zhou

PDF

Open Access 1 Repo

TL;DR

This paper investigates using large language models with in-context learning as multi-dimensional evaluators for text summarization, achieving competitive results without extensive training data and analyzing factors affecting their performance.

Contribution

It demonstrates that in-context learning enables effective multi-dimensional evaluation of summaries, reducing reliance on large annotated datasets and providing insights into evaluation factors.

Findings

01

In-context learning evaluators are competitive with traditional trained evaluators.

02

They achieve state-of-the-art on relevance and factuality dimensions.

03

Performance is influenced by selection and number of in-context examples.

Abstract

Evaluation of natural language generation (NLG) is complex and multi-dimensional. Generated text can be evaluated for fluency, coherence, factuality, or any other dimensions of interest. Most frameworks that perform such multi-dimensional evaluation require training on large manually or synthetically generated datasets. In this paper, we study the efficacy of large language models as multi-dimensional evaluators using in-context learning, obviating the need for large training datasets. Our experiments show that in-context learning-based evaluators are competitive with learned evaluation frameworks for the task of text summarization, establishing state-of-the-art on dimensions such as relevance and factual consistency. We then analyze the effects of factors such as the selection and number of in-context examples on performance. Finally, we study the efficacy of in-context learning based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jainsameer06/ice
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsAttention Is All You Need · Linear Layer · {Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · Residual Connection · 15 Ways to Contact How can i speak to someone at Delta Airlines · Cosine Annealing · Layer Normalization · Byte Pair Encoding · Softmax