Can Unconfident LLM Annotations Be Used for Confident Conclusions?

Kristina Gligori\'c; Tijana Zrnic; Cinoo Lee; Emmanuel J. Cand\`es,; and Dan Jurafsky

arXiv:2408.15204·cs.CL·February 11, 2025·3 cites

Can Unconfident LLM Annotations Be Used for Confident Conclusions?

Kristina Gligori\'c, Tijana Zrnic, Cinoo Lee, Emmanuel J. Cand\`es,, and Dan Jurafsky

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Confidence-Driven Inference, a method that combines LLM annotations and confidence indicators to reduce human annotation needs while maintaining valid, accurate statistical estimates in computational social science tasks.

Contribution

The paper presents a novel approach that strategically integrates LLM confidence metrics with annotations to optimize data collection and ensure valid conclusions.

Findings

01

Reduces human annotations by over 25% in CSS tasks

02

Guarantees valid and accurate conclusions despite using LLM annotations

03

Applicable to a broad range of NLP estimation problems

Abstract

Large language models (LLMs) have shown high agreement with human raters across a variety of tasks, demonstrating potential to ease the challenges of human data collection. In computational social science (CSS), researchers are increasingly leveraging LLM annotations to complement slow and expensive human annotations. Still, guidelines for collecting and using LLM annotations, without compromising the validity of downstream conclusions, remain limited. We introduce Confidence-Driven Inference: a method that combines LLM annotations and LLM confidence indicators to strategically select which human annotations should be collected, with the goal of producing accurate statistical estimates and provably valid confidence intervals while reducing the number of human annotations needed. Our approach comes with safeguards against LLM annotations of poor quality, guaranteeing that the conclusions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kristinagligoric/confidence-driven-inference
noneOfficial

Videos

Can Unconfident LLM Annotations Be Used for Confident Conclusions?· underline

Taxonomy

TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Library Science and Information Systems