What GPT Knows About Who is Who

Xiaohan Yang; Eduardo Peynetti; Vasco Meerman; Chris Tanner

arXiv:2205.07407·cs.CL·May 17, 2022

What GPT Knows About Who is Who

Xiaohan Yang, Eduardo Peynetti, Vasco Meerman, Chris Tanner

PDF

Open Access 1 Repo

TL;DR

This paper investigates the ability of large language models like GPT-2 and GPT-Neo to perform coreference resolution using prompt engineering, revealing their limited and inconsistent capabilities in identifying coreferent mentions.

Contribution

It introduces a QA-based prompt-engineering approach to assess LLMs' coreference resolution abilities, highlighting their limitations and sensitivity to prompts.

Findings

01

GPT-2 and GPT-Neo can produce valid answers

02

Their coreference identification is limited and inconsistent

03

Performance is highly prompt-sensitive

Abstract

Coreference resolution -- which is a crucial task for understanding discourse and language at large -- has yet to witness widespread benefits from large language models (LLMs). Moreover, coreference resolution systems largely rely on supervised labels, which are highly expensive and difficult to annotate, thus making it ripe for prompt engineering. In this paper, we introduce a QA-based prompt-engineering method and discern \textit{generative}, pre-trained LLMs' abilities and limitations toward the task of coreference resolution. Our experiments show that GPT-2 and GPT-Neo can return valid answers, but that their capabilities to identify coreferent mentions are limited and prompt-sensitive, leading to inconsistent results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

awesomecoref/prompt-coref
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Cosine Annealing · Weight Decay · Discriminative Fine-Tuning · Linear Warmup With Cosine Annealing · Softmax · Multi-Head Attention · Attention Dropout