Testing the limits of natural language models for predicting human   language judgments

Tal Golan; Matthew Siegelman; Nikolaus Kriegeskorte; Christopher; Baldassano

arXiv:2204.03592·cs.CL·September 15, 2023·1 cites

Testing the limits of natural language models for predicting human language judgments

Tal Golan, Matthew Siegelman, Nikolaus Kriegeskorte, Christopher, Baldassano

PDF

Open Access 1 Repo

TL;DR

This study evaluates how well various neural network language models, including GPT-2, predict human language judgments using controversial sentence pairs, revealing strengths and limitations in model-human alignment.

Contribution

Introduces a novel experimental method using controversial sentence pairs to assess model-human consistency across diverse language models.

Findings

01

GPT-2 showed the highest alignment with human judgments

02

Controversial sentence pairs effectively reveal model failures

03

All models exhibited notable shortcomings in matching human perception

Abstract

Neural network language models can serve as computational hypotheses about how humans process language. We compared the model-human consistency of diverse language models using a novel experimental approach: controversial sentence pairs. For each controversial sentence pair, two language models disagree about which sentence is more likely to occur in natural text. Considering nine language models (including n-gram, recurrent neural networks, and transformer models), we created hundreds of such controversial sentence pairs by either selecting sentences from a corpus or synthetically optimizing sentence pairs to be highly controversial. Human subjects then provided judgments indicating for each pair which of the two sentences is more likely. Controversial sentence pairs proved highly effective at revealing model failures and identifying models that aligned most closely with human…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dpmlab/contstimlang
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)

MethodsAttention Is All You Need · Linear Layer · Dropout · Discriminative Fine-Tuning · Weight Decay · Cosine Annealing · Softmax · Layer Normalization · Refunds@Expedia|||How do I get a full refund from Expedia? · Adam