Large Language Models for Psycholinguistic Plausibility Pretesting
Samuel Joseph Amouyal, Aya Meltzer-Asscher, Jonathan Berant

TL;DR
This paper explores the use of large language models, especially GPT-4, for pretesting linguistic materials in psycholinguistics by assessing their plausibility judgments compared to human evaluations.
Contribution
It demonstrates that GPT-4's plausibility judgments strongly correlate with human judgments and can replace humans for coarse-grained pretesting but not for fine-grained assessments.
Findings
GPT-4's plausibility judgments align closely with human judgments.
Other LMs perform well on common syntactic structures.
GPT-4 is effective for coarse but not fine-grained pretesting.
Abstract
In psycholinguistics, the creation of controlled materials is crucial to ensure that research outcomes are solely attributed to the intended manipulations and not influenced by extraneous factors. To achieve this, psycholinguists typically pretest linguistic materials, where a common pretest is to solicit plausibility judgments from human evaluators on specific sentences. In this work, we investigate whether Language Models (LMs) can be used to generate these plausibility judgements. We investigate a wide range of LMs across multiple linguistic structures and evaluate whether their plausibility judgements correlate with human judgements. We find that GPT-4 plausibility judgements highly correlate with human judgements across the structures we examine, whereas other LMs correlate well with humans on commonly used syntactic structures. We then test whether this correlation implies that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsAttention Is All You Need · Dense Connections · Position-Wise Feed-Forward Layer · Label Smoothing · Softmax · Absolute Position Encodings · Linear Layer · Byte Pair Encoding · Multi-Head Attention · Residual Connection
