LLMs Learn Constructions That Humans Do Not Know
Jonathan Dunn, Mai Mohamed Eida

TL;DR
This paper explores how large language models hallucinate grammatical constructions not recognized by humans, revealing biases and unknown syntactic knowledge through behavioral and meta-linguistic probing methods.
Contribution
It introduces methods to detect hallucinated constructions in LLMs and demonstrates the models' tendency to confirm false hypotheses about linguistic structures.
Findings
Models hallucinate constructions not supported by human linguistics
Probing methods are susceptible to confirmation bias
Models would confirm false hypotheses about linguistic structures
Abstract
This paper investigates false positive constructions: grammatical structures which an LLM hallucinates as distinct constructions but which human introspection does not support. Both a behavioural probing task using contextual embeddings and a meta-linguistic probing task using prompts are included, allowing us to distinguish between implicit and explicit linguistic knowledge. Both methods reveal that models do indeed hallucinate constructions. We then simulate hypothesis testing to determine what would have happened if a linguist had falsely hypothesized that these hallucinated constructions do exist. The high accuracy obtained shows that such false hypotheses would have been overwhelmingly confirmed. This suggests that construction probing methods suffer from a confirmation bias and raises the issue of what unknown and incorrect syntactic knowledge these models also possess.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
