Can Language Models Induce Grammatical Knowledge from Indirect Evidence?
Miyu Oba, Yohei Oseki, Akiyo Fukatsu, Akari Haga, Hiroki Ouchi, Taro, Watanabe, Saku Sugawara

TL;DR
This paper examines whether language models can efficiently learn grammatical rules from indirect evidence, revealing that current models struggle with this task and suggesting future research directions for improving their inductive capabilities.
Contribution
The paper introduces the Wug InDirect Evidence Test (WIDET) dataset to evaluate models' ability to learn grammar from indirect evidence, highlighting limitations in current models' data efficiency.
Findings
Language models do not induce grammatical knowledge from indirect evidence.
Models struggle even after repeated exposure to structured instances.
Future work should focus on models utilizing latent indirect evidence.
Abstract
What kinds of and how much data is necessary for language models to induce grammatical knowledge to judge sentence acceptability? Recent language models still have much room for improvement in their data efficiency compared to humans. This paper investigates whether language models efficiently use indirect data (indirect evidence), from which they infer sentence acceptability. In contrast, humans use indirect evidence efficiently, which is considered one of the inductive biases contributing to efficient language acquisition. To explore this question, we introduce the Wug InDirect Evidence Test (WIDET), a dataset consisting of training instances inserted into the pre-training data and evaluation instances. We inject synthetic instances with newly coined wug words into pretraining data and explore the model's behavior on evaluation data that assesses grammatical acceptability regarding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
