Frequency, Acceptability, and Selection: A case study of clause-embedding
Aaron Steven White, Kyle Rawlins

TL;DR
This study examines how well verb frequency in subcategorization frames predicts acceptability, finding that frequency alone is a weak predictor and that common modeling techniques offer limited improvement.
Contribution
It demonstrates that verb frequency distributions poorly predict acceptability, highlighting the need for alternative models in understanding verb-frame compatibility.
Findings
Frequency explains less than 1/3 of acceptability variance.
Common matrix factorization techniques only marginally improve predictions.
Data and code are publicly available for further research.
Abstract
We investigate the relationship between the frequency with which verbs are found in particular subcategorization frames and the acceptability of those verbs in those frames, focusing in particular on subordinate clause-taking verbs, such as "think", "want", and "tell". We show that verbs' subcategorization frame frequency distributions are poor predictors of their acceptability in those frames---explaining, at best, less than 1/3 of the total information about acceptability across the lexicon---and, further, that common matrix factorization techniques used to model the acquisition of verbs' acceptability in subcategorization frames fare only marginally better. All data and code are available at http://megaattitude.io.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
