Learned Construction Grammars Converge Across Registers Given Increased Exposure
Jonathan Dunn, Harish Tayyar Madabushi

TL;DR
This study investigates how increased exposure to diverse language registers influences the convergence of learned construction grammars, demonstrating that more exposure leads to more unified grammars across registers while maintaining a core set of universal constructions.
Contribution
It provides empirical evidence that increased exposure causes learned construction grammars to converge across different registers, highlighting the stability of a core set of constructions.
Findings
Increased exposure leads to converging grammars across registers.
A core set of register-universal constructions remains stable.
Convergence occurs across multiple languages and registers.
Abstract
This paper measures the impact of increased exposure on whether learned construction grammars converge onto shared representations when trained on data from different registers. Register influences the frequency of constructions, with some structures common in formal but not informal usage. We expect that a grammar induction algorithm exposed to different registers will acquire different constructions. To what degree does increased exposure lead to the convergence of register-specific grammars? The experiments in this paper simulate language learning in 12 languages (half Germanic and half Romance) with corpora representing three registers (Twitter, Wikipedia, Web). These simulations are repeated with increasing amounts of exposure, from 100k to 2 million words, to measure the impact of exposure on the convergence of grammars. The results show that increased exposure does lead to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Topic Modeling
