Unsupervised Grammar Induction with Depth-bounded PCFG
Lifeng Jin, Finale Doshi-Velez, Timothy Miller, William Schuler, Lane, Schwartz

TL;DR
This paper introduces a depth-bounded probabilistic context-free grammar induction model that improves parse accuracy and category label consistency in grammar acquisition from speech and text.
Contribution
It extends depth-bounding to PCFG induction, demonstrating enhanced accuracy and category label consistency over previous models.
Findings
Outperforms or matches existing models in parse accuracy.
Achieves consistent use of category labels in acquired grammars.
Effective on both child-directed speech and newswire text.
Abstract
There has been recent interest in applying cognitively or empirically motivated bounds on recursion depth to limit the search space of grammar induction models (Ponvert et al., 2011; Noji and Johnson, 2016; Shain et al., 2016). This work extends this depth-bounding approach to probabilistic context-free grammar induction (DB-PCFG), which has a smaller parameter space than hierarchical sequence models, and therefore more fully exploits the space reductions of depth-bounding. Results for this model on grammar acquisition from transcribed child-directed speech and newswire text exceed or are competitive with those of other models when evaluated on parse accuracy. Moreover, gram- mars acquired from this model demonstrate a consistent use of category labels, something which has not been demonstrated by other acquisition models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Speech Recognition and Synthesis
