Unsupervised Grammar Induction with Depth-bounded PCFG

Lifeng Jin; Finale Doshi-Velez; Timothy Miller; William Schuler; Lane; Schwartz

arXiv:1802.08545·cs.CL·February 27, 2018

Unsupervised Grammar Induction with Depth-bounded PCFG

Lifeng Jin, Finale Doshi-Velez, Timothy Miller, William Schuler, Lane, Schwartz

PDF

Open Access 1 Repo

TL;DR

This paper introduces a depth-bounded probabilistic context-free grammar induction model that improves parse accuracy and category label consistency in grammar acquisition from speech and text.

Contribution

It extends depth-bounding to PCFG induction, demonstrating enhanced accuracy and category label consistency over previous models.

Findings

01

Outperforms or matches existing models in parse accuracy.

02

Achieves consistent use of category labels in acquired grammars.

03

Effective on both child-directed speech and newswire text.

Abstract

There has been recent interest in applying cognitively or empirically motivated bounds on recursion depth to limit the search space of grammar induction models (Ponvert et al., 2011; Noji and Johnson, 2016; Shain et al., 2016). This work extends this depth-bounding approach to probabilistic context-free grammar induction (DB-PCFG), which has a smaller parameter space than hierarchical sequence models, and therefore more fully exploits the space reductions of depth-bounding. Results for this model on grammar acquisition from transcribed child-directed speech and newswire text exceed or are competitive with those of other models when evaluated on parse accuracy. Moreover, gram- mars acquired from this model demonstrate a consistent use of category labels, something which has not been demonstrated by other acquisition models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lifengjin/db-pcfg
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech and dialogue systems · Speech Recognition and Synthesis