Identifiability and Unmixing of Latent Parse Trees
Daniel Hsu, Sham M. Kakade, Percy Liang

TL;DR
This paper investigates the identifiability of unsupervised parsing models and proposes a novel unmixing strategy for efficient parameter estimation in identifiable models, addressing challenges in both model analysis and learning.
Contribution
It introduces a numerical method for checking model identifiability and develops a new unmixing approach for parameter estimation in complex parsing models.
Findings
Identifiability can be checked using Jacobian rank analysis.
Unmixing enables efficient parameter estimation for certain parsing models.
Spectral methods are limited by parse tree topology variability.
Abstract
This paper explores unsupervised learning of parsing models along two directions. First, which models are identifiable from infinite data? We use a general technique for numerically checking identifiability based on the rank of a Jacobian matrix, and apply it to several standard constituency and dependency parsing models. Second, for identifiable models, how do we estimate the parameters efficiently? EM suffers from local optima, while recent work using spectral methods cannot be directly applied since the topology of the parse tree varies across sentences. We develop a strategy, unmixing, which deals with this additional complexity for restricted classes of parsing models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Machine Learning and Algorithms · Topic Modeling
