Identifiability and Unmixing of Latent Parse Trees

Daniel Hsu; Sham M. Kakade; Percy Liang

arXiv:1206.3137·stat.ML·June 15, 2012·23 cites

Identifiability and Unmixing of Latent Parse Trees

Daniel Hsu, Sham M. Kakade, Percy Liang

PDF

Open Access

TL;DR

This paper investigates the identifiability of unsupervised parsing models and proposes a novel unmixing strategy for efficient parameter estimation in identifiable models, addressing challenges in both model analysis and learning.

Contribution

It introduces a numerical method for checking model identifiability and develops a new unmixing approach for parameter estimation in complex parsing models.

Findings

01

Identifiability can be checked using Jacobian rank analysis.

02

Unmixing enables efficient parameter estimation for certain parsing models.

03

Spectral methods are limited by parse tree topology variability.

Abstract

This paper explores unsupervised learning of parsing models along two directions. First, which models are identifiable from infinite data? We use a general technique for numerically checking identifiability based on the rank of a Jacobian matrix, and apply it to several standard constituency and dependency parsing models. Second, for identifiable models, how do we estimate the parameters efficiently? EM suffers from local optima, while recent work using spectral methods cannot be directly applied since the topology of the parse tree varies across sentences. We develop a strategy, unmixing, which deals with this additional complexity for restricted classes of parsing models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Machine Learning and Algorithms · Topic Modeling