Learning the Structure of Generative Models without Labeled Data

Stephen H. Bach; Bryan He; Alexander Ratner; Christopher R\'e

arXiv:1703.00854·cs.LG·September 12, 2017·71 cites

Learning the Structure of Generative Models without Labeled Data

Stephen H. Bach, Bryan He, Alexander Ratner, Christopher R\'e

PDF

Open Access

TL;DR

This paper introduces a new method for automatically estimating the dependency structure of generative models used in weak supervision, improving efficiency and accuracy without labeled data.

Contribution

It proposes an $oldsymbol{ ext{l}_1}$-regularized pseudolikelihood approach for structure learning that is faster and more precise than existing methods, requiring less data.

Findings

01

Method is 100× faster than maximum likelihood approaches.

02

Selects fewer extraneous dependencies, reducing false positives.

03

Improves F1 score by 1.5 points on real-world data.

Abstract

Curating labeled training data has become the primary bottleneck in machine learning. Recent frameworks address this bottleneck with generative models to synthesize labels at scale from weak supervision sources. The generative model's dependency structure directly affects the quality of the estimated labels, but selecting a structure automatically without any labeled data is a distinct challenge. We propose a structure estimation method that maximizes the $ℓ_{1}$ -regularized marginal pseudolikelihood of the observed data. Our analysis shows that the amount of unlabeled data required to identify the true structure scales sublinearly in the number of possible dependencies for a broad class of models. Simulations show that our method is 100 $\times$ faster than a maximum likelihood approach and selects $1/4$ as many extraneous dependencies. We also show that our method provides an average…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Machine Learning and Data Classification · Generative Adversarial Networks and Image Synthesis