On the challenges of learning with inference networks on sparse, high-dimensional data
Rahul G. Krishnan, Dawen Liang, Matthew Hoffman

TL;DR
This paper investigates the challenges of using inference networks for parameter estimation in deep neural network-based nonlinear factor analysis models, especially on sparse, high-dimensional data, and proposes methods to mitigate underfitting.
Contribution
It identifies underfitting as a key issue in modeling sparse, high-dimensional data with inference networks and introduces iterative optimization techniques to improve model fitting.
Findings
Achieves state-of-the-art results on a benchmark text-count dataset.
Demonstrates excellent performance on top-N recommendation tasks.
Shows that proposed methods significantly reduce underfitting in sparse data modeling.
Abstract
We study parameter estimation in Nonlinear Factor Analysis (NFA) where the generative model is parameterized by a deep neural network. Recent work has focused on learning such models using inference (or recognition) networks; we identify a crucial problem when modeling large, sparse, high-dimensional datasets -- underfitting. We study the extent of underfitting, highlighting that its severity increases with the sparsity of the data. We propose methods to tackle it via iterative optimization inspired by stochastic variational inference \citep{hoffman2013stochastic} and improvements in the sparse data representation used for inference. The proposed techniques drastically improve the ability of these powerful models to fit sparse data, achieving state-of-the-art results on a benchmark text-count dataset and excellent results on the task of top-N recommendation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference · Bayesian Methods and Mixture Models
