DOLDA - a regularized supervised topic model for high-dimensional   multi-class regression

M{\aa}ns Magnusson; Leif Jonsson; Mattias Villani

arXiv:1602.00260·stat.ML·October 21, 2016·Comput. Stat.

DOLDA - a regularized supervised topic model for high-dimensional multi-class regression

M{\aa}ns Magnusson, Leif Jonsson, Mattias Villani

PDF

1 Repo

TL;DR

DOLDA is a supervised topic model designed for high-dimensional multi-class classification, combining the Diagonal Orthant probit model with a Horseshoe prior for efficient variable selection and interpretability.

Contribution

The paper introduces DOLDA, a novel supervised topic model that effectively handles many classes and covariates with a parallel Gibbs sampler for improved computational efficiency.

Findings

01

DOLDA achieves high predictive accuracy on real datasets.

02

The model provides interpretable topics linked directly to classes.

03

Efficient parallel Gibbs sampling enhances scalability.

Abstract

Generating user interpretable multi-class predictions in data rich environments with many classes and explanatory covariates is a daunting task. We introduce Diagonal Orthant Latent Dirichlet Allocation (DOLDA), a supervised topic model for multi-class classification that can handle both many classes as well as many covariates. To handle many classes we use the recently proposed Diagonal Orthant (DO) probit model (Johndrow et al., 2013) together with an efficient Horseshoe prior for variable selection/shrinkage (Carvalho et al., 2010). We propose a computationally efficient parallel Gibbs sampler for the new model. An important advantage of DOLDA is that learned topics are directly connected to individual classes without the need for a reference class. We evaluate the model's predictive accuracy on two datasets and demonstrate DOLDA's advantage in interpreting the generated predictions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lejon/DiagonalOrthantLDA
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.