Topic Analysis with Side Information: A Neural-Augmented LDA Approach

Biyi Fang; Truong Vo; Kripa Rajshekhar; Diego Klabjan

arXiv:2510.24918·cs.LG·November 4, 2025

Topic Analysis with Side Information: A Neural-Augmented LDA Approach

Biyi Fang, Truong Vo, Kripa Rajshekhar, Diego Klabjan

PDF

TL;DR

This paper introduces nnLDA, a neural-augmented topic model that effectively incorporates auxiliary information to improve topic coherence, interpretability, and downstream task performance over traditional LDA.

Contribution

The paper proposes nnLDA, a novel neural-augmented probabilistic model that dynamically integrates side information via a neural prior, enhancing traditional topic modeling capabilities.

Findings

01

nnLDA outperforms LDA and DMR in topic coherence.

02

The model achieves lower perplexity on benchmark datasets.

03

It improves downstream classification accuracy.

Abstract

Traditional topic models such as Latent Dirichlet Allocation (LDA) have been widely used to uncover latent structures in text corpora, but they often struggle to integrate auxiliary information such as metadata, user attributes, or document labels. These limitations restrict their expressiveness, personalization, and interpretability. To address this, we propose nnLDA, a neural-augmented probabilistic topic model that dynamically incorporates side information through a neural prior mechanism. nnLDA models each document as a mixture of latent topics, where the prior over topic proportions is generated by a neural network conditioned on auxiliary features. This design allows the model to capture complex nonlinear interactions between side information and topic distributions that static Dirichlet priors cannot represent. We develop a stochastic variational Expectation-Maximization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.