Neural Attention-Aware Hierarchical Topic Model

Yuan Jin; He Zhao; Ming Liu; Lan Du; Wray Buntine

arXiv:2110.07161·cs.CL·October 15, 2021

Neural Attention-Aware Hierarchical Topic Model

Yuan Jin, He Zhao, Ming Liu, Lan Du, Wray Buntine

PDF

Open Access

TL;DR

This paper introduces a neural hierarchical topic model that incorporates sentence-level information and external semantic knowledge, improving topic coherence and reconstruction accuracy in document modeling.

Contribution

It proposes a variational autoencoder-based model that integrates sentence-level data and pre-trained embeddings with hierarchical regularization, a novel approach in neural topic modeling.

Findings

01

Lowered reconstruction errors at sentence and document levels.

02

Discovered more coherent and meaningful topics.

03

Enhanced utilization of semantic knowledge in topic modeling.

Abstract

Neural topic models (NTMs) apply deep neural networks to topic modelling. Despite their success, NTMs generally ignore two important aspects: (1) only document-level word count information is utilized for the training, while more fine-grained sentence-level information is ignored, and (2) external semantic knowledge regarding documents, sentences and words are not exploited for the training. To address these issues, we propose a variational autoencoder (VAE) NTM model that jointly reconstructs the sentence and document word counts using combinations of bag-of-words (BoW) topical embeddings and pre-trained semantic embeddings. The pre-trained embeddings are first transformed into a common latent topical space to align their semantics with the BoW embeddings. Our model also features hierarchical KL divergence to leverage embeddings of each document to regularize those of their sentences,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods