Multi-view and Multi-source Transfers in Neural Topic Modeling with   Pretrained Topic and Word Embeddings

Pankaj Gupta; Yatin Chaudhary; Hinrich Sch\"utze

arXiv:1909.06563·cs.CL·September 18, 2019

Multi-view and Multi-source Transfers in Neural Topic Modeling with Pretrained Topic and Word Embeddings

Pankaj Gupta, Yatin Chaudhary, Hinrich Sch\"utze

PDF

Open Access

TL;DR

This paper introduces a transfer learning framework for neural topic modeling that leverages pre-trained latent topics and word embeddings from source corpora to improve topic quality and handle data sparsity in target domains.

Contribution

It proposes a novel method to transfer latent topics and word representations from source to target, enhancing neural topic models especially for sparse or short texts.

Findings

01

Achieved state-of-the-art results on multiple datasets.

02

Improved topic coherence and interpretability.

03

Enhanced generalization in various text domains.

Abstract

Though word embeddings and topics are complementary representations, several past works have only used pre-trained word embeddings in (neural) topic modeling to address data sparsity problem in short text or small collection of documents. However, no prior work has employed (pre-trained latent) topics in transfer learning paradigm. In this paper, we propose an approach to (1) perform knowledge transfer using latent topics obtained from a large source corpus, and (2) jointly transfer knowledge via the two representations (or views) in neural topic modeling to improve topic quality, better deal with polysemy and data sparsity issues in a target corpus. In doing so, we first accumulate topics and word representations from one or many source corpora to build a pool of topics and word vectors. Then, we identify one or multiple relevant source domain(s) and take advantage of corresponding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Advanced Text Analysis Techniques

MethodsInterpretability