TL;DR
This paper introduces a neural topic modeling framework that leverages multi-view, multi-source embeddings, including pretrained topic and word embeddings, to enhance topic quality and handle polysemy across diverse document collections.
Contribution
It proposes a novel multi-source, multi-view embedding approach for neural topic modeling, improving performance on various datasets by integrating multiple pretrained embeddings.
Findings
Achieved state-of-the-art results on multiple datasets.
Improved topic coherence and perplexity scores.
Effective handling of polysemy and data sparsity.
Abstract
Though word embeddings and topics are complementary representations, several past works have only used pretrained word embeddings in (neural) topic modeling to address data sparsity in short-text or small collection of documents. This work presents a novel neural topic modeling framework using multi-view embedding spaces: (1) pretrained topic-embeddings, and (2) pretrained word-embeddings (context insensitive from Glove and context-sensitive from BERT models) jointly from one or many sources to improve topic quality and better deal with polysemy. In doing so, we first build respective pools of pretrained topic (i.e., TopicPool) and word embeddings (i.e., WordPool). We then identify one or more relevant source domain(s) and transfer knowledge to guide meaningful learning in the sparse target domain. Within neural topic modeling, we quantify the quality of topics and document…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Dropout · Adam · Dense Connections · GloVe Embeddings · Softmax · Linear Warmup With Linear Decay
