A Framework for Neural Topic Modeling of Text Corpora
Shayan Fazeli, Majid Sarrafzadeh

TL;DR
FAME is an open-source framework that leverages traditional and transformer-based embeddings for improved neural topic modeling and document clustering in text corpora.
Contribution
The paper introduces FAME, a novel framework integrating diverse textual features, including BERT embeddings, for enhanced topic discovery and clustering.
Findings
Effective extraction of topics using BERT embeddings
Improved clustering of semantically similar documents
Open-source availability of the FAME framework
Abstract
Topic Modeling refers to the problem of discovering the main topics that have occurred in corpora of textual data, with solutions finding crucial applications in numerous fields. In this work, inspired by the recent advancements in the Natural Language Processing domain, we introduce FAME, an open-source framework enabling an efficient mechanism of extracting and incorporating textual features and utilizing them in discovering topics and clustering text documents that are semantically similar in a corpus. These features range from traditional approaches (e.g., frequency-based) to the most recent auto-encoding embeddings from transformer-based language models such as BERT model family. To demonstrate the effectiveness of this library, we conducted experiments on the well-known News-Group dataset. The library is available online.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Computational and Text Analysis Methods
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Dropout · Dense Connections · Layer Normalization · Adam · Weight Decay · Linear Warmup With Linear Decay · Softmax
