Deep Autoencoder-based Fuzzy C-Means for Topic Detection

Hendri Murfi; Natasha Rosaline; Nora Hariadi

arXiv:2102.02636·cs.IR·December 28, 2021

Deep Autoencoder-based Fuzzy C-Means for Topic Detection

Hendri Murfi, Natasha Rosaline, Nora Hariadi

PDF

TL;DR

This paper introduces a novel deep autoencoder-based fuzzy c-means clustering method for topic detection in text data, combining deep learning with fuzzy clustering to improve topic coherence and interpretability.

Contribution

It proposes DFCM, a new hybrid approach that leverages deep autoencoders for feature learning and fuzzy c-means for clustering, enhancing existing topic detection techniques.

Findings

01

DFCM improves coherence scores over eigenspace fuzzy c-means.

02

DFCM is comparable to NMF and LDA in performance.

03

The method effectively captures meaningful topics from textual data.

Abstract

Topic detection is a process for determining topics from a collection of textual data. One of the topic detection methods is a clustering-based method, which assumes that the centroids are topics. The clustering method has the advantage that it can process data with negative representations. Therefore, the clustering method allows a combination with a broader representation learning method. In this paper, we adopt deep learning for topic detection by using a deep autoencoder and fuzzy c-means called deep autoencoder-based fuzzy c-means (DFCM). The encoder of the autoencoder performs a lower-dimensional representation learning. Fuzzy c-means groups the lower-dimensional representation to identify the centroids. The autoencoder's decoder transforms back the centroids into the original representation to be interpreted as the topics. Our simulation shows that DFCM improves the coherence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSolana Customer Service Number +1-833-534-1729