Clustering and Classification in Text Collections Using Graph Modularity
Grigory Pivovarov, Sergei Trunov

TL;DR
The paper introduces a fast algorithm for clustering and classifying large text collections using bipartite graph modularity, achieving high quality results and record-breaking speed.
Contribution
It presents a novel algorithm that leverages bipartite graph modularity for efficient text clustering and classification, outperforming existing methods in speed.
Findings
Competitive clustering and classification quality.
Record-breaking processing speed.
Effective use of bipartite graph modularity.
Abstract
A new fast algorithm for clustering and classification of large collections of text documents is introduced. The new algorithm employs the bipartite graph that realizes the word-document matrix of the collection. Namely, the modularity of the bipartite graph is used as the optimization functional. Experiments performed with the new algorithm on a number of text collections had shown a competitive quality of the clustering (classification), and a record-breaking speed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Web Data Mining and Analysis · Advanced Graph Neural Networks
