Clustering via Content-Augmented Stochastic Blockmodels
J. Massey Cashore, Xiaoting Zhao, Alexander A. Alemi, Yujia Liu, Peter, I. Frazier

TL;DR
This paper introduces content-augmented stochastic blockmodels (CASB) that integrate item content with interaction data to improve clustering accuracy in bipartite graphs, outperforming existing methods.
Contribution
The paper proposes CASB, a novel model combining content and interaction data for better community detection in bipartite graphs.
Findings
CASB achieves higher clustering accuracy than benchmark methods.
Content integration significantly improves community detection.
Validated on scientific interaction datasets.
Abstract
Much of the data being created on the web contains interactions between users and items. Stochastic blockmodels, and other methods for community detection and clustering of bipartite graphs, can infer latent user communities and latent item clusters from this interaction data. These methods, however, typically ignore the items' contents and the information they provide about item clusters, despite the tendency of items in the same latent cluster to share commonalities in content. We introduce content-augmented stochastic blockmodels (CASB), which use item content together with user-item interaction data to enhance the user communities and item clusters learned. Comparisons to several state-of-the-art benchmark methods, on datasets arising from scientists interacting with scientific articles, show that content-augmented stochastic blockmodels provide highly accurate clusters with respect…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Data Stream Mining Techniques · Advanced Clustering Algorithms Research
