Folksonomies and clustering in the collaborative system CiteULike
Andrea Capocci, Guido Caldarelli

TL;DR
This paper analyzes the semantic relationships in CiteULike, a collaborative tagging system, revealing how clustering coefficients can improve data classification and spam detection.
Contribution
It introduces a graph-based approach to study tag semantics and demonstrates how clustering coefficients reflect meaningful semantic patterns.
Findings
Clustering coefficient indicates semantic patterns among tags
Semantic relationships can enhance data classification
Clustering metrics assist in spam detection
Abstract
We analyze CiteULike, an online collaborative tagging system where users bookmark and annotate scientific papers. Such a system can be naturally represented as a tripartite graph whose nodes represent papers, users and tags connected by individual tag assignments. The semantics of tags is studied here, in order to uncover the hidden relationships between tags. We find that the clustering coefficient reflects the semantical patterns among tags, providing useful ideas for the designing of more efficient methods of data classification and spam detection.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
