Ground truth? Concept-based communities versus the external   classification of physics manuscripts

Vasyl Palchykov; Valerio Gemmetto; Alexey Boyarsky; Diego; Garlaschelli

arXiv:1602.08451·cs.DL·August 25, 2016

Ground truth? Concept-based communities versus the external classification of physics manuscripts

Vasyl Palchykov, Valerio Gemmetto, Alexey Boyarsky, Diego, Garlaschelli

PDF

TL;DR

This study compares community detection results on scientific publication networks with expert-made classifications, revealing discrepancies that highlight interdisciplinary overlaps and methodological similarities not captured by external labels.

Contribution

It demonstrates that external classifications may not fully represent the thematic structure of scientific communities, emphasizing the importance of analyzing intrinsic network-based communities.

Findings

01

Community detection partly aligns with expert classifications

02

Discrepancies reveal interdisciplinary overlaps

03

Methodological similarities are often overlooked

Abstract

Community detection techniques are widely used to infer hidden structures within interconnected systems. Despite demonstrating high accuracy on benchmarks, they reproduce the external classification for many real-world systems with a significant level of discrepancy. A widely accepted reason behind such outcome is the unavoidable loss of non-topological information (such as node attributes) encountered when the original complex system is represented as a network. In this article we emphasize that the observed discrepancies may also be caused by a different reason: the external classification itself. For this end we use scientific publication data which i) exhibit a well defined modular structure and ii) hold an expert-made classification of research articles. Having represented the articles and the extracted scientific concepts both as a bipartite network and as its unipartite…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.