Representing the Disciplinary Structure of Physics: A Comparative Evaluation of Graph and Text Embedding Methods
Isabel Constantino, Sadamori Kojaku, Santo Fortunato, Yong-Yeol Ahn

TL;DR
This study compares graph and text embedding methods for representing the hierarchical structure of physics research classifications, finding neural network-based approaches outperform traditional methods in capturing scientific knowledge structures.
Contribution
It provides a comparative evaluation of graph and text embeddings in representing the PACS hierarchy, highlighting the superiority of neural network-based methods.
Findings
Neural network-based embeddings outperform traditional methods.
Graph embedding methods like node2vec better capture PACS structure.
Results suggest potential for combining methods for better interpretability.
Abstract
Recent advances in machine learning offer new ways to represent and study scholarly works and the space of knowledge. Graph and text embeddings provide a convenient vector representation of scholarly works based on citations and text. Yet, it is unclear whether their representations are consistent or provide different views of the structure of science. Here, we compare graph and text embedding by testing their ability to capture the hierarchical structure of the Physics and Astronomy Classification Scheme (PACS) of papers published by the American Physical Society (APS). We also provide a qualitative comparison of the overall structure of the graph and text embeddings for reference. We find that neural network-based methods outperform traditional methods and graph embedding methods such as node2vec are better than other methods at capturing the PACS structure. Our results call for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsscientometrics and bibliometrics research · Advanced Text Analysis Techniques · Biomedical Text Mining and Ontologies
