Topic Analysis of Superconductivity Literature by Semantic Non-negative Matrix Factorization
Valentin Stanev, Erik Skau, Ichiro Takeuchi, Boian S. Alexandrov

TL;DR
This paper introduces SeNMFk, an advanced topic modeling method that incorporates semantic structure and robust topic number determination to analyze superconductivity literature, revealing layered, coherent scientific topics.
Contribution
The paper presents SeNMFk, a novel extension of NMF that integrates semantic information and robustly identifies the number of topics in scientific texts.
Findings
SeNMFk extracts coherent, validated topics from superconductivity abstracts.
Topics vary from broad concepts to specific effects or techniques.
Most topics are specialized, with only a few being highly prevalent.
Abstract
We utilize a recently developed topic modeling method called SeNMFk, extending the standard Non-negative Matrix Factorization (NMF) methods by incorporating the semantic structure of the text, and adding a robust system for determining the number of topics. With SeNMFk, we were able to extract coherent topics validated by human experts. From these topics, a few are relatively general and cover broad concepts, while the majority can be precisely mapped to specific scientific effects or measurement techniques. The topics also differ by ubiquity, with only three topics prevalent in almost 40 percent of the abstract, while each specific topic tends to dominate a small subset of the abstracts. These results demonstrate the ability of SeNMFk to produce a layered and nuanced analysis of large scientific corpora.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Expert finding and Q&A systems
