Quantifying Knowledge Production Efficiency with Thermodynamics: A Data-Driven Study of Scientific Concepts
Artem Chumachenko, Brett Buttliere

TL;DR
This paper uses thermodynamics to study how scientific concepts evolve over time by analyzing their frequency in physics papers.
Contribution
It introduces a data-driven framework using maximum entropy and thermodynamic principles to quantify concept dynamics.
Findings
Two regimes of concept dynamics—stable and driven—were identified, separated by a transition near criticality.
A residual-information measure quantifies departures from equilibrium in concept evolution.
Efficiency indicators describe how concepts maintain or reorganize their structure over time.
Abstract
We develop a data-driven framework for analyzing how scientific concepts evolve through their empirical in-text frequency distributions in large text corpora. For each concept, the observed distribution is paired with a maximum entropy equilibrium reference, which takes a generalized Boltzmann form determined by two measurable statistical moments. Using data from more than 500,000 physics papers (about 13,000 concepts, 2000–2018), we reconstruct the temporal trajectories of the associated MaxEnt parameters and entropy measures, and we identify two characteristic regimes of concept dynamics, stable and driven, separated by a transition point near criticality. Departures from equilibrium are quantified using a residual-information measure that captures how much structure a concept exhibits beyond its equilibrium baseline. To analyze temporal change, we adapt the Hatano–Sasa and…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage and cultural evolution · Authorship Attribution and Profiling · Computational and Text Analysis Methods
