Text as Statistical Mechanics Object
K.Koroutchev, E.Korutcheva

TL;DR
This paper models human written text using a statistical mechanics framework, deriving potential energy from large corpora, and finds that specific heat distinguishes between closed class words and specific terms.
Contribution
It introduces a novel statistical mechanics model for text analysis, linking thermodynamic concepts to linguistic features.
Findings
Specific heat parameter separates closed class words from content words.
Numerical validation confirms the model's effectiveness.
Potential energy derivation from corpus data enhances text analysis.
Abstract
In this article we present a model of human written text based on statistical mechanics approach by deriving the potential energy for different parts of the text using large text corpus. We have checked the results numerically and found that the specific heat parameter effectively separates the closed class words from the specific terms used in the text.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Computational Physics and Python Applications
