Identifying Quantum Mechanical Statistics in Italian Corpora
Diederik Aerts, Jonito Aerts Argu\"elles, Lester Beltran, Massimiliano, Sassoli de Bianchi, Sandro Sozzo

TL;DR
This paper investigates the statistical distribution of words in Italian texts, revealing they follow Bose-Einstein statistics and exhibit quantum-like clustering due to meaning and context, with implications for understanding language and physics.
Contribution
It generalizes a theoretical framework to identify quantum statistical patterns in language and demonstrates their presence in Italian texts, extending previous findings from English.
Findings
Words follow Bose-Einstein statistics in Italian texts
Word clustering is linked to meaning and context, resembling quantum entanglement
Word randomization reduces quantum effects, akin to increasing temperature in physics
Abstract
We present a theoretical and empirical investigation of the statistical behaviour of the words in a text produced by human language. To this aim, we analyse the word distribution of various texts of Italian language selected from a specific literary corpus. We firstly generalise a theoretical framework elaborated by ourselves to identify 'quantum mechanical statistics' in large-size texts. Then, we show that, in all analysed texts, words distribute according to 'Bose--Einstein statistics' and show significant deviations from 'Maxwell--Boltzmann statistics'. Next, we introduce an effect of 'word randomization' which instead indicates that the difference between the two statistical models is not as pronounced as in the original cases. These results confirm the empirical patterns obtained in texts of English language and strongly indicate that identical words tend to 'clump together' as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
