Using Complex Networks to Quantify Consistency in the Use of Words
Diego R. Amancio, Osvaldo N. Oliveira Jr., Luciano da F. Costa

TL;DR
This study introduces a complex network approach to measure word usage consistency in texts, revealing patterns related to familiarity, ambiguity, and author identification, with potential applications in emotion recognition.
Contribution
The paper presents a novel method using complex networks to quantify word usage consistency and demonstrates its effectiveness in author recognition tasks.
Findings
Consistency follows a log-normal distribution, unlike Zipf's law.
Highly consistent words are used in limited semantic contexts.
Consistency indices can distinguish authors of different text types.
Abstract
In this paper we quantify the consistency of word usage in written texts represented by complex networks, where words were taken as nodes, by measuring the degree of preservation of the node neighborhood.} Words were considered highly consistent if the authors used them with the same neighborhood. When ranked according to the consistency of use, the words obeyed a log-normal distribution, in contrast to the Zipf's law that applies to the frequency of use. Consistency correlated positively with the familiarity and frequency of use, and negatively with ambiguity and age of acquisition. An inspection of some highly consistent words confirmed that they are used in very limited semantic contexts. A comparison of consistency indices for 8 authors indicated that these indices may be employed for author recognition. Indeed, as expected authors of novels could be distinguished from those who…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
