Modeling the average shortest path length in growth of word-adjacency networks
Andrzej Kulig, Stanislaw Drozdz, Jaroslaw Kwapien, Pawel Oswiecimka

TL;DR
This paper studies the growth and properties of word-adjacency networks in texts, revealing unique shortest path length behaviors and proposing an extended model that better captures these linguistic network characteristics.
Contribution
It introduces an extended network model incorporating local chain-like growth to accurately replicate empirical shortest path length asymptotics in linguistic networks.
Findings
Empirical networks show novel shortest path length dependence on network size.
Standard models fail to reproduce the observed asymptotics.
The extended model with local linear growth matches empirical results.
Abstract
We investigate properties of evolving linguistic networks defined by the word-adjacency relation. Such networks belong to the category of networks with accelerated growth but their shortest path length appears to reveal the network size dependence of different functional form than the ones known so far. We thus compare the networks created from literary texts with their artificial substitutes based on different variants of the Dorogovtsev-Mendes model and observe that none of them is able to properly simulate the novel asymptotics of the shortest path length. Then, we identify the local chain-like linear growth induced by grammar and style as a missing element in this model and extend it by incorporating such effects. It is in this way that a satisfactory agreement with the empirical result is obtained.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
