Inflection system of a language as a complex network
Henryk Fuk\'s

TL;DR
This paper models Latin inflection structures as complex bipartite networks, revealing insights into word groupings and inflection possibilities, with implications for linguistic analysis and language processing.
Contribution
It introduces a novel network-based approach to analyze inflection structures, including the construction of bipartite graphs and the study of their component distributions.
Findings
Large connected components correspond to word groups.
Distribution of component sizes resembles percolation cluster distributions.
Coverage curves can be constructed from the network structure.
Abstract
We investigate inflection structure of a synthetic language using Latin as an example. We construct a bipartite graph in which one group of vertices correspond to dictionary headwords and the other group to inflected forms encountered in a given text. Each inflected form is connected to its corresponding headword, which in some cases in non-unique. The resulting sparse graph decomposes into a large number of connected components, to be called word groups. We then show how the concept of the word group can be used to construct coverage curves of selected Latin texts. We also investigate a version of the inflection graph in which all theoretically possible inflected forms are included. Distribution of sizes of connected components of this graphs resembles cluster distribution in a lattice percolation near the critical point.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
