Initial Comparison of Linguistic Networks Measures for Parallel Texts
Kristina Ban, Ana Me\v{s}trovi\'c, Sanda Martin\v{c}i\'c-Ip\v{s}i\'c

TL;DR
This study explores Croatian syllable networks using complex network analysis, revealing small-world properties and similarities with Portuguese and Chinese syllable networks, contributing to understanding language structure through network measures.
Contribution
It introduces the analysis of Croatian syllable networks and compares their properties with those of Portuguese and Chinese, highlighting their small-world characteristics.
Findings
Croatian syllable networks have high clustering coefficients.
They exhibit properties of small-world networks.
Croatian networks are similar to Portuguese and Chinese networks.
Abstract
This paper presents preliminary results of Croatian syllable networks analysis. Syllable network is a network in which nodes are syllables and links between them are constructed according to their connections within words. In this paper we analyze networks of syllables generated from texts collected from the Croatian Wikipedia and Blogs. As a main tool we use complex network analysis methods which provide mechanisms that can reveal new patterns in a language structure. We aim to show that syllable networks have much higher clustering coefficient in comparison to Erd\"os-Renyi random networks. The results indicate that Croatian syllable networks exhibit certain properties of a small world networks. Furthermore, we compared Croatian syllable networks with Portuguese and Chinese syllable networks and we showed that they have similar properties.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Opinion Dynamics and Social Influence · Topological and Geometric Data Analysis
