# Open-access network science: Investigating phonological similarity networks based on the SUBTLEX-US lexicon

**Authors:** John Alderete, Sarbjot Mann, Paul Tupper

PMC · DOI: 10.3758/s13428-025-02610-9 · Behavior Research Methods · 2025-02-18

## TL;DR

This paper creates and analyzes open-access phonological similarity networks using the SUBTLEX-US English corpus, showing they have typical network properties and providing tools for further exploration.

## Contribution

The novel contribution is the creation and open release of phonological similarity networks based on the SUBTLEX-US corpus with various configurations.

## Key findings

- The networks exhibit small-world properties, broad degree distributions, and robustness to node removal.
- The networks show contrasts in degree and clustering coefficient consistent with prior studies.
- The backbone network extraction reveals familiar trends related to network centrality.

## Abstract

Network science tools are becoming increasingly important to psycholinguistics, but few open-access data sets exist for exploring network properties of even well-studied languages like English. We constructed several phonological similarity networks (neighbors differ in exactly one consonant or vowel phoneme) using words from a lexicon based on the SUBTLEX-US English corpus, distinguishing networks by size and word representation (i.e., lemma vs. word form). The resulting networks are shown to exhibit many familiar characteristics, including small-world properties, broad degree distributions, and robustness to node removal, regardless of network size and word representation. We also validated the SUBTLEX phonological networks by showing that they exhibit contrasts in degree and clustering coefficient comparable to the same contrasts found in prior studies and exhibit familiar trends after extraction of a backbone network of nodes important to network centrality. The data release (https://github.com/aldo-git-bit/phonological-similarity-networks-SUBTLEX) includes 17 adjacency lists that can be further explored using the networkX package in Python, a package of files for building new adjacency lists from scratch, and several scripts that allow users to analyze and extend these results.

## Full-text entities

- **Diseases:** HML (MESH:D008607)
- **Species:** Felis catus (cat, species) [taxon 9685]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11836074/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11836074/full.md

## References

10 references — full list in the complete paper: https://tomesphere.com/paper/PMC11836074/full.md

---
Source: https://tomesphere.com/paper/PMC11836074