# Characterizing the Large‐Scale Structure of Multimodal Semantic Networks

**Authors:** Raja Marjieh, Pol van Rijn, Ilia Sucholutsky, Harin Lee, Nori Jacoby, Thomas L. Griffiths

PMC · DOI: 10.1111/cogs.70131 · Cognitive Science · 2025-10-23

## TL;DR

This paper explores how semantic networks, based on human responses to natural stimuli, form small-world structures and relate to cognitive processes.

## Contribution

The study introduces multimodal semantic networks derived from naturalistic stimuli and reveals their truncated power-law degree distribution.

## Key findings

- Multimodal semantic networks exhibit a small-world structure with truncated power-law degree distributions.
- These networks predict human sensory judgments and reaction times in lexical tasks.
- Multimodal and lexical networks share overlapping themes, both showing truncated degree distributions.

## Abstract

Humans organize semantic knowledge into complex networks that encode relations between concepts. The structure of those networks has broad implications for human cognitive processes, and for theories of semantic development. Evidence from large lexical networks such as those derived from word associations suggest that semantic networks are characterized by high sparsity and clustering while maintaining short average paths between concepts, a phenomenon known as a “small‐world” network. It has also been argued that those networks are “scale‐free,” meaning that the number of connections (or degree) between concepts follows a power‐law distribution, whereby most concepts have few connections, while a few have many. However, the scale‐free property is still debated, and the extent to which the lexical evidence reflects the naturally occurring semantic regularities of the environment has not been investigated systematically. To address this, we collected and analyzed semantic descriptors, human evaluations, and similarity judgments from four large datasets of naturalistic stimuli across three modalities (visual, auditory, and audio‐visual) comprising 7916 stimuli and 610,841 human responses. By connecting concepts that co‐occur as descriptors of the same stimuli, we construct “multimodal” semantic networks. We show that these networks exhibit a clear small‐world structure with a degree distribution that is best captured by a truncated power law (i.e., the most‐connected concepts are less common than predicted by a perfect power law). We further show that these networks are predictive of human sensory judgments on these domains, as well as reaction times in an independent lexical decision task. Finally, we show that multimodal networks also share overlapping themes with previously analyzed lexical networks, which upon a more rigorous reanalysis are revealed to be truncated too. Our findings shed new light on the origins of the structure of semantic networks by tying it to the semantic regularities of the environment.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12550224/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12550224/full.md

## References

70 references — full list in the complete paper: https://tomesphere.com/paper/PMC12550224/full.md

---
Source: https://tomesphere.com/paper/PMC12550224