Emergence and Localisation of Semantic Role Circuits in LLMs
Nura Aljaafari, Danilo S. Carvalho, Andr\'e Freitas

TL;DR
This paper investigates how large language models internally represent semantic roles, revealing compact, causally isolated circuits that develop gradually and partially transfer across models and scales.
Contribution
It introduces a novel method combining role-cross minimal pairs, emergence analysis, and cross-model comparison to characterize semantic role circuits in LLMs.
Findings
Highly concentrated semantic role circuits (89-94% attribution within 28 nodes)
Gradual structural refinement of semantic circuits, bypassing localised circuits in larger models
Moderate cross-scale component overlap (24-59%) with high spectral similarity
Abstract
Despite displaying semantic competence, large language models' internal mechanisms that ground abstract semantic structure remain insufficiently characterised. We propose a method integrating role-cross minimal pairs, temporal emergence analysis, and cross-model comparison to study how LLMs implement semantic roles. Our analysis uncovers: (i) highly concentrated circuits (89-94% attribution within 28 nodes); (ii) gradual structural refinement rather than phase transitions, with larger models sometimes bypassing localised circuits; and (iii) moderate cross-scale conservation (24-59% component overlap) alongside high spectral similarity. These findings suggest that LLMs form compact, causally isolated mechanisms for abstract semantic structure, and these mechanisms exhibit partial transfer across scales and architectures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage and cultural evolution · Topic Modeling · Natural Language Processing Techniques
