Punctuation and Predicates in Language Models
Sonakshi Chauhan, Maheep Chaudhary, Koby Choy, Samuel Nellessen, Nandi Schoots

TL;DR
This paper investigates how large language models process punctuation and reasoning components, revealing model-specific roles of punctuation and different processing strategies for reasoning rules, with implications for interpretability.
Contribution
It provides the first detailed analysis of punctuation's role in LLMs and compares how different reasoning components are internally processed across models.
Findings
Punctuation is necessary and sufficient in GPT-2 but less so in other models.
Models process reasoning components like conditionals differently across layers.
Punctuation and reasoning components influence model interpretability and internal representations.
Abstract
In this paper we explore where information is collected and how it is propagated throughout layers in large language models (LLMs). We begin by examining the surprising computational importance of punctuation tokens which previous work has identified as attention sinks and memory aids. Using intervention-based techniques, we evaluate the necessity and sufficiency (for preserving model performance) of punctuation tokens across layers in GPT-2, DeepSeek, and Gemma. Our results show stark model-specific differences: for GPT-2, punctuation is both necessary and sufficient in multiple layers, while this holds far less in DeepSeek and not at all in Gemma. Extending beyond punctuation, we ask whether LLMs process different components of input (e.g., subjects, adjectives, punctuation, full sentences) by forming early static summaries reused across the network, or if the model remains sensitive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
