Linear socio-demographic representations emerge in Large Language Models from indirect cues
Paul Bouchaud, Pedro Ramaciotti

TL;DR
This paper reveals that large language models encode sociodemographic attributes in linear, interpretable ways from indirect cues like names and occupations, influencing their responses and potential biases.
Contribution
It demonstrates that LLMs develop linear demographic representations from implicit cues and shows how these influence downstream tasks and biases.
Findings
Linear demographic representations are encoded in LLM activation spaces.
Names and occupations trigger demographic cues aligned with real-world data.
Implicit demographic biases affect LLM behavior and decision-making.
Abstract
We investigate how LLMs encode sociodemographic attributes of human conversational partners inferred from indirect cues such as names and occupations. We show that LLMs develop linear representations of user demographics within activation space, wherein stereotypically associated attributes are encoded along interpretable geometric directions. We first probe residual streams across layers of four open transformer-based LLMs (Magistral 24B, Qwen3 14B, GPT-OSS 20B, OLMo2-1B) prompted with explicit demographic disclosure. We show that the same probes predict demographics from implicit cues: names activate census-aligned gender and race representations, while occupations trigger representations correlated with real-world workforce statistics. These linear representations allow us to explain demographic inferences implicitly formed by LLMs during conversation. We demonstrate that these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Topic Modeling · Language and cultural evolution
