Task Ecologies and the Evolution of World-Tracking Representations in Large Language Models
Giulio Valentino Dalla Riva

TL;DR
This paper develops a theoretical framework to understand when language models develop world-tracking representations, analyzing their ecological veridicality and failure modes through a formal decomposition.
Contribution
It introduces a precise notion of ecological veridicality, characterizes when fixed encodings suffice, and validates the theory with experiments on small language models.
Findings
The excess cross-entropy vanishes if encoding preserves training ecology's equivalence classes.
Frozen transformers satisfy the fixed-encoding analysis, while in-context learning does not.
Models can still incur positive excess on refined deployment ecologies, indicating limitations.
Abstract
We study language models as evolving model organisms and ask when autoregressive next-token learning selects for world-tracking representations. For any encoding of latent world states, the Bayes-optimal next-token cross-entropy decomposes into the irreducible conditional entropy plus a Jensen--Shannon excess term. That excess vanishes if and only if the encoding preserves the training ecology's equivalence classes. This yields a precise notion of ecological veridicality for language models and identifies the minimum-complexity zero-excess solution as the quotient partition by training equivalence. We then determine when this fixed-encoding analysis applies to transformer families: frozen dense and frozen Mixture-of-Experts transformers satisfy it, in-context learning does not enlarge the model's separation set, and per-task adaptation breaks the premise. The framework predicts two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
