Task Ecologies and the Evolution of World-Tracking Representations in Large Language Models

Giulio Valentino Dalla Riva

arXiv:2604.05469·stat.ME·April 8, 2026

Task Ecologies and the Evolution of World-Tracking Representations in Large Language Models

Giulio Valentino Dalla Riva

PDF

TL;DR

This paper develops a theoretical framework to understand when language models develop world-tracking representations, analyzing their ecological veridicality and failure modes through a formal decomposition.

Contribution

It introduces a precise notion of ecological veridicality, characterizes when fixed encodings suffice, and validates the theory with experiments on small language models.

Findings

01

The excess cross-entropy vanishes if encoding preserves training ecology's equivalence classes.

02

Frozen transformers satisfy the fixed-encoding analysis, while in-context learning does not.

03

Models can still incur positive excess on refined deployment ecologies, indicating limitations.

Abstract

We study language models as evolving model organisms and ask when autoregressive next-token learning selects for world-tracking representations. For any encoding of latent world states, the Bayes-optimal next-token cross-entropy decomposes into the irreducible conditional entropy plus a Jensen--Shannon excess term. That excess vanishes if and only if the encoding preserves the training ecology's equivalence classes. This yields a precise notion of ecological veridicality for language models and identifies the minimum-complexity zero-excess solution as the quotient partition by training equivalence. We then determine when this fixed-encoding analysis applies to transformer families: frozen dense and frozen Mixture-of-Experts transformers satisfy it, in-context learning does not enlarge the model's separation set, and per-task adaptation breaks the premise. The framework predicts two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.