Directional Optimization Asymmetry in Transformers: A Synthetic Stress Test
Mihir Sahasrabudhe

TL;DR
This paper introduces a synthetic benchmark to isolate and analyze the intrinsic directional optimization asymmetry in Transformer models, revealing a fundamental, semantics-free bias that persists even in controlled, language-agnostic settings.
Contribution
The study provides a synthetic, entropy-controlled benchmark to systematically investigate directional learning biases in Transformers, demonstrating an inherent optimization gap independent of linguistic data.
Findings
Transformers exhibit a significant directional optimization gap even in synthetic, language-agnostic tasks.
Pre-training shifts but does not eliminate the directional bias in Transformers.
LoRA encounters a capacity limit on high-entropy inverse mappings, highlighting model constraints.
Abstract
Transformers are theoretically reversal-invariant: their function class does not prefer left-to-right over right-to-left mappings. Yet empirical studies on natural language repeatedly report a "reversal curse," and recent work on temporal asymmetry in LLMs suggests that real-world corpora carry their own arrow of time. This leaves an unresolved question: do directional failures stem from linguistic statistics, or from the architecture itself? We cut through this ambiguity with a fully synthetic, entropy-controlled benchmark designed as a clean-room stress test for directional learning. Using random string mappings with tunable branching factor K, we construct forward tasks with zero conditional entropy and inverse tasks with analytically determined entropy floors. Excess loss above these floors reveals that even scratch-trained GPT-2 models exhibit a strong, reproducible directional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeurobiology of Language and Bilingualism · Language and cultural evolution · Language Development and Disorders
