A transformer architecture alteration to incentivise externalised reasoning

Elizabeth Pavlova; Mariia Koroliuk; Karthik Viswanathan; Cameron Tice; Edward James Young; Puria Radmard

arXiv:2603.21376·cs.AI·March 25, 2026

A transformer architecture alteration to incentivise externalised reasoning

Elizabeth Pavlova, Mariia Koroliuk, Karthik Viswanathan, Cameron Tice, Edward James Young, Puria Radmard

PDF

Open Access

TL;DR

This paper introduces a transformer modification with an early-exit mechanism and reinforcement learning training to make large language models more verbose reasoners, reducing unnecessary computation while maintaining performance.

Contribution

It presents a novel architectural change and training pipeline that incentivizes models to truncate forward passes early, optimizing reasoning efficiency.

Findings

01

Models learn to adaptively reduce computation across tokens

02

Early-exit mechanism maintains task performance

03

Potential to minimize excess computation in reasoning models

Abstract

We propose a new architectural change, and post-training pipeline, for making LLMs more verbose reasoners by teaching a model to truncate forward passes early. We augment an existing transformer architecture with an early-exit mechanism at intermediate layers and train the model to exit at shallower layers when the next token can be predicted without deep computation. After a calibration stage, we incentivise the model to exit as early as possible while maintaining task performance using reinforcement learning. We provide preliminary results to this effect for small reasoning models, showing that they learn to adaptively reduce computations across tokens. We predict that, applied at the right scale, our approach can minimise the amount of excess computation that reasoning models have at their disposal to perform non-myopic planning using their internal activations, reserving this only…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI-based Problem Solving and Planning · Reinforcement Learning in Robotics · Multimodal Machine Learning Applications