In-Place Test-Time Training

Guhao Feng; Shengjie Luo; Kai Hua; Ge Zhang; Di He; Wenhao Huang; Tianle Cai

arXiv:2604.06169·cs.LG·April 8, 2026

In-Place Test-Time Training

Guhao Feng, Shengjie Luo, Kai Hua, Ge Zhang, Di He, Wenhao Huang, Tianle Cai

PDF

1 Repo 1 Models 1 Video

TL;DR

In-Place Test-Time Training (In-Place TTT) enables large language models to adapt dynamically during inference by updating their final projection matrices with a tailored objective, improving performance on long-context tasks.

Contribution

The paper introduces a scalable, in-place TTT framework that adapts LLMs during inference without retraining, using a theoretically grounded objective aligned with language modeling.

Findings

01

Enables a 4B-parameter model to perform well on 128k context tasks.

02

Outperforms existing TTT approaches when trained from scratch.

03

Provides insights through ablation studies on design choices.

Abstract

The static ``train then deploy" paradigm fundamentally limits Large Language Models (LLMs) from dynamically adapting their weights in response to continuous streams of new information inherent in real-world tasks. Test-Time Training (TTT) offers a compelling alternative by updating a subset of model parameters (fast weights) at inference time, yet its potential in the current LLM ecosystem is hindered by critical barriers including architectural incompatibility, computational inefficiency and misaligned fast weight objectives for language modeling. In this work, we introduce In-Place Test-Time Training (In-Place TTT), a framework that seamlessly endows LLMs with Test-Time Training ability. In-Place TTT treats the final projection matrix of the ubiquitous MLP blocks as its adaptable fast weights, enabling a ``drop-in" enhancement for LLMs without costly retraining from scratch.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bytedance-seed/In-Place-TTT
github

Models

🤗
Lgr54HFi/chimera
model· 52 dl
52 dl

Videos

In-Place Test-Time Training· slideslive