Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs

Ziyue Li; Yang Li; Tianyi Zhou

arXiv:2507.07996·cs.LG·July 11, 2025

Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs

Ziyue Li, Yang Li, Tianyi Zhou

PDF

Open Access

TL;DR

This paper introduces a method to dynamically adapt the architecture of pretrained large language models at test time by manipulating layers as modules, leading to improved efficiency and accuracy.

Contribution

It proposes a novel approach using Monte Carlo Tree Search to optimize layer configurations per sample, enabling flexible, sample-specific model architectures.

Findings

01

Over 75% of correctly predicted samples can be made shorter, improving efficiency.

02

Over 60% of initially incorrect samples can be corrected with optimized layer configurations.

03

Test-time architecture adaptation significantly enhances LLM performance.

Abstract

Can a pretrained neural network adapt its architecture to different inputs without any finetuning? Do we need all layers for simple tasks, and are they adequate for challenging tasks? We found that the layers of a pretrained large language model (LLM) can be manipulated as separate modules to build a better and even shallower model customized for each test sample. In particular, each layer from the pretrained model can be skipped/pruned or repeated multiple times as recurrent neural networks (RNN), and stacked with others in arbitrary orders, yielding a chain-of-layers (CoLa) per sample. This compositional space greatly expands the scope of existing works on looped/recurrent pretrained modules, layer pruning, or early-exit networks. We develop a Monte Carlo Tree Search (MCTS) protocol to explore and identify the optimal CoLa for each sample from math and commonsense reasoning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)