Adaptive Blockwise Search: Inference-Time Alignment for Large Language Models

Mohammad Atif Quamar; Mohammad Areeb; Nishant Sharma; Ananth Shreekumar; Jonathan Rosenthal; Muslum Ozgur Ozmen; Mikhail Kuznetsov; Z. Berkay Celik

arXiv:2510.23334·cs.CL·October 28, 2025

Adaptive Blockwise Search: Inference-Time Alignment for Large Language Models

Mohammad Atif Quamar, Mohammad Areeb, Nishant Sharma, Ananth Shreekumar, Jonathan Rosenthal, Muslum Ozgur Ozmen, Mikhail Kuznetsov, Z. Berkay Celik

PDF

TL;DR

AdaSearch is an adaptive, inference-time alignment method for large language models that allocates computational effort dynamically to critical tokens, significantly improving alignment quality across various tasks.

Contribution

The paper introduces AdaSearch, a novel blockwise search strategy that adaptively allocates computational resources during inference for better alignment.

Findings

01

AdaSearch outperforms Best-of-N and fine-tuning baselines.

02

Win-rates improve by over 10% in key alignment tasks.

03

Effective across eight large language models.

Abstract

LLM alignment remains a critical challenge. Inference-time methods provide a flexible alternative to fine-tuning, but their uniform computational effort often yields suboptimal alignment. We hypothesize that for many alignment tasks, the initial tokens of a response are disproportionately more critical. To leverage this principle, we introduce AdaSearch, a novel blockwise search strategy. It adaptively allocates a fixed computational budget using a sampling schedule, focusing search effort on these critical tokens. We apply AdaSearch to sequential decoding and introduce its tree-search counterpart, AdaBeam. Our comprehensive evaluation across eight LLMs demonstrates that AdaSearch outperforms strong Best-of-N and fine-tuning baselines. Specifically, win-rates improve by over 10% for harmlessness generation, controlled sentiment generation, and for mathematical reasoning tasks relative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.