TRIM: Hybrid Inference via Targeted Stepwise Routing in Multi-Step Reasoning Tasks

Vansh Kapoor; Aman Gupta; Hao Chen; Anurag Beniwal; Jing Huang; Aviral Kumar

arXiv:2601.10245·cs.AI·April 16, 2026

TRIM: Hybrid Inference via Targeted Stepwise Routing in Multi-Step Reasoning Tasks

Vansh Kapoor, Aman Gupta, Hao Chen, Anurag Beniwal, Jing Huang, Aviral Kumar

PDF

1 Video

TL;DR

TRIM introduces a stepwise routing approach for multi-step reasoning tasks, selectively deploying larger models only for critical steps to improve efficiency and accuracy.

Contribution

It proposes a novel step-level routing strategy that enhances inference efficiency by targeting only problematic reasoning steps for larger models.

Findings

01

TRIM achieves 5x higher cost efficiency on MATH-500.

02

Advanced policies match performance of expensive models with 80% fewer tokens.

03

Method generalizes well across different math reasoning benchmarks.

Abstract

Multi-step reasoning tasks like mathematical problem solving are vulnerable to cascading failures, where a single incorrect step leads to complete solution breakdown. Current LLM routing methods assign entire queries to one model, treating all reasoning steps as equal. We propose TRIM (Targeted routing in multi-step reasoning tasks), which routes only critical steps $\unicode x 2013$ those likely to derail the solution $\unicode x 2013$ to larger models while letting smaller models handle routine continuations. Our key insight is that targeted step-level interventions can fundamentally transform inference efficiency by confining expensive calls to precisely those steps where stronger models prevent cascading errors. TRIM operates at the step-level: it uses process reward models to identify erroneous steps and makes routing decisions based on step-level uncertainty and budget constraints. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

TRIM: Hybrid Inference via Targeted Stepwise Routing in Multi-Step Reasoning Tasks· slideslive