Latent Action Reparameterization for Efficient Agent Inference

Wenhao Huang; Qingwen Zeng; Qiyue Chen; Zijie Guo; Yu Sun; Cheng Yang; Siru Ouyang; Jiri Gesi; Fang Wu; Jiayi Zhang; Huaming Chen; Bang Liu; Xiangru Tang; Chenglin Wu

arXiv:2605.18597·cs.AI·May 20, 2026

Latent Action Reparameterization for Efficient Agent Inference

Wenhao Huang, Qingwen Zeng, Qiyue Chen, Zijie Guo, Yu Sun, Cheng Yang, Siru Ouyang, Jiri Gesi, Fang Wu, Jiayi Zhang, Huaming Chen, Bang Liu, Xiangru Tang, Chenglin Wu

PDF

TL;DR

The paper introduces Latent Action Reparameterization (LAR), a method that learns compact, semantic latent actions to reduce decision horizons and improve inference efficiency in large language model agents.

Contribution

LAR is a novel framework that learns latent action spaces from agent trajectories, enabling more efficient decision-making over shorter horizons without sacrificing expressiveness.

Findings

01

LAR significantly reduces effective action horizon across benchmarks.

02

LAR decreases action tokens and inference time while maintaining or improving success rates.

03

LAR outperforms traditional macro or hierarchical approaches in efficiency.

Abstract

Large language model (LLM) agents often rely on long sequences of low-level textual actions, resulting in large effective decision horizons and high inference cost. While prior work has focused on improving inference efficiency through system-level optimizations or prompt engineering, we argue that a key bottleneck lies in the representation of the action space itself. We propose Latent Action Reparameterization (LAR), a framework that learns a compact latent action space in which each latent action corresponds to a multi-step semantic behavior. By reparameterizing agent actions into latent units, LAR enables decision making over a shorter effective horizon while preserving the expressiveness of the original action space. Unlike hand-crafted macros or hierarchical controllers, latent actions are learned from agent trajectories and integrated directly into the model, allowing both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.