CARL: Criticality-Aware Agentic Reinforcement Learning
Leyang Shen, Yang Zhang, Chun Kai Ling, Xiaoyan Zhao, Tat-Seng Chua

TL;DR
CARL is a reinforcement learning algorithm that uses entropy to identify critical states, focusing training on them to improve performance and efficiency in long-horizon tasks.
Contribution
The paper introduces CARL, a novel criticality-aware RL method that selectively updates actions from high-criticality states, enhancing learning efficiency and effectiveness.
Findings
CARL outperforms traditional methods in diverse tasks.
It achieves higher efficiency by focusing on critical states.
Experimental results confirm improved performance and resource utilization.
Abstract
Agents capable of accomplishing complex tasks through multiple interactions with the environment have emerged as a popular research direction. However, in such multi-step settings, the conventional group-level policy optimization algorithm becomes suboptimal because of its underlying assumption that each step holds equal contribution, which deviates significantly from reality. Our analysis reveals that only the action choices on a small fraction of states are critical in determining the final outcome. Building on this insight, we propose CARL, a criticality-aware reinforcement learning algorithm tailored for long-horizon agentic reasoning. CARL leverages entropy as a heuristic proxy for state criticality and achieves focused training by assigning rewards to actions taken from high-criticality states while excluding actions taken from low-criticality states from model updates, avoiding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
