Learning CLI Agents with Structured Action Credit under Selective Observation

Haoyang Su; Ying Wen

arXiv:2605.08013·cs.AI·May 11, 2026

Learning CLI Agents with Structured Action Credit under Selective Observation

Haoyang Su, Ying Wen

PDF

1 Datasets

TL;DR

This paper introduces novel methods for training CLI agents by leveraging structured action attributes and selective observations, improving task learning in complex environments.

Contribution

It proposes $\sigma$-Reveal for context selection and $ ext{A}^3$ for credit assignment, addressing key bottlenecks in CLI agent learning.

Findings

01

$\sigma$-Reveal enhances context relevance in CLI tasks.

02

$ ext{A}^3$ improves credit assignment in multi-turn trajectories.

03

ShellOps dataset enables benchmarking of CLI agent performance.

Abstract

Command line interface (CLI) agents are emerging as a practical paradigm for agent-computer interaction over evolving filesystems, executable command line programs, and online execution feedback. Recent work has used reinforcement learning (RL) to learn these interaction abilities from verifiable task feedback, yet few methods exploit the native structured attributes of CLI actions as learning signals. Beyond this underused action structure, CLI learning also couples two bottlenecks for coding agents. First, the agent must identify task-relevant evidence in a large codebase from partial observations. Second, sparse terminal rewards must be assigned to the actions that shape a long multi-turn trajectory. We study these bottlenecks through shell-driven information extraction and file editing tasks. For selective observation, we introduce $σ$ -Reveal, an inference-time mechanism that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Hoyant-Su/ShellOps
dataset· 1.3k dl
1.3k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.