Traversal-as-Policy: Log-Distilled Gated Behavior Trees as Externalized, Verifiable Policies for Safe, Robust, and Efficient Agents

Peiran Li; Jiashuo Sun; Fangzhou Lin; Shuo Xing; Tianfu Fu; Suofei Feng; Chaoqun Ni; Zhengzhong Tu

arXiv:2603.05517·cs.LG·March 9, 2026

Traversal-as-Policy: Log-Distilled Gated Behavior Trees as Externalized, Verifiable Policies for Safe, Robust, and Efficient Agents

Peiran Li, Jiashuo Sun, Fangzhou Lin, Shuo Xing, Tianfu Fu, Suofei Feng, Chaoqun Ni, Zhengzhong Tu

PDF

Open Access

TL;DR

This paper introduces Traversal-as-Policy, a method that distills execution logs into a Gated Behavior Tree to improve safety, robustness, and efficiency of autonomous agents by controlling task execution through structured, verifiable policies.

Contribution

It presents a novel approach of externalizing implicit long-horizon policies into executable, verifiable behavior trees that enhance safety and success rates in autonomous agents.

Findings

01

Success rate on SWE-bench Verified increased from 34.6% to 73.6%.

02

Violations reduced from 2.8% to 0.2%.

03

Token usage decreased significantly, improving efficiency.

Abstract

Autonomous LLM agents fail because long-horizon policy remains implicit in model weights and transcripts, while safety is retrofitted post hoc. We propose Traversal-as-Policy: distill sandboxed OpenHands execution logs into a single executable Gated Behavior Tree (GBT) and treat tree traversal -- rather than unconstrained generation -- as the control policy whenever a task is in coverage. Each node encodes a state-conditioned action macro mined and merge-checked from successful trajectories; macros implicated by unsafe traces attach deterministic pre-execution gates over structured tool context and bounded history, updated under experience-grounded monotonicity so previously rejected unsafe contexts cannot be re-admitted. At runtime, a lightweight traverser matches the base model's intent to child macros, executes one macro at a time under global and node-local gating, and when stalled…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Software Engineering Methodologies · Reinforcement Learning in Robotics