Learning Correct Behavior from Examples: Validating Sequential Execution in Autonomous Agents

Reshabh K Sharma; Gaurav Mittal; Yu Hu

arXiv:2605.03159·cs.AI·May 6, 2026

Learning Correct Behavior from Examples: Validating Sequential Execution in Autonomous Agents

Reshabh K Sharma, Gaurav Mittal, Yu Hu

PDF

TL;DR

This paper introduces a novel algorithm that learns correct sequential behavior of autonomous agents from minimal examples and validates new executions using semantic understanding and topological matching.

Contribution

It combines dominator analysis with large language model semantics to automatically learn and validate agent behavior from very few execution traces.

Findings

01

High accuracy in detecting bugs with only 3 training traces

02

Effective across domains like UI testing, code generation, and robotics

03

Provides explainable validation with coverage metrics

Abstract

As autonomous agents become increasingly sophisticated, validating their sequential behavior presents a significant challenge. Traditional testing approaches require manual specification, exact sequence matching, or thousands of training examples. We present a novel algorithm that automatically learns correct behavior from just 2-10 passing execution traces and validates new executions against this learned model. Our approach combines dominator analysis from compiler theory with multimodal large language model-powered semantic understanding to identify essential states and handle non-deterministic behavior. The system constructs a generalized ground truth model using Prefix Tree Acceptors, merges traces through multi-tiered equivalence detection, and validates new executions via topological subsequence matching. In controlled experiments, our system achieved high accuracy in detecting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.