Detecting Non-Optimal Decisions of Embodied Agents via Diversity-Guided Metamorphic Testing
Wenzhao Wu, Yahui Tang, Mingfei Cheng, Wenbing Tang, Yuan Zhou, Yang Liu

TL;DR
This paper introduces NoD-DGMT, a framework using diversity-guided metamorphic testing to detect non-optimal decisions in embodied agents, improving evaluation of their decision quality beyond functional correctness.
Contribution
It formalizes the problem of non-optimal decisions in embodied agents and proposes four metamorphic relations along with a diversity-guided selection strategy for effective detection.
Findings
Achieves 31.9% violation detection rate on average.
Diversity-guided filtering improves detection rates by 4.3%.
Outperforms six baseline methods with 16.8% relative improvement.
Abstract
As embodied agents advance toward real-world deployment, ensuring optimal decisions becomes critical for resource-constrained applications. Current evaluation methods focus primarily on functional correctness, overlooking the non-functional optimality of generated plans. This gap can lead to significant performance degradation and resource waste. We identify and formalize the problem of Non-optimal Decisions (NoDs), where agents complete tasks successfully but inefficiently. We present NoD-DGMT, a systematic framework for detecting NoDs in embodied agent task planning via diversity-guided metamorphic testing. Our key insight is that optimal planners should exhibit invariant behavioral properties under specific transformations. We design four novel metamorphic relations capturing fundamental optimality properties: position detour suboptimality, action optimality completeness, condition…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning · Robotic Path Planning Algorithms · Reinforcement Learning in Robotics
