Detecting Non-Optimal Decisions of Embodied Agents via Diversity-Guided Metamorphic Testing

Wenzhao Wu; Yahui Tang; Mingfei Cheng; Wenbing Tang; Yuan Zhou; Yang Liu

arXiv:2512.20083·cs.SE·December 24, 2025

Detecting Non-Optimal Decisions of Embodied Agents via Diversity-Guided Metamorphic Testing

Wenzhao Wu, Yahui Tang, Mingfei Cheng, Wenbing Tang, Yuan Zhou, Yang Liu

PDF

Open Access

TL;DR

This paper introduces NoD-DGMT, a framework using diversity-guided metamorphic testing to detect non-optimal decisions in embodied agents, improving evaluation of their decision quality beyond functional correctness.

Contribution

It formalizes the problem of non-optimal decisions in embodied agents and proposes four metamorphic relations along with a diversity-guided selection strategy for effective detection.

Findings

01

Achieves 31.9% violation detection rate on average.

02

Diversity-guided filtering improves detection rates by 4.3%.

03

Outperforms six baseline methods with 16.8% relative improvement.

Abstract

As embodied agents advance toward real-world deployment, ensuring optimal decisions becomes critical for resource-constrained applications. Current evaluation methods focus primarily on functional correctness, overlooking the non-functional optimality of generated plans. This gap can lead to significant performance degradation and resource waste. We identify and formalize the problem of Non-optimal Decisions (NoDs), where agents complete tasks successfully but inefficiently. We present NoD-DGMT, a systematic framework for detecting NoDs in embodied agent task planning via diversity-guided metamorphic testing. Our key insight is that optimal planners should exhibit invariant behavioral properties under specific transformations. We design four novel metamorphic relations capturing fundamental optimality properties: position detour suboptimality, action optimality completeness, condition…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI-based Problem Solving and Planning · Robotic Path Planning Algorithms · Reinforcement Learning in Robotics