Rethink Repeatable Measures of Robot Performance with Statistical Query

Bowen Weng; Linda Capito; Guillermo A. Castillo; Dylan Khor

arXiv:2505.08216·cs.RO·December 22, 2025

Rethink Repeatable Measures of Robot Performance with Statistical Query

Bowen Weng, Linda Capito, Guillermo A. Castillo, Dylan Khor

PDF

TL;DR

This paper introduces a provably repeatable modification for statistical query algorithms used in robot performance testing, ensuring consistent results across different conditions while maintaining accuracy and efficiency.

Contribution

It proposes a lightweight, adaptive modification to any statistical query routine that guarantees repeatability with bounded accuracy and efficiency, applicable across various robot testing scenarios.

Findings

01

Proven repeatability guarantees for SQ algorithms in robot testing.

02

Effective across manipulator, vehicle, and humanoid robot evaluation scenarios.

03

Maintains accuracy and efficiency bounds in diverse testing conditions.

Abstract

For a general standardized testing algorithm designed to evaluate a specific aspect of a robot's performance, several key expectations are commonly imposed. Beyond accuracy (i.e., closeness to a typically unknown ground-truth reference) and efficiency (i.e., feasibility within acceptable testing costs and equipment constraints), one particularly important attribute is repeatability. Repeatability refers to the ability to consistently obtain the same testing outcome when similar testing algorithms are executed on the same subject robot by different stakeholders, across different times or locations. However, achieving repeatable testing has become increasingly challenging as the components involved grow more complex, intelligent, diverse, and, most importantly, stochastic. While related efforts have addressed repeatability at ethical, hardware, and procedural levels, this study focuses…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.