Distracted Robot: How Visual Clutter Undermine Robotic Manipulation
Amir Rasouli, Montgomery Alban, Sajjad Pakdamansavoji, Zhiyuan Li, Zhanguang Zhang, Aaron Wu, Xuan Zhao

TL;DR
This paper introduces a psychophysical evaluation protocol for robotic manipulation in cluttered scenes, revealing how environmental factors and distractor arrangements significantly impair policy performance, with implications for improving robustness.
Contribution
It proposes a unified clutter measure for evaluation, systematically tests manipulation policies in realistic scenarios, and analyzes how clutter affects performance and model vulnerabilities.
Findings
Clutter reduces manipulation success by up to 34%.
Different policies show unique vulnerabilities to clutter.
Finetuning improves performance but does not fully mitigate clutter effects.
Abstract
In this work, we propose an evaluation protocol for examining the performance of robotic manipulation policies in cluttered scenes. Contrary to prior works, we approach evaluation from a psychophysical perspective, therefore we use a unified measure of clutter that accounts for environmental factors as well as the distractors quantity, characteristics, and arrangement. Using this measure, we systematically construct evaluation scenarios in both hyper-realistic simulation and real-world and conduct extensive experimentation on manipulation policies, in particular vision-language-action (VLA) models. Our experiments highlight the significant impact of scene clutter, lowering the performance of the policies, by as much as 34% and show that despite achieving similar average performance across the tasks, different VLA policies have unique vulnerabilities and a relatively low agreement on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robot Manipulation and Learning · Reinforcement Learning in Robotics
