Distracted Robot: How Visual Clutter Undermine Robotic Manipulation

Amir Rasouli; Montgomery Alban; Sajjad Pakdamansavoji; Zhiyuan Li; Zhanguang Zhang; Aaron Wu; Xuan Zhao

arXiv:2511.22780·cs.RO·December 1, 2025

Distracted Robot: How Visual Clutter Undermine Robotic Manipulation

Amir Rasouli, Montgomery Alban, Sajjad Pakdamansavoji, Zhiyuan Li, Zhanguang Zhang, Aaron Wu, Xuan Zhao

PDF

Open Access

TL;DR

This paper introduces a psychophysical evaluation protocol for robotic manipulation in cluttered scenes, revealing how environmental factors and distractor arrangements significantly impair policy performance, with implications for improving robustness.

Contribution

It proposes a unified clutter measure for evaluation, systematically tests manipulation policies in realistic scenarios, and analyzes how clutter affects performance and model vulnerabilities.

Findings

01

Clutter reduces manipulation success by up to 34%.

02

Different policies show unique vulnerabilities to clutter.

03

Finetuning improves performance but does not fully mitigate clutter effects.

Abstract

In this work, we propose an evaluation protocol for examining the performance of robotic manipulation policies in cluttered scenes. Contrary to prior works, we approach evaluation from a psychophysical perspective, therefore we use a unified measure of clutter that accounts for environmental factors as well as the distractors quantity, characteristics, and arrangement. Using this measure, we systematically construct evaluation scenarios in both hyper-realistic simulation and real-world and conduct extensive experimentation on manipulation policies, in particular vision-language-action (VLA) models. Our experiments highlight the significant impact of scene clutter, lowering the performance of the policies, by as much as 34% and show that despite achieving similar average performance across the tasks, different VLA policies have unique vulnerabilities and a relatively low agreement on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Robot Manipulation and Learning · Reinforcement Learning in Robotics