Agent-Centric Observation Adaptation for Robust Visual Control under Dynamic Perturbations

Zhengru Fang; Yu Guo; Fei Liu; Yuang Zhang; Yihang Tao; Senkang Hu; Wenbo Ding; Yuguang Fang

arXiv:2604.24661·cs.RO·May 11, 2026

Agent-Centric Observation Adaptation for Robust Visual Control under Dynamic Perturbations

Zhengru Fang, Yu Guo, Fei Liu, Yuang Zhang, Yihang Tao, Senkang Hu, Wenbo Ding, Yuguang Fang

PDF

TL;DR

This paper introduces ACO-MoE, a plug-and-play observation adapter that enhances visual control robustness under dynamic, non-stationary corruptions by focusing on foreground information and using a mixture-of-experts approach.

Contribution

It proposes a novel agent-centric, foreground-focused observation adaptation method pretrained offline, improving downstream control performance under diverse and switching visual perturbations.

Findings

01

Achieves 95.3% recovery of clean-input performance under challenging corruptions.

02

Generalizes zero-shot to unseen visual perturbations.

03

Consistently improves control across multiple benchmarks.

Abstract

Real-world visual systems face time-varying perturbations, including weather, sensor noise, compression artifacts, and background distractions. Existing image restoration methods are typically designed for fixed corruption types and optimized for pixel-level fidelity, leaving open two questions: how restoration behaves under non-stationary corruption switching, and whether pixel-level fidelity preserves the task-relevant information needed by downstream models. To study this setting, we introduce the Visual Degraded Control Suite (VDCS), a benchmark that injects Markov-switching physical degradations into rendered scenes. We further identify a fundamental failure mode of reconstruction-based representations: faithfully reconstructing corrupted observations forces the latent state to encode corruption-specific nuisance information, thereby contaminating downstream models. From an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.