TL;DR
VISION-SLS introduces a scalable control method using learned visual features and System Level Synthesis to ensure safety and constraint satisfaction in complex visuomotor tasks with high-resolution images.
Contribution
It combines learned low-dimensional visual representations with an efficient SLS-based output-feedback control approach for safe, scalable visuomotor control.
Findings
Successfully applied to simulated and real hardware tasks with high-resolution images.
Enables safe, information-gathering behavior with reduced uncertainty.
Outperforms baselines in safety and solve times on hardware.
Abstract
We propose VISION-SLS, a method for nonlinear output-feedback control from high-resolution RGB images which provides robust constraint satisfaction guarantees under calibrated uncertainty bounds despite partial observability, sensor noise, and nonlinear dynamics. To enable scalability while retaining guarantees, we propose: (i) a learned low-dimensional observation map from pretrained visual features with state-dependent error bounds, and (ii) a causal affine time-varying output-feedback policy optimized via System Level Synthesis (SLS). We develop a scalable, novel solver for the resulting nonconvex program that leverages sequential convex programming coupled with efficient Riccati recursions. On two simulated visuomotor tasks (a 4D car and a 10D quadrotor) with >= 512 x 512 pixels and a 59D humanoid task with partial observability, our method enables safe, information-gathering…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
