KOROL: Learning Visualizable Object Feature with Koopman Operator Rollout for Manipulation
Hongyi Chen, Abulikemu Abuduweili, Aviral Agrawal, Yunhai Han, Harish, Ravichandar, Changliu Liu, Jeffrey Ichnowski

TL;DR
KOROL introduces a vision-based manipulation framework using Koopman operators to learn interpretable object features, enabling accurate system state propagation and improved task success in robotic manipulation.
Contribution
The paper presents a novel method that uses Koopman operators with learned visual features for manipulation, eliminating the need for ground-truth object states in real-world applications.
Findings
Outperforms existing model-based imitation learning and diffusion policies.
Maintains high task success rates with learned visual features.
Effective in both simulated and real-world robot tasks.
Abstract
Learning dexterous manipulation skills presents significant challenges due to complex nonlinear dynamics that underlie the interactions between objects and multi-fingered hands. Koopman operators have emerged as a robust method for modeling such nonlinear dynamics within a linear framework. However, current methods rely on runtime access to ground-truth (GT) object states, making them unsuitable for vision-based practical applications. Unlike image-to-action policies that implicitly learn visual features for control, we use a dynamics model, specifically the Koopman operator, to learn visually interpretable object features critical for robotic manipulation within a scene. We construct a Koopman operator using object features predicted by a feature extractor and utilize it to auto-regressively advance system states. We train the feature extractor to embed scene information into object…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Human Pose and Action Recognition · Advanced Neural Network Applications
