Multi-Modal Fusion in Contact-Rich Precise Tasks via Hierarchical Policy Learning
Piaopiao Jin, Yinjie Lin, Yanchao Tan, Tiefeng Li, Wei Yang

TL;DR
This paper introduces a hierarchical policy learning framework that fuses visual, force, and proprioceptive data for contact-rich robotic manipulation, enabling precise assembly and generalization to new objects.
Contribution
It presents a novel multi-modal sensor fusion method using hierarchical RL, improving precision and adaptability in contact-rich tasks.
Findings
Achieved 0.25mm clearance in simulation assembly tasks
Successfully transferred the system from simulation to real robots without fine-tuning
Demonstrated robustness across varied initial configurations and object shapes
Abstract
Combined visual and force feedback play an essential role in contact-rich robotic manipulation tasks. Current methods focus on developing the feedback control around a single modality while underrating the synergy of the sensors. Fusing different sensor modalities is necessary but remains challenging. A key challenge is to achieve an effective multi-modal and generalized control scheme to novel objects with precision. This paper proposes a practical multi-modal sensor fusion mechanism using hierarchical policy learning. To begin with, we use a self-supervised encoder that extracts multi-view visual features and a hybrid motion/force controller that regulates force behaviors. Next, the multi-modality fusion is simplified by hierarchical integration of the vision, force, and proprioceptive data in the reinforcement learning (RL) algorithm. Moreover, with hierarchical policy learning, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Tactile and Sensory Interactions · Neuroscience and Neural Engineering
