Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation
Ruoxuan Feng, Di Hu, Wenke Ma, Xuelong Li

TL;DR
This paper introduces MS-Bot, a stage-guided multi-sensory fusion method that dynamically adjusts sensory modality priorities based on task stages, improving robotic manipulation performance and interpretability.
Contribution
The paper presents a novel stage-aware fusion approach that mimics human sensory integration, enabling robots to adaptively prioritize senses during complex tasks.
Findings
Enhanced manipulation accuracy in pouring and peg insertion tasks.
More effective and explainable sensory fusion compared to existing methods.
Robust performance across different stages of task execution.
Abstract
Humans possess a remarkable talent for flexibly alternating to different senses when interacting with the environment. Picture a chef skillfully gauging the timing of ingredient additions and controlling the heat according to the colors, sounds, and aromas, seamlessly navigating through every stage of the complex cooking process. This ability is founded upon a thorough comprehension of task stages, as achieving the sub-goal within each stage can necessitate the utilization of different senses. In order to endow robots with similar ability, we incorporate the task stages divided by sub-goals into the imitation learning process to accordingly guide dynamic multi-sensory fusion. We propose MS-Bot, a stage-guided dynamic multi-sensory fusion method with coarse-to-fine stage understanding, which dynamically adjusts the priority of modalities based on the fine-grained state within the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning
