ForceSight: Text-Guided Mobile Manipulation with Visual-Force Goals
Jeremy A. Collins, Cody Houff, You Liang Tan, Charles C. Kemp

TL;DR
ForceSight introduces a deep learning system that predicts visual and force goals from a single RGBD image and text prompt, enabling improved dexterous mobile manipulation tasks such as grasping and object transfer.
Contribution
The paper presents a novel system that integrates force prediction with visual goals for text-guided manipulation, enhancing robot performance in complex environments.
Findings
Achieved 81% success in unseen environments for manipulation tasks.
Including force goals improves success rate from 45% to 90%.
Demonstrated effectiveness on real mobile manipulators with diverse objects.
Abstract
We present ForceSight, a system for text-guided mobile manipulation that predicts visual-force goals using a deep neural network. Given a single RGBD image combined with a text prompt, ForceSight determines a target end-effector pose in the camera frame (kinematic goal) and the associated forces (force goal). Together, these two components form a visual-force goal. Prior work has demonstrated that deep models outputting human-interpretable kinematic goals can enable dexterous manipulation by real robots. Forces are critical to manipulation, yet have typically been relegated to lower-level execution in these systems. When deployed on a mobile manipulator equipped with an eye-in-hand RGBD camera, ForceSight performed tasks such as precision grasps, drawer opening, and object handovers with an 81% success rate in unseen environments with object instances that differed significantly from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Hand Gesture Recognition Systems · Tactile and Sensory Interactions
