Semantic-Contact Fields for Category-Level Generalizable Tactile Tool Manipulation
Kevin Yuchen Ma, Heng Zhang, Weisi Lin, Mike Zheng Shou, Yan Wu

TL;DR
This paper introduces Semantic-Contact Fields (SCFields), a novel 3D representation combining visual semantics with tactile contact data, enabling category-level generalization in soft tactile tool manipulation through a two-stage sim-to-real learning pipeline.
Contribution
The work presents SCFields, a unified tactile-visual representation learned via simulation and minimal real data, improving generalization across diverse tools in manipulation tasks.
Findings
SCFields enables robust category-level generalization in tactile manipulation.
The two-stage sim-to-real pipeline effectively transfers tactile representations from simulation to real-world.
Experiments show significant performance improvements over vision-only and raw tactile baselines.
Abstract
Generalizing tool manipulation requires both semantic planning and precise physical control. Modern generalist robot policies, such as Vision-Language-Action (VLA) models, often lack the physical grounding required for contact-rich tool manipulation. Conversely, existing contact-aware policies that leverage tactile or haptic sensing are typically instance-specific and fail to generalize across diverse tool geometries. Bridging this gap requires learning representations that are both semantically transferable and physically grounded, yet a fundamental barrier remains: diverse real-world tactile data are prohibitive to collect at scale, while direct zero-shot sim-to-real transfer is challenging due to the complex nonlinear deformation of soft tactile sensors. To address this, we propose Semantic-Contact Fields (SCFields), a unified 3D representation that fuses visual semantics with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
