TouchAnything: A Dataset and Framework for Bimanual Tactile Estimation from Egocentric Video
Jianyi Zhou, Ziteng Gao, Feiyang Hong, Zirui Liu, Guannan Zhang, Weisheng Dai, Ruichen Zhen, Chuqiao Lyu, Haotian Wu, Yinian Mao, Xushi Wang, Yuxiang Jiang, Wenbo Ding, and Shuo Yang

TL;DR
This paper introduces EgoTouch, a large-scale egocentric dataset with tactile supervision, and TouchAnything, a framework for predicting tactile information from visual data to enhance embodied interaction models.
Contribution
The paper presents EgoTouch dataset with synchronized visual and tactile data, and a vision-to-touch prediction framework that leverages multi-view inputs for improved tactile inference.
Findings
Incorporating wrist-mounted views improves tactile prediction accuracy.
TouchAnything achieves up to 5.0% improvement in Contact IoU.
The dataset and code will be publicly released.
Abstract
Egocentric human video data, which captures rich human-environment interactions and can be collected at scale, has become a key driver of embodied intelligence research. However, existing egocentric datasets typically lack tactile sensing, a critical modality that provides direct cues about contact, force, and pressure in human-object interaction. Without such signals, models struggle to learn physically grounded representations of real-world interaction dynamics. While tactile sensors provide these cues, deploying high-quality tactile hardware at scale remains expensive and cumbersome. This raises a central question: can tactile feedback be inferred directly from visual observations, enabling scalable tactile supervision for egocentric video data and supporting physically grounded embodied learning? To enable research in this direction, we introduce EgoTouch, a large-scale multi-view…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
