Egocentric Hand-object Interaction Detection and Application
Yao Lu, Walterio W. Mayol-Cuevas

TL;DR
This paper introduces a real-time egocentric hand-object interaction detection method that uses hand and object cues, achieving high accuracy and efficiency suitable for activity segmentation.
Contribution
The novel workflow predicts hand pose, masks, and interaction status jointly, enabling real-time detection with competitive accuracy and activity segmentation capabilities.
Findings
Achieves 89% HOI detection accuracy on EPIC-KITCHENS
Runs at over 30 FPS on the same hardware
Attains 68.2% and 82.8% F1 scores on GTEA and UTGrasp datasets
Abstract
In this paper, we present a method to detect the hand-object interaction from an egocentric perspective. In contrast to massive data-driven discriminator based method like \cite{Shan20}, we propose a novel workflow that utilises the cues of hand and object. Specifically, we train networks predicting hand pose, hand mask and in-hand object mask to jointly predict the hand-object interaction status. We compare our method with the most recent work from Shan et al. \cite{Shan20} on selected images from EPIC-KITCHENS \cite{damen2018scaling} dataset and achieve accuracy on HOI (hand-object interaction) detection which is comparative to Shan's (). However, for real-time performance, with the same machine, our method can run over FPS which is much efficient than Shan's ( FPS). Furthermore, with our approach, we are able to segment script-less…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Multimodal Machine Learning Applications
