Egocentric Hand-object Interaction Detection
Yao Lu, Yanan Liu

TL;DR
This paper introduces a real-time egocentric hand-object interaction detection method using multi-view hand pose estimation, achieving high accuracy and efficiency for understanding human activities.
Contribution
It proposes a multi-camera approach to improve hand pose estimation under occlusion, enabling fast and accurate detection of hand-object interactions in egocentric videos.
Findings
Achieves 89% accuracy on HOI detection, comparable to recent methods.
Runs at over 30 FPS, significantly faster than prior approaches.
Uses multi-view data to mitigate occlusion issues in hand pose estimation.
Abstract
In this paper, we propose a method to jointly determine the status of hand-object interaction. This is crucial for egocentric human activity understanding and interaction. From a computer vision perspective, we believe that determining whether a hand is interacting with an object depends on whether there is an interactive hand pose and whether the hand is touching the object. Thus, we extract the hand pose, hand-object masks to jointly determine the interaction status. In order to solve the problem of hand pose estimation due to in-hand object occlusion, we use a multi-cam system to capture hand pose data from multiple perspectives. We evaluate and compare our method with the most recent work from Shan et al. \cite{Shan20} on selected images from EPIC-KITCHENS \cite{damen2018scaling} dataset and achieve accuracy on HOI (hand-object interaction) detection which is comparative to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Hand Gesture Recognition Systems
