Two-stream convolutional neural network for accurate RGB-D fingertip detection using depth and edge information
Hengkai Guo, Guijin Wang, Xinghao Chen

TL;DR
This paper introduces a two-stream CNN that combines depth and edge information for precise RGB-D fingertip detection, achieving superior accuracy and real-time performance on public datasets.
Contribution
A novel two-stream CNN architecture with a slow fusion strategy that effectively integrates depth and edge data for fingertip detection.
Findings
Outperforms state-of-the-art methods on HandNet dataset
Achieves an average 3D error of 9.9mm
Provides comparable accuracy on NYU hand dataset
Abstract
Accurate detection of fingertips in depth image is critical for human-computer interaction. In this paper, we present a novel two-stream convolutional neural network (CNN) for RGB-D fingertip detection. Firstly edge image is extracted from raw depth image using random forest. Then the edge information is combined with depth information in our CNN structure. We study several fusion approaches and suggest a slow fusion strategy as a promising way of fingertip detection. As shown in our experiments, our real-time algorithm outperforms state-of-the-art fingertip detection methods on the public dataset HandNet with an average 3D error of 9.9mm, and shows comparable accuracy of fingertip estimation on NYU hand dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Advanced Neural Network Applications
