TL;DR
Duodepth introduces dual depth sensor methods for static gesture recognition, significantly reducing misclassification caused by occlusion and pose variations, using point cloud fusion and dual PointNet architectures.
Contribution
The paper proposes two novel approaches utilizing synchronized dual depth cameras, including point cloud fusion and dual PointNet, to improve static gesture recognition accuracy.
Findings
39.2% reduction in misclassification with fused point clouds
53.4% reduction with dual PointNet architecture
Effective handling of occlusion and pose variation in gesture recognition
Abstract
Static gesture recognition is an effective non-verbal communication channel between a user and their devices; however many modern methods are sensitive to the relative pose of the user's hands with respect to the capture device, as parts of the gesture can become occluded. We present two methodologies for gesture recognition via synchronized recording from two depth cameras to alleviate this occlusion problem. One is a more classic approach using iterative closest point registration to accurately fuse point clouds and a single PointNet architecture for classification, and the other is a dual Point-Net architecture for classification without registration. On a manually collected data-set of 20,100 point clouds we show a 39.2% reduction in misclassification for the fused point cloud method, and 53.4% for the dual PointNet, when compared to a standard single camera pipeline.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodseToro Customer Care Number +1-833-534-1729
