GrabAR: Occlusion-aware Grabbing Virtual Objects in AR
Xiao Tang, Xiaowei Hu, Chi-Wing Fu, Daniel Cohen-Or

TL;DR
GrabAR introduces a novel neural network-based method to predict occlusion between real hands and virtual objects in AR, improving interaction realism without requiring explicit depth information.
Contribution
It presents a new approach that directly predicts occlusion masks using paired images, bypassing depth estimation, and employs synthetic and real datasets for training.
Findings
Effective occlusion prediction improves AR interaction realism.
System supports natural hand grabbing and manipulation of virtual objects.
Quantitative and qualitative evaluations demonstrate system robustness.
Abstract
Existing augmented reality (AR) applications often ignore occlusion between real hands and virtual objects when incorporating virtual objects in our views. The challenges come from the lack of accurate depth and mismatch between real and virtual depth. This paper presents GrabAR, a new approach that directly predicts the real-and-virtual occlusion, and bypasses the depth acquisition and inference. Our goal is to enhance AR applications with interactions between hand (real) and grabbable objects (virtual). With paired images of hand and object as inputs, we formulate a neural network that learns to generate the occlusion mask. To train the network, we compile a synthetic dataset to pre-train it and a real dataset to fine-tune it, thus reducing the burden of manual labels and addressing the domain difference. Then, we embed the trained network in a prototyping AR system that supports hand…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Augmented Reality Applications · Robotics and Sensor-Based Localization
