Learning to Remove Multipath Distortions in Time-of-Flight Range Images for a Robotic Arm Setup
Kilho Son, Ming-Yu Liu, Yuichi Taguchi

TL;DR
This paper introduces a deep learning method to effectively remove multipath distortions in ToF range images captured by a robotic arm setup, significantly improving measurement accuracy.
Contribution
It presents a novel learning-based approach that automatically collects and labels training data using a structured light sensor, enabling effective multipath distortion removal.
Findings
Achieves 55% error reduction in range estimation
Outperforms baseline algorithms in experimental validation
Utilizes robotic arm for automatic data collection and labeling
Abstract
Range images captured by Time-of-Flight (ToF) cameras are corrupted with multipath distortions due to interaction between modulated light signals and scenes. The interaction is often complicated, which makes a model-based solution elusive. We propose a learning-based approach for removing the multipath distortions for a ToF camera in a robotic arm setup. Our approach is based on deep learning. We use the robotic arm to automatically collect a large amount of ToF range images containing various multipath distortions. The training images are automatically labeled by leveraging a high precision structured light sensor available only in the training time. In the test time, we apply the learned model to remove the multipath distortions. This allows our robotic arm setup to enjoy the speed and compact form of the ToF camera without compromising with its range measurement errors. We conduct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Optical Sensing Technologies · Advanced Vision and Imaging · Optical measurement and interference techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
