Task-driven Compression for Collision Encoding based on Depth Images
Mihir Kulkarni, Kostas Alexis

TL;DR
This paper introduces a learning-based, task-driven compression method for depth images tailored to collision prediction in robotics, outperforming classical methods at high compression ratios by encoding collision-relevant information.
Contribution
A novel 3D image processing and neural network-based encoding approach that incorporates robot size for effective collision prediction from compressed depth images.
Findings
Superior collision prediction accuracy at high compression ratios
Effective encoding of complex scenes with thin obstacles
Outperforms classical task-agnostic compression methods
Abstract
This paper contributes a novel learning-based method for aggressive task-driven compression of depth images and their encoding as images tailored to collision prediction for robotic systems. A novel 3D image processing methodology is proposed that accounts for the robot's size in order to appropriately "inflate" the obstacles represented in the depth image and thus obtain the distance that can be traversed by the robot in a collision-free manner along any given ray within the camera frustum. Such depth-and-collision image pairs are used to train a neural network that follows the architecture of Variational Autoencoders to compress-and-transform the information in the original depth image to derive a latent representation that encodes the collision information for the given depth image. We compare our proposed task-driven encoding method with classical task-agnostic methods and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Video Surveillance and Tracking Methods · Human Pose and Action Recognition
