Obstacle Avoidance through Deep Networks based Intermediate Perception
Shichao Yang, Sandeep Konam, Chen Ma, Stephanie Rosenthal, Manuela, Veloso, Sebastian Scherer

TL;DR
This paper introduces a novel obstacle avoidance method using deep networks that predict depth and surface normals from monocular images to generate trajectories, improving accuracy and generalization in indoor environments.
Contribution
The proposed approach combines intermediate perception of depth and surface normals with trajectory prediction, enhancing obstacle avoidance performance over direct methods.
Findings
20% increase in accuracy compared to direct prediction
Effective generalization to various indoor datasets
Successful application in robot flight simulations and experiments
Abstract
Obstacle avoidance from monocular images is a challenging problem for robots. Though multi-view structure-from-motion could build 3D maps, it is not robust in textureless environments. Some learning based methods exploit human demonstration to predict a steering command directly from a single image. However, this method is usually biased towards certain tasks or demonstration scenarios and also biased by human understanding. In this paper, we propose a new method to predict a trajectory from images. We train our system on more diverse NYUv2 dataset. The ground truth trajectory is computed from the designed cost functions automatically. The Convolutional Neural Network perception is divided into two stages: first, predict depth map and surface normal from RGB images, which are two important geometric properties related to 3D obstacle representation. Second, predict the trajectory from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Robotic Path Planning Algorithms · Human Pose and Action Recognition
