The Foreseeable Future: Self-Supervised Learning to Predict Dynamic Scenes for Indoor Navigation
Hugues Thomas, Jian Zhang, Timothy D. Barfoot

TL;DR
This paper introduces a self-supervised method for predicting future dynamic scenes using Spatiotemporal Occupancy Grid Maps, enhancing indoor robot navigation by forecasting scene changes from lidar data.
Contribution
It presents a novel self-supervised pipeline combining 3D and 2D neural networks to predict future scene occupancy, enabling lifelong learning for indoor navigation systems.
Findings
Effective prediction of future scenes demonstrated in simulation and real robot tests.
Automated annotation process for creating SOGMs from noisy real-world data.
Introduction of a new indoor 3D lidar dataset with annotations.
Abstract
We present a method for generating, predicting, and using Spatiotemporal Occupancy Grid Maps (SOGM), which embed future semantic information of real dynamic scenes. We present an auto-labeling process that creates SOGMs from noisy real navigation data. We use a 3D-2D feedforward architecture, trained to predict the future time steps of SOGMs, given 3D lidar frames as input. Our pipeline is entirely self-supervised, thus enabling lifelong learning for real robots. The network is composed of a 3D back-end that extracts rich features and enables the semantic segmentation of the lidar frames, and a 2D front-end that predicts the future information embedded in the SOGM representation, potentially capturing the complexities and uncertainties of real-world multi-agent, multi-future interactions. We also design a navigation system that uses these predicted SOGMs within planning, after they have…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Path Planning Algorithms · Video Analysis and Summarization · Robotics and Sensor-Based Localization
