VidLoc: A Deep Spatio-Temporal Model for 6-DoF Video-Clip Relocalization
Ronald Clark, Sen Wang, Andrew Markham, Niki Trigoni, Hongkai Wen

TL;DR
This paper introduces VidLoc, a deep recurrent model that leverages temporal information in video sequences to improve 6-DoF camera localization accuracy and smoothness, outperforming single-image methods.
Contribution
The paper presents a novel recurrent neural network approach for 6-DoF video localization that exploits temporal smoothness, reducing localization errors significantly.
Findings
Short sequences (20 frames) improve pose smoothness and accuracy.
The model outperforms single-image localization methods.
Probabilistic pose estimates are feasible with the proposed approach.
Abstract
Machine learning techniques, namely convolutional neural networks (CNN) and regression forests, have recently shown great promise in performing 6-DoF localization of monocular images. However, in most cases image-sequences, rather only single images, are readily available. To this extent, none of the proposed learning-based approaches exploit the valuable constraint of temporal smoothness, often leading to situations where the per-frame error is larger than the camera motion. In this paper we propose a recurrent model for performing 6-DoF localization of video-clips. We find that, even by considering only short sequences (20 frames), the pose estimates are smoothed and the localization error can be drastically reduced. Finally, we consider means of obtaining probabilistic pose estimates from our model. We evaluate our method on openly-available real-world autonomous driving and indoor…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques
