TL;DR
This paper introduces a novel real-time method for relocalizing the 6DOF pose of event cameras using a stacked spatial LSTM network that learns spatial dependencies from event images, significantly improving accuracy over previous methods.
Contribution
The paper presents a new SP-LSTM based approach that effectively captures spatial dependencies in event images for accurate pose relocalization, outperforming existing methods.
Findings
Reduces position error by approximately 6 times.
Reduces orientation error by approximately 3 times.
Generalizes well across datasets and outperforms recent methods.
Abstract
We present a new method to relocalize the 6DOF pose of an event camera solely based on the event stream. Our method first creates the event image from a list of events that occurs in a very short time interval, then a Stacked Spatial LSTM Network (SP-LSTM) is used to learn the camera pose. Our SP-LSTM is composed of a CNN to learn deep features from the event images and a stack of LSTM to learn spatial dependencies in the image feature space. We show that the spatial dependency plays an important role in the relocalization task and the SP-LSTM can effectively learn this information. The experimental results on a publicly available dataset show that our approach generalizes well and outperforms recent methods by a substantial margin. Overall, our proposed method reduces by approx. 6 times the position error and 3 times the orientation error compared to the current state of the art. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
