Neuromorphic Monocular Depth Estimation with Uncertainty Modeling
Viktor Bergkvist, Felix Rydell, Per-Erik Forss\'en, David Gustafsson, Johan Rideg

TL;DR
This paper introduces a deep learning approach for monocular depth estimation from event camera data, incorporating uncertainty modeling to identify reliable depth predictions.
Contribution
It presents a novel framework combining multiple event representations with uncertainty estimation methods, trained on synthetic data and fine-tuned on real sequences.
Findings
10 bin log-normal and 5 bin evidential models perform best across metrics
Uncertainty estimation effectively indicates pixels with reliable depth
Different event representations yield similar performance
Abstract
Event cameras offer distinct advantages over conventional frame-based sensors, including microsecond-level temporal resolution, high dynamic range, and low bandwidth. In this paper, we predict per-pixel depth distributions from monocular event streams using deep neural networks. We estimate uncertainty using Gaussian, log-normal, and evidential learning frameworks. We compare six event representations: spatio-temporal voxel grids with 1, 5, 10, and 20 temporal bins, the Compact Spatio-Temporal Representation (CSTR), and Time-Ordered Recent Event (TORE) volumes. Our U-Net-based models are trained on synthetic data and then fine-tuned on real sequences. We evaluate performance using absolute relative error, root mean squared error, and the area under the sparsification error. Quantitative results show that the representations perform similarly, while 10 bin log-normal and 5 bin evidential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
