Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes

Fabien Delattre; David Dirnfeld; Phat Nguyen; Stephen Scarano; Michael; J. Jones; Pedro Miraldo; Erik Learned-Miller

arXiv:2309.08588·cs.CV·September 18, 2023

Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes

Fabien Delattre, David Dirnfeld, Phat Nguyen, Stephen Scarano, Michael, J. Jones, Pedro Miraldo, Erik Learned-Miller

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel, efficient method for estimating camera rotation in crowded scenes from monocular video, achieving high accuracy and robustness where previous methods struggled, supported by a new dataset and benchmark.

Contribution

We propose a new generalization of the Hough transform on SO(3) for robust camera rotation estimation in crowded scenes, outperforming existing methods in accuracy and speed.

Findings

01

Our method reduces rotation estimation error by nearly 50% compared to the next best approach.

02

It is more accurate than existing methods regardless of computational speed.

03

The approach is effective in crowded, real-world scenes, as demonstrated on a new dataset.

Abstract

We present an approach to estimating camera rotation in crowded, real-world scenes from handheld monocular video. While camera rotation estimation is a well-studied problem, no previous methods exhibit both high accuracy and acceptable speed in this setting. Because the setting is not addressed well by other datasets, we provide a new dataset and benchmark, with high-accuracy, rigorously verified ground truth, on 17 video sequences. Methods developed for wide baseline stereo (e.g., 5-point methods) perform poorly on monocular video. On the other hand, methods used in autonomous driving (e.g., SLAM) leverage specific sensor setups, specific motion models, or local optimization strategies (lagging batch processing) and do not generalize well to handheld video. Finally, for dynamic scenes, commonly used robustification techniques like RANSAC require large numbers of iterations, and become…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes· youtube

Taxonomy

TopicsAdvanced Vision and Imaging · Image and Object Detection Techniques · Robotics and Sensor-Based Localization

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings