Random Forests versus Neural Networks - What's Best for Camera Localization?
Daniela Massiceti, Alexander Krull, Eric Brachmann, Carsten, Rother, Philip H.S. Torr

TL;DR
This paper compares Random Forests and Neural Networks for camera localization, introduces a ForestNet architecture derived from RFs, and proposes a differentiable ensemble averaging method, showing RFs perform better in final pose accuracy.
Contribution
It introduces a test-time efficient ForestNet architecture and a differentiable ensemble averaging technique for regression, advancing camera localization methods.
Findings
Neural Networks outperform RFs in scene coordinate regression.
RFs and ForestNets perform slightly better in final pose estimation.
ForestNet with robust averaging improves state-of-the-art accuracy.
Abstract
This work addresses the task of camera localization in a known 3D scene given a single input RGB image. State-of-the-art approaches accomplish this in two steps: firstly, regressing for every pixel in the image its 3D scene coordinate and subsequently, using these coordinates to estimate the final 6D camera pose via RANSAC. To solve the first step, Random Forests (RFs) are typically used. On the other hand, Neural Networks (NNs) reign in many dense regression tasks, but are not test-time efficient. We ask the question: which of the two is best for camera localization? To address this, we make two method contributions: (1) a test-time efficient NN architecture which we term a ForestNet that is derived and initialized from a RF, and (2) a new fully-differentiable robust averaging technique for regression ensembles which can be trained end-to-end with a NN. Our experimental findings show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
