OFVL-MS: Once for Visual Localization across Multiple Indoor Scenes
Tao Xie, Kun Dai, Siyi Lu, Ke Wang, Zhiqiang Jiang, Jinghan Gao,, Dedong Liu, Jie Xu, Lijun Zhao, Ruifeng Li

TL;DR
OFVL-MS introduces a unified multi-task learning framework for visual localization across multiple indoor scenes, reducing storage and improving accuracy by adaptive layer sharing and gradient normalization.
Contribution
It proposes a novel adaptive sharing policy and gradient normalization technique enabling efficient multi-scene localization without training separate models for each scene.
Findings
Outperforms state-of-the-art methods on multiple benchmarks.
Achieves high localization accuracy with fewer parameters.
Generalizes well to new scenes with minimal additional parameters.
Abstract
In this work, we seek to predict camera poses across scenes with a multi-task learning manner, where we view the localization of each scene as a new task. We propose OFVL-MS, a unified framework that dispenses with the traditional practice of training a model for each individual scene and relieves gradient conflict induced by optimizing multiple scenes collectively, enabling efficient storage yet precise visual localization for all scenes. Technically, in the forward pass of OFVL-MS, we design a layer-adaptive sharing policy with a learnable score for each layer to automatically determine whether the layer is shared or not. Such sharing policy empowers us to acquire task-shared parameters for a reduction of storage cost and task-specific parameters for learning scene-related features to alleviate gradient conflict. In the backward pass of OFVL-MS, we introduce a gradient normalization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
OFVL-MS: Once for Visual Localization across Multiple Indoor Scenes· youtube
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques
MethodsGradient Normalization
