YOWO: You Only Walk Once to Jointly Map An Indoor Scene and Register Ceiling-mounted Cameras

Fan Yang; Sosuke Yamao; Ikuo Kusajima; Atsunori Moteki; Shoichi Masui; and Shan Jiang

arXiv:2511.16521·cs.CV·November 21, 2025

YOWO: You Only Walk Once to Jointly Map An Indoor Scene and Register Ceiling-mounted Cameras

Fan Yang, Sosuke Yamao, Ikuo Kusajima, Atsunori Moteki, Shoichi Masui, and Shan Jiang

PDF

Open Access

TL;DR

This paper presents a unified method for indoor scene mapping and ceiling-mounted camera registration using a single traversal with a head-mounted camera, improving accuracy and efficiency for indoor localization.

Contribution

It introduces a novel joint optimization framework that simultaneously maps the scene and registers ceiling-mounted cameras from a single traversal, along with a new dataset and benchmark.

Findings

01

Effective joint mapping and registration within a unified framework

02

Improved accuracy over separate methods

03

New dataset and benchmark for collaborative scene mapping

Abstract

Using ceiling-mounted cameras (CMCs) for indoor visual capturing opens up a wide range of applications. However, registering CMCs to the target scene layout presents a challenging task. While manual registration with specialized tools is inefficient and costly, automatic registration with visual localization may yield poor results when visual ambiguity exists. To alleviate these issues, we propose a novel solution for jointly mapping an indoor scene and registering CMCs to the scene layout. Our approach involves equipping a mobile agent with a head-mounted RGB-D camera to traverse the entire scene once and synchronize CMCs to capture this mobile agent. The egocentric videos generate world-coordinate agent trajectories and the scene layout, while the videos of CMCs provide pseudo-scale agent trajectories and CMC relative poses. By correlating all the trajectories with their corresponding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Multimodal Machine Learning Applications