MultiCam: On-the-fly Multi-Camera Pose Estimation Using Spatiotemporal Overlaps of Known Objects

Shiyu Li; Hannah Schieber; Kristoffer Waldow; Benjamin Busam; Julian Kreimeier; Daniel Roth

arXiv:2603.22839·cs.CV·March 25, 2026

MultiCam: On-the-fly Multi-Camera Pose Estimation Using Spatiotemporal Overlaps of Known Objects

Shiyu Li, Hannah Schieber, Kristoffer Waldow, Benjamin Busam, Julian Kreimeier, Daniel Roth

PDF

Open Access

TL;DR

This paper presents a real-time, marker-less multi-camera pose estimation method for AR that uses scene understanding and spatiotemporal overlaps of known objects, outperforming existing approaches.

Contribution

It introduces a novel approach that leverages scene graph updates and object overlaps to estimate camera poses without markers, applicable to static and dynamic multi-camera setups.

Findings

01

Outperforms state-of-the-art in camera pose accuracy on YCB-V and T-LESS datasets.

02

Effective in both static and dynamic camera scenarios.

03

Provides a new dataset with multi-camera, multi-object, and temporal FoV overlap.

Abstract

Multi-camera dynamic Augmented Reality (AR) applications require a camera pose estimation to leverage individual information from each camera in one common system. This can be achieved by combining contextual information, such as markers or objects, across multiple views. While commonly cameras are calibrated in an initial step or updated through the constant use of markers, another option is to leverage information already present in the scene, like known objects. Another downside of marker-based tracking is that markers have to be tracked inside the field-of-view (FoV) of the cameras. To overcome these limitations, we propose a constant dynamic camera pose estimation leveraging spatiotemporal FoV overlaps of known objects on the fly. To achieve that, we enhance the state-of-the-art object pose estimator to update our spatiotemporal scene graph, enabling a relation even among…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Robotics and Sensor-Based Localization · Human Pose and Action Recognition