LAMP: Localization Aware Multi-camera People Tracking in Metric 3D World

Nan Yang; Julian Straub; Fan Zhang; Richard Newcombe; Jakob Engel; Lingni Ma

arXiv:2605.05390·cs.CV·May 8, 2026

LAMP: Localization Aware Multi-camera People Tracking in Metric 3D World

Nan Yang, Julian Straub, Fan Zhang, Richard Newcombe, Jakob Engel, Lingni Ma

PDF

TL;DR

LAMP is a novel multi-camera 3D human tracking framework that leverages device motion and calibration to improve egocentric tracking accuracy using a two-step process involving 3D conversion and spatio-temporal modeling.

Contribution

It introduces a simple, end-to-end framework that disentangles observer and target motion, enabling effective multi-view, localized 3D human tracking in egocentric scenarios.

Findings

01

Achieves state-of-the-art results on monocular benchmarks.

02

Significantly outperforms baselines in egocentric multi-camera settings.

03

Effectively leverages multi-view, asynchronous camera data.

Abstract

Tracking 3D human motion from egocentric multi-camera headset is challenged by severe egomotion, partial visibility or occlusions and lack of training data. Existing methods designed for monocular video often require static or slowly-moving cameras and cannot efficiently leverage multi-view, calibrated and localized input. This makes them brittle and prone to fail on dynamic egocentric captures. We propose LAMP (Localization Aware Multi-camera People Tracking): a novel, simple framework to solve this via early disentanglement of observer and target motion. LAMP introduces a two-step process. First, we leverage the known device 6 DoF motion and calibration to convert detected 2D body keypoints from all cameras over a temporal window into a unified 3D world reference frame. Second, an end-to-end-trained spatio-temporal transformer fits 3D human motion directly to this 3D ray cloud. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.