Adaptive Geodesic Conformal Prediction for Egocentric Camera Pose Estimation
Aishani Pathak, Hasti Seifi

TL;DR
This paper introduces an adaptive conformal prediction method for egocentric camera pose estimation that guarantees uncertainty regions and effectively identifies harder frames to improve coverage in AR applications.
Contribution
It proposes DINOv2-Bridge, a two-stage adaptive conformal prediction approach that transfers difficulty estimation across participants without additional images, enhancing coverage guarantees.
Findings
Standard CP achieves 90% overall coverage but only 60% on hardest frames.
Geodesic scoring better identifies physically harder frames than Euclidean scoring.
DINOv2-Bridge improves hardest frame coverage from 75% to 93% while maintaining overall 90% coverage.
Abstract
Egocentric pose estimation for Augmented Reality (AR) and assistive devices requires not just accurate predictions but guaranteed uncertainty regions. Conformal prediction (CP) provides such guarantees without retraining, but we show that standard CP with a single fixed threshold achieves nominal 90% overall coverage while covering only ~60% of the hardest 25% of frames (Q4) -- a ~30 percentage-point conditional coverage gap consistent across 12 participants, 3 predictors, and 3 horizons (108 evaluations) on EPIC-Fields. We further show that a geodesic SE(3) nonconformity score identifies physically harder frames than Euclidean scoring, with only 15-26% Q4 overlap and 2-3x higher ground-truth camera displacement for geodesic Q4 frames. To close the coverage gap, we propose DINOv2-Bridge adaptive CP: a two-stage difficulty estimator trained on a single source participant that transfers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
