TL;DR
InFlux introduces a comprehensive benchmark dataset with per-frame ground truth for dynamic camera intrinsics, enabling better evaluation and development of algorithms that handle real-world videos with changing camera parameters.
Contribution
The paper presents InFlux, the first large-scale benchmark with per-frame intrinsic annotations for videos with dynamic intrinsics, covering diverse scenes and variations.
Findings
Existing methods struggle with dynamic intrinsics prediction.
InFlux dataset captures wide intrinsic variations and scene diversity.
Benchmark enables more accurate evaluation of intrinsic estimation algorithms.
Abstract
Accurately tracking camera intrinsics is crucial for achieving 3D understanding from 2D video. However, most 3D algorithms assume that camera intrinsics stay constant throughout a video, which is often not true for many real-world in-the-wild videos. A major obstacle in this field is a lack of dynamic camera intrinsics benchmarks--existing benchmarks typically offer limited diversity in scene content and intrinsics variation, and none provide per-frame intrinsic changes for consecutive video frames. In this paper, we present Intrinsics in Flux (InFlux), a real-world benchmark that provides per-frame ground truth intrinsics annotations for videos with dynamic intrinsics. Compared to prior benchmarks, InFlux captures a wider range of intrinsic variations and scene diversity, featuring 143K+ annotated frames from 386 high-resolution indoor and outdoor videos with dynamic camera intrinsics.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
