Scale-aware Insertion of Virtual Objects in Monocular Videos

Songhai Zhang; Xiangli Li; Yingtian Liu; Hongbo Fu

arXiv:2012.02371·cs.CV·December 7, 2020

Scale-aware Insertion of Virtual Objects in Monocular Videos

Songhai Zhang, Xiangli Li, Yingtian Liu, Hongbo Fu

PDF

Open Access

TL;DR

This paper introduces a scale-aware approach for inserting virtual objects into monocular videos by estimating global scene scale using a Bayesian method and a new object size dataset, improving realism and robustness.

Contribution

It presents a novel Bayesian scale estimation method incorporating object size priors and introduces Metric-Tree, a hierarchical dataset of object sizes for over 900 categories.

Findings

01

Outperforms state-of-the-art scale estimation methods

02

Demonstrates robustness across various video scenes

03

Provides a new dataset for object size priors

Abstract

In this paper, we propose a scale-aware method for inserting virtual objects with proper sizes into monocular videos. To tackle the scale ambiguity problem of geometry recovery from monocular videos, we estimate the global scale objects in a video with a Bayesian approach incorporating the size priors of objects, where the scene objects sizes should strictly conform to the same global scale and the possibilities of global scales are maximized according to the size distribution of object categories. To do so, we propose a dataset of sizes of object categories: Metric-Tree, a hierarchical representation of sizes of more than 900 object categories with the corresponding images. To handle the incompleteness of objects recovered from videos, we propose a novel scale estimation method that extracts plausible dimensions of objects for scale optimization. Experiments have shown that our method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization