COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation
Jiefeng Li, Ye Yuan, Davis Rempe, Haotian Zhang, Pavlo Molchanov, Cewu, Lu, Jan Kautz, Umar Iqbal

TL;DR
COIN introduces a novel control-inpainting diffusion prior and a joint optimization framework to improve global human and camera motion estimation from RGB videos, outperforming existing methods on challenging benchmarks.
Contribution
The paper proposes COIN, a control-inpainting diffusion prior with a new score distillation method and human-scene relation loss for disentangling and accurately estimating motions.
Findings
Outperforms state-of-the-art in global human and camera motion estimation
Achieves 33% improvement in world joint position error on RICH dataset
Demonstrates effectiveness on three challenging benchmarks
Abstract
Estimating global human motion from moving cameras is challenging due to the entanglement of human and camera motions. To mitigate the ambiguity, existing methods leverage learned human motion priors, which however often result in oversmoothed motions with misaligned 2D projections. To tackle this problem, we propose COIN, a control-inpainting motion diffusion prior that enables fine-grained control to disentangle human and camera motions. Although pre-trained motion diffusion models encode rich motion priors, we find it non-trivial to leverage such knowledge to guide global motion estimation from RGB videos. COIN introduces a novel control-inpainting score distillation sampling method to ensure well-aligned, consistent, and high-quality motion from the diffusion prior within a joint optimization framework. Furthermore, we introduce a new human-scene relation loss to alleviate the scale…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Image and Signal Denoising Methods
MethodsDiffusion
