Generating Fit Check Videos with a Handheld Camera
Bowei Chen, Brian Curless, Ira Kemelmacher-Shlizerman, Steven M. Seitz

TL;DR
This paper introduces a novel method for generating realistic full-body videos from static selfies and motion data, enabling convenient, handheld camera-based video capture with scene consistency.
Contribution
It presents a new video diffusion model with multi-reference attention and an image-based fine-tuning strategy for realistic human video synthesis from minimal input.
Findings
Achieves realistic full-body videos from selfies and IMU data
Enables scene-aware rendering with consistent illumination and shadows
Improves frame sharpness and realism through fine-tuning
Abstract
Self-captured full-body videos are popular, but most deployments require mounted cameras, carefully-framed shots, and repeated practice. We propose a more convenient solution that enables full-body video capture using handheld mobile devices. Our approach takes as input two static photos (front and back) of you in a mirror, along with an IMU motion reference that you perform while holding your mobile phone, and synthesizes a realistic video of you performing a similar target motion. We enable rendering into a new scene, with consistent illumination and shadows. We propose a novel video diffusion-based model to achieve this. Specifically, we propose a parameter-free frame generation strategy and a multi-reference attention mechanism to effectively integrate appearance information from both the front and back selfies into the video diffusion model. Further, we introduce an image-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Evacuation and Crowd Dynamics · Video Analysis and Summarization
MethodsSoftmax · Attention Is All You Need · Diffusion
