DEVIS-GRPO: Unleashing GRPO on Dynamic Extreme View Synthesis
Yi Zuo, Huimin Wu, Lingling Li, Fang Liu, Licheng Jiao, Qing Li

TL;DR
DEVIS-GRPO introduces an online policy gradient framework with a novel sampling strategy for efficient, diverse, and high-quality extreme-view video synthesis, outperforming existing methods on multiple datasets.
Contribution
It presents the first online policy gradient approach for extreme view video generation using a progressive sampling strategy, improving efficiency and diversity.
Findings
Achieves 21.57% PSNR improvement on Kubric-4D.
Reduces LPIPS by 18.56% on iPhone dataset.
Outperforms prior methods in non-occlusion areas.
Abstract
Trajectory-controlled video generation has become essential for controllable video generation. While current methods perform well under small-view camera motions, they degrade significantly with large-view motions. Existing solutions for extreme-view synthesis typically require dedicated video pairs, demanding substantial annotation effort. To address these limitations, we propose Dynamic Extreme VIew Synthesis-GRPO (DEVIS-GRPO), a GRPO-based framework for trajectory-controlled video generation, the first online policy gradient method for extreme view video generation. Central to our approach is a novel sampling strategy: Accumulative Dynamic Extreme VIew Synthesis (ADEVIS), which achieves large-view camera motions by progressively accumulating small-view increments. This method delivers two key advantages: 1) enhanced training efficiency, as it eliminates the need to warm-start the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
