ImmersePro: End-to-End Stereo Video Synthesis Via Implicit Disparity   Learning

Jian Shi; Zhenyu Li; Peter Wonka

arXiv:2410.00262·cs.CV·October 2, 2024

ImmersePro: End-to-End Stereo Video Synthesis Via Implicit Disparity Learning

Jian Shi, Zhenyu Li, Peter Wonka

PDF

Open Access 1 Repo

TL;DR

ImmersePro is a novel end-to-end framework that converts single-view videos into stereo videos using implicit disparity learning, leveraging a new dataset and attention mechanisms for high-quality stereo synthesis.

Contribution

The paper introduces ImmersePro, a dual-branch architecture with implicit disparity guidance and a large-scale stereo video dataset for improved stereo video generation.

Findings

01

Significant quantitative improvements over existing methods.

02

Effective stereo video synthesis from monocular videos.

03

Introduction of the large-scale YouTube-SBS dataset.

Abstract

We introduce \textit{ImmersePro}, an innovative framework specifically designed to transform single-view videos into stereo videos. This framework utilizes a novel dual-branch architecture comprising a disparity branch and a context branch on video data by leveraging spatial-temporal attention mechanisms. \textit{ImmersePro} employs implicit disparity guidance, enabling the generation of stereo pairs from video sequences without the need for explicit disparity maps, thus reducing potential errors associated with disparity estimation models. In addition to the technical advancements, we introduce the YouTube-SBS dataset, a comprehensive collection of 423 stereo videos sourced from YouTube. This dataset is unprecedented in its scale, featuring over 7 million stereo pairs, and is designed to facilitate training and benchmarking of stereo video generation models. Our experiments demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shijianjian/ImmersePro
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Video Coding and Compression Technologies

MethodsSoftmax · Attention Is All You Need