Cross-View Exocentric to Egocentric Video Synthesis

Gaowen Liu; Hao Tang; Hugo Latapie; Jason Corso; Yan Yan

arXiv:2107.03120·cs.CV·July 8, 2021

Cross-View Exocentric to Egocentric Video Synthesis

Gaowen Liu, Hao Tang, Hugo Latapie, Jason Corso, Yan Yan

PDF

Open Access

TL;DR

This paper introduces STA-GAN, a novel bi-directional attention-based generative model that synthesizes egocentric videos from exocentric views by capturing spatial and temporal features, outperforming existing methods.

Contribution

The paper proposes a new STA-GAN model with dual discriminators and attention fusion for cross-view video synthesis, addressing the challenge of view transformation.

Findings

01

STA-GAN significantly outperforms existing methods on Side2Ego and Top2Ego datasets.

02

The bi-directional attention fusion improves the quality of generated egocentric videos.

03

Dual discriminators enhance the robustness of network training.

Abstract

Cross-view video synthesis task seeks to generate video sequences of one view from another dramatically different view. In this paper, we investigate the exocentric (third-person) view to egocentric (first-person) view video generation task. This is challenging because egocentric view sometimes is remarkably different from the exocentric view. Thus, transforming the appearances across the two different views is a non-trivial task. Particularly, we propose a novel Bi-directional Spatial Temporal Attention Fusion Generative Adversarial Network (STA-GAN) to learn both spatial and temporal information to generate egocentric video sequences from the exocentric view. The proposed STA-GAN consists of three parts: temporal branch, spatial branch, and attention fusion. First, the temporal and spatial branches generate a sequence of fake frames and their corresponding features. The fake frames…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis