Switch-a-View: View Selection Learned from Unlabeled In-the-wild Videos

Sagnik Majumder; Tushar Nagarajan; Ziad Al-Halah; Kristen Grauman

arXiv:2412.18386·cs.CV·April 23, 2025

Switch-a-View: View Selection Learned from Unlabeled In-the-wild Videos

Sagnik Majumder, Tushar Nagarajan, Ziad Al-Halah, Kristen Grauman

PDF

Open Access

TL;DR

Switch-a-View is a model that learns to automatically select the optimal viewpoint in multi-view videos using unlabeled, human-edited videos, enabling better view switching in how-to videos without extensive labeled data.

Contribution

We propose a novel training method that uses pseudo-labels from unlabeled videos to learn view selection, improving multi-view video presentation with limited supervision.

Findings

01

Effective view selection on real-world videos

02

Outperforms baseline methods in view switching tasks

03

Applicable to various multi-view video settings

Abstract

We introduce SWITCH-A-VIEW, a model that learns to automatically select the viewpoint to display at each timepoint when creating a how-to video. The key insight of our approach is how to train such a model from unlabeled -- but human-edited -- video samples. We pose a pretext task that pseudo-labels segments in the training videos for their primary viewpoint (egocentric or exocentric), and then discovers the patterns between the visual and spoken content in a how-to video on the one hand and its view-switch moments on the other hand. Armed with this predictor, our model can be applied to new multi-view video settings for orchestrating which viewpoint should be displayed when, even when such settings come with limited labels. We demonstrate our idea on a variety of real-world videos from HowTo100M and Ego-Exo4D, and rigorously validate its advantages. Project:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Advanced Image and Video Retrieval Techniques · Video Coding and Compression Technologies