V-Trans4Style: Visual Transition Recommendation for Video Production   Style Adaptation

Pooja Guhan; Tsung-Wei Huang; Guan-Ming Su; Subhadra Gopalakrishnan,; Dinesh Manocha

arXiv:2501.07983·cs.CV·January 15, 2025

V-Trans4Style: Visual Transition Recommendation for Video Production Style Adaptation

Pooja Guhan, Tsung-Wei Huang, Guan-Ming Su, Subhadra Gopalakrishnan,, Dinesh Manocha

PDF

Open Access

TL;DR

V-Trans4Style is a transformer-based algorithm that recommends visually seamless transitions for videos, enabling style adaptation across various production types with improved accuracy demonstrated on a new dataset.

Contribution

The paper introduces a novel transformer-based model with style conditioning for adaptive video transition recommendation, outperforming existing methods and providing a new dataset for style-specific evaluation.

Findings

01

Outperforms state-of-the-art by 10-80% in Recall@K and mean rank

02

Style conditioning improves style similarity by around 12%

03

Introduces AutoTransition++ dataset with 6k videos categorized by style

Abstract

We introduce V-Trans4Style, an innovative algorithm tailored for dynamic video content editing needs. It is designed to adapt videos to different production styles like documentaries, dramas, feature films, or a specific YouTube channel's video-making technique. Our algorithm recommends optimal visual transitions to help achieve this flexibility using a more bottom-up approach. We first employ a transformer-based encoder-decoder network to learn recommending temporally consistent and visually seamless sequences of visual transitions using only the input videos. We then introduce a style conditioning module that leverages this model to iteratively adjust the visual transitions obtained from the decoder through activation maximization. We demonstrate the efficacy of our method through experiments conducted on our newly introduced AutoTransition++ dataset. It is a 6k video version of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Human Motion and Animation · Video Analysis and Summarization