Beyond Audio and Pose: A General-Purpose Framework for Video Synchronization

Yosub Shin; Igor Molybog

arXiv:2506.15937·cs.CV·June 23, 2025

Beyond Audio and Pose: A General-Purpose Framework for Video Synchronization

Yosub Shin, Igor Molybog

PDF

Open Access 1 Repo

TL;DR

This paper introduces VideoSync, a versatile video synchronization framework that operates independently of specific feature extraction methods, providing a new, more generalizable approach with rigorous evaluation and improved performance.

Contribution

The work presents VideoSync, a general-purpose video synchronization framework evaluated on diverse datasets, correcting biases in prior methods and establishing reproducible benchmarks.

Findings

01

VideoSync outperforms existing methods like SeSyn-Net under fair conditions.

02

A CNN-based model is identified as the most effective for offset prediction.

03

The framework is applicable across single-human, multi-human, and non-human scenarios.

Abstract

Video synchronization-aligning multiple video streams capturing the same event from different angles-is crucial for applications such as reality TV show production, sports analysis, surveillance, and autonomous systems. Prior work has heavily relied on audio cues or specific visual events, limiting applicability in diverse settings where such signals may be unreliable or absent. Additionally, existing benchmarks for video synchronization lack generality and reproducibility, restricting progress in the field. In this work, we introduce VideoSync, a video synchronization framework that operates independently of specific feature extraction methods, such as human pose estimation, enabling broader applicability across different content types. We evaluate our system on newly composed datasets covering single-human, multi-human, and non-human scenarios, providing both the methodology and code…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

videosyncai/videosync
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimedia Communication and Technology · Video Analysis and Summarization · Subtitles and Audiovisual Media