Video-ReTime: Learning Temporally Varying Speediness for Time Remapping
Simon Jenni, Markus Woodson, Fabian Caba Heilbron

TL;DR
This paper introduces a neural network-based method for temporally remapping videos by accurately detecting speed variations and optimizing frame sampling, resulting in natural, precisely timed video replays.
Contribution
It presents a self-supervised neural model for localizing speed changes and an optimization technique for robust, precise video re-timing, outperforming prior methods in accuracy and efficiency.
Findings
More accurate detection of playback speed variations
Orders of magnitude more efficient than previous approaches
Robust re-timing on longer videos with precise control
Abstract
We propose a method for generating a temporally remapped video that matches the desired target duration while maximally preserving natural video dynamics. Our approach trains a neural network through self-supervision to recognize and accurately localize temporally varying changes in the video playback speed. To re-time videos, we 1. use the model to infer the slowness of individual video frames, and 2. optimize the temporal frame sub-sampling to be consistent with the model's slowness predictions. We demonstrate that this model can detect playback speed variations more accurately while also being orders of magnitude more efficient than prior approaches. Furthermore, we propose an optimization for video re-timing that enables precise control over the target duration and performs more robustly on longer videos than prior methods. We evaluate the model quantitatively on artificially…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition · Anomaly Detection Techniques and Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
