Seeing Fast and Slow: Learning the Flow of Time in Videos
Yen-Siang Wu, Rundong Luo, Jingsen Zhu, Tao Tu, Ali Farhadi, Matthew Wallingford, Yu-Chiang Frank Wang, Steve Marschner, Wei-Chiu Ma

TL;DR
This paper introduces models that perceive, estimate, and manipulate the flow of time in videos, enabling applications like speed detection, slow-motion dataset creation, and speed-conditioned video generation.
Contribution
It presents a self-supervised approach to learn temporal reasoning in videos, leading to the largest slow-motion dataset and new models for temporal control and super-resolution.
Findings
Models can detect speed changes and estimate playback speed.
Created the largest slow-motion video dataset from noisy sources.
Developed models for speed-conditioned video generation and temporal super-resolution.
Abstract
How can we tell whether a video has been sped up or slowed down? How can we generate videos at different speeds? Although videos have been central to modern computer vision research, little attention has been paid to perceiving and controlling the passage of time. In this paper, we study time as a learnable visual concept and develop models for reasoning about and manipulating the flow of time in videos. We first exploit the multimodal cues and temporal structure naturally present in videos to learn, in a self-supervised manner, to detect speed changes and estimate playback speed. We then show that these learned temporal reasoning models enable us to curate the largest slow-motion video dataset to date from noisy in-the-wild sources. Such slow-motion footage, typically filmed by high-speed cameras, contains substantially richer temporal detail than standard videos. Using this data, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
