Retro-Actions: Learning 'Close' by Time-Reversing 'Open' Videos
Will Price, Dima Damen

TL;DR
This paper explores how specific video transformations like time-reversal and horizontal-flipping can be used to improve video recognition by maintaining or changing labels, enabling zero-shot learning and data augmentation.
Contribution
It introduces a general framework for classifying video transforms into invariant, equivariant, and novel-generating classes, and demonstrates their utility in zero-shot learning and data augmentation.
Findings
Time-reversed videos are perceived as realistic by humans.
Transformations enable zero-shot learning of complex actions.
Horizontal-flipping can be problematic if used naively.
Abstract
We investigate video transforms that result in class-homogeneous label-transforms. These are video transforms that consistently maintain or modify the labels of all videos in each class. We propose a general approach to discover invariant classes, whose transformed examples maintain their label; pairs of equivariant classes, whose transformed examples exchange their labels; and novel-generating classes, whose transformed examples belong to a new class outside the dataset. Label transforms offer additional supervision previously unexplored in video recognition benefiting data augmentation and enabling zero-shot learning opportunities by learning a class from transformed videos of its counterpart. Amongst such video transforms, we study horizontal-flipping, time-reversal, and their composition. We highlight errors in naively using horizontal-flipping as a form of data augmentation in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
