Recovering Spatiotemporal Correspondence between Deformable Objects by Exploiting Consistent Foreground Motion in Video
Luca Del Pero, Susanna Ricco, Rahul Sukthankar, Vittorio, Ferrari

TL;DR
This paper presents a method to recover spatiotemporal correspondences between deformable objects in videos by leveraging consistent foreground motion, enabling accurate alignment despite appearance variations.
Contribution
It introduces a motion-based approach that automatically finds and aligns deformable objects in videos using consistent motion cues, outperforming appearance-based methods.
Findings
Successfully aligns thousands of frame pairs in tiger and horse videos
Outperforms the popular SIFT Flow algorithm in accuracy
Effectively handles challenging deformable object scenarios
Abstract
Given unstructured videos of deformable objects, we automatically recover spatiotemporal correspondences to map one object to another (such as animals in the wild). While traditional methods based on appearance fail in such challenging conditions, we exploit consistency in object motion between instances. Our approach discovers pairs of short video intervals where the object moves in a consistent manner and uses these candidates as seeds for spatial alignment. We model the spatial correspondence between the point trajectories on the object in one interval to those in the other using a time-varying Thin Plate Spline deformation model. On a large dataset of tiger and horse videos, our method automatically aligns thousands of pairs of frames to a high accuracy, and outperforms the popular SIFT Flow algorithm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Human Motion and Animation
