Learning Transformations To Reduce the Geometric Shift in Object Detection
Vidit Vidit, Martin Engilberge, Mathieu Salzmann

TL;DR
This paper proposes a self-training method to learn geometric transformations that reduce performance drops in object detection caused by geometric shifts, without requiring labeled data or camera info.
Contribution
It introduces a novel self-training approach to learn geometric transformations that mitigate geometric domain shifts in object detection.
Findings
Improved detection performance under camera FoV changes
Enhanced robustness to viewpoint variations
Effective without labeled target data or camera details
Abstract
The performance of modern object detectors drops when the test distribution differs from the training one. Most of the methods that address this focus on object appearance changes caused by, e.g., different illumination conditions, or gaps between synthetic and real images. Here, by contrast, we tackle geometric shifts emerging from variations in the image capture process, or due to the constraints of the environment causing differences in the apparent geometry of the content itself. We introduce a self-training approach that learns a set of geometric transformations to minimize these shifts without leveraging any labeled data in the new domain, nor any information about the cameras. We evaluate our method on two different shifts, i.e., a camera's field of view (FoV) change and a viewpoint change. Our results evidence that learning geometric transformations helps detectors to perform…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods · Advanced Neural Network Applications
MethodsTest
