TL;DR
This paper introduces a Representation Invariance Loss (RIL) for oriented object detection that addresses representation ambiguity, leading to improved regression accuracy and consistency across datasets.
Contribution
The paper proposes a novel RIL method that treats multiple object representations as equivalent minima and uses Hungarian matching for optimal regression, improving detection performance.
Findings
Achieves significant improvement on remote sensing and scene text datasets.
Addresses representation ambiguity with a novel loss function.
Demonstrates consistent performance gains across multiple datasets.
Abstract
Arbitrary-oriented objects exist widely in natural scenes, and thus the oriented object detection has received extensive attention in recent years. The mainstream rotation detectors use oriented bounding boxes (OBB) or quadrilateral bounding boxes (QBB) to represent the rotating objects. However, these methods suffer from the representation ambiguity for oriented object definition, which leads to suboptimal regression optimization and the inconsistency between the loss metric and the localization accuracy of the predictions. In this paper, we propose a Representation Invariance Loss (RIL) to optimize the bounding box regression for the rotating objects. Specifically, RIL treats multiple representations of an oriented object as multiple equivalent local minima, and hence transforms bounding box regression into an adaptive matching process with these local minima. Then, the Hungarian…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
