Explore Intrinsic Geometry for Query-based Tiny and Oriented Object Detector with Momentum-based Bipartite Matching
Junpeng Zhang, Zewei Yang, Jie Feng, Yuhui Zheng, Ronghua Shang, Mengxuan Zhang

TL;DR
This paper introduces IGOFormer, a novel query-based oriented object detector that incorporates intrinsic geometry and momentum-based matching to improve tiny and arbitrarily oriented object detection, especially in aerial imagery.
Contribution
The paper proposes a new detector that explicitly models intrinsic geometry and stabilizes inter-stage matching, enhancing performance on tiny, oriented objects.
Findings
Achieves 78.00% AP50 on DOTA-V1.0 with Swin-T backbone.
Outperforms existing methods in aerial oriented object detection.
Demonstrates the effectiveness of geometric embeddings and momentum-based matching.
Abstract
Recent query-based detectors have achieved remarkable progress, yet their performance remains constrained when handling objects with arbitrary orientations, especially for tiny objects capturing limited texture information. This limitation primarily stems from the underutilization of intrinsic geometry during pixel-based feature decoding and the occurrence of inter-stage matching inconsistency caused by stage-wise bipartite matching. To tackle these challenges, we present IGOFormer, a novel query-based oriented object detector that explicitly integrates intrinsic geometry into feature decoding and enhances inter-stage matching stability. Specifically, we design an Intrinsic Geometry-aware Decoder, which enhances the object-related features conditioned on an object query by injecting complementary geometric embeddings extrapolated from their correlations to capture the geometric layout…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications
