TL;DR
This paper introduces a novel Sequence Feature Alignment (SFA) method for domain adaptive detection transformers, improving cross-domain performance by aligning global and token-wise features with a bipartite matching loss.
Contribution
The paper proposes a new SFA technique with domain query-based and token-wise alignment modules, specifically designed for detection transformers, and introduces a bipartite matching loss for better feature discriminability.
Findings
SFA outperforms state-of-the-art methods on three benchmarks.
Global and token-wise feature alignment reduces domain gaps effectively.
The bipartite matching loss enhances detection robustness.
Abstract
Detection transformers have recently shown promising object detection results and attracted increasing attention. However, how to develop effective domain adaptation techniques to improve its cross-domain performance remains unexplored and unclear. In this paper, we delve into this topic and empirically find that direct feature distribution alignment on the CNN backbone only brings limited improvements, as it does not guarantee domain-invariant sequence features in the transformer for prediction. To address this issue, we propose a novel Sequence Feature Alignment (SFA) method that is specially designed for the adaptation of detection transformers. Technically, SFA consists of a domain query-based feature alignment (DQFA) module and a token-wise feature alignment (TDA) module. In DQFA, a novel domain query is used to aggregate and align global context from the token sequence of both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
