Text-Phase Synergy Network with Dual Priors for Unsupervised Cross-Domain Image Retrieval

Jing Yang; Hui Xue; Shipeng Zhu; Pengfei Fang

arXiv:2603.12711·cs.CV·April 6, 2026

Text-Phase Synergy Network with Dual Priors for Unsupervised Cross-Domain Image Retrieval

Jing Yang, Hui Xue, Shipeng Zhu, Pengfei Fang

PDF

TL;DR

This paper introduces TPSNet, a novel unsupervised cross-domain image retrieval method that uses dual priors from CLIP-generated text prompts and domain-invariant phase features to enhance semantic guidance and domain alignment.

Contribution

The paper proposes TPSNet, integrating text and phase priors to improve unsupervised cross-domain image retrieval beyond existing pseudo-label based methods.

Findings

01

TPSNet outperforms state-of-the-art methods on UCDIR benchmarks.

02

Using CLIP-based class prompts provides more accurate semantic supervision.

03

Domain-invariant phase features effectively bridge domain gaps while preserving semantics.

Abstract

This paper studies unsupervised cross-domain image retrieval (UCDIR), which aims to retrieve images of the same category across different domains without relying on labeled data. Existing methods typically utilize pseudo-labels, derived from clustering algorithms, as supervisory signals for intra-domain representation learning and cross-domain feature alignment. However, these discrete pseudo-labels often fail to provide accurate and comprehensive semantic guidance. Moreover, the alignment process frequently overlooks the entanglement between domain-specific and semantic information, leading to semantic degradation in the learned representations and ultimately impairing retrieval performance. This paper addresses the limitations by proposing a Text-Phase Synergy Network with Dual Priors(TPSNet). Specifically, we first employ CLIP to generate a set of class-specific prompts per domain,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.