Cross-domain Image Retrieval with a Dual Attribute-aware Ranking Network
Junshi Huang, Rogerio S. Feris, Qiang Chen, Shuicheng Yan

TL;DR
This paper introduces a Dual Attribute-aware Ranking Network (DARN) for cross-domain clothing image retrieval, effectively bridging the gap between online shopping images and user photos by leveraging semantic attributes and a large-scale dataset.
Contribution
The paper proposes a novel dual-network architecture with attribute-guided learning and ranking constraints, along with a large dataset for training and evaluation.
Findings
Top-20 retrieval accuracy doubled with DARN (0.570 vs. 0.268)
Attribute-guided learning improves retrieval performance
Large-scale dataset enables effective network training
Abstract
We address the problem of cross-domain image retrieval, considering the following practical application: given a user photo depicting a clothing image, our goal is to retrieve the same or attribute-similar clothing items from online shopping stores. This is a challenging problem due to the large discrepancy between online shopping images, usually taken in ideal lighting/pose/background conditions, and user photos captured in uncontrolled conditions. To address this problem, we propose a Dual Attribute-aware Ranking Network (DARN) for retrieval feature learning. More specifically, DARN consists of two sub-networks, one for each domain, whose retrieval feature representations are driven by semantic attribute learning. We show that this attribute-guided learning is a key factor for retrieval accuracy improvement. In addition, to further align with the nature of the retrieval problem, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications
