MMFL-Net: Multi-scale and Multi-granularity Feature Learning for Cross-domain Fashion Retrieval
Chen Bao, Xudong Zhang, Jiazhou Chen, Yongwei Miao

TL;DR
This paper introduces MMFL-Net, a novel multi-scale, multi-granularity feature learning network for cross-domain fashion retrieval, effectively bridging domain gaps and capturing detailed clothing features for improved retrieval accuracy.
Contribution
The paper proposes a unified framework with semantic-spatial feature fusion, multi-branch architecture, and combined loss functions to enhance cross-domain fashion image retrieval.
Findings
Achieves significant improvement over state-of-the-art on DeepFashion-C2S and Street2Shop datasets.
Effectively captures global, part-informed, and local features for robust clothing representation.
Jointly optimizes intra-class and inter-class distances for better discriminability.
Abstract
Instance-level image retrieval in fashion is a challenging issue owing to its increasing importance in real-scenario visual fashion search. Cross-domain fashion retrieval aims to match the unconstrained customer images as queries for photographs provided by retailers; however, it is a difficult task due to a wide range of consumer-to-shop (C2S) domain discrepancies and also considering that clothing image is vulnerable to various non-rigid deformations. To this end, we propose a novel multi-scale and multi-granularity feature learning network (MMFL-Net), which can jointly learn global-local aggregation feature representations of clothing images in a unified framework, aiming to train a cross-domain model for C2S fashion visual similarity. First, a new semantic-spatial feature fusion part is designed to bridge the semantic-spatial gap by applying top-down and bottom-up bidirectional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
