MMFL-Net: Multi-scale and Multi-granularity Feature Learning for   Cross-domain Fashion Retrieval

Chen Bao; Xudong Zhang; Jiazhou Chen; Yongwei Miao

arXiv:2210.15128·cs.CV·October 28, 2022

MMFL-Net: Multi-scale and Multi-granularity Feature Learning for Cross-domain Fashion Retrieval

Chen Bao, Xudong Zhang, Jiazhou Chen, Yongwei Miao

PDF

TL;DR

This paper introduces MMFL-Net, a novel multi-scale, multi-granularity feature learning network for cross-domain fashion retrieval, effectively bridging domain gaps and capturing detailed clothing features for improved retrieval accuracy.

Contribution

The paper proposes a unified framework with semantic-spatial feature fusion, multi-branch architecture, and combined loss functions to enhance cross-domain fashion image retrieval.

Findings

01

Achieves significant improvement over state-of-the-art on DeepFashion-C2S and Street2Shop datasets.

02

Effectively captures global, part-informed, and local features for robust clothing representation.

03

Jointly optimizes intra-class and inter-class distances for better discriminability.

Abstract

Instance-level image retrieval in fashion is a challenging issue owing to its increasing importance in real-scenario visual fashion search. Cross-domain fashion retrieval aims to match the unconstrained customer images as queries for photographs provided by retailers; however, it is a difficult task due to a wide range of consumer-to-shop (C2S) domain discrepancies and also considering that clothing image is vulnerable to various non-rigid deformations. To this end, we propose a novel multi-scale and multi-granularity feature learning network (MMFL-Net), which can jointly learn global-local aggregation feature representations of clothing images in a unified framework, aiming to train a cross-domain model for C2S fashion visual similarity. First, a new semantic-spatial feature fusion part is designed to bridge the semantic-spatial gap by applying top-down and bottom-up bidirectional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.