Attribute-Guided Multi-Level Attention Network for Fine-Grained Fashion Retrieval
Ling Xiao, Toshihiko Yamasaki

TL;DR
This paper introduces AG-MAN, a novel attribute-guided multi-level attention network that enhances feature extraction for fine-grained fashion retrieval, addressing the feature gap problem and outperforming existing methods on multiple datasets.
Contribution
The paper proposes a new attribute-guided multi-level attention network with enhanced feature extraction and a classification scheme to improve fine-grained fashion retrieval accuracy.
Findings
Outperforms existing attention-based methods on FashionAI, DeepFashion, and Zappos50k datasets.
Achieves up to 2.12% improvement in MAP on FashionAI dataset.
Effectively alleviates the feature gap problem in fine-grained fashion retrieval.
Abstract
Fine-grained fashion retrieval searches for items that share a similar attribute with the query image. Most existing methods use a pre-trained feature extractor (e.g., ResNet 50) to capture image representations. However, a pre-trained feature backbone is typically trained for image classification and object detection, which are fundamentally different tasks from fine-grained fashion retrieval. Therefore, existing methods suffer from a feature gap problem when directly using the pre-trained backbone for fine-tuning. To solve this problem, we introduce an attribute-guided multi-level attention network (AG-MAN). Specifically, we first enhance the pre-trained feature extractor to capture multi-level image embedding, thereby enriching the low-level features within these representations. Then, we propose a classification scheme where images with the same attribute, albeit with different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Image Retrieval and Classification Techniques
MethodsTriplet Loss
