Attr2Style: A Transfer Learning Approach for Inferring Fashion Styles via Apparel Attributes
Rajdeep Hazra Banerjee, Abhinav Ravi, Ujjal Kr Dutta

TL;DR
This paper introduces Attr2Style, a transfer learning model that generates fashion style captions from apparel images by leveraging attribute-based annotations, reducing the need for costly style-specific labels.
Contribution
It presents a novel transfer learning approach that uses attribute-based training to infer style captions, addressing annotation challenges in fashion image captioning.
Findings
Captions generated are closely aligned with actual style information.
The model effectively transfers attribute knowledge to style captioning.
Qualitative and quantitative evaluations demonstrate improved caption accuracy.
Abstract
Popular fashion e-commerce platforms mostly provide details about low-level attributes of an apparel (eg, neck type, dress length, collar type) on their product detail pages. However, customers usually prefer to buy apparel based on their style information, or simply put, occasion (eg, party/ sports/ casual wear). Application of a supervised image-captioning model to generate style-based image captions is limited because obtaining ground-truth annotations in the form of style-based captions is difficult. This is because annotating style-based captions requires a certain amount of fashion domain expertise, and also adds to the costs and manual effort. On the contrary, low-level attribute based annotations are much more easily available. To address this issue, we propose a transfer-learning based image captioning model that is trained on a source dataset with sufficient attribute-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Video Analysis and Summarization
