VICTOR: Visual Incompatibility Detection with Transformers and   Fashion-specific contrastive pre-training

Stefanos-Iordanis Papadopoulos; Christos Koutlis; Symeon Papadopoulos,; Ioannis Kompatsiaris

arXiv:2207.13458·cs.CV·September 9, 2022·1 cites

VICTOR: Visual Incompatibility Detection with Transformers and Fashion-specific contrastive pre-training

Stefanos-Iordanis Papadopoulos, Christos Koutlis, Symeon Papadopoulos,, Ioannis Kompatsiaris

PDF

Open Access 1 Repo

TL;DR

VICTOR is a transformer-based model that detects visual incompatibility in fashion outfits, using contrastive pre-training and a new dataset, achieving high accuracy while significantly reducing computational costs.

Contribution

The paper introduces VICTOR, a novel transformer architecture for fashion compatibility detection, and a new dataset Polyvore-MISFITs, improving accuracy and efficiency over existing methods.

Findings

01

VICTOR surpasses state-of-the-art on Polyvore datasets.

02

Reduces floating operations by 88% compared to previous models.

03

Effective in both overall compatibility regression and item mismatch detection.

Abstract

For fashion outfits to be considered aesthetically pleasing, the garments that constitute them need to be compatible in terms of visual aspects, such as style, category and color. Previous works have defined visual compatibility as a binary classification task with items in a garment being considered as fully compatible or fully incompatible. However, this is not applicable to Outfit Maker applications where users create their own outfits and need to know which specific items may be incompatible with the rest of the outfit. To address this, we propose the Visual InCompatibility TransfORmer (VICTOR) that is optimized for two tasks: 1) overall compatibility as regression and 2) the detection of mismatching items and utilize fashion-specific contrastive language-image pre-training for fine tuning computer vision neural networks on fashion imagery. We build upon the Polyvore outfit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

stevejpapad/visual-incompatibility-transformer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Image Enhancement Techniques · Generative Adversarial Networks and Image Synthesis