Garments2Look: A Multi-Reference Dataset for High-Fidelity Outfit-Level Virtual Try-On with Clothing and Accessories

Junyao Hu; Zhongwei Cheng; Waikeung Wong; Xingxing Zou

arXiv:2603.14153·cs.CV·March 17, 2026

Garments2Look: A Multi-Reference Dataset for High-Fidelity Outfit-Level Virtual Try-On with Clothing and Accessories

Junyao Hu, Zhongwei Cheng, Waikeung Wong, Xingxing Zou

PDF

Open Access 2 Datasets

TL;DR

Garments2Look introduces a comprehensive large-scale dataset for outfit-level virtual try-on, enabling research on full outfits with multiple garments and accessories, addressing limitations of existing datasets.

Contribution

This paper presents the first extensive multimodal dataset for outfit-level VTON, including diverse categories, detailed annotations, and a synthesis pipeline for data quality, along with baseline evaluations.

Findings

01

Current VTON methods struggle with full outfit try-on

02

Layering and styling inference remains challenging

03

Dataset enables future research on complex outfit virtual try-on

Abstract

Virtual try-on (VTON) has advanced single-garment visualization, yet real-world fashion centers on full outfits with multiple garments, accessories, fine-grained categories, layering, and diverse styling, remaining beyond current VTON systems. Existing datasets are category-limited and lack outfit diversity. We introduce Garments2Look, the first large-scale multimodal dataset for outfit-level VTON, comprising 80K many-garments-to-one-look pairs across 40 major categories and 300+ fine-grained subcategories. Each pair includes an outfit with 3-12 reference garment images (Average 4.48), a model image wearing the outfit, and detailed item and try-on textual annotations. To balance authenticity and diversity, we propose a synthesis pipeline. It involves heuristically constructing outfit lists before generating try-on results, with the entire process subjected to strict automated filtering…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · 3D Shape Modeling and Analysis