OpenVTON-Bench: A Large-Scale High-Resolution Benchmark for Controllable Virtual Try-On Evaluation

Jin Li; Tao Chen; Shuai Jiang; Weijie Wang; Jingwen Luo; Chenhui Wu

arXiv:2601.22725·cs.CV·May 7, 2026

OpenVTON-Bench: A Large-Scale High-Resolution Benchmark for Controllable Virtual Try-On Evaluation

Jin Li, Tao Chen, Shuai Jiang, Weijie Wang, Jingwen Luo, Chenhui Wu

PDF

2 Datasets

TL;DR

OpenVTON-Bench is a comprehensive high-resolution benchmark dataset and evaluation protocol for virtual try-on systems, addressing the need for reliable, fine-grained assessment of visual quality and semantic consistency.

Contribution

The paper introduces a large-scale, high-resolution dataset with a novel multi-modal evaluation protocol combining semantic reasoning and multi-scale metrics for virtual try-on evaluation.

Findings

01

Strong correlation with human judgment (Kendall's τ of 0.833)

02

Dataset covers 20 garment categories with 100K image pairs

03

Proposed metrics effectively distinguish boundary errors from texture artifacts

Abstract

Recent advances in diffusion models have significantly elevated the visual fidelity of Virtual Try-On (VTON) systems, yet reliable evaluation remains a persistent bottleneck. Traditional metrics struggle to quantify fine-grained texture details and semantic consistency, while existing datasets fail to meet commercial standards in scale and diversity. We present OpenVTON-Bench, a large-scale benchmark comprising approximately 100K high-resolution image pairs (up to $1536 \times 1536$ ). The dataset is constructed using DINOv3-based hierarchical clustering for semantically balanced sampling and Gemini-powered dense captioning, ensuring a uniform distribution across 20 fine-grained garment categories. To support reliable evaluation, we propose a multi-modal protocol that measures VTON quality along five interpretable dimensions: background consistency, identity fidelity, texture fidelity,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.