OmniVTON: Training-Free Universal Virtual Try-On

Zhaotong Yang; Yuhui Li; Shengfeng He; Xinzhe Li; Yangyang Xu; Junyu Dong; Yong Du

arXiv:2507.15037·cs.CV·July 22, 2025

OmniVTON: Training-Free Universal Virtual Try-On

Zhaotong Yang, Yuhui Li, Shengfeng He, Xinzhe Li, Yangyang Xu, Junyu Dong, Yong Du

PDF

TL;DR

OmniVTON is a novel training-free framework for virtual try-on that achieves high-quality, multi-person garment transfer across diverse scenarios by disentangling garment and pose conditioning without requiring model training.

Contribution

It introduces a training-free, universal VTON method that effectively handles multiple conditions and multi-human scenarios by decoupling garment and pose constraints.

Findings

01

Achieves superior results across various datasets and garment types.

02

First framework capable of multi-human virtual try-on in a single scene.

03

Demonstrates high texture fidelity and pose accuracy without training.

Abstract

Image-based Virtual Try-On (VTON) techniques rely on either supervised in-shop approaches, which ensure high fidelity but struggle with cross-domain generalization, or unsupervised in-the-wild methods, which improve adaptability but remain constrained by data biases and limited universality. A unified, training-free solution that works across both scenarios remains an open challenge. We propose OmniVTON, the first training-free universal VTON framework that decouples garment and pose conditioning to achieve both texture fidelity and pose consistency across diverse settings. To preserve garment details, we introduce a garment prior generation mechanism that aligns clothing with the body, followed by continuous boundary stitching technique to achieve fine-grained texture retention. For precise pose alignment, we utilize DDIM inversion to capture structural cues while suppressing texture…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.