DM-VTON: Distilled Mobile Real-time Virtual Try-On
Khoi-Nguyen Nguyen-Ngoc, Thanh-Tung Phan-Nguyen, Khanh-Duy Le, and Tam V. Nguyen, Minh-Triet Tran, Trung-Nghia Le

TL;DR
DM-VTON is a lightweight, real-time virtual try-on framework that uses knowledge distillation and an efficient mobile module to deliver high-quality results at 40 fps on standard hardware, enabling practical AR shopping experiences.
Contribution
The paper introduces DM-VTON, a novel, efficient virtual try-on system that combines knowledge distillation with a mobile generator to achieve real-time performance without human parsing.
Findings
Achieves 40 fps on Nvidia Tesla T4 GPU
Consumes only 37 MB of memory
Maintains high output quality comparable to state-of-the-art methods
Abstract
The fashion e-commerce industry has witnessed significant growth in recent years, prompting exploring image-based virtual try-on techniques to incorporate Augmented Reality (AR) experiences into online shopping platforms. However, existing research has primarily overlooked a crucial aspect - the runtime of the underlying machine-learning model. While existing methods prioritize enhancing output quality, they often disregard the execution time, which restricts their applications on a limited range of devices. To address this gap, we propose Distilled Mobile Real-time Virtual Try-On (DM-VTON), a novel virtual try-on framework designed to achieve simplicity and efficiency. Our approach is based on a knowledge distillation scheme that leverages a strong Teacher network as supervision to guide a Student network without relying on human parsing. Notably, we introduce an efficient Mobile…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Augmented Reality Applications · Human Pose and Action Recognition
MethodsKnowledge Distillation
