Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision

Hyunsoo Cha; Wonjung Woo; Byungjun Kim; Hanbyul Joo

arXiv:2604.04934·cs.CV·May 5, 2026

Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision

Hyunsoo Cha, Wonjung Woo, Byungjun Kim, Hanbyul Joo

PDF

1 Repo

TL;DR

Vanast is a unified framework that synthesizes realistic, identity-preserving human animations with garments from a single image, overcoming common issues like distortion and inconsistency.

Contribution

It introduces a single-step model with synthetic triplet supervision and a dual module architecture for improved virtual try-on and animation quality.

Findings

01

Produces high-fidelity, identity-consistent animations

02

Supports zero-shot garment interpolation

03

Reduces garment distortion and inconsistency

Abstract

We present Vanast, a unified framework that generates garment-transferred human animation videos directly from a single human image, garment images, and a pose guidance video. Conventional two-stage pipelines treat image-based virtual try-on and pose-driven animation as separate processes, which often results in identity drift, garment distortion, and front-back inconsistency. Our model addresses these issues by performing the entire process in a single unified step to achieve coherent synthesis. To enable this setting, we construct large-scale triplet supervision. Our data generation pipeline includes generating identity-preserving human images in alternative outfits that differ from garment catalog images, capturing full upper and lower garment triplets to overcome the single-garment-posed video pair limitation, and assembling diverse in-the-wild triplets without requiring garment…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

snuvclab/vanast
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.