# Dress&Dance: Dress up and Dance as You Like It - Technical Preview

**Authors:** Jun-Kun Chen, Aayush Bansal, Minh Phuoc Vo, Yu-Xiong Wang

arXiv: 2508.21070 · 2025-08-29

## TL;DR

Dress&Dance is a novel video diffusion framework that creates high-quality, realistic virtual try-on videos of users wearing various garments, with synchronized movement based on reference videos, using a single user image and multi-modal conditioning.

## Contribution

The paper introduces CondNet, a new attention-based conditioning network that unifies multi-modal inputs for improved garment registration and motion fidelity in virtual try-on videos.

## Key findings

- Outperforms existing open source and commercial solutions.
- Generates high-quality 5-second videos at 24 FPS and 1152x720 resolution.
- Supports diverse garments and simultaneous try-on in a single pass.

## Abstract

We present Dress&Dance, a video diffusion framework that generates high quality 5-second-long 24 FPS virtual try-on videos at 1152x720 resolution of a user wearing desired garments while moving in accordance with a given reference video. Our approach requires a single user image and supports a range of tops, bottoms, and one-piece garments, as well as simultaneous tops and bottoms try-on in a single pass. Key to our framework is CondNet, a novel conditioning network that leverages attention to unify multi-modal inputs (text, images, and videos), thereby enhancing garment registration and motion fidelity. CondNet is trained on heterogeneous training data, combining limited video data and a larger, more readily available image dataset, in a multistage progressive manner. Dress&Dance outperforms existing open source and commercial solutions and enables a high quality and flexible try-on experience.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.21070/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/2508.21070/full.md

## References

35 references — full list in the complete paper: https://tomesphere.com/paper/2508.21070/full.md

---
Source: https://tomesphere.com/paper/2508.21070