Sketch2Colab: Sketch-Conditioned Multi-Human Animation via Controllable Flow Distillation
Divyanshu Daiya, Aniket Bera

TL;DR
Sketch2Colab is a novel framework that converts 2D sketches into realistic, controllable multi-human 3D animations with improved speed and fidelity, leveraging diffusion models and a CTMC planner.
Contribution
It introduces a sketch-conditioned diffusion prior, a distilled flow model for fast sampling, and a CTMC-based interaction planner for synchronized multi-human motion.
Findings
Outperforms baselines in constraint adherence and perceptual quality.
Samples significantly faster than diffusion-only methods.
Effectively models discrete interaction changes in multi-human motion.
Abstract
We present Sketch2Colab, which turns storyboard-style 2D sketches into coherent, object-aware 3D multi-human motion with fine-grained control over agents, joints, timing, and contacts. Diffusion-based motion generators offer strong realism but often rely on costly guidance for multi-entity control and degrade under strong conditioning. Sketch2Colab instead learns a sketch-conditioned diffusion prior and distills it into a rectified-flow student in latent space for fast, stable sampling. To make motion follow storyboards closely, we guide the student with differentiable objectives that enforce keyframes, paths, contacts, and physical consistency. Collaborative motion naturally involves discrete changes in interaction, such as converging, forming contact, cooperative transport, or disengaging, and a continuous flow alone struggles to sequence these shifts cleanly. We address this with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
