AssetDropper: Asset Extraction via Diffusion Models with Reward-Driven Optimization
Lanjiong Li, Guanhua Zhao, Lingting Zhu, Zeyu Cai, Lequan Yu, Jian Zhang, Zeyu Wang

TL;DR
AssetDropper is a novel framework that leverages diffusion models and reward-driven optimization to accurately extract standardized assets from images, aiding designers in accessing high-quality asset libraries.
Contribution
It introduces the first method for asset extraction from images using diffusion models combined with a reward-based feedback loop, supported by large synthetic and real-world datasets.
Findings
Achieves state-of-the-art asset extraction accuracy.
Handles complex scenarios like perspective distortion and occlusion.
Utilizes a reward model to improve extraction consistency.
Abstract
Recent research on generative models has primarily focused on creating product-ready visual outputs; however, designers often favor access to standardized asset libraries, a domain that has yet to be significantly enhanced by generative capabilities. Although open-world scenes provide ample raw materials for designers, efficiently extracting high-quality, standardized assets remains a challenge. To address this, we introduce AssetDropper, the first framework designed to extract assets from reference images, providing artists with an open-world asset palette. Our model adeptly extracts a front view of selected subjects from input images, effectively handling complex scenarios such as perspective distortion and subject occlusion. We establish a synthetic dataset of more than 200,000 image-subject pairs and a real-world benchmark with thousands more for evaluation, facilitating the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Aesthetic Perception and Analysis · Visual Attention and Saliency Detection
