AssetDropper: Asset Extraction via Diffusion Models with Reward-Driven Optimization

Lanjiong Li; Guanhua Zhao; Lingting Zhu; Zeyu Cai; Lequan Yu; Jian Zhang; Zeyu Wang

arXiv:2506.07738·cs.CV·June 10, 2025

AssetDropper: Asset Extraction via Diffusion Models with Reward-Driven Optimization

Lanjiong Li, Guanhua Zhao, Lingting Zhu, Zeyu Cai, Lequan Yu, Jian Zhang, Zeyu Wang

PDF

Open Access

TL;DR

AssetDropper is a novel framework that leverages diffusion models and reward-driven optimization to accurately extract standardized assets from images, aiding designers in accessing high-quality asset libraries.

Contribution

It introduces the first method for asset extraction from images using diffusion models combined with a reward-based feedback loop, supported by large synthetic and real-world datasets.

Findings

01

Achieves state-of-the-art asset extraction accuracy.

02

Handles complex scenarios like perspective distortion and occlusion.

03

Utilizes a reward model to improve extraction consistency.

Abstract

Recent research on generative models has primarily focused on creating product-ready visual outputs; however, designers often favor access to standardized asset libraries, a domain that has yet to be significantly enhanced by generative capabilities. Although open-world scenes provide ample raw materials for designers, efficiently extracting high-quality, standardized assets remains a challenge. To address this, we introduce AssetDropper, the first framework designed to extract assets from reference images, providing artists with an open-world asset palette. Our model adeptly extracts a front view of selected subjects from input images, effectively handling complex scenarios such as perspective distortion and subject occlusion. We establish a synthetic dataset of more than 200,000 image-subject pairs and a real-world benchmark with thousands more for evaluation, facilitating the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Aesthetic Perception and Analysis · Visual Attention and Saliency Detection