Progressive Photorealistic Simplification
Adi Rosenthal, Dana Berman, Yedid Hoshen, Ariel Shamir

TL;DR
This paper introduces a progressive, semantic-based image simplification method that preserves photorealism by iteratively removing and inpainting scene elements, leveraging vision-language models and learned verification.
Contribution
It presents a novel framework combining semantic understanding and generative editing for photorealistic image simplification, including a distillation into a video prediction model.
Findings
Produces high-quality, photorealistic simplification trajectories
Enables applications like decluttering and semantic layer decomposition
Offers a more structured alternative to traditional abstraction methods
Abstract
Existing image simplification techniques often rely on Non-Photorealistic Rendering (NPR), transforming photographs into stylized sketches, cartoons, or paintings. While effective at reducing visual complexity, such approaches typically sacrifice photographic realism. In this work, we explore a complementary direction: simplifying images while preserving their photorealistic appearance. We introduce progressive semantic image simplification, a framework that iteratively reduces scene complexity by removing and inpainting elements in a controlled manner. At each step, the resulting image remains a plausible natural photograph. Our method combines semantic understanding with generative editing, leveraging Vision-Language Models (VLMs) to identify and prioritize elements for removal, and a learned verifier to ensure photorealism and coherence throughout the process. This is implemented via…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
