PixLens: A Novel Framework for Disentangled Evaluation in   Diffusion-Based Image Editing with Object Detection + SAM

Stefan Stefanache; Llu\'is Pastor P\'erez; Julen Costa Watanabe,; Ernesto Sanchez Tejedor; Thomas Hofmann; Enis Simsar

arXiv:2410.05710·cs.CV·October 10, 2024

PixLens: A Novel Framework for Disentangled Evaluation in Diffusion-Based Image Editing with Object Detection + SAM

Stefan Stefanache, Llu\'is Pastor P\'erez, Julen Costa Watanabe,, Ernesto Sanchez Tejedor, Thomas Hofmann, Enis Simsar

PDF

Open Access 1 Repo

TL;DR

PixLens introduces a comprehensive evaluation framework for diffusion-based image editing models, focusing on edit quality and disentanglement, addressing the lack of standardized benchmarks in the field.

Contribution

The paper presents PixLens, a novel benchmark that evaluates image editing models on content preservation, realism, and disentanglement without relying solely on human judgment or existing models.

Findings

01

PixLens effectively assesses edit quality and disentanglement.

02

It provides a standardized benchmark for diffusion-based image editing.

03

The framework enhances understanding of model performance in diverse editing tasks.

Abstract

Evaluating diffusion-based image-editing models is a crucial task in the field of Generative AI. Specifically, it is imperative to assess their capacity to execute diverse editing tasks while preserving the image content and realism. While recent developments in generative models have opened up previously unheard-of possibilities for image editing, conducting a thorough evaluation of these models remains a challenging and open task. The absence of a standardized evaluation benchmark, primarily due to the inherent need for a post-edit reference image for evaluation, further complicates this issue. Currently, evaluations often rely on established models such as CLIP or require human intervention for a comprehensive understanding of the performance of these image editing models. Our benchmark, PixLens, provides a comprehensive evaluation of both edit quality and latent representation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thesstefan/pixlens
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Steganography and Watermarking Techniques · Currency Recognition and Detection

MethodsContrastive Language-Image Pre-training