CLPIPS: A Personalized Metric for AI-Generated Image Similarity
Khoi Trinh, Jay Rothenberger, Scott Seidenberger, Dimitrios Diochnos, Anindya Maiti

TL;DR
This paper introduces CLPIPS, a customized perceptual image similarity metric fine-tuned with human judgments to better align with human perception in image generation workflows.
Contribution
The work presents a lightweight fine-tuning approach for LPIPS that enhances its alignment with human judgments in image similarity assessments.
Findings
CLPIPS outperforms baseline LPIPS in correlation with human rankings.
Limited fine tuning of LPIPS layer weights improves perceptual alignment.
The approach demonstrates the potential for adaptive similarity metrics in human-in-the-loop workflows.
Abstract
Iterative prompt refinement is central to reproducing target images with text to image generative models. Previous studies have incorporated image similarity metrics (ISMs) as additional feedback to human users. Existing ISMs such as LPIPS and CLIP provide objective measures of image likeness but often fail to align with human judgments, particularly in context specific or user driven tasks. In this paper, we introduce Customized Learned Perceptual Image Patch Similarity (CLPIPS), a customized extension of LPIPS that adapts a metric's notion of similarity directly to human judgments. We aim to explore whether lightweight, human augmented fine tuning can meaningfully improve perceptual alignment, positioning similarity metrics as adaptive components for human in the loop workflows with text to image tools. We evaluate CLPIPS on a human subject dataset in which participants iteratively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
