Seeing Like a Designer Without One: A Study on Unsupervised Slide Quality Assessment via Designer Cue Augmentation

Tai Inui; Steven Oh; Magdeline Kuan

arXiv:2508.19289·cs.CV·August 28, 2025

Seeing Like a Designer Without One: A Study on Unsupervised Slide Quality Assessment via Designer Cue Augmentation

Tai Inui, Steven Oh, Magdeline Kuan

PDF

TL;DR

This paper introduces an unsupervised method combining visual design metrics and multimodal embeddings to accurately assess slide quality, closely aligning with human judgments and outperforming existing vision-language models.

Contribution

The study presents a novel unsupervised slide quality assessment pipeline that integrates expert-inspired visual metrics with CLIP-ViT embeddings, achieving high correlation with human ratings.

Findings

01

Achieved up to 0.83 Pearson correlation with human ratings

02

Outperformed leading vision-language models by 1.79x to 3.23x

03

Validated convergent and discriminant validity of the method

Abstract

We present an unsupervised slide-quality assessment pipeline that combines seven expert-inspired visual-design metrics (whitespace, colorfulness, edge density, brightness contrast, text density, color harmony, layout balance) with CLIP-ViT embeddings, using Isolation Forest-based anomaly scoring to evaluate presentation slides. Trained on 12k professional lecture slides and evaluated on six academic talks (115 slides), our method achieved Pearson correlations up to 0.83 with human visual-quality ratings-1.79x to 3.23x stronger than scores from leading vision-language models (ChatGPT o4-mini-high, ChatGPT o3, Claude Sonnet 4, Gemini 2.5 Pro). We demonstrate convergent validity with visual ratings, discriminant validity against speaker-delivery scores, and exploratory alignment with overall impressions. Our results show that augmenting low-level design cues with multimodal embeddings…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.