SlideAudit: A Dataset and Taxonomy for Automated Evaluation of Presentation Slides
Zhuohao Jerry Zhang, Ruiqi Chen, Mingyuan Zhong, Jacob O. Wobbrock

TL;DR
This paper introduces SlideAudit, a comprehensive dataset and taxonomy for evaluating presentation slide designs, and assesses AI's ability to identify and improve slide flaws using various prompting strategies.
Contribution
It provides a new annotated dataset and taxonomy for slide design flaws, and evaluates AI models' effectiveness in flaw detection and slide improvement.
Findings
AI models have limited accuracy in identifying slide flaws (F1 0.331-0.655)
Prompting with the taxonomy improves AI performance
AI can significantly improve slides, especially when guided by the taxonomy
Abstract
Automated evaluation of specific graphic designs like presentation slides is an open problem. We present SlideAudit, a dataset for automated slide evaluation. We collaborated with design experts to develop a thorough taxonomy of slide design flaws. Our dataset comprises 2400 slides collected and synthesized from multiple sources, including a subset intentionally modified with specific design problems. We then fully annotated them using our taxonomy through strictly trained crowdsourcing from Prolific. To evaluate whether AI is capable of identifying design flaws, we compared multiple large language models under different prompting strategies, and with an existing design critique pipeline. We show that AI models struggle to accurately identify slide design flaws, with F1 scores ranging from 0.331 to 0.655. Notably, prompting techniques leveraging our taxonomy achieved the highest…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
