TL;DR
The paper introduces A^3, a comprehensive framework for advertising aesthetic assessment based on a new theory-driven paradigm, a large annotated dataset, and a multimodal language model, aiming to improve evaluation's scalability, interpretability, and effectiveness.
Contribution
It proposes the A^3-Law paradigm, constructs a large annotated dataset, and develops A^3-Align, a multimodal model that aligns with the paradigm for better advertising image evaluation.
Findings
A^3-Align outperforms existing models in aligning with A^3-Law.
The framework generalizes well to advertisement selection and critique.
The dataset contains 120K instruction-response pairs with rich annotations.
Abstract
Advertising images significantly impact commercial conversion rates and brand equity, yet current evaluation methods rely on subjective judgments, lacking scalability, standardized criteria, and interpretability. To address these challenges, we present A^3 (Advertising Aesthetic Assessment), a comprehensive framework encompassing four components: a paradigm (A^3-Law), a dataset (A^3-Dataset), a multimodal large language model (A^3-Align), and a benchmark (A^3-Bench). Central to A^3 is a theory-driven paradigm, A^3-Law, comprising three hierarchical stages: (1) Perceptual Attention, evaluating perceptual image signals for their ability to attract attention; (2) Formal Interest, assessing formal composition of image color and spatial layout in evoking interest; and (3) Desire Impact, measuring desire evocation from images and their persuasive impact. Building on A^3-Law, we construct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
