Preferences Order, Ratings Anchor: From Fused Expert Aesthetic Ground Truth to Self-Distillation
Yuanpei Zhao, Jie Lin, Chao Zhang, Yilin Wang, Mao Li, Chenhui Li, Jie Hou, and Tangjie Lv

TL;DR
This paper introduces PPaint, a dual-protocol benchmark combining pairwise preferences and pointwise ratings for image aesthetic assessment, demonstrating how fusing both improves aesthetic scoring accuracy and transferability.
Contribution
It presents a novel dual-protocol benchmark and a method to fuse preferences and ratings, advancing the accuracy and generalization of aesthetic assessment models.
Findings
Preferences yield more consistent ordinal rankings.
Fusing preferences and ratings improves scoring accuracy.
Distilled VLM achieves state-of-the-art performance across categories.
Abstract
Pairwise preferences and pointwise ratings are the two dominant annotation protocols in image aesthetic assessment (IAA), yet existing benchmarks adopt only one, leaving their complementarity unmeasured under controlled conditions. We introduce PPaint, a matched dual-protocol benchmark in which 15 domain experts, 5 per category, annotate 150 Chinese paintings under both protocols across five aesthetic dimensions, collecting 45,900 pairwise expert judgments through a locally dense preference design alongside the matched ratings. The matched design reveals complementary strengths: preferences yield more consistent ordinal rankings, while ratings anchor the absolute score scale. Fusing both signals via two independent preference-to-score methods yields a fused expert ground truth on which the two constructions converge to nearly identical scores. The same preference-to-score principle…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
