Enhancing Zero-shot Personalized Image Aesthetics Assessment with Profile-aware Multimodal LLM
Chun Wang, Chenfeng Wei, Chenyang Liu, Weihong Deng

TL;DR
This paper introduces P-MLLM, a profile-aware multimodal language model that enhances zero-shot personalized image aesthetics assessment by integrating user profiles and visual information for better subjective rating predictions.
Contribution
It proposes a novel profile-based personalization paradigm and a multimodal LLM architecture that effectively incorporates visual data in a zero-shot setting.
Findings
P-MLLM achieves competitive zero-shot performance on PIAA benchmarks.
The model remains effective with coarse profile information.
Profile-aware integration improves personalization in image aesthetics assessment.
Abstract
Personalized image aesthetics assessment (PIAA) aims to predict an individual user's subjective rating of an image, which requires modeling user-specific aesthetic preferences. Existing methods rely on historical user ratings for this modeling and therefore struggle when such data are unavailable. We address this zero-shot setting by using user profiles as contextual signals for personalization and adopting a profile-based personalization paradigm. We introduce P-MLLM, a profile-aware multimodal LLM that augments a frozen LLM with selective fusion modules for controlled visual integration. These modules selectively integrate visual information into the model's evolving hidden states during profile-conditioned reasoning, allowing visual information to be incorporated in a profile-aware manner. Experiments on recent PIAA benchmarks show that P-MLLM achieves competitive zero-shot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
