Image Score: Learning and Evaluating Human Preferences for Mercari Search
Chingis Oinar, Miao Cao, Shanshan Fu

TL;DR
This paper introduces a cost-effective LLM-based method for evaluating image quality in e-commerce, demonstrating its correlation with user behavior and its positive impact on sales at Mercari.
Contribution
The paper presents a novel LLM-driven approach for assessing image aesthetics that correlates with human preferences and improves e-commerce performance.
Findings
LLM-generated image labels correlate with user behavior.
The approach is cost-effective for large-scale image quality assessment.
Online experiments show increased sales using the proposed method.
Abstract
Mercari is the largest C2C e-commerce marketplace in Japan, having more than 20 million active monthly users. Search being the fundamental way to discover desired items, we have always had a substantial amount of data with implicit feedback. Although we actively take advantage of that to provide the best service for our users, the correlation of implicit feedback for such tasks as image quality assessment is not trivial. Many traditional lines of research in Machine Learning (ML) are similarly motivated by the insatiable appetite of Deep Learning (DL) models for well-labelled training data. Weak supervision is about leveraging higher-level and/or noisier supervision over unlabeled data. Large Language Models (LLMs) are being actively studied and used for data labelling tasks. We present how we leverage a Chain-of-Thought (CoT) to enable LLM to produce image aesthetics labels that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
