An Empirical Analysis of GPT-4V's Performance on Fashion Aesthetic   Evaluation

Yuki Hirakawa; Takashi Wada; Kazuya Morishita; Ryotaro Shimizu; Takuya; Furusawa; Sai Htaung Kham; Yuki Saito

arXiv:2410.23730·cs.CV·November 1, 2024

An Empirical Analysis of GPT-4V's Performance on Fashion Aesthetic Evaluation

Yuki Hirakawa, Takashi Wada, Kazuya Morishita, Ryotaro Shimizu, Takuya, Furusawa, Sai Htaung Kham, Yuki Saito

PDF

Open Access

TL;DR

This paper investigates GPT-4V's zero-shot ability to evaluate fashion aesthetics, showing it aligns with human judgments but has limitations in ranking similar-colored outfits.

Contribution

First empirical study of GPT-4V's performance on fashion aesthetic evaluation, highlighting its strengths and weaknesses in zero-shot settings.

Findings

01

GPT-4V's predictions align with human judgments

02

Struggles with ranking outfits of similar colors

03

Provides a baseline for future research

Abstract

Fashion aesthetic evaluation is the task of estimating how well the outfits worn by individuals in images suit them. In this work, we examine the zero-shot performance of GPT-4V on this task for the first time. We show that its predictions align fairly well with human judgments on our datasets, and also find that it struggles with ranking outfits in similar colors. The code is available at https://github.com/st-tech/gpt4v-fashion-aesthetic-evaluation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsConsumer Perception and Purchasing Behavior · Cultural and Historical Studies · Diverse Topics in Contemporary Research

MethodsALIGN