Urban Visual Appeal According to ChatGPT: Contrasting AI and Human Insights
Milad Malekzadeh, Elias Willberg, Jussi Torkko, Tuuli Toivonen

TL;DR
This study evaluates GPT-4's ability to assess urban visual appeal from street view images, comparing its ratings with human judgments to explore AI's potential in urban planning.
Contribution
It demonstrates that GPT-4 can effectively approximate human assessments of urban visual appeal, highlighting both its strengths and limitations in capturing contextual nuances.
Findings
GPT-4 aligns well with human ratings in many areas.
GPT-4 favors greener suburban areas over urban centers.
Human judgment considers contextual factors GPT-4 often misses.
Abstract
The visual appeal of urban environments significantly impacts residents' satisfaction with their living spaces and their overall mood, which in turn, affects their health and well-being. Given the resource-intensive nature of gathering evaluations on urban visual appeal through surveys or inquiries from residents, there is a constant quest for automated solutions to streamline this process and support spatial planning. In this study, we applied an off-the-shelf AI model to automate the analysis of urban visual appeal, using over 1,800 Google Street View images of Helsinki, Finland. By incorporating the GPT-4 model with specified criteria, we assessed these images. Simultaneously, 24 participants were asked to rate the images. Our results demonstrated a strong alignment between GPT-4 and participant ratings, although geographic disparities were noted. Specifically, GPT-4 showed a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImpact of AI and Big Data on Business and Society · Digital Media and Visual Art
MethodsAdam · Label Smoothing · Linear Layer · Byte Pair Encoding · Layer Normalization · Softmax · Attention Is All You Need · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Dense Connections
