Zero-shot Building Age Classification from Facade Image Using GPT-4
Zichao Zeng, June Moh Goo, Xinglei Wang, Bin Chi, Meihui Wang, and Jan, Boehm

TL;DR
This paper explores using GPT-4 Vision in a zero-shot manner to classify building ages from facade images, demonstrating potential without training but with limitations in accuracy and granularity.
Contribution
It introduces a novel zero-shot classification approach using GPT-4 Vision prompts for building age estimation from facade images.
Findings
Achieved 39.69% accuracy in classifying building age epochs.
Mean absolute error of 0.85 decades indicates successful rough predictions.
Struggles with very old buildings and fine-grained age distinctions within 2 decades.
Abstract
A building's age of construction is crucial for supporting many geospatial applications. Much current research focuses on estimating building age from facade images using deep learning. However, building an accurate deep learning model requires a considerable amount of labelled training data, and the trained models often have geographical constraints. Recently, large pre-trained vision language models (VLMs) such as GPT-4 Vision, which demonstrate significant generalisation capabilities, have emerged as potential training-free tools for dealing with specific vision tasks, but their applicability and reliability for building information remain unexplored. In this study, a zero-shot building age classifier for facade images is developed using prompts that include logical instructions. Taking London as a test case, we introduce a new dataset, FI-London, comprising facade images and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Surveying and Cultural Heritage
MethodsAttention Is All You Need · Dropout · Adam · Position-Wise Feed-Forward Layer · Linear Layer · Layer Normalization · Byte Pair Encoding · Absolute Position Encodings · Multi-Head Attention · Dense Connections
