Building Damage Detection using Satellite Images and Patch-Based Transformer Methods
Smriti Siva, Jan Cross-Zamirski

TL;DR
This paper evaluates the effectiveness of patch-based Vision Transformer models, specifically DINOv2-small and DeiT, for building damage classification using satellite images, addressing challenges of noise and class imbalance in the xBD dataset.
Contribution
The study introduces a novel patch-based pre-processing pipeline and a frozen-head fine-tuning strategy for ViT models, improving damage classification performance on satellite imagery.
Findings
ViT models achieve competitive F1 scores compared to CNN baselines.
Patch-based pre-processing reduces background noise and enhances structural feature detection.
Small ViT architectures perform well with the proposed training method.
Abstract
Rapid building damage assessment is critical for post-disaster response. Damage classification models built on satellite imagery provide a scalable means of obtaining situational awareness. However, label noise and severe class imbalance in satellite data create major challenges. The xBD dataset offers a standardized benchmark for building-level damage across diverse geographic regions. In this study, we evaluate Vision Transformer (ViT) model performance on the xBD dataset, specifically investigating how these models distinguish between types of structural damage when training on noisy, imbalanced data. In this study, we specifically evaluate DINOv2-small and DeiT for multi-class damage classification. We propose a targeted patch-based pre-processing pipeline to isolate structural features and minimize background noise in training. We adopt a frozen-head fine-tuning strategy to keep…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification · Infrastructure Maintenance and Monitoring · Advanced Neural Network Applications
