When are Foundation Models Effective? Understanding the Suitability for Pixel-Level Classification Using Multispectral Imagery
Yiqun Xie, Zhihao Wang, Weiye Chen, Zhili Li, Xiaowei Jia, Yanhua Li,, Ruichen Wang, Kangyang Chai, Ruohan Li, Sergii Skakun

TL;DR
This study evaluates the effectiveness of foundation models for pixel-level multispectral imagery classification, revealing they are not always superior to traditional methods and their suitability depends on task-specific factors.
Contribution
It provides a comparative analysis of foundation models versus traditional and deep learning models for remote sensing classification, highlighting when foundation models are appropriate.
Findings
Traditional ML models perform comparably or better in texture-insensitive tasks.
Deep learning models excel in texture-dependent tasks like burn scar detection.
Foundation models' effectiveness depends on the alignment of self-supervised tasks with real applications.
Abstract
Foundation models, i.e., very large deep learning models, have demonstrated impressive performances in various language and vision tasks that are otherwise difficult to reach using smaller-size models. The major success of GPT-type of language models is particularly exciting and raises expectations on the potential of foundation models in other domains including satellite remote sensing. In this context, great efforts have been made to build foundation models to test their capabilities in broader applications, and examples include Prithvi by NASA-IBM, Segment-Anything-Model, ViT, etc. This leads to an important question: Are foundation models always a suitable choice for different remote sensing tasks, and when or when not? This work aims to enhance the understanding of the status and suitability of foundation models for pixel-level classification using multispectral imagery at moderate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification
