When are Foundation Models Effective? Understanding the Suitability for   Pixel-Level Classification Using Multispectral Imagery

Yiqun Xie; Zhihao Wang; Weiye Chen; Zhili Li; Xiaowei Jia; Yanhua Li,; Ruichen Wang; Kangyang Chai; Ruohan Li; Sergii Skakun

arXiv:2404.11797·cs.CV·April 19, 2024·1 cites

When are Foundation Models Effective? Understanding the Suitability for Pixel-Level Classification Using Multispectral Imagery

Yiqun Xie, Zhihao Wang, Weiye Chen, Zhili Li, Xiaowei Jia, Yanhua Li,, Ruichen Wang, Kangyang Chai, Ruohan Li, Sergii Skakun

PDF

Open Access

TL;DR

This study evaluates the effectiveness of foundation models for pixel-level multispectral imagery classification, revealing they are not always superior to traditional methods and their suitability depends on task-specific factors.

Contribution

It provides a comparative analysis of foundation models versus traditional and deep learning models for remote sensing classification, highlighting when foundation models are appropriate.

Findings

01

Traditional ML models perform comparably or better in texture-insensitive tasks.

02

Deep learning models excel in texture-dependent tasks like burn scar detection.

03

Foundation models' effectiveness depends on the alignment of self-supervised tasks with real applications.

Abstract

Foundation models, i.e., very large deep learning models, have demonstrated impressive performances in various language and vision tasks that are otherwise difficult to reach using smaller-size models. The major success of GPT-type of language models is particularly exciting and raises expectations on the potential of foundation models in other domains including satellite remote sensing. In this context, great efforts have been made to build foundation models to test their capabilities in broader applications, and examples include Prithvi by NASA-IBM, Segment-Anything-Model, ViT, etc. This leads to an important question: Are foundation models always a suitable choice for different remote sensing tasks, and when or when not? This work aims to enhance the understanding of the status and suitability of foundation models for pixel-level classification using multispectral imagery at moderate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRemote-Sensing Image Classification