OReole-FM: successes and challenges toward billion-parameter foundation   models for high-resolution satellite imagery

Philipe Dias; Aristeidis Tsaris; Jordan Bowman; Abhishek; Potnis; Jacob Arndt; H. Lexie Yang; Dalton Lunga

arXiv:2410.19965·cs.CV·October 29, 2024

OReole-FM: successes and challenges toward billion-parameter foundation models for high-resolution satellite imagery

Philipe Dias, Aristeidis Tsaris, Jordan Bowman, Abhishek, Potnis, Jacob Arndt, H. Lexie Yang, Dalton Lunga

PDF

TL;DR

This paper demonstrates the successful pretraining of billion-parameter foundation models for high-resolution satellite imagery using exascale computing resources, highlighting the importance of data scaling and providing best practices for the geospatial community.

Contribution

It introduces a scalable pretraining approach for large vision transformers on satellite data, utilizing high-performance computing and a novel dataset, with insights into technical challenges and model performance.

Findings

01

Data scaling is crucial for effective model scaling.

02

Pretrained models improve performance on classification, segmentation, detection.

03

Technical challenges in training large models are discussed.

Abstract

While the pretraining of Foundation Models (FMs) for remote sensing (RS) imagery is on the rise, models remain restricted to a few hundred million parameters. Scaling models to billions of parameters has been shown to yield unprecedented benefits including emergent abilities, but requires data scaling and computing resources typically not available outside industry R&D labs. In this work, we pair high-performance computing resources including Frontier supercomputer, America's first exascale system, and high-resolution optical RS data to pretrain billion-scale FMs. Our study assesses performance of different pretrained variants of vision Transformers across image classification, semantic segmentation and object detection benchmarks, which highlight the importance of data scaling for effective model scaling. Moreover, we discuss construction of a novel TIU pretraining dataset, model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.