Mars-Bench: A Benchmark for Evaluating Foundation Models for Mars Science Tasks

Mirali Purohit; Bimal Gajera; Vatsal Malaviya; Irish Mehta; Kunal Kasodekar; Jacob Adler; Steven Lu; Umaa Rebbapragada; Hannah Kerner

arXiv:2510.24010·cs.CV·October 29, 2025

Mars-Bench: A Benchmark for Evaluating Foundation Models for Mars Science Tasks

Mirali Purohit, Bimal Gajera, Vatsal Malaviya, Irish Mehta, Kunal Kasodekar, Jacob Adler, Steven Lu, Umaa Rebbapragada, Hannah Kerner

PDF

1 Video

TL;DR

Mars-Bench is the first comprehensive benchmark for evaluating foundation models on Mars science tasks, enabling systematic assessment across diverse datasets and fostering development of domain-specific models.

Contribution

The paper introduces Mars-Bench, a standardized benchmark with datasets and evaluation protocols for Mars-related tasks, addressing a critical gap in Mars science AI research.

Findings

01

Mars-specific models outperform general models on benchmark tasks.

02

Pre-trained models on Earth and natural images show limited transferability.

03

Domain-adapted pre-training improves Mars model performance.

Abstract

Foundation models have enabled rapid progress across many specialized domains by leveraging large-scale pre-training on unlabeled data, demonstrating strong generalization to a variety of downstream tasks. While such models have gained significant attention in fields like Earth Observation, their application to Mars science remains limited. A key enabler of progress in other domains has been the availability of standardized benchmarks that support systematic evaluation. In contrast, Mars science lacks such benchmarks and standardized evaluation frameworks, which have limited progress toward developing foundation models for Martian tasks. To address this gap, we introduce Mars-Bench, the first benchmark designed to systematically evaluate models across a broad range of Mars-related tasks using both orbital and surface imagery. Mars-Bench comprises 20 datasets spanning classification,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Mars-Bench: A Benchmark for Evaluating Foundation Models for Mars Science Tasks· slideslive