TL;DR
MOMO is a novel multi-sensor foundation model for Mars remote sensing that integrates data from various sensors using a new model merging strategy, achieving superior performance on multiple tasks.
Contribution
Introduces MOMO, the first multi-sensor foundation model for Mars, with a novel EVL strategy for effective model merging across sensors.
Findings
MOMO outperforms existing models on Mars-Bench tasks.
EVL strategy improves model stability and generalization.
Model merging at compatible checkpoints enhances multi-resolution data integration.
Abstract
We introduce MOMO, the first multi-sensor foundation model for Mars remote sensing. MOMO uses model merge to integrate representations learned independently from three key Martian sensors (HiRISE, CTX, and THEMIS), spanning resolutions from 0.25 m/pixel to 100 m/pixel. Central to our method is our novel Equal Validation Loss (EVL) strategy, which aligns checkpoints across sensors based on validation loss similarity before fusion via task arithmetic. This ensures models are merged at compatible convergence stages, leading to improved stability and generalization. We train MOMO on a large-scale, high-quality corpus of million samples curated from Mars orbital data and evaluate it on 9 downstream tasks from Mars-Bench. MOMO achieves better overall performance compared to ImageNet pre-trained, earth observation foundation model, sensor-specific pre-training, and fully-supervised…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
