The Moon's Many Faces: A Single Unified Transformer for Multimodal Lunar Reconstruction

Tom Sander; Moritz Tenthoff; Kay Wohlfarth; Christian W\"ohler

arXiv:2505.05644·cs.CV·May 5, 2026

The Moon's Many Faces: A Single Unified Transformer for Multimodal Lunar Reconstruction

Tom Sander, Moritz Tenthoff, Kay Wohlfarth, Christian W\"ohler

PDF

TL;DR

This paper introduces a unified transformer model for multimodal lunar surface reconstruction, enabling flexible translation between various data types like images, DEMs, and surface normals.

Contribution

It presents a novel single transformer architecture that learns shared representations across multiple lunar data modalities for the first time.

Findings

01

The model learns physically plausible relations across modalities.

02

It demonstrates effective lunar 3D reconstruction and albedo estimation.

03

Multimodal learning enhances planetary surface analysis.

Abstract

Multimodal learning is an emerging research topic across multiple disciplines but has rarely been applied to planetary science. In this contribution, we propose a single, unified transformer architecture trained to learn shared representations between multiple sources like grayscale images, Digital Elevation Models (DEMs), surface normals, and albedo maps. The architecture supports flexible translation from any input modality to any target modality. Our results demonstrate that our foundation model learns physically plausible relations across these four modalities. We further identify that image-based 3D reconstruction and albedo estimation (Shape and Albedo from Shading) of lunar images can be formulated as a multimodal learning problem. Our results demonstrate the potential of multimodal learning to solve Shape and Albedo from Shading and provide a new approach for large-scale…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.