MapFM: Foundation Model-Driven HD Mapping with Multi-Task Contextual Learning

Leonid Ivanov; Vasily Yuryev; Dmitry Yudin

arXiv:2506.15313·cs.CV·June 19, 2025

MapFM: Foundation Model-Driven HD Mapping with Multi-Task Contextual Learning

Leonid Ivanov, Vasily Yuryev, Dmitry Yudin

PDF

Open Access

TL;DR

MapFM is an advanced end-to-end model that leverages foundation models and multi-task learning to generate high-quality, vectorized HD maps in real-time for autonomous driving, enhancing scene understanding and map accuracy.

Contribution

The paper introduces MapFM, a novel foundation model-driven approach that integrates multi-task learning for improved HD map prediction in autonomous driving.

Findings

01

Significantly improved feature representation quality.

02

Enhanced map prediction accuracy through multi-task learning.

03

Effective online vectorized HD map generation demonstrated.

Abstract

In autonomous driving, high-definition (HD) maps and semantic maps in bird's-eye view (BEV) are essential for accurate localization, planning, and decision-making. This paper introduces an enhanced End-to-End model named MapFM for online vectorized HD map generation. We show significantly boost feature representation quality by incorporating powerful foundation model for encoding camera images. To further enrich the model's understanding of the environment and improve prediction quality, we integrate auxiliary prediction heads for semantic segmentation in the BEV representation. This multi-task learning approach provides richer contextual supervision, leading to a more comprehensive scene representation and ultimately resulting in higher accuracy and improved quality of the predicted vectorized HD maps. The source code is available at https://github.com/LIvanoff/MapFM.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques