MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction
Jing Yang, Minyue Jiang, Sen Yang, Xiao Tan, Yingying Li, Errui Ding, Hanli Wang, Jingdong Wang

TL;DR
MGMapNet introduces a multi-granularity representation framework for end-to-end vectorized HD map construction, effectively capturing both category and geometry information by integrating point-level and instance-level features.
Contribution
The paper proposes MGMapNet, a novel multi-granularity approach that models map elements with combined coarse and fine features, enhancing HD map accuracy.
Findings
Achieves state-of-the-art mAP scores on nuScenes and Argoverse2 datasets.
Outperforms previous methods like MapTRv2 by significant margins.
Effectively models intrinsic relationships between points and instances.
Abstract
The construction of Vectorized High-Definition (HD) map typically requires capturing both category and geometry information of map elements. Current state-of-the-art methods often adopt solely either point-level or instance-level representation, overlooking the strong intrinsic relationships between points and instances. In this work, we propose a simple yet efficient framework named MGMapNet (Multi-Granularity Map Network) to model map element with a multi-granularity representation, integrating both coarse-grained instance-level and fine-grained point-level queries. Specifically, these two granularities of queries are generated from the multi-scale bird's eye view (BEV) features using a proposed Multi-Granularity Aggregator. In this module, instance-level query aggregates features over the entire scope covered by an instance, and the point-level query aggregates features locally.…
Peer Reviews
Decision·ICLR 2025 Poster
1. The contributions are highlighted. The novel contributions compared with previous approaches are also discussed properly. 2. Both quantitative and qualitative results are shown and discussed. Ablation studies are conducted in a meaningful way.
1. The citations of the whole paper are wrong. It should be \citep{} instead of \cite{}. 2. From Figure 1 and 3. we can see the advantages of MGMapNet over other models. However, I can still see that the extracted lanes by MGMapNet are sometimes zigzagged while the ground truth lines are straight lines. I wonder whether you can add some regularity or loss terms to avoid this. Maybe for those straight lines, resample their vertices along the straight lines every times during model training so t
1. The problem studied in the paper is very important in practice and find applications in real world. 2. The paper is clearly written and easy to follow. 3. Experiments are conducted to verify the performance of the proposed method.
1. The challenges and contributions of the proposed techniques require further elaboration. What are the specific challenges to design these techniques in section 3? 2. The encoders and decoders are mostly MLP-based. It is difficult to understand the logic, rationale and difficulty to apply the techniques. 3. Some evaluation metrics in experiments are not explained, e.g. AP_ped and AP_div, and AP_bou in table 1. 4. How are the proposed techniques related to High-Definition? 5. Quality of figure
S1. The paper introduces a method that combines both coarse-grained instance-level and fine-grained point-level queries, effectively capturing both global category information and local geometric details of map elements. S2. The design of the Multi-Granularity Aggregator and Point Instance Interaction modules facilitates efficient and effective information sharing between instance-level and point-level queries. S3. The proposed MGMapNet framework outperforms several baseline models, achieving
W1. The paper’s description can be overwhelming for readers who are not deeply familiar with the HD map construction topic (e.g., me). For example, it lacks a formal problem formulation, which would help in grounding the research context. Additionally, the method's explanation is a bit difficult to follow. W2. The paper could be strengthened by providing a detailed analysis of the time and space complexity of MGMapNet compared to baseline models. Given that efficiency is a key motivation, under
Videos
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Multimodal Machine Learning Applications
