VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization
Yiwei Zhang, Jin Gao, Fudong Ge, Guan Luo, Bing Li, Zhaoxiang Zhang,, Haibin Ling, Weiming Hu

TL;DR
VQ-Map introduces a novel approach using vector quantization and tokenized discrete space to improve bird's-eye-view map layout estimation, effectively handling occlusion and low-resolution challenges.
Contribution
The paper presents a new method leveraging VQ-VAE for high-level BEV semantics in a tokenized space, enabling better alignment and map generation from perspective view features.
Findings
Achieved state-of-the-art IoU scores on nuScenes and Argoverse benchmarks.
Demonstrated effective alignment of PV features with BEV tokens.
Generated high-quality BEV maps under challenging conditions.
Abstract
Bird's-eye-view (BEV) map layout estimation requires an accurate and full understanding of the semantics for the environmental elements around the ego car to make the results coherent and realistic. Due to the challenges posed by occlusion, unfavourable imaging conditions and low resolution, \emph{generating} the BEV semantic maps corresponding to corrupted or invalid areas in the perspective view (PV) is appealing very recently. \emph{The question is how to align the PV features with the generative models to facilitate the map estimation}. In this paper, we propose to utilize a generative model similar to the Vector Quantized-Variational AutoEncoder (VQ-VAE) to acquire prior knowledge for the high-level BEV semantics in the tokenized discrete space. Thanks to the obtained BEV tokens accompanied with a codebook embedding encapsulating the semantics for different BEV elements in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Vision and Imaging · Digital Image Processing Techniques · Retinal Imaging and Analysis
MethodsALIGN · Sparse Evolutionary Training
