GroupLane: End-to-End 3D Lane Detection with Channel-wise Grouping

Zhuoling Li; Chunrui Han; Zheng Ge; Jinrong Yang; En Yu; Haoqian Wang,; Hengshuang Zhao; Xiangyu Zhang

arXiv:2307.09472·cs.CV·July 19, 2023·1 cites

GroupLane: End-to-End 3D Lane Detection with Channel-wise Grouping

Zhuoling Li, Chunrui Han, Zheng Ge, Jinrong Yang, En Yu, Haoqian Wang,, Hengshuang Zhao, Xiangyu Zhang

PDF

Open Access 3 Reviews

TL;DR

GroupLane is a fast, end-to-end 3D lane detection method that uses row-wise classification in bird-eye-view, supporting both vertical and horizontal lanes with high accuracy and efficiency.

Contribution

It introduces a fully convolutional, end-to-end 3D lane detection framework with a novel row-wise classification approach and lane-aware grouping, outperforming state-of-the-art methods.

Findings

01

Outperforms PersFormer by 13.6% F1 score on OpenLane.

02

Achieves nearly 7x faster inference with significantly fewer FLOPs.

03

Supports detection of both vertical and horizontal lanes in bird-eye-view.

Abstract

Efficiency is quite important for 3D lane detection due to practical deployment demand. In this work, we propose a simple, fast, and end-to-end detector that still maintains high detection precision. Specifically, we devise a set of fully convolutional heads based on row-wise classification. In contrast to previous counterparts, ours supports recognizing both vertical and horizontal lanes. Besides, our method is the first one to perform row-wise classification in bird-eye-view. In the heads, we split feature into multiple groups and every group of feature corresponds to a lane instance. During training, the predictions are associated with lane labels using the proposed single-win one-to-one matching to compute loss, and no post-processing operation is demanded for inference. In this way, our proposed fully convolutional detector, GroupLane, realizes end-to-end detection like DETR.…

Peer Reviews

Decision·Submitted to ICLR 2024

Reviewer 01Rating 5· marginally below the acceptance thresholdConfidence 5

Strengths

- Provide test results on various datasets and achieved high performances.

Weaknesses

- The ultra-fast deep lane detection method has already introduced a hybrid anchor-based lane detection that predicts row-and-column anchors corresponding to lanes. - It is interesting to note that Table 1 and Table 2 exhibit inconsistent results when using different backbone models. It would be nicer if the authors further investigated this issue. Z. Qin, P. Zhang and X. Li, "Ultra Fast Deep Lane Detection With Hybrid Anchor Driven Ordinal Classification," in IEEE TPAMI, 2022.

Reviewer 02Rating 6· marginally above the acceptance thresholdConfidence 3

Strengths

The paper is nicely written and easy to read. The main contribution, to my point of view, consists in splitting the BEV features into two groups of candidates: horizontal candidates and vertical candidates. Each group has 6 heads to predict existence confidence, visibility, category, row-wise classification index, x-axis offset, and z-axis offset. Since the proposed model splits the group of candidates in horizontal and vertical, the authors proposed an adapted technic called single-win one-to-o

Weaknesses

In the experimental part, GROUPLANE is evaluated on three datasets. The selected baseline model is PersFormer (described as the best published model). Can you give details on this choice? Regarding the benchmark webpage, it seems that the best 2022 model is 58% F1 score and that PersFormer is currently ranked 9. The resulting figure 2 is not fair and should be changed with new models. Moreover, can you add the two following references (ranked 1 and 2) from ICCV2023: LATR: 3D Lane Detection from

Reviewer 03Rating 5· marginally below the acceptance thresholdConfidence 4

Strengths

1. Idea seems fundamentally sound. 2. Spliting BEV feature for vertical and horizontal lane detection would be valuable, espeically when model is deployed on an edge device. 3. Paper is well written and very easy to read.

Weaknesses

1. Simply dividing each group into N outputs limited the max output lane number of the model. 2. SOM strategy is simple yet effect, details are not well explained or even missing, eg. the matching cost definition.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings