TL;DR
GaussianAD introduces a Gaussian-centric framework for autonomous driving that uses 3D semantic Gaussians for scene representation, enabling efficient perception and planning in an end-to-end manner, validated on the nuScenes dataset.
Contribution
The paper proposes a novel Gaussian-based scene representation and an end-to-end training framework for autonomous driving, improving efficiency and effectiveness over traditional dense or sparse methods.
Findings
Effective 3D perception with sparse convolutions
Accurate motion planning and scene forecasting
Validated on nuScenes dataset with strong results
Abstract
Vision-based autonomous driving shows great potential due to its satisfactory performance and low costs. Most existing methods adopt dense representations (e.g., bird's eye view) or sparse representations (e.g., instance boxes) for decision-making, which suffer from the trade-off between comprehensiveness and efficiency. This paper explores a Gaussian-centric end-to-end autonomous driving (GaussianAD) framework and exploits 3D semantic Gaussians to extensively yet sparsely describe the scene. We initialize the scene with uniform 3D Gaussians and use surrounding-view images to progressively refine them to obtain the 3D Gaussian scene representation. We then use sparse convolutions to efficiently perform 3D perception (e.g., 3D detection, semantic map construction). We predict 3D flows for the Gaussians with dynamic semantics and plan the ego trajectory accordingly with an objective of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsADaptive gradient method with the OPTimal convergence rate · Sparse Convolutions
