Technical Report for Argoverse Challenges on Unified Sensor-based Detection, Tracking, and Forecasting
Zhepeng Wang, Feng Chen, Kanokphan Lertniphonphan, Siwei Chen, Jinyao, Bao, Pengfei Zheng, Jinbao Zhang, Kaer Huang, Tao Zhang

TL;DR
This paper introduces a unified sensor-based detection, tracking, and forecasting network for autonomous driving, achieving top results in the Argoverse Challenges by integrating multiple tasks into a single model with a BEV encoder.
Contribution
A novel unified network architecture that combines detection, tracking, and forecasting tasks using a BEV encoder with spatial and temporal fusion.
Findings
Achieved 1st place in Detection, Tracking, and Forecasting at Argoverse Challenges.
Demonstrated effective multi-task learning on the Argoverse 2 dataset.
Unified approach outperforms separate task-specific models.
Abstract
This report presents our Le3DE2E solution for unified sensor-based detection, tracking, and forecasting in Argoverse Challenges at CVPR 2023 Workshop on Autonomous Driving (WAD). We propose a unified network that incorporates three tasks, including detection, tracking, and forecasting. This solution adopts a strong Bird's Eye View (BEV) encoder with spatial and temporal fusion and generates unified representations for multi-tasks. The solution was tested in the Argoverse 2 sensor dataset to evaluate the detection, tracking, and forecasting of 26 object categories. We achieved 1st place in Detection, Tracking, and Forecasting on the E2E Forecasting track in Argoverse Challenges at CVPR 2023 WAD.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAir Quality Monitoring and Forecasting · Advanced Neural Network Applications · Video Surveillance and Tracking Methods
