On Offline Evaluation of Vision-based Driving Models
Felipe Codevilla, Antonio M. L\'opez, Vladlen Koltun, Alexey, Dosovitskiy

TL;DR
This paper explores the limitations of offline evaluation metrics for vision-based driving models, highlighting that prediction error alone does not reliably indicate driving quality, and proposes improved evaluation strategies.
Contribution
It analyzes the correlation between offline metrics and actual driving performance, demonstrating how dataset selection and metric choice impact evaluation accuracy.
Findings
Offline prediction error often does not correlate with driving quality.
Models with identical prediction errors can have different driving performances.
Proper dataset and metric selection improve offline evaluation reliability.
Abstract
Autonomous driving models should ideally be evaluated by deploying them on a fleet of physical vehicles in the real world. Unfortunately, this approach is not practical for the vast majority of researchers. An attractive alternative is to evaluate models offline, on a pre-collected validation dataset with ground truth annotation. In this paper, we investigate the relation between various online and offline metrics for evaluation of autonomous driving models. We find that offline prediction error is not necessarily correlated with driving quality, and two models with identical prediction error can differ dramatically in their driving performance. We show that the correlation of offline evaluation with driving quality can be significantly improved by selecting an appropriate validation dataset and suitable offline metrics. The supplementary video can be viewed at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Video Surveillance and Tracking Methods · Advanced Neural Network Applications
