Diverse Complexity Measures for Dataset Curation in Self-driving

Abbas Sadat; Sean Segal; Sergio Casas; James Tu; Bin Yang; Raquel; Urtasun; Ersin Yumer

arXiv:2101.06554·cs.LG·January 19, 2021

Diverse Complexity Measures for Dataset Curation in Self-driving

Abbas Sadat, Sean Segal, Sergio Casas, James Tu, Bin Yang, Raquel, Urtasun, Ersin Yumer

PDF

Open Access

TL;DR

This paper introduces a novel data curation method for self-driving datasets that uses diverse criteria to select traffic scenes, improving model generalization across multiple tasks.

Contribution

It proposes a new data selection approach based on diverse interestingness criteria, addressing limitations of fixed-model active learning in self-driving applications.

Findings

01

Improved model performance and generalization on multiple tasks.

02

Effective dataset curation leads to better autonomous driving models.

03

Versatile approach applicable across different models and tasks.

Abstract

Modern self-driving autonomy systems heavily rely on deep learning. As a consequence, their performance is influenced significantly by the quality and richness of the training data. Data collecting platforms can generate many hours of raw data in a daily basis, however, it is not feasible to label everything. It is thus of key importance to have a mechanism to identify "what to label". Active learning approaches identify examples to label, but their interestingness is tied to a fixed model performing a particular task. These assumptions are not valid in self-driving, where we have to solve a diverse set of tasks (i.e., perception, and motion forecasting) and our models evolve over time frequently. In this paper we introduce a novel approach and propose a new data selection method that exploits a diverse set of criteria that quantize interestingness of traffic scenes. Our experiments on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Anomaly Detection Techniques and Applications · Machine Learning and Data Classification