Towards Robust Training Datasets for Machine Learning with Ontologies: A Case Study for Emergency Road Vehicle Detection
Lynn Vonderhaar, Timothy Elvira, Tyler Procko, Omar Ochoa

TL;DR
This paper introduces a method using ontologies to validate and improve the robustness and completeness of training datasets for machine learning in safety-critical domains, demonstrated through a case study on emergency road vehicle detection.
Contribution
It proposes a novel approach to enhance ML trustworthiness by leveraging domain and image quality ontologies to validate dataset robustness and completeness.
Findings
Ontologies effectively validate dataset domain completeness.
The method improves dataset robustness for safety-critical ML applications.
Proof of concept demonstrated in emergency vehicle detection domain.
Abstract
Countless domains rely on Machine Learning (ML) models, including safety-critical domains, such as autonomous driving, which this paper focuses on. While the black box nature of ML is simply a nuisance in some domains, in safety-critical domains, this makes ML models difficult to trust. To fully utilize ML models in safety-critical domains, it would be beneficial to have a method to improve trust in model robustness and accuracy without human experts checking each decision. This research proposes a method to increase trust in ML models used in safety-critical domains by ensuring the robustness and completeness of the model's training dataset. Because ML models embody what they are trained with, ensuring the completeness of training datasets can help to increase the trust in the training of ML models. To this end, this paper proposes the use of a domain ontology and an image quality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Natural Language Processing Techniques · Anomaly Detection Techniques and Applications
MethodsOntology
