Reducing Aleatoric and Epistemic Uncertainty through Multi-modal Data Acquisition

Arthur Hoarau; Benjamin Quost; S\'ebastien Destercke; Willem Waegeman

arXiv:2501.18268·cs.LG·February 10, 2026

Reducing Aleatoric and Epistemic Uncertainty through Multi-modal Data Acquisition

Arthur Hoarau, Benjamin Quost, S\'ebastien Destercke, Willem Waegeman

PDF

Open Access 1 Repo

TL;DR

This paper presents a novel multi-modal data acquisition framework that disentangles aleatoric and epistemic uncertainties, enabling targeted data collection to improve prediction reliability in AI systems.

Contribution

It introduces an innovative framework for uncertainty-aware data acquisition across multiple modalities, challenging assumptions about uncertainty reducibility and integrating active learning with uncertainty quantification.

Findings

01

Aleatoric uncertainty decreases with more modalities.

02

Epistemic uncertainty reduces with increased observations.

03

Framework demonstrated on two multi-modal datasets.

Abstract

To generate accurate and reliable predictions, modern AI systems need to combine data from multiple modalities, such as text, images, audio, spreadsheets, and time series. Multi-modal data introduces new opportunities and challenges for disentangling uncertainty: it is commonly assumed in the machine learning community that epistemic uncertainty can be reduced by collecting more data, while aleatoric uncertainty is irreducible. However, this assumption is challenged in modern AI systems when information is obtained from different modalities. This paper introduces an innovative data acquisition framework where uncertainty disentanglement leads to actionable decisions, allowing sampling in two directions: sample size and data modality. The main hypothesis is that aleatoric uncertainty decreases as the number of modalities increases, while epistemic uncertainty decreases by collecting more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ArthurHoa/deep-eknn
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies