Human Uncertainty and Ranking Error -- The Secret of Successful Evaluation in Predictive Data Mining
Kevin Jasberg, Sergej Sizov

TL;DR
This paper investigates how human decision volatility impacts the evaluation and ranking of data mining models, revealing biases and error propagation, and proposes probabilistic solutions to improve assessment accuracy.
Contribution
It introduces a probabilistic framework to analyze human uncertainty effects on model evaluation and ranking, providing mathematical proofs and solution strategies.
Findings
Human uncertainty biases prediction metrics like RMSE.
Uncertainty propagates and induces errors in algorithm rankings.
Probabilistic methods can mitigate evaluation biases.
Abstract
One of the most crucial issues in data mining is to model human behaviour in order to provide personalisation, adaptation and recommendation. This usually involves implicit or explicit knowledge, either by observing user interactions, or by asking users directly. But these sources of information are always subject to the volatility of human decisions, making utilised data uncertain to a particular extent. In this contribution, we elaborate on the impact of this human uncertainty when it comes to comparative assessments of different data mining approaches. In particular, we reveal two problems: (1) biasing effects on various metrics of model-based prediction and (2) the propagation of uncertainty and its thus induced error probabilities for algorithm rankings. For this purpose, we introduce a probabilistic view and prove the existence of those problems mathematically, as well as provide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Data Management and Algorithms · Data Stream Mining Techniques
