Truth Discovery Algorithms: An Experimental Evaluation
Dalia Attia Waguih, Laure Berti-Equille (Qatar Computing Research, Institute)

TL;DR
This paper systematically compares 12 state-of-the-art truth discovery algorithms through extensive experiments on synthetic and real data, analyzing their efficiency, usability, and scalability to guide future research.
Contribution
It provides a comprehensive review, reference implementations, and an experimental framework for evaluating truth discovery methods under diverse scenarios.
Findings
Initialization and parameter settings significantly affect algorithm performance.
Scalability varies widely among the evaluated methods.
The experimental framework enables thorough comparison across multiple data scenarios.
Abstract
A fundamental problem in data fusion is to determine the veracity of multi-source data in order to resolve conflicts. While previous work in truth discovery has proved to be useful in practice for specific settings, sources' behavior or data set characteristics, there has been limited systematic comparison of the competing methods in terms of efficiency, usability, and repeatability. We remedy this deficit by providing a comprehensive review of 12 state-of-the art algorithms for truth discovery. We provide reference implementations and an in-depth evaluation of the methods based on extensive experiments on synthetic and real-world data. We analyze aspects of the problem that have not been explicitly studied before, such as the impact of initialization and parameter setting, convergence, and scalability. We provide an experimental framework for extensively comparing the methods in a wide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Privacy-Preserving Technologies in Data · Data Stream Mining Techniques
