Towards an Open Science Platform for the Evaluation of Data Fusion
Weinan Huang, Junyi Chen, Lei Meng, David Lillis

TL;DR
This paper advocates for a centralized open science platform to standardize evaluation and comparison of data fusion techniques in information retrieval, aiming to enhance reproducibility and progress in the field.
Contribution
It proposes the development of a unified software platform for evaluating data fusion algorithms, addressing current inconsistencies and reducing implementation burdens.
Findings
Identified key qualities for an evaluation platform.
Developed an early prototype system.
Potential to facilitate more consistent and comparable research results.
Abstract
Combining the results of different search engines in order to improve upon their performance has been the subject of many research papers. This has become known as the "Data Fusion" task, and has great promise in dealing with the vast quantity of unstructured textual data that is a feature of many Big Data scenarios. However, no universally-accepted evaluation methodology has emerged in the community. This makes it difficult to make meaningful comparisons between the various proposed techniques from reading the literature alone. Variations in the datasets, metrics, and baseline results have all contributed to this difficulty. This paper argues that a more unified approach is required, and that a centralised software platform should be developed to aid researchers in making comparisons between their algorithms and others. The desirable qualities of such a system have been identified…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Data Management and Algorithms · Data Mining Algorithms and Applications
