Towards an Open Science Platform for the Evaluation of Data Fusion

Weinan Huang; Junyi Chen; Lei Meng; David Lillis

arXiv:1802.04068·cs.IR·February 13, 2018

Towards an Open Science Platform for the Evaluation of Data Fusion

Weinan Huang, Junyi Chen, Lei Meng, David Lillis

PDF

Open Access

TL;DR

This paper advocates for a centralized open science platform to standardize evaluation and comparison of data fusion techniques in information retrieval, aiming to enhance reproducibility and progress in the field.

Contribution

It proposes the development of a unified software platform for evaluating data fusion algorithms, addressing current inconsistencies and reducing implementation burdens.

Findings

01

Identified key qualities for an evaluation platform.

02

Developed an early prototype system.

03

Potential to facilitate more consistent and comparable research results.

Abstract

Combining the results of different search engines in order to improve upon their performance has been the subject of many research papers. This has become known as the "Data Fusion" task, and has great promise in dealing with the vast quantity of unstructured textual data that is a feature of many Big Data scenarios. However, no universally-accepted evaluation methodology has emerged in the community. This makes it difficult to make meaningful comparisons between the various proposed techniques from reading the literature alone. Variations in the datasets, metrics, and baseline results have all contributed to this difficulty. This paper argues that a more unified approach is required, and that a centralised software platform should be developed to aid researchers in making comparisons between their algorithms and others. The desirable qualities of such a system have been identified…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Data Management and Algorithms · Data Mining Algorithms and Applications