Reliable Querying of Very Large, Fast Moving and Noisy Predicted Interaction Data using Hierarchical Crowd Curation
Hasan M. Jamil, Fereidoon Sadri

TL;DR
This paper introduces a scalable crowd computing approach for rapid, tentative annotation of large, noisy biological data, enabling early data use with subsequent expert validation.
Contribution
It presents a novel hierarchical crowd curation method that manages trust and supports ad hoc queries for reliable biological data analysis.
Findings
Effective management of crowd trust improves annotation reliability.
Supports fast, tentative data curation for large biological datasets.
Enables early data utilization with minimal initial investment.
Abstract
The abundance of predicted and mined but uncertain biological data show huge needs for massive, efficient and scalable curation efforts. The human expertise warranted by any successful curation enterprize is often economically prohibitive especially for speculative end user queries that may not ultimately bear fruit. So the challenge remains in devising a low cost engine capable of delivering fast but tentative annotation and curation of a set of data items that can be authoritatively validated by experts later demanding significantly small investment. The aim thus is to make a large volume of predicted data available for use as early as possible with an acceptable degree of confidence in their accuracy while the curation continues. In this paper, we present a novel approach to annotation and curation of biological database contents using crowd computing. The technical contribution is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Data Stream Mining Techniques · Scientific Computing and Data Management
