On the Challenges of Collaborative Data Processing
Sylvie Noel, Daniel Lemire

TL;DR
This paper examines the potential and challenges of large-scale collaborative data analysis, analyzing Web 2.0 sites to identify factors influencing open collaboration in data processing.
Contribution
It provides a quantitative analysis of Web 2.0 data visualization sites and explores the limiting factors of large-scale collaborative data analysis.
Findings
Evidence of at least moderate open collaboration in Web 2.0 data visualization sites
Identification of key limiting factors in collaborative data processing
Insights into the potential for large-scale collaborative analysis
Abstract
The last 30 years have seen the creation of a variety of electronic collaboration tools for science and business. Some of the best-known collaboration tools support text editing (e.g., wikis). Wikipedia's success shows that large-scale collaboration can produce highly valuable content. Meanwhile much structured data is being collected and made publicly available. We have never had access to more powerful databases and statistical packages. Is large-scale collaborative data analysis now possible? Using a quantitative analysis of Web 2.0 data visualization sites, we find evidence that at least moderate open collaboration occurs. We then explore some of the limiting factors of collaboration over data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWikis in Education and Collaboration · Scientific Computing and Data Management
