Seven Principles for Effective Scientific Big-DataSystems

Niall H. Robinson; Joe Hamman; Ryan Abernathey

arXiv:1908.03356·cs.DC·June 26, 2020·1 cites

Seven Principles for Effective Scientific Big-DataSystems

Niall H. Robinson, Joe Hamman, Ryan Abernathey

PDF

Open Access

TL;DR

This paper outlines seven architectural principles crucial for developing effective, robust, and flexible scientific big-data systems to handle the exponential growth of data and complexity in scientific research.

Contribution

It introduces a set of seven fundamental principles for designing scientific big-data systems, emphasizing architecture to improve data handling and discovery.

Findings

01

Effective principles enhance data processing robustness.

02

Guidelines improve system flexibility and scalability.

03

Principles support better scientific discovery.

Abstract

We should be in a golden age of scientific discovery, given that we have more data and more compute power available than ever before, plus a new generation of algorithms that can learn effectively from data. But paradoxically, in many data-driven fields, the eureka moments are becoming increasingly rare. Scientists are struggling to keep pace with the explosion in the volume and complexity of scientific data. We describe here a few simple architectural principles that we believe are essential in order to create effective, robust, and flexible platforms that make the best use of emerging technology to deal with the exponential growth of scientific data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Computational Physics and Python Applications · Data Analysis with R