Genomics and Biological Big Data: Facing Current and Future Challenges around Data and Software Sharing and Reproducibility
Sandra Gesing, Thomas Richard Connor, Ian Taylor

TL;DR
The paper discusses the challenges and necessary features of data sharing, software, and reproducibility in genomics big data, emphasizing integrative solutions for large-scale, user-friendly research.
Contribution
It identifies key challenges in genomic big data management and proposes characteristics for advanced solutions to enhance sharing, reproducibility, and usability.
Findings
Current solutions address only part of the challenges.
Effective solutions must improve reusability and reproducibility.
Seamless data sharing requires integrative approaches.
Abstract
Novel technologies in genomics allow creating data in exascale dimension with relatively minor effort of human and laboratory and thus monetary resources compared to capabilities only a decade ago. While the availability of this data salvage to find answers for research questions, which would not have been feasible before, maybe even not feasible to ask before, the amount of data creates new challenges, which obviously need new software and data management systems. Such new solutions have to consider integrative approaches, which are not only considering the effectiveness and efficiency of data processing but improve reusability, reproducibility and usability especially tailored to the target user communities of genomic big data. In our opinion, current solutions tackle part of the challenges and have each their strengths but lack to provide a complete solution. We present in this paper…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Genetics, Bioinformatics, and Biomedical Research · Research Data Management Practices
