In-Situ Data Analysis of Protein Folding Trajectories
Travis Johnston, Boyu Zhang, Adam Liwo, Silvia Crivelli, Michela, Taufer

TL;DR
This paper introduces a distributed in-situ data analysis method for large protein folding trajectories, enabling real-time analysis during simulations to reduce data movement and storage needs.
Contribution
It presents a novel distributed in-situ analysis approach that processes protein folding data locally, avoiding centralized data movement and enabling real-time insights.
Findings
Reduces storage space for trajectory data
Processes data in one pass during simulation
Builds global knowledge without data transfer
Abstract
The transition from petascale to exascale computers is characterized by substantial changes in the computer architectures and technologies. The research community relying on computational simulations is being forced to revisit the algorithms for data generation and analysis due to various concerns, such as higher degrees of concurrency, deeper memory hierarchies, substantial I/O and communication constraints. Simulations today typically save all data to analyze later. Simulations at the exascale will require us to analyze data as it is generated and save only what is really needed for analysis, which must be performed predominately in-situ, i.e., executed sufficiently fast locally, limiting memory and disk usage, and avoiding the need to move large data across nodes. In this paper, we present a distributed method that enables in-situ data analysis for large protein folding trajectory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Genomics and Phylogenetic Studies · Plant nutrient uptake and metabolism
