Leveraging User Access Patterns and Advanced Cyberinfrastructure to Accelerate Data Delivery from Shared-use Scientific Observatories
Yubo Qin, Ivan Rodero, Anthony Simonet, Charles Meertens, Daniel, Reiner, James Riley, Manish Parashar

TL;DR
This paper introduces a push-based data delivery framework that uses in-network capabilities and data pre-fetching to improve data access speed and reduce network load for shared scientific observatories.
Contribution
It presents a novel framework combining in-network data staging with a hybrid pre-fetching model based on user access patterns for observatories.
Findings
Significant improvement in data delivery performance.
Reduction in network traffic at observatory facilities.
Effective data pre-fetching based on access pattern analysis.
Abstract
With the growing number and increasing availability of shared-use instruments and observatories, observational data is becoming an essential part of application workflows and contributor to scientific discoveries in a range of disciplines. However, the corresponding growth in the number of users accessing these facilities coupled with the expansion in the scale and variety of the data, is making it challenging for these facilities to ensure their data can be accessed, integrated, and analyzed in a timely manner, and is resulting significant demands on their cyberinfrastructure (CI). In this paper, we present the design of a push-based data delivery framework that leverages emerging in-network capabilities, along with data pre-fetching techniques based on a hybrid data management model. Specifically, we analyze data access traces for two large-scale observatories, Ocean Observatories…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Scientific Computing and Data Management · Advanced Data Storage Technologies
