Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing
Bill Allcock, Joe Bester, John Bresnahan, Ann L. Chervenak, Ian, Foster, Carl Kesselman, Sam Meder, Veronika Nefedova, Darcy Quesnel, Steven, Tuecke

TL;DR
This paper introduces secure, high-performance data transport and replica management services for Data Grids, enhancing reliability and efficiency in data-intensive scientific applications across distributed environments.
Contribution
It presents the design and implementation of GridFTP and a replica management service, extending FTP with features tailored for Data Grid needs, including security and performance enhancements.
Findings
GridFTP extends FTP with striping and partial access.
Preliminary performance results show improved data transfer efficiency.
Integration with Globus Toolkit enhances security and management.
Abstract
An emerging class of data-intensive applications involve the geographically dispersed extraction of complex scientific information from very large collections of measured or computed data. Such applications arise, for example, in experimental physics, where the data in question is generated by accelerators, and in simulation science, where the data is generated by supercomputers. So-called Data Grids provide essential infrastructure for such applications, much as the Internet provides essential services for applications such as e-mail and the Web. We describe here two services that we believe are fundamental to any Data Grid: reliable, high-speed transporet and replica management. Our high-speed transport service, GridFTP, extends the popular FTP protocol with new features required for Data Grid applciations, such as striping and partial file access. Our replica management service…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques · Advanced Data Storage Technologies
