Efficient Multi-site Data Movement Using Constraint Programming for Data Hungry Science
Michal Zerola, J\'er\^ome Lauret, Roman Bart\'ak, Michal, \v{S}umbera

TL;DR
This paper presents a constraint programming approach for optimizing multi-site data transfers and placements in distributed scientific computing, improving efficiency and reducing waiting times for data-hungry experiments.
Contribution
It introduces a novel CP-based model for data transfer and placement planning, with enhancements for faster scheduling and practical implementation in scientific data management.
Findings
CP approach reduces data transfer scheduling time.
Enhanced solver techniques improve scalability and efficiency.
Comparison shows CP outperforms Peer-2-Peer models in real scenarios.
Abstract
For the past decade, HENP experiments have been heading towards a distributed computing model in an effort to concurrently process tasks over enormous data sets that have been increasing in size as a function of time. In order to optimize all available resources (geographically spread) and minimize the processing time, it is necessary to face also the question of efficient data transfers and placements. A key question is whether the time penalty for moving the data to the computational resources is worth the presumed gain. Onward to the truly distributed task scheduling we present the technique using a Constraint Programming (CP) approach. The CP technique schedules data transfers from multiple resources considering all available paths of diverse characteristic (capacity, sharing and storage) having minimum user's waiting time as an objective. We introduce a model for planning data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
