Scalable Climate Data Analysis: Balancing Petascale Fidelity and Computational Cost
Aashish Panta, Amy Gooch, Giorgio Scorzelli, Michela Taufer, Valerio Pascucci

TL;DR
This paper presents a scalable climate data analysis system that reduces storage and computational costs by 99%, while maintaining high data fidelity, enabling cost-effective high-resolution climate research.
Contribution
It introduces a hierarchical multiresolution data management and ML-assisted reconstruction ecosystem that balances accuracy and efficiency for petascale climate data.
Findings
Reduced storage and computational costs by 99%
Maintained RMS error of 1.46°C with significant data reduction
Validated on petascale NASA climate datasets
Abstract
The growing resolution and volume of climate data from remote sensing and simulations pose significant storage, processing, and computational challenges. Traditional compression or subsampling methods often compromise data fidelity, limiting scientific insights. We introduce a scalable ecosystem that integrates hierarchical multiresolution data management, intelligent transmission, and ML-assisted reconstruction to balance accuracy and efficiency. Our approach reduces storage and computational costs by 99\%, lowering expenses from $100,000 to $24 while maintaining a Root Mean Square (RMS) error of 1.46 degrees Celsius. Our experimental results confirm that even with significant data reduction, essential features required for accurate climate analysis are preserved. Validated on petascale NASA climate datasets, this solution enables cost-effective, high-fidelity climate analysis for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
