Distributed Cross-Channel Hierarchical Aggregation for Foundation Models
Aristeidis Tsaris, Isaac Lyngaas, John Lagregren, Mohamed Wahib, Larry York, Prasanna Balaprakash, Dan Lu, Feiyi Wang, Xiao Wang

TL;DR
This paper introduces D-CHAG, a scalable method for efficient distributed aggregation in vision-based scientific models, significantly reducing memory and increasing throughput on large GPU clusters.
Contribution
The paper presents D-CHAG, a novel distributed hierarchical aggregation technique compatible with various transformer architectures, improving efficiency for multi-channel image datasets.
Findings
Achieved up to 75% memory reduction.
More than doubled throughput on 1024 GPUs.
Effective for hyperspectral and weather data.
Abstract
Vision-based scientific foundation models hold significant promise for advancing scientific discovery and innovation. This potential stems from their ability to aggregate images from diverse sources such as varying physical groundings or data acquisition systems and to learn spatio-temporal correlations using transformer architectures. However, tokenizing and aggregating images can be compute-intensive, a challenge not fully addressed by current distributed methods. In this work, we introduce the Distributed Cross-Channel Hierarchical Aggregation (D-CHAG) approach designed for datasets with a large number of channels across image modalities. Our method is compatible with any model-parallel strategy and any type of vision transformer architecture, significantly improving computational efficiency. We evaluated D-CHAG on hyperspectral imaging and weather forecasting tasks. When integrated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfrastructure Maintenance and Monitoring
MethodsDense Connections · Layer Normalization · Vision Transformer
