Challenges Towards Deploying Data Intensive Scientific Applications on   Extreme Heterogeneity Supercomputers

Hang Liu; Yufei Ding; Da Zheng; Seung Woo Son; Da Yan

arXiv:1804.09738·cs.DC·April 27, 2018

Challenges Towards Deploying Data Intensive Scientific Applications on Extreme Heterogeneity Supercomputers

Hang Liu, Yufei Ding, Da Zheng, Seung Woo Son, Da Yan

PDF

Open Access

TL;DR

This paper identifies key challenges in deploying data-intensive scientific applications on extreme heterogeneity supercomputers, focusing on data movement, hardware scheduling, and programming complexity to enable effective utilization.

Contribution

It systematically categorizes eight challenges across data movement, scheduling, and programming, providing a comprehensive framework for future research in heterogeneous supercomputing.

Findings

01

Highlighting the importance of fast data movement in heterogeneous systems

02

Emphasizing the need for intelligent hardware scheduling

03

Addressing programming complexity to promote adoption

Abstract

Shrinking transistors, which powered the advancement of computing in the past half century, has stalled due to power wall; now extreme heterogeneity is promised to be the next driving force to feed the needs of ever-increasingly diverse scientific domains. To unlock the potentials of such supercomputers, we identify eight potential challenges in three categories: First, one needs fast data movement since extreme heterogeneity will inevitably complicate the communication circuits -- thus hampering the data movement. Second, we need to intelligently schedule suitable hardware for corresponding applications/stages. Third, we have to lower the programming complexity in order to encourage the adoption of heterogeneous computing.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed and Parallel Computing Systems · Graph Theory and Algorithms · Scientific Computing and Data Management