Concurrent CPU-GPU Task Programming using Modern C++
Tsung-Wei Huang, Yibo Lin

TL;DR
Heteroflow is a modern C++ library that simplifies and accelerates the development of CPU-GPU parallel programs using task dependency graphs, demonstrating significant performance improvements in real applications.
Contribution
This paper introduces Heteroflow, a novel C++ library for efficient heterogeneous CPU-GPU programming with task dependency graphs, enhancing performance and productivity.
Findings
Achieved 7.7x speed-up in VLSI timing analysis
Demonstrated scalability across CPU-GPU configurations
Outperformed existing libraries in efficiency and generality
Abstract
In this paper, we introduce Heteroflow, a new C++ library to help developers quickly write parallel CPU-GPU programs using task dependency graphs. Heteroflow leverages the power of modern C++ and task-based approaches to enable efficient implementations of heterogeneous decomposition strategies. Our new CPU-GPU programming model allows users to express a problem in a way that adapts to effective separation of concerns and expertise encapsulation. Compared with existing libraries, Heteroflow is more cost-efficient in performance scaling, programming productivity, and solution generality. We have evaluated Heteroflow on two real applications in VLSI design automation and demonstrated the performance scalability across different CPU-GPU numbers and problem sizes. At a particular example of VLSI timing analysis with million-scale tasking, Heteroflow achieved 7.7x runtime speed-up (99 vs 13…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Ferroelectric and Negative Capacitance Devices · Embedded Systems Design Techniques
