Cooling Channel Design Optimization for High Power Multi-chip Packages
Michael Acquah, Zheng Liu

TL;DR
This paper presents a physics-based computational framework for optimizing embedded cooling channel layouts in high-power multi-chip modules, significantly reducing chip temperatures.
Contribution
It introduces a systematic, surrogate-based optimization approach for interdigitated cooling architectures in heterogeneous multi-chip packages.
Findings
Optimal design reduces peak chip temperature by 140.45°C.
Average chip temperature decreases by 35.87°C.
Framework applied to NVIDIA GB200 architecture.
Abstract
Thermal management is a major challenge in next-generation high-performance computing systems, particularly for heterogeneous multi-chip packages such as the NVIDIA GB200 Grace Blackwell Superchip. In this work, a physics-based computational framework is developed to optimize embedded cooling channel layouts for high-power multi-chip modules. The model couples steady-state heat conduction with a porous media-based representation of coolant transport, coupled with a row-wise coolant energy balance, to estimate chip temperature fields within microchannel networks. Unlike conventional designs, an interdigitated cooling architecture is parameterized using geometric variables, including channel count, width, and expansion over chip regions, enabling systematic design exploration. To enable efficient optimization, a surrogate-based approach is employed to approximate the relationship between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
