Cooling Matters: Benchmarking Large Language Models and Vision-Language Models on Liquid-Cooled Versus Air-Cooled H100 GPU Systems
Imran Latif, Muhammad Ali Shafique, Hayat Ullah, Alex C. Newkirk, Xi Yu, and Arslan Munir

TL;DR
This paper benchmarks large language and vision-language models on liquid-cooled versus air-cooled H100 GPU systems, demonstrating that liquid cooling improves thermal stability, performance, and energy efficiency in data centers.
Contribution
It provides the first detailed comparison of LLM and VLM performance and efficiency on liquid-cooled versus air-cooled GPU systems, highlighting significant benefits of liquid cooling.
Findings
Liquid cooling maintains GPU temperatures between 41-50°C.
Liquid-cooled systems achieve 17% higher performance than air-cooled systems.
Liquid cooling improves energy efficiency and reduces energy overhead.
Abstract
The unprecedented growth in artificial intelligence (AI) workloads, recently dominated by large language models (LLMs) and vision-language models (VLMs), has intensified power and cooling demands in data centers. This study benchmarks LLMs and VLMs on two HGX nodes, each with 8x NVIDIA H100 graphics processing units (GPUs), using liquid and air cooling. Leveraging GPU Burn, Weights and Biases, and IPMItool, we collect detailed thermal, power, and computation data. Results show that the liquid-cooled systems maintain GPU temperatures between 41-50 degrees Celsius, while the air-cooled counterparts fluctuate between 54-72 degrees Celsius under load. This thermal stability of liquid-cooled systems yields 17 percent higher performance (54 TFLOPs per GPU vs. 46 TFLOPs per GPU), improved performance per watt, reduced energy overhead, and greater system efficiency than the air-cooled…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Advanced Neural Network Applications
