Artificial intelligence (AI) and high performance computing (HPC) workloads depend on high-performance chips (typically graphics processing units or GPUs) to perform mathematical calculations. As the computing performance for these chips intensifies, more heat must be removed. Server vendors are switching from air-cooling to liquid-cooling to deal with this increased heat density.
Direct liquid cooling (DLC), also known as direct-to-chip, has become the de facto method for cooling these chips. However, the data center industry is still learning how to implement this technology at scale and is in the early phases of developing standards around it. As result, there are some key challenges related to DLC systems in data centers.
In this whitepaper, we focus on the challenges associated with DLC applications involving around 500 kW or more and 10 or more IT racks. These rough thresholds indicate where a specific type of DLC architecture becomes applicable, as we will explore in detail later.