Cooling just became the most strategic choice in AI infrastructure

https://cdn.mos.cms.futurecdn.net/EXMLBYo5k7EwcuyYg9vmmM-2560-80.jpg

For most of the last forty years, data center performance gains came from one place: smaller transistors. Moore's Law and Dennard scaling did the work.

Each new generation of silicon delivered more performance at the same or lower power, and thermal was a maintenance problem, not a performance limiter.

Cooling sat in the background. Operators measured it through PUE, optimized for it where convenient, and otherwise treated it as overhead.

That world is over.

Dennard scaling broke years ago, transistor efficiency gains are leveling off, and AI accelerator TDPs have climbed from 700 watts in the H100 generation to over 1,400 watts in current Blackwell deployments, with NVIDIA's upcoming Rubin platform expected to push further.

Thermal is no longer something that happens after the architectural decisions. It is now the binding constraint on how much performance a chip can sustain, and it is becoming one of the most strategic...

Copyright of this story solely belongs to techradar.com. To see the full text click HERE

Read more