Overclocking the Cloud? Immersion Cooling Could Enable Faster Servers

Oct. 27, 2021
Microsoft has been test-driving the use of overclocked processors running in immersion cooling tanks, and says the combination can boost server performance by 20 percent. The company says immersion cooling “unlocks new potential for data center design.”

Gamers have long used the combination of overclocking CPUs and water cooling to squeeze peak performance out of PCs. Can that same approach create more powerful cloud computing platforms?

New research from Microsoft suggests that it might. The company has been test-driving the use of overclocked processors running in immersion cooling tanks, and says the combination allows servers to perform at a higher level.

“Based on our tests, we’ve found that for some chipsets, the performance can increase by 20 percent through the use of liquid cooling,” said Christian Belady, distinguished engineer and vice president of Microsoft’s datacenter advanced development group. “This demonstrates how liquid cooling can be used not only to support our sustainability goals to reduce and eventually eliminate water used for cooling in datacenters, but also generate more performant chips operating at warmer coolant temperatures for advanced AI and ML workloads.”

Those types of performance improvements could be significant when applied at cloud scale, and enable new approaches to how data centers are built and operated.

“Because of the efficiencies in both power and cooling that liquid cooling affords us, it unlocks new potential for data center rack design,” said Belady.

The research was shared as part of Microsoft’s update on its sustainability initiatives, including a plan to slash its data center water usage by 95 percent by 2024. Increased use of liquid cooling is an important component of this effort, as it allows a waterless design. Microsoft will also begin running warmer data centers, raising the set point for server halls to reduce – and possibly eliminate – the company’s reliance on water-intensive evaporative cooling.

Overclocking: More Power, But Also More Heat

In overclocking, CPUs are run at a higher clock rate than designed – in effect, forcing the computer to run faster than it’s supposed to go. Overclocking boosts performance, but also causes components to generate more heat.

“Liquid cooling can be used not only to support our sustainability goals, but also generate more performant chips operating at warmer coolant temperatures for advanced AI and ML workloads.”
Christian Belady, Microsoft

That’s where immersion cooling comes in. In two-phase immersion, servers are submerged in a coolant fluid that boils off as the chips generate heat, removing the heat as it changes from liquid to vapor. The vapor then condenses into liquid for reuse, all without a pump. Immersion cooling can deliver exceptional power efficiency because it uses sealed tanks that don’t require the raised floors or room-level air cooling found in most commercial data centers.

We’ve been tracking Microsoft’s adoption of liquid cooling, and the company is signaling that two-phase immersion cooling will play a much larger role in future data designs.

“Liquid cooling paves the way for more densely-packed servers in smaller spaces, meaning increased capacity per square foot in a datacenter – or the ability to create smaller datacenters in more strategic locations in the future, Belady said. “This adds to the benefits of waterless cooling design.”

In March, Microsoft revealed that it was test-driving cooling technology used in bitcoin mining facilities in which servers are dunked in tanks of cooling fluid to manage rising heat densities. In April it placed a single rack of 48 servers using two-phase immersion cooling into production in its data center in Quincy, Washington.

More Immersion Cooling on the Horizon?

Microsoft isn’t yet building entire data centers of immersion tanks, but says it is ready to expand their use.

“Our plan is to scale this deployment to multiple tanks to understand how to scale liquid immersion cooling and maintain services reliability,” the company said. “Depending on the outcome, we are going to develop a more optimized version of this technology across other datacenters.”

Microsoft isn’t the first company to connect immersion and performance gains, as this has been a driver in the adoption of immersion in cryptocurrency mining. Riot Blockchain says its research found that using immersion with ASIC chips can boost its hash rate by 25 percent, with the potential for larger gains. This week Riot began construction on 200 megawatts of immersion cooling capacity at a new hashing center in Rockdale, Texas.

“We anticipate observing an increase in the company’s hash rate and productivity through 2022, without having to rely solely on purchasing additional ASICs,” said Jason Lee, CEO of Riot.

Microsoft’s Azure cloud and Office apps require much higher reliability than bitcoin mining, which can be interrupted without knocking global businesses offline. A key motivation for Microsoft is beefing up its cloud for growing adoption of artificial intelligence (AI) and other high-density applications, which pose challenges for data center design and management.

Powerful new hardware for AI workloads is packing more computing power into each piece of equipment, boosting the power density – the amount of electricity used by servers and storage in a rack or cabinet – and the accompanying heat.

AI hardware also can create high “flux,” in which power use in a rack increases rapidly as hardware commences a new workload, which can be difficult to manage with traditional air cooling. A number of service providers have focused on air-cooled solutions optimized for high-density workloads, but as densities rise past 25 to 30 kW a rack, users increasingly turn to liquid cooling to manage these workloads.

Belady says liquid cooling represents “a major step function for managing density.”

“It paves the way for higher density and more power-efficient data centers,” said Belady. “We’re only at the beginning of that density curve. We’re really bullish on the technology.”

About the Author

Rich Miller

I write about the places where the Internet lives, telling the story of data centers and the people who build them. I founded Data Center Knowledge, the data center industry's leading news site. Now I'm exploring the future of cloud computing at Data Center Frontier.

Sponsored Recommendations

A better approach to boost data center capacity – Supply capacity agreements

Explore a transformative approach to data center capacity planning with insights on supply capacity agreements, addressing the impact of COVID-19, the AI race, and the evolving...

How Modernizing Aging Data Center Infrastructure Improves Sustainability

Explore the path to improved sustainability in data centers by modernizing aging infrastructure, uncovering challenges, three effective approaches, and specific examples outlined...

How Modern DCIM Helps Multi-Tenant Colocation Data Centers Be More Competitive

Discover the transformative impact of modern DCIM software on multi-tenant colocation data centers, enhancing competitiveness through improved resiliency, security, environmental...

3 Steps to Calculate Total Enterprise IT Energy Consumption Using DCIM

Embark on a simplified journey to measure and reduce the environmental impact of your enterprise IT with our practical guide, outlining a straightforward 3-step framework using...

Sashkin/Shutterstock.com

Unpacking CDU Motors: It’s Not Just About Redundancy

Matt Archibald, Director of Technical Architecture at nVent, explores methods for controlling coolant distribution units (CDU), the "heart" of the liquid cooling system.

White Papers

Dcf Opus Wp 2022 07 22 8 28 46 233x300

16 Powerful Tips to Lower Your AWS Spending

July 22, 2022
Opus Interactive outlines popular tools that can help you make informed decisions about how to best allocate AWS resources and reduce costs.