Immersion Cooling (Part 1): Redefining Reliability Standards
What you’ll learn:
- How immersion cooling reframes system design challenges.
- Challenges of retrofitting for immersion cooling.
- The limits of air-based standards.
Data center workloads continue to surge due to the rise of AI and high-performance computing (HPC), and, in turn, traditional air-cooling methods are reaching their practical limits. As thermal loads escalate and density requirements expand, data center operators look to find new ways of managing heat. Immersion cooling has emerged as a promising path forward.
However, this shift exposes significant gaps in how the industry defines and tests for component reliability. Standards developed for air-cooled environments were never intended to predict how materials behave when fully submerged in dielectric fluids. Given the new demands in architectural design and performance, key factors such as aging models, failure modes, and even basic assumptions about component durability require rethinking.
This evolution is reshaping how data center operators assess component reliability. Standards originated to support air-cooled systems served their purpose. But they must evolve to address the new challenges presented by immersion environments.
>>Check out this TechXchange for similar articles and videos
While air-cooling standards have long guided system planning, immersion cooling introduces a different set of aging mechanisms and material challenges. To keep pace, engineers and industry groups like the Open Compute Project (OCP) are working together to build testing frameworks based on real-world immersion conditions. This shift introduces distinct design and reliability challenges between air- and immersion-cooled systems (see figure).
How Immersion Cooling Reframes System Design Challenges
Immersion cooling removes airflow constraints but demands a fundamental rethinking of infrastructure, material selection, and system design. Traditional air-cooled systems, reliant on fans and heatsinks, face growing challenges managing component Thermal Design Powers (TDPs) that now routinely exceed 300 W—and even cross the critical 400-W threshold in many next-generation GPUs and AI accelerators. Above this point, airflow is often insufficient for maintaining safe operating temperatures.
To bridge the gap, many data center operators initially turned to cold-plate cooling, which improves thermal transfer by circulating liquid directly to the hottest components. However, while this approach better addresses higher chip densities than what’s possible with air cooling, cold-plate solutions introduce extensive manifolding, complex rack-level heat exchanger integration, and added mechanical failure points, including the risk of leaks from tubing and connections.
As compute loads continue to climb, full immersion—whether single- or dual-phase—is emerging as the next step to overcome the structural and thermal limits of both air and cold-plate systems. By fully submerging servers in dielectric fluids, immersion cooling sidesteps airflow limitations altogether.
Potential power savings, often cited as reaching up to 30% compared to conventional air-cooled deployments, depend on several factors. These can include the specific immersion technology used, the power usage effectiveness (PUE) of the baseline air-cooled system, climate conditions, and the nature of the IT load, offering a potentially meaningful boost in energy efficiency under optimal conditions. Still, realizing these gains requires more than retrofitting existing hardware.
Challenges of Retrofitting for Immersion Cooling
Brownfield retrofits often face serious hurdles. Many legacy data centers use elevated floors that aren’t engineered to support the weight and density of immersion tanks. Upgrading these sites often demands costly structural reinforcement along with the addition of systems needed for immersion, such as heat exchangers, fluid lines, and maintenance pathways.
Given these structural and infrastructure challenges, most new immersion buildouts are being deployed in purpose-built "AI factory" environments, where floor support, cooling infrastructure, and spatial layouts are engineered specifically for immersion architectures.
In greenfield buildouts, immersion cooling drives higher rack densities and better thermal control, but only when infrastructure is purpose-built for submerged systems.
The Limits of Air-Based Standards
Immersion cooling offers clear thermal advantages but also exposes limitations in traditional reliability frameworks. Most existing standards were built to model material aging in air—conditions where oxidation, not chemical interactions, was the primary failure driver.
Inside dielectric fluids, oxidation slows dramatically.
In its place, thermal-chemical degradation—including potential hydrolysis, material swelling, and additives gradually leaching into the fluid—emerges as a dominant risk. Over time, these chemical shifts can weaken mechanical properties and compromise long-term reliability. Testing methods like mixed flow gas aging, originally designed to simulate airborne corrosion through exposure to reactive gases like sulfur dioxide and nitrogen dioxide, no longer align with the real failure mechanisms at play in fluid environments.
In Part 2, we take a look at rethinking thermal and mechanical behavior in immersion.