When a single GPU’ s power consumption skyrockets past 1,000 W, and when the power of a single rack surges to 200 kW, can traditional air cooling still keep up? ASHRAE states that when rack power reaches approximately 50-120 kW or higher, traditional air cooling struggles to meet thermal management demands. Consequently, liquid cooling technology has evolved from an optional feature in data centers to a necessity in the era of high-density computing.
This article explores why liquid cooling has become the mainstream cooling technology in the AI era and how water quality affects the performance and service life of liquid cooling systems. Additionally, this article highlights several water quality monitoring instruments and solutions provided by Renkeer to ensure the long-term, stable operation of liquid cooling systems.
Why Liquid Cooling Matters More Than Ever in the Age of AI?
With the explosive growth of AI computing power, the power consumption of high-power chips has surpassed 1,200W. However, according to Vertiv data, the upper limit for air cooling technology is TDP <1,000W. Therefore, when the power consumption of AI chips exceeds the 1,000W threshold, the marginal efficiency of air cooling drops significantly and may even trigger thermal throttling. Thermal throttling occurs when a chip actively reduces its clock speed to prevent overheating, resulting in a decrease in computing power.
Consequently, as AI GPU power consumption moves from the 300W range toward 700W, 1,000W, and beyond, liquid cooling is essential to ensure that computing power is fully realized. When the power consumption per rack in AI servers surges beyond 50kW, the rapid increase in computing density creates an urgent need for the adoption of liquid cooling. Furthermore, the most effective way to reduce an IDC’ s PUE from above 1.5 to 1.2 is also to adopt liquid cooling. For these reasons, in the era of artificial intelligence, liquid cooling technology has become an essential standard for the sustained development of AI.
Liquid cooling technology refers to the use of a liquid (such as water or other coolants) to replace air as a heat transfer medium, facilitating heat exchange with heat-generating components to dissipate heat. Compared to traditional air cooling, liquid cooling can more effectively lower equipment temperatures, ensuring both performance and longevity. Taking water as an example, its thermal conductivity is approximately 20 to 25 times that of air; for the same volume, water’ s heat-carrying capacity is about 3,000 times that of air. Therefore, in high-power, high-thermal-density AI servers, GPU clusters, and data centers, liquid cooling can more effectively suppress chip temperature rises and reduce thermal throttling caused by overheating.
At NVIDIA’ s 2025 Global Technology Conference (GTC), Jensen Huang announced that racks set to launch in 2027(such as the Rubin Ultra NVL576), could consume up to 600 kW of power, which is five times the power consumption of current GB200 NVL72 cabinets (approximately 120 kW). This will place unprecedented strain on cooling systems, which is why Rubin Ultra adopted a fully liquid-cooled solution from the very beginning of its product architecture design. This signals that future high-performance AI data centers will inevitably transition 100% to liquid cooling.
Liquid Cooling Systems: Types, Architecture, and Key Factors
Two main liquid cooling aystems in AI data centers
| Types | Cold plate cooling | Immersion cooling |
|---|---|---|
| Principles | The cold plate is positioned close to the heat source (CPU/GPU), using the coolant within the cold plate to dissipate heat. | The server is fully submerged in coolant, which evaporates to dissipate heat. |
| Characteristics | Minimal hardware modifications and simple maintenance; high reliability requirements. | High cooling capacity, high power density, and quiet operation. |
| Ecology | Inconsistencies in IT infrastructure, refrigerants, piping, and power supply systems; servers are often deeply integrated with server racks. | Customizable; compatibility with optical modules to be verified. |
| Morphology | | |
1. Direct-to-Chip (D2C) cold plate cooling
Direct-to-Chip (D2C) cold plate cooling is the mainstream technology in the current AI era. It involves attaching metal plates (also known as cold plates) with internal water channels to CPU and GPU chips; the coolant flows through the cold plates to carry away heat from the chips. With this method, servers remain housed in their original vertical racks, requiring very few structural modifications. Its advantages include low retrofitting costs and compatibility with existing server architectures. Currently, cold plate cooling technology holds a dominant position in the thermal management market, with a 70% market share.
2. Immersion cooling
Immersion cooling involves completely submerging the entire server motherboard in a special non-conductive liquid (typically a fluorinated fluid). Heat generated by the chips is transferred directly to the liquid and then carried away by the circulating coolant. Its advantage is extremely high cooling efficiency; PUE can be reduced to 1.12 (according to a report by the Uptime Institute), and it produces virtually no noise. However, it is more expensive, requires high-quality fluids and strict sealing standards, and fluorinated liquids face allegations of toxicity and environmental pollution. Therefore, at present, it remains a technology for the future.
Liquid cooling system architecture
Whether using cold plate or immersion cooling, once heat is removed by the liquid, the subsequent processing path is the same. A liquid cooling system consists of two physically isolated circulation loops connected in series: the secondary loop (IT Loop), comprising the CDU, liquid-cooled cabinets (cold plates, UQDs, and manifolds), and piping; and the primary loop (Facility Loop), comprising the cooling source (cooling tower or dry cooler) and piping.
In a liquid-cooling architecture, the CDU serves as the flow hub and heat exchange center, connecting the two independent circulation loops. Heat is transferred between the two sides of the coolant via the metal walls of the heat exchanger inside the CDU. After absorbing heat, the coolant in the primary loop returns to the cooling tower to be recooled; after releasing heat, the coolant in the secondary loop cools down and is pumped back to the cabinet by the CDU circulation pump. Ultimately, the heat generated by the CPUs and GPUs is dissipated into the external environment via the cooling tower.
Introduction to liquid cooling technology, video from @Equinix
Water quality is a key factor affecting liquid cooling systems
The proper operation of a liquid-cooled system depends heavily on the quality of the coolant. As clearly stated in the guidelines of the American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE), in liquid-cooling architectures, fluid quality is just as important as the mechanical design itself. Even slight deviations in pH, conductivity, or turbidity can lead to corrosion, scaling, and microbial contamination. This is because the internal channels of cold plates are extremely narrow, with many measuring only 0.2 mm or even as small as 100 μm. Any rust, scale, microorganisms, copper ions, or corrosion byproducts can cause blockages. Once blocked, the GPU temperature rises immediately, which may lead to GPU throttling, slow down training speeds, or even cause GPU damage resulting in a complete server shutdown.
Key Water Quality Parameters for AI Liquid Cooling Systems
In the past, a data center might have consumed only a few MW, but today an AI data center can consume 100 MW or more, and the scale of its cooling systems has grown accordingly. To maintain efficient heat exchange, the importance of facility-side circulating water management, including cooling towers, chilled water systems, and CDUs, has increased, resulting in greater demand for online monitoring and automated dosing control of water quality parameters such as conductivity and pH.
Leading standards bodies, primarily ASHRAE TC 9.9 and the Open Compute Project (OCP), have established mandatory water quality parameters for data center liquid cooling systems, with stricter requirements applying to the Technology Cooling System (TCS) loop that circulates coolant directly through cold plates and server hardware. Six parameters are universally recognized as critical, each requiring dedicated sensor technology:
| Parameter | Industry Reference Value | Source | Primary Risk | Monitoring Frequency | Dedicated sensor |
|---|---|---|---|---|---|
| PH | 6.8-8.5 (water-based coolant) | ASHRAE TC 9.9/OCP | Acid/alkaline corrosion | Continuous inline | RS-PH-N01-3-201T-EX |
| Conductivity | 0.20-20 µS/cm (deionized water systems) | ASHRAE TC 9.9 | Elevated ionic concentration; galvanic corrosion risk; electrical short on leakage | Continuous inline | RS-EC-N01-3-01-EX |
| Turbidity | <20 NTU | ASHRAE TC 9.9 | Suspended solids; microchannel blockage risk | Continuous inline | RS-ZD-N01-1S-50-EX |
| Total Hardness | <20 mg/L (as CaCO₃) | ASHRAE TC 9.9 | Scale formation on heat transfer surfaces | Weekly manual + trend | RS-LCa-N01-3-100-EX RS-LMg-N01-3-100-EX |
| Bacteria | <100 CFU/mL | ASHRAE TC 9.9 | Biofilm formation; microbiologically influenced corrosion (MIC) | Monthly laboratory analysis | — |
| ORP | 200-400 mV | Cooling tower/ASHRAE 188 | Oxidizing biocide effectiveness in open-loop systems | Continuous inline (cooling tower only) | RS-ORP-N01-3-300T-EX |
The following key water quality parameters determine the behavior of the coolant in the circuit:
1. pH
In D2C cold plate cooling systems, untreated ethylene glycol solutions degrade over time, producing organic acids that continuously lower the pH and accelerate corrosion. The risk is particularly severe for aluminum components, as the protective oxide film on passivated aluminum is stable only within a pH range of 4.0-8.5. Once the pH falls below 4.0, the passivation layer dissolves, exposing the bare metal directly to the coolant.
Key Point: According to ASHRAE TC 9.9 and OCP guidelines, the target pH for water-based TCS circuits should be maintained between 6.8 and 8.5. If a deviation in the coolant’s pH is detected, corrective action should be taken as soon as possible.
2. Conductivity
Conductivity determines the “purity” of the coolant in a liquid cooling system and the “risk of electrical conductivity.” Increased conductivity typically indicates the presence of more metal ions, chloride ions, or other impurities in the water, which can accelerate the electrochemical corrosion of materials such as copper, aluminum, and stainless steel, thereby shortening the service life of cold plates and piping.
Conductivity requirements vary significantly among different coolants: deionized water systems require strict control below 100 µS/cm, while propylene glycol systems typically have an acceptable upper limit of 2000-2500 µS/cm. Therefore, conductivity assessments must be evaluated in conjunction with the specific type of coolant.
Key Point: Conductivity sensors are low-cost and fast-responding; they are widely integrated into CDU systems and represent one of the most cost-effective and efficient methods for real-time monitoring of coolant contamination.
3. Turbidity
For D2C cold plate cooling systems, an increase in turbidity is often an early indication of corrosion, scaling, or biological contamination occurring within the system. The microchannels inside the cold plate are extremely narrow, and any accumulation of particles can cause localized blockages, leading to uneven heat dissipation from the chip and a sudden rise in temperature, which in turn can trigger GPU throttling or even hardware damage.
Key Point: According to ASHRAE TC 9.9, turbidity in TCS circuits should be maintained below 20 NTU. A sustained increase in turbidity typically indicates deeper water quality issues and requires a comprehensive assessment in conjunction with pH and conductivity data.
4. Calcium and magnesium ions
In the cold plate-based liquid cooling IT-side loops most commonly used in AI data centers, deionized water or specialized coolants are typically employed, which contain virtually no calcium or magnesium ions and therefore have a hardness level close to zero. However, in facility-side loops (such as cooling towers), water evaporation leads to the accumulation of calcium and magnesium ions and subsequent scaling, which impairs heat transfer efficiency. Therefore, monitoring the concentration of calcium and magnesium ions in facility-side loops is critical.
Key Point: A layer of calcium carbonate scale just 0.25 mm (0.01 inch) thick can reduce heat transfer efficiency by approximately 10%; consequently, scale prevention is a top priority in industrial circulating water systems.
5. ORP
ORP reflects the redox balance of the cooling water. In open-loop cooling towers, ORP can be used to assess the effectiveness of oxidative biocides (such as sodium hypochlorite). Typically maintained at 200-400 mV, it effectively inhibits the growth of microorganisms such as Legionella. In closed-loop TCS systems, ORP primarily reflects the consumption rate of corrosion inhibitors.
Key Point: ORP cannot replace microbial testing, but it serves as an early warning indicator of the likelihood of water quality deterioration.
Renkeer's Liquid Cooling Water Quality Monitoring Sensors
Liquid cooling systems in AI data centers operate continuously throughout the year. Relying on traditional manual sampling and testing methods suffers from significant lag and cannot meet the stability requirements of cooling systems in high-density AI data centers. Therefore, integrating online sensors directly into the cooling loops to enable continuous, real-time monitoring of key water quality parameters has become the standard practice for modern liquid cooling operations and maintenance.
To address core water quality risks in liquid cooling systems, Renkeer offers online sensors that cover four key parameters—pH, conductivity, turbidity, and ORP. These water quality sensors can be deployed in the secondary circuit of the TCS, at the inlet and outlet of the CDU, and in the primary circuit of the cooling tower, thereby establishing a comprehensive water quality early-warning system.
1. pH sensors for liquid cooling systems
Renkeer’ s integrated pH sensors (Model RS-PH-N01-3-*-EX Series) combine a composite electrode, a reference electrode, and a temperature sensor into a single unit. They can be installed directly into the cooling circuit piping to enable continuous online monitoring of pH values. With a measurement range of 0-14 pH, a resolution of 0.01 pH, and an accuracy of ±0.15 pH, these sensors fully meet the requirements for precise control within the target range of 6.8-8.5 in liquid cooling systems. The built-in automatic temperature compensation function ensures more reliable measurement data.
2. Conductivity sensors for liquid cooling systems
Renkeer’ s conductivity sensors (Model RS-EC-N01-3-EX Series) feature a platinum black electrode coating that does not react with common liquid cooling media, ensuring high accuracy and stable measurements. The K=1 electrode constant type covers 1-2000 µS/cm and is suitable for deionized water and TCS circuits with low ionic concentrations; the K=10 type covers 10-20,000 µS/cm and is suitable for high-concentration coolant systems such as ethylene glycol blends.
3. Turbidity sensors for liquid cooling systems
Renkeer turbidity sensors (Model RS-ZD-N01-*-EX Series) feature built-in temperature compensation and optical filtering algorithms, with an accuracy of ±5% FS. Since the secondary-side circuit of TCS systems has high coolant purity requirements, a 0-50 NTU range is recommended. For the primary-side cooling tower circuit, ranges of 0-200 NTU, 0-1,000 NTU, or 0-4,000 NTU may be selected. The stainless steel housing provides excellent corrosion resistance against water-glycol coolants containing inhibitors.
4. ORP sensors for liquid cooling systems
Renkeer ORP sensors (Model RS-ORP-N01-3-EX Series) feature a high-purity platinum composite electrode that is resistant to acid, alkali, and oxidative corrosion. As an inert precious metal, platinum does not react chemically with common cooling media used in liquid cooling systems, such as deionized water and propylene glycol mixtures, making it naturally suited for long-term online monitoring in liquid cooling circuits.
In addition to the four core sensors mentioned above, Renkeer also offers dissolved oxygen sensors (to monitor oxygen concentration in closed loops; higher dissolved oxygen levels increase the risk of corrosion), residual chlorine sensors (used to verify the disinfection effectiveness after chlorine addition in cooling towers), and total suspended solids (TSS) sensors (which complement turbidity sensors by directly quantifying the mass concentration of particulate matter). These can be flexibly selected based on the specific configuration and monitoring requirements of the liquid cooling system to build a more comprehensive water quality monitoring system.
How Water Quality Improves Cooling Efficiency and Reduces Operating Costs
Optimize heat transfer and reduce energy consumption
Clean, contamination-free coolant is the most effective way to remove and dissipate heat from servers. Cooling systems account for approximately 30% to 40% of a data center’s total electricity consumption, making them the single largest controllable source of energy consumption. A decline in heat transfer efficiency caused by deteriorating water quality directly drives up PUE and electricity costs.
Reduce scaling and corrosion
Through water quality monitoring combined with softening, filtration, and chemical treatment, scaling and corrosion can be effectively suppressed, preventing premature damage to cooling coils, heat exchangers, and piping. This maximizes cost savings on maintenance and extends the service life of equipment.
Extending system lifespan
Maintaining high water quality ensures that cooling systems operate smoothly for longer periods, reducing mechanical failures and unplanned downtime. Cooling failures account for 19% of major data center outages, and cooling failure is typically a gradual process of thermal runaway. Continuous water quality monitoring can detect problems early and prevent system downtime.
FAQs
1. As AI adoption accelerates, how urgent is the need for liquid-cooled servers?
According to News Express, the share of liquid cooling technology in overall data center cooling solutions is expected to grow from approximately 22% in 2024 to over 45% by 2030. The market also reflects this urgency: the global data center liquid cooling market was valued at $6.6 billion in 2025 and is projected to reach $29.5 billion by 2033, representing a compound annual growth rate (CAGR) of 20.1%.
2. What are the forms of liquid cooling being deployed?
Direct-to-Chip Liquid Cooling (Cold Plate Cooling)
Immersion Cooling
Rear Door Heat Exchanger (RDHx)
3. What is a CDU, and what is its function?
CDU stands for Coolant Distribution Unit, which serves as the central control hub of a liquid cooling system. Essentially, it is a closed-loop pump station equipped with precision heat exchangers and precise pressure control. It connects to cooling towers and chillers upstream to draw in cooling water, and to server cold plates or immersion tanks downstream to supply temperature-controlled coolant to computing equipment. Its function is to continuously transfer heat 24 hours a day.
Is it possible to have 100% free cooling with liquid-cooled servers?
It cannot reach 100%, but under certain conditions it can approach 90%+.
4. Will liquid cooling become mandatory in future data centers?
Yes, and it has already happened. The NVIDIA Rubin platform is the world’s first AI infrastructure to achieve 100% liquid cooling; every chip and every network component is cooled by liquid, and the system contains no fans. Because the Rubin platform incorporates a 100% liquid-cooled infrastructure, every cloud service provider and operator building data centers for it must make this transition.
5. What are the leak risks, and how are they managed?
High-Risk Leakage Points:
Locations most prone to leaks include beneath quick-connect manifolds, at the drain pan of the rear-door heat exchanger (RDHx), and beneath the cold plate return lines.
Common risk areas in immersion cooling tanks include welds, sight glass seals, pump housings, and fluid inlets and outlets; Leaks in rear-door heat exchangers most frequently occur in brazed coil sections, pipe fittings, and integrated control valves.
Real-Life Incident:
In November 2025, a liquid cooling pipe burst in a GPU cluster in Southeast Asia, affecting more than a dozen GPU server racks in a high-density data center. Employees were photographed using mops to clean up the standing water. Given that each GPU rack costs several million dollars, even damage to a single rack would result in a seven-figure loss.
6. What coolant should I use — water, glycol, or dielectric fluid?
- Direct liquid cooling using cold plates: Propylene glycol-water solution (PG25 industry standard); deionized water may be used for indoor closed-loop systems.
- Immersion cooling: Dielectric fluid (synthetic hydrocarbons or ester-based fluids for single-phase systems; fluorinated fluids for two-phase systems).
- Facility main pipelines/outdoor pipelines: Ethylene glycol-water solution (with priority given to freeze protection).
7. How do I maintain coolant quality over time?
Both ASHRAE TC 9.9 and the OCP explicitly require continuous monitoring of coolant water quality, with key parameters including pH (target range for propylene glycol circuits: 8.0-10.5) and corrosion inhibitor concentration. Once the corrosion inhibitor is depleted, the corrosion rate of aluminum and copper components increases sharply.
8. What rack density triggers the need for liquid cooling?
Air cooling begins to reach its physical limits when rack power consumption exceeds approximately 30~40 kW. The required airflow volume and velocity exceed the practical capacity of standard data center airflow systems; at this density and above, liquid cooling shifts from being an efficiency option to a structural necessity.
9. How does liquid cooling affect PUE?
Liquid cooling significantly lowers the PUE (total energy consumption divided by IT equipment energy consumption; the closer to 1, the more energy-efficient) by reducing the energy consumption of the cooling system, lowering the power consumption of the IT equipment itself, and improving the utilization rate of waste heat. While traditional air cooling typically has a PUE of 1.4-1.5, liquid cooling can reduce it to below 1.1, and high-end solutions can even approach the theoretical limit of 1.01.
10. What are the main problems in the current development of AI?
In the United States, data center electricity consumption accounted for approximately 4%-4.4% of the nation’s total electricity consumption around 2023 and has become one of the fastest-growing electricity loads. According to scenario projections by the Lawrence Berkeley National Laboratory (LBNL), driven by the continued expansion of AI computing power, U.S. data center electricity consumption could reach approximately 426 TWh by 2030—an increase of about 133% compared to 2024. These figures indicate that artificial intelligence is no longer merely a software issue, but rather a challenge for power infrastructure and thermal management.

This article was authored by the Renkeer Technical Team, comprising engineers and product specialists with deep expertise in water quality sensing and industrial monitoring instrumentation. Renkeer designs and manufactures precision sensors for water treatment, environmental monitoring, and industrial process control. Our sensor solutions are engineered to meet international standards including ASHRAE and ISO, providing the accuracy and reliability that mission-critical liquid cooling infrastructure demands.









