cyclone
Rack Vortex
Engineering February 28, 2026

AI Data Center Cooling Requirements in 2026: What Actually Works

Every GPU generation doubles the thermal problem. Here's the real engineering breakdown — not the vendor pitch.

The AI infrastructure buildout is the largest data center expansion in history. Hyperscalers are deploying hundreds of megawatts of GPU compute. Enterprises are standing up inference clusters. And every single one of them is hitting the same wall: cooling.

The GPU roadmap from NVIDIA alone tells the thermal story:

GPU Thermal Requirements (Per Accelerator)

  • NVIDIA A100 (2020): 400W TDP — air cooled, manageable at ~15kW/rack
  • NVIDIA H100 SXM (2023): 700W TDP — pushing air cooling limits at ~40kW/rack
  • NVIDIA H200 (2024): 700W TDP — same thermal, more memory bandwidth
  • NVIDIA B200 (2025): 1,000W TDP — liquid cooling strongly recommended
  • NVIDIA GB200 NVL72 (2025): 1,400W per GPU tray — liquid cooling required, ~120kW per rack

The industry narrative is simple: "Everything needs liquid cooling now." But that narrative serves liquid cooling vendors, not data center operators. The reality is more nuanced, and the right answer depends on which AI workload you're running and what infrastructure you already have.

The Three Tiers of AI Cooling

Tier 1: Training Clusters (40–120kW/rack)

Large-scale training — think GPT-class models, protein folding, climate simulations — runs on the densest hardware available. NVIDIA DGX systems, HGX platforms, and the new GB200 NVL72 racks.

At 40kW+ per rack, you're beyond what passive raised-floor cooling can deliver with traditional tile-and-containment approaches. But that doesn't automatically mean liquid:

  • 40–50kW/rack (H100 clusters): Air cooling is viable with active airflow delivery — not passive floor pressure, but directed air capture at the rack inlet. This is the sweet spot where optimized air cooling avoids the $3–5M capital cost of liquid retrofit.
  • 50–80kW/rack (dense H100/B200): Hybrid approaches — air cooling for CPU/memory/networking components, rear-door heat exchangers for GPU exhaust. Practical for existing facilities.
  • 80–120kW/rack (GB200 NVL72): Direct liquid cooling to the chip. No air cooling solution handles this density. Budget $15–25M for a 10MW liquid-cooled deployment.

Tier 2: Inference and Fine-Tuning (15–40kW/rack)

This is where most enterprises actually live. Running inference on H100s, L40S, or fine-tuning smaller models. Rack densities of 15–40kW — high by traditional standards, but well within the capability of optimized air cooling.

The mistake operators make: assuming that because their training cluster needs liquid cooling, their inference deployment does too. It doesn't. Inference workloads are often 40–60% of training TDP because GPUs aren't running at sustained maximum power.

The Expensive Mistake:

We've seen operators spend $2M on liquid cooling infrastructure for inference clusters that could have been air cooled for $50K in airflow optimization. That's a 40x cost difference for the same thermal outcome. Always right-size the cooling to the actual workload, not the GPU's maximum spec.

Tier 3: Edge AI and Mixed Workloads (8–15kW/rack)

On-premises inference, AI-augmented applications, mixed CPU/GPU workloads. Densities of 8–15kW per rack — challenging for legacy cooling but absolutely solvable with proper airflow management.

Standard hot aisle / cold aisle containment handles this tier well. Add rack-level airflow capture for hot spots, and you're covered without any liquid infrastructure.

Air Cooling vs. Liquid Cooling: The Real Decision Framework

Forget the marketing. Here's the engineering decision tree:

Decision Framework

  • Under 15kW/rack: Standard containment + optimized tile layout. Cost: $5K–$15K/rack.
  • 15–30kW/rack: Containment + rack-level airflow capture (e.g., RackVortex). Cost: $2K–$5K/rack. Zero downtime to deploy.
  • 30–50kW/rack: Active air delivery + rear-door heat exchangers. Cost: $8K–$15K/rack + plumbing for RDHx.
  • 50–80kW/rack: Hybrid: direct-to-chip liquid for GPUs, air for everything else. Cost: $15K–$25K/rack + CDU infrastructure.
  • 80kW+ per rack: Full liquid cooling (direct-to-chip or immersion). Cost: $20K–$40K/rack + facility-level changes.

The Raised Floor Question

Most existing enterprise data centers have raised floors. The AI buildout question is whether that floor can support GPU workloads or needs to be ripped out.

The answer, for Tier 2 and Tier 3 workloads, is almost always: keep the floor, fix the delivery.

A raised floor with 18-inch plenum depth can deliver 200+ CFM per tile. At 15kW per rack, you need roughly 400 CFM per rack (assuming 20°F delta-T). Two tiles per rack — standard layout — provides adequate cooling if you're capturing the air efficiently.

The problem isn't floor capacity. It's delivery efficiency. As we detail in our analysis of recirculation losses, up to 40% of the air that comes through perforated tiles never reaches a server inlet. Fix the delivery, and your existing floor handles twice the density it's currently supporting.

What's Coming Next: 2027 and Beyond

NVIDIA's roadmap shows no sign of thermal deceleration. The Rubin architecture (expected 2027) will push per-GPU TDP past 1,500W. At rack scale, that's 150kW+.

For hyperscale training, liquid cooling is inevitable and already being deployed at scale. But the enterprise segment — which is 80% of the market — will continue operating at 15–40kW densities for the foreseeable future. Inference doesn't need 1,400W GPUs. Fine-tuning doesn't need 120kW racks.

The smart investment for enterprise operators is maximizing air cooling efficiency today while designing liquid-ready infrastructure for tomorrow. That means:

  • Optimize airflow delivery now (immediate ROI, zero stranded capital)
  • Ensure your floor and ceiling can accommodate future piping runs
  • Don't buy liquid cooling equipment until you have the workload that demands it

Key Takeaways

  • 80% of enterprise AI deployments run at 15–40kW/rack — well within optimized air cooling capability
  • Liquid cooling is required above 80kW/rack (GB200 NVL72 class), not at 40kW
  • Inference workloads run at 40–60% of training TDP — don't over-cool for the wrong workload
  • Fix delivery efficiency before adding cooling capacity — most raised floors are under-utilized, not under-sized

Running AI on Existing Infrastructure?

Check whether your raised floor can support GPU rack densities with optimized airflow delivery.

RUN COMPATIBILITY CHECK →

Related Reading