WorkingAgents + Accelsius: Agent Governance Meets the Physics of AI Infrastructure

By James Aspinwall, co-written by Alfred Pennyworth (my trusted AI) — March 7, 2026, 19:05

Accelsius cools the hardware that runs AI. WorkingAgents governs the AI that runs on the hardware. As data centers scale from megawatts to gigawatts to support agentic workloads, the cooling infrastructure itself becomes a mission-critical system that needs intelligent, governed management. This partnership sits at the intersection of physical infrastructure and AI governance — two layers of the same stack that are about to converge.

What Accelsius Brings

Accelsius is a liquid cooling company founded in 2022, commercializing two-phase direct-to-chip technology originally developed at Nokia Bell Labs. Series B closed January 2026 at $65M led by Johnson Controls with Legrand joining — strategic investors who are the largest building technology and electrical infrastructure companies in the world. Total funding: ~$89M+.

NeuCool Platform — Two-Phase Direct-to-Chip Cooling:

Dielectric refrigerant flows through vaporators mounted directly on hot-spot chips (CPUs, GPUs)
Refrigerant nucleates into vapor at the chip surface, absorbing massive heat, then condenses back to liquid in a closed loop
Two-phase change absorbs far more energy per unit than single-phase liquid cooling (latent heat vs. sensible heat)
Tested beyond 4,500W per GPU socket — highest in the industry for direct-to-chip
Thermal resistance of 0.020°C/W at high TDPs (NVIDIA H100-class)
Supports facility water temperatures up to 45°C, maximizing free-cooling days
Estimated 50% reduction in energy use vs. air cooling
35% OpEx savings over single-phase direct-to-chip
PUE of 1.08 (vs. industry average ~1.5 for air-cooled)
Compatible with Intel Sapphire Rapids, AMD EPYC Genoa, NVIDIA Grace, NVIDIA H200, AMD MI325X

NeuCool MR250 — Row-Based CDU (Generally Available Oct 2025):

250 kW of liquid cooling capacity per rack (1 x 250 kW or 2 x 125 kW configurations)
First in a planned series — higher-capacity systems shipping 2026
Compatible with R1233zd(E) or R-515B refrigerants
Hot-swappable pumps, power supplies, control boards, and sensors

NeuGuard — Enterprise Support Program:

Up to $100,000 per rack coverage for internal leak damage
Custom multi-year warranty backed by CNA (one of the largest warranty insurers in the US)
Network of Authorized Service Partners for lifecycle support
Planning through deployment through ongoing operations

Intelligent Monitoring:

Integrated software monitoring all NeuCool parameters and functions
Local touchscreen LCD on CDU + remote Web interface
DCIM integration via SNMP, IPMI, Redfish
Auto-failover redundancies for edge and remote deployments
Real-time visibility into cooling infrastructure

Customers and Partners:

5x growth in deployments in H1 2025
Equinix deployment at Ashburn, Virginia CIF
iM Data Centers partnership for modular AI data centers
Johnson Controls (strategic investor + building technology integration)
Legrand (strategic investor + electrical/thermal portfolio integration)
Equus Compute Solutions (OEM integration partner)

Target Market: Hyperscalers, OEMs, colocation providers, and enterprise data centers deploying AI/HPC workloads at 50+ kW per rack densities.

What WorkingAgents Brings

WorkingAgents is an AI governance platform with three gateways (AI, Agent, MCP) that governs 60+ MCP tools with capability-based access control, audit trails, and guardrails on every agent action. The platform provides multi-step workflow orchestration, cost attribution, and observability across all agent interactions.

What WorkingAgents doesn’t have: physical infrastructure management, cooling telemetry, hardware monitoring, or data center operational technology. What it does have: the governance layer for autonomous systems that make decisions affecting mission-critical infrastructure.

The Strategic Thesis

Data center cooling is becoming an autonomous, AI-managed system. The industry trend is clear:

Google’s DeepMind reduced data center cooling energy by 40% using AI in 2016 — the earliest proof point
In 2026, autonomous operations are managing critical infrastructure tasks from workload placement to power optimization
Specialist AI agents dedicated to infrastructure governance continuously monitor, classify, and optimize in real time
Cooling systems with AI enable continuous prediction of heat loads and automatic adjustment without on-site staff

Accelsius already has intelligent monitoring with SNMP/IPMI/Redfish integration and auto-failover. The next step is AI agents that autonomously manage cooling infrastructure — adjusting coolant flow rates based on GPU utilization forecasts, preemptively routing cooling capacity before workload spikes, coordinating with power management systems, and making decisions that affect hardware worth millions per rack.

Those agents need governance. An autonomous cooling agent that makes a bad decision doesn’t just waste energy — it can thermal-throttle a training run that costs $100K/hour, or worse, damage GPU hardware that takes months to replace.

This is where WorkingAgents enters: governing the AI agents that manage the physical infrastructure that runs AI.

The Gap Analysis

Accelsius Gap	WorkingAgents Solution
Monitoring is reactive — alerts and dashboards, not autonomous optimization	Agent Gateway orchestrates multi-step cooling workflows with checkpoints
No permission model for who/what can adjust cooling parameters	Capability-based access control scopes each agent and operator’s authority
No audit trail connecting cooling decisions to workload events	Unified logging across infrastructure and workload layers
No guardrails preventing dangerous cooling configuration changes	Pre-execution validation blocks unsafe parameter changes
No multi-system coordination (cooling + power + workload)	Agent orchestration across infrastructure, power, and compute agents
No cost attribution linking cooling energy to specific workloads	Token-level cost tracking extended to infrastructure cost per job
Single-system monitoring (cooling only)	Cross-system observability across cooling, power, compute, and network

WorkingAgents Gap	Accelsius Solution
No physical infrastructure management capability	NeuCool platform with real-time cooling telemetry
No hardware-level monitoring or telemetry	SNMP, IPMI, Redfish, DCIM integration
No data center operational technology expertise	Decades of thermal engineering from Nokia Bell Labs heritage
No edge/remote deployment infrastructure	Auto-failover for edge facilities with remote management
No hardware reliability or warranty infrastructure	NeuGuard with $100K/rack coverage and CNA warranty backing
No relationship with hyperscalers/colo operators	Equinix, Johnson Controls, Legrand partnerships

Synergy Areas

1. Governed Autonomous Cooling Management

The primary opportunity. AI agents that manage cooling infrastructure in real time, with WorkingAgents providing the governance layer:

Cooling Optimization Agent
  ✓ NeuCool: read thermal telemetry (all racks)
  ✓ NeuCool: adjust coolant flow rate (within safe bounds)
  ✓ NeuCool: adjust facility water temperature setpoint (±2°C)
  ✓ Workload Manager: read GPU utilization forecasts
  ✓ Alerts: send thermal warnings to NOC
  × NeuCool: disable cooling on active racks
  × NeuCool: change refrigerant type
  × Power: modify power distribution
  × Hardware: shut down compute nodes

Capacity Planning Agent
  ✓ NeuCool: read historical thermal data
  ✓ NeuCool: read CDU capacity utilization
  ✓ Workload Manager: read job queue and scheduling
  ✓ Reports: generate capacity forecasts
  × NeuCool: modify any operational parameters
  × Power: modify any operational parameters

The cooling agent can optimize within defined safety bounds — it can adjust flow rates and temperature setpoints but cannot disable cooling on active racks or make changes outside its authorized range. Every adjustment is logged with the thermal data that triggered it, the workload context, and the energy impact.

Human-in-the-loop for high-risk operations: “Agent recommends increasing facility water temp from 38°C to 42°C to save 15% cooling energy during off-peak — approve or deny?”

2. Cross-Layer Infrastructure Observability

Modern AI data centers have four interdependent layers: compute, network, power, and cooling. A problem in one layer cascades to the others. GPU utilization spike → thermal spike → cooling response → power draw increase → potential circuit overload.

WorkingAgents provides unified observability across all four layers:

{
  "event": "training_job_scale_up",
  "cluster": "gpu-rack-14-16",
  "timeline": [
    {
      "layer": "compute",
      "action": "GPU utilization increased 40% → 95%",
      "agent": "workload_scheduler",
      "timestamp": "2026-03-07T14:00:00Z"
    },
    {
      "layer": "cooling",
      "action": "NeuCool flow rate increased 60% → 85%",
      "agent": "cooling_optimizer",
      "guardrails": { "thermal_safety": "passed", "flow_rate_limit": "within_bounds" },
      "timestamp": "2026-03-07T14:00:03Z"
    },
    {
      "layer": "power",
      "action": "Rack power draw increased 45kW → 72kW",
      "agent": "power_monitor",
      "guardrails": { "circuit_capacity": "passed", "budget_check": "passed" },
      "timestamp": "2026-03-07T14:00:03Z"
    },
    {
      "layer": "cooling",
      "action": "CDU MR250 capacity at 88% — pre-staged adjacent CDU",
      "agent": "capacity_planner",
      "timestamp": "2026-03-07T14:00:05Z"
    }
  ],
  "total_cooling_cost_delta": "+$12.40/hr",
  "attributed_to": "customer: acme-corp, job: llm-finetune-v3"
}

One log traces the entire cascade from workload to cooling to power to cost. When something goes wrong — and in a 250kW-per-rack environment, the failure modes are severe — operators see the full chain of causation, not fragments scattered across four different monitoring systems.

3. Workload-Aware Cooling Cost Attribution

AI infrastructure operators need to know what cooling costs per job, per customer, per GPU-hour. Today, cooling is a facility-level overhead allocated by square footage or rack count. With NeuCool’s per-rack telemetry integrated through WorkingAgents’ cost attribution:

Per-job cooling cost — training run X consumed 45 kWh of cooling energy over 8 hours = $3.60 cooling cost
Per-customer cooling allocation — Customer A’s workloads generated 2.3x the thermal load of Customer B at the same GPU count (different model architectures produce different heat profiles)
Cooling efficiency scoring — Rack 14 operates at PUE 1.06, Rack 22 at PUE 1.12 — workload placement optimizes for cooling efficiency
Budget enforcement — Agent blocks job submission if projected cooling cost would exceed customer’s thermal budget

This transforms cooling from an undifferentiated facility expense into a metered, attributed, optimizable cost center — critical for colocation providers and cloud operators who need to price AI compute accurately.

4. Predictive Maintenance with Governed Response

NeuCool’s hot-swappable components (pumps, power supplies, control boards, sensors) combined with intelligent monitoring generate rich telemetry data. AI agents can predict component failure before it happens:

Predictive Maintenance Agent
  ✓ NeuCool: read pump vibration, flow rate, pressure telemetry
  ✓ NeuCool: read historical component performance baselines
  ✓ Maintenance: create service tickets
  ✓ Alerts: notify NOC and NeuGuard service partner
  ✓ Capacity: pre-stage backup cooling capacity
  × NeuCool: shut down components
  × NeuCool: modify cooling parameters
  × Maintenance: dispatch technicians (requires human approval)

The agent detects that Pump A in CDU-14 shows vibration patterns consistent with bearing wear — 72-hour predicted failure window. It creates a service ticket through NeuGuard’s Authorized Service Partner network, pre-stages cooling redundancy on adjacent CDUs, and alerts the NOC. It cannot shut down the pump itself or dispatch a technician without human approval.

Every prediction, alert, and response is logged. When the NeuGuard warranty team reviews the incident, the audit trail shows exactly what happened, when the degradation was detected, what response was taken, and whether the governed workflow prevented downtime.

5. Multi-Site Infrastructure Governance

Accelsius deploys across hyperscalers, colos, and enterprise data centers — each with different operational policies, SLAs, and risk tolerances. WorkingAgents’ Virtual MCP Servers scope governance per site:

Hyperscaler Site (Equinix Ashburn)
  ✓ Autonomous cooling optimization (full agent authority)
  ✓ Predictive maintenance (auto-create tickets)
  ✓ Workload-aware thermal routing
  Guardrails: thermal limits per Equinix SLA, power caps per contract

Edge Deployment (Remote, Unmanned)
  ✓ Autonomous cooling with enhanced auto-failover
  ✓ Remote monitoring and telemetry
  ✓ Emergency shutdown authority (agent can shut down cooling if leak detected)
  Guardrails: conservative thermal bounds, immediate NOC notification

Enterprise On-Prem
  ✓ Monitoring and alerting only
  × No autonomous parameter changes (human-approval required for all changes)
  Guardrails: strictest bounds, full audit trail for compliance

Same NeuCool hardware. Same WorkingAgents governance platform. Different policies per site based on operational maturity, staffing model, and risk tolerance.

6. Sustainability Reporting and ESG Compliance

Accelsius claims 50% energy reduction vs. air cooling and PUE of 1.08. These claims need continuous measurement, reporting, and verification for ESG reporting and sustainability commitments. WorkingAgents provides:

Continuous PUE tracking — real-time calculation from NeuCool telemetry, not periodic facility-level estimates
Carbon attribution — cooling energy consumption × grid carbon intensity = per-job carbon footprint
Compliance audit trail — every sustainability metric traced to source telemetry with timestamp and calculation methodology
Automated ESG reporting — agents generate periodic sustainability reports from cooling telemetry data, governed to ensure accuracy and consistency

When a hyperscaler reports “our AI workloads run at PUE 1.08 using Accelsius NeuCool,” the WorkingAgents audit trail proves it — rack by rack, hour by hour, job by job.

Partnership Model

Phase 1: Telemetry Integration (Weeks 1-6)

Connect NeuCool’s SNMP/IPMI/Redfish telemetry to WorkingAgents as MCP tools
Define tool schemas: neucool.read_thermal, neucool.read_flow_rate, neucool.read_cdu_capacity, neucool.adjust_setpoint
Build cooling-specific guardrails: thermal safety bounds, rate-of-change limits, mandatory cooldown periods between adjustments
Prototype governed cooling optimization loop at a test deployment

Phase 2: Joint Pilot (Weeks 7-14)

Deploy at one Accelsius customer site (colo or enterprise, 4-8 racks with MR250)
Start with monitoring and alerting agents (read-only, no autonomous adjustments)
Add workload-aware cost attribution: correlate GPU utilization with cooling energy per rack
Measure: energy savings from optimized cooling schedules, MTTR for predictive maintenance alerts, audit trail completeness

Phase 3: Autonomous Operations Product (Weeks 15-24)

Package “NeuCool + WorkingAgents” as a governed autonomous cooling platform
Graduated autonomy model: monitoring-only → assisted optimization → supervised autonomous → full autonomous
Multi-site governance dashboard with per-site policies and cross-site analytics
Joint go-to-market targeting AI data center operators scaling from single-rack to multi-MW deployments

Revenue Opportunity

The AI data center cooling market is growing explosively:

Global data center liquid cooling market projected to exceed $15B by 2028
AI/HPC workloads drive 100+ kW per rack densities that air cooling physically cannot handle
Cooling represents 30-40% of total data center energy costs — the largest operational expense after compute
Colocation providers need per-customer cooling cost attribution to price AI compute competitively
Hyperscalers spending $50B+ annually on data center infrastructure demand autonomous operations to scale without proportional headcount

WorkingAgents monetizes the governance layer on top of Accelsius’s hardware. Accelsius sells NeuCool with governed autonomous management as a differentiator over competitors (CoolIT, GRC, Motivair) who sell hardware without an intelligence layer. The joint offering commands a premium because it delivers measurable outcomes: lower PUE, predictive maintenance, cost attribution, and audit-ready sustainability reporting.

Why This Partnership Works

AI infrastructure is entering a phase where the physical and digital layers can no longer be managed independently. A 250 kW rack running NVIDIA B200s generates enough heat to warm a house. The cooling system keeping those GPUs alive is as mission-critical as the GPUs themselves. And as cooling systems become intelligent and autonomous, they need the same governance that any autonomous system demands: permissions, guardrails, audit trails, and human oversight for high-risk decisions.

Accelsius has built the best thermal solution for AI-scale computing — 4,500W per socket, PUE 1.08, two-phase efficiency that single-phase can’t match. WorkingAgents has built the governance platform for autonomous AI systems — permissions, guardrails, and audit trails at every decision point.

Accelsius removes the thermal barrier to AI scale. WorkingAgents removes the trust barrier to autonomous infrastructure. Together, they deliver what the next generation of AI data centers needs: cooling that’s intelligent, autonomous, and governed — from the chip to the audit trail.

James Aspinwall is the founder of WorkingAgents, an AI governance platform specializing in agent access control, security, and integration services for enterprises deploying AI at scale.