By James Aspinwall — February 2026
The Problem Everyone Gets Wrong
When people hear “AI for compliance,” they picture an AI reading every bank transaction and deciding whether it’s suspicious. That sounds impressive. It’s also ruinously expensive and dangerously slow.
A mid-size bank processes a million transactions a day. Running each one through an AI model costs roughly $1,000–3,000 per day in API fees alone. The AI takes one to two seconds per transaction — an eternity when you need to block a fraudulent wire transfer before the money disappears. And when the regulator asks “why did you flag this?”, the answer “the AI had a feeling” doesn’t survive an audit.
There’s a better way.
The Idea: AI Writes the Rules, Code Enforces Them
Instead of using AI at runtime — reading every transaction as it flows through — you use AI once, at design time, to generate the compliance rules as executable code. Then that code runs against the transaction stream at machine speed.
Think of it like this. You wouldn’t hire a master chef to taste every plate that comes out of your restaurant kitchen. You’d have the chef design the recipes, train the cooks, and set the quality standards. Then the kitchen runs on its own, thousands of plates a night, consistently, at a fraction of the cost of the chef standing at every station.
The AI is the chef. The code is the kitchen.
How It Works in Practice
A compliance officer writes a rule in plain language:
“Flag any account that makes three or more cash deposits totalling over €9,000 within 24 hours, where each individual deposit is below €10,000.”
That’s a classic structuring detection rule — someone splitting a large deposit into smaller ones to avoid the €10,000 reporting threshold.
An AI model reads that description and generates a small, self-contained program that watches the transaction stream, tracks running totals per account, and raises an alert when the pattern matches. A developer reviews the generated code, approves it, and deploys it — without restarting the system.
The rule then processes millions of transactions per day at microsecond speed, for zero ongoing AI cost.
Why This Matters for Business
Cost
Running AI inference on every transaction: €30,000–90,000 per month for a mid-size bank.
Running generated code on every transaction: effectively zero marginal cost. You pay for the servers, which you already have.
Speed
AI inference per transaction: 500 milliseconds to 2 seconds. By the time the AI decides a wire transfer is suspicious, the money may already be in another country.
Generated code per transaction: microseconds. Fast enough to block the transfer before it clears.
Auditability
Regulators — BaFin, FinCEN, the FCA — require you to explain exactly why a transaction was flagged. A rule written as code is perfectly transparent: “This transaction was flagged because the account made four deposits totalling €9,400 within 18 hours, each under €10,000, matching structuring rule SC-001.” That’s an answer a regulator can verify.
An AI model’s reasoning is a black box. You can ask it to explain itself, but the explanation is generated after the fact — it’s a story, not a proof.
The Three-Tier Architecture
The smartest approach isn’t to eliminate AI entirely. It’s to put it where it adds value and keep it away from where it burns money.
Tier 1 — Rule Engine (fast, cheap, deterministic) Generated code handles the bulk of the work. Threshold checks, velocity monitoring, sanctions list screening, geographic risk scoring. This catches 95% of cases at negligible cost.
Tier 2 — AI Review (targeted, valuable) The 5% of transactions that are ambiguous — flagged but unclear, or showing novel patterns the rules haven’t seen — get routed to an AI model. The AI adds context, identifies connections between accounts, and drafts the narrative section of Suspicious Activity Reports that analysts currently spend hours writing by hand.
Tier 3 — Human Analyst The final decision stays with a person. But instead of reviewing raw transaction data, the analyst gets a pre-written case summary with the rule that triggered, the AI’s assessment, and a draft SAR. Their job shifts from data sifting to judgement calls.
This tiered approach means you’re paying for AI inference on maybe 50,000 transactions a day instead of a million. That’s a 95% cost reduction while actually improving the quality of the output.
Why WorkingAgents framework
WorkingAgents architecture is built for exactly this kind of work. A few properties make it particularly well-suited:
-
Millions of lightweight processes. Each compliance rule — or even each active account — can run as its own isolated process. Half a million concurrent processes is routine. Each one holds its own state: running totals, velocity counters, behavioral baselines.
-
JustInTime deployment. New rules can be deployed into a running system without downtime. No restart, no dropped transactions, no maintenance window at 3 AM.
-
Fault isolation. If one rule crashes on an edge case, the others keep running. The system automatically restarts the failed rule and it picks up where it left off. No single point of failure.
-
Built-in stream processing. WorkingAgents has native tools for processing high-volume data streams with backpressure — the system automatically slows intake if processing falls behind, preventing data loss without manual intervention.
The Consulting Opportunity
For a company like a licensed European bank running cloud infrastructure under heavy compliance requirements, this approach solves three problems at once:
- Cost — They’re evaluating AI tools but can’t justify the per-transaction inference costs at scale.
- Speed — Real-time fraud detection requires sub-second response times that current AI inference can’t deliver.
- Compliance — Regulators demand explainable, auditable decision-making that black-box AI models can’t provide.
The pitch is straightforward: your compliance team describes rules in plain language, AI generates auditable code that runs at wire speed, and the AI also handles the edge cases and paperwork that your analysts spend their days on. You get the benefits of AI where it matters and deterministic reliability where you need it.
Not every problem needs an AI model running in real time. Sometimes the smartest use of AI is to write the code and get out of the way.