Building a Regulated Finance Demo: Five AI Agents for Banking Compliance

James Aspinwall — February 2026

The Solaris demo targets regulated finance — German banking and fintech specifically. Five AI agents running on WorkingAgents handle compliance monitoring, regulatory tracking, API anomaly detection, exposure limits, and KYC orchestration. Everything runs on synthetic data. No real customer data, no paid services required.

This article is both the concept and the build plan.

The Five Agents

1. Compliance Monitor

A rule engine scans a transaction stream for suspicious patterns: structuring (amounts split just below reporting thresholds), rapid cross-border flows, and sudden activity on dormant accounts. When a pattern matches, the LLM drafts a Suspicious Activity Report (SAR) narrative — not just a flag, but a written explanation of why the activity is suspicious and what regulatory obligation it triggers.

Data: Transaction dataset (~10K records with 50-100 seeded suspicious patterns).

2. Regulatory Change Tracker

An RSS scraper pulls bulletins from BaFin and EBA — real public feeds, not mocked. The LLM classifies each bulletin by relevance to the institution’s operations, maps it to internal policy controls, and flags gaps where new regulations don’t align with existing compliance procedures.

Data: Real BaFin/EBA RSS feeds + a policy mapping table (regulation ID to internal control to compliance status).

3. API Anomaly Detector

A statistical baseline tracks normal API behavior — latency distributions, error rates, authentication patterns per endpoint. When metrics breach thresholds, the LLM correlates multiple signals into a single incident narrative: “Auth failures on /transfers spiked 400% while latency on /accounts degraded — possible credential stuffing attack.”

Data: API telemetry logs with injected anomalies.

4. Exposure Monitor

Simple arithmetic with regulatory teeth. Sum all positions per counterparty, compare against the CRR large exposure limit (25% of own funds). The LLM generates an audit-trail explanation when limits are approached or breached, and recommends specific actions — reduce position, request exemption, escalate to risk committee.

Data: Portfolio dataset with mock loans and credit lines approaching CRR limits.

5. KYC Orchestrator

A workflow engine sequences the onboarding pipeline: ID verification, sanctions screening, PEP (Politically Exposed Person) check, risk scoring. The LLM makes the escalate-vs-auto-approve decision and writes a review summary for each applicant. Edge cases (partial PEP matches, sanctions near-misses) get escalated with reasoning.

Data: 20-30 mock onboarding records with varying risk profiles.

Agent Summary

Agent	Core Logic	LLM Does What	Data Source
Compliance Monitor	Rule engine flags transaction patterns	Drafts SAR narrative with regulatory context	Transaction dataset (10K records)
Regulatory Change Tracker	RSS/scraper pulls BaFin/EBA bulletins	Classifies relevance, maps to policies, flags gaps	Real BaFin feeds + policy mapping table
API Anomaly Detector	Statistical baseline + threshold alerts	Correlates signals into incident narrative	API telemetry logs
Exposure Monitor	Sum positions vs CRR limit (25% own funds)	Generates audit explanation + recommended action	Portfolio dataset
KYC Orchestrator	Workflow: ID → sanctions → PEP → risk score	Decides escalate vs auto-approve, writes summary	Onboarding records

Synthetic Data Requirements

All data is generated. Nothing touches real customers.

Transaction Dataset (~10K records)

Fields: amount, currency, sender_id, receiver_id, timestamp, sender_country, receiver_country, account_type, account_dormancy_days.

Seed 50-100 suspicious patterns across three categories:

Structuring: 8-12 transactions from the same sender, each EUR 9,500-9,900 (just under the EUR 10,000 reporting threshold), within a 48-hour window.
Rapid cross-border flows: Funds moving through 3+ countries in under 24 hours, amounts above EUR 50,000.
Dormant account activity: Accounts inactive for 180+ days suddenly receiving or sending large transfers.

The remaining ~9,900 transactions are normal retail and commercial banking activity — salary payments, utility bills, standard transfers.

KYC Onboarding Records (20-30 applicants)

Mix of risk profiles:

15-20 clean passes (standard individuals, domestic, no flags)
3-5 PEP hits (government officials, family members of officials)
2-3 sanctions near-matches (name similarity to sanctioned entities)
1-2 clear sanctions matches (for demonstrating hard-stop workflow)

Each record: name, date_of_birth, nationality, document_type, document_id, address, occupation, source_of_funds.

Regulatory Feed (30-50 bulletins)

Pull real bulletins from BaFin’s public RSS feed (bafin.de/EN/Newsroom). Supplement with 10-15 mocked bulletins if needed.

Policy mapping table (15-20 rows):

Regulation ID	Regulation Title	Internal Control	Status
BaFin-2024-MaRisk	MaRisk update on IT risk	IT-CTRL-007	Compliant
EBA-GL-2024-AML	AML Guidelines Rev. 3	AML-CTRL-012	Gap identified

Include 2-3 gap scenarios where new regulations have no corresponding internal control.

API Telemetry (mock logs)

Fields: endpoint, method, latency_ms, status_code, auth_method, source_ip, timestamp, user_agent.

Generate 5,000-10,000 log entries with normal traffic, then inject:

Auth failure spike: 200+ 401 responses on /api/v1/transfers from 3-4 IPs within 10 minutes.
Unusual endpoint pattern: Repeated hits on /api/v1/admin/users from a service account that normally only calls /api/v1/accounts.
Latency degradation: P99 latency on /api/v1/accounts climbing from 120ms to 2,400ms over 30 minutes.

Build Plan

Ordered by priority. Each phase produces something demoable.

Phase 1: Data Generation Script

Build a reusable synthetic data generator — either an Elixir module or a Python script.

Outputs:

transactions.json — 10K transaction records with seeded anomalies
kyc_applicants.json — 20-30 onboarding records
portfolio.json — counterparty exposure positions
api_telemetry.json — 5-10K API log entries with injected anomalies

This script is reusable. When the next fintech prospect needs a demo with different parameters, regenerate the data with different seeds. The generator is a deliverable, not throwaway code.

Phase 2: Compliance Monitor (build first)

This is the highest-impact agent and the template for all others.

Build end-to-end:

Transaction stream ingestion (read from dataset, simulate real-time feed)
Rule engine: pattern matching for structuring, cross-border, dormant account triggers
LLM call: flagged transactions go to the model with context, model returns SAR narrative
Output: timestamped alert with SAR draft, risk score, recommended action

This is the most visual agent. A live feed of transactions scrolling by, a flag appearing, a SAR draft generating in real-time. Build the full pipeline here, then reuse the pattern (ingest → detect → LLM narrative → output) for the remaining agents.

Phase 3: Exposure Monitor (quick win)

The simplest agent. Pure arithmetic plus an LLM explanation layer.

Load portfolio positions grouped by counterparty
Sum exposure per counterparty, compare to CRR limit (25% of own funds — use EUR 100M as mock own funds, so limit is EUR 25M)
Flag any counterparty above 80% of limit (approaching) or above 100% (breach)
LLM generates audit-trail text: what the exposure is, why it’s flagged, recommended action

Build time: half a day. Demonstrates that not every agent needs complex ML — sometimes the value is in the explanation and audit trail, not the detection.

Phase 4: Regulatory Change Tracker (real public data)

This agent is impressive because it uses real regulatory text, not synthetic data.

Pull RSS from BaFin (bafin.de/EN/Newsroom) — these are public, no auth required
Parse bulletin title, date, category, summary text
LLM classifies: relevant to our operations? Which business lines?
LLM maps to internal controls using the policy mapping table
LLM flags gaps: “New EBA guideline on outsourcing risk has no corresponding internal control”

The gap detection is the money shot. A regulator publishes something new, and within seconds the agent tells you what you’re missing.

Phase 5: API Anomaly Detector + KYC Orchestrator

Build fully if time permits. If time is short, show these as simplified versions or mockups.

API Anomaly Detector (simplified):

Skip the statistical baseline, use hardcoded thresholds (>50 errors/minute, >2s P99 latency)
Feed the threshold breaches to the LLM for incident narrative
Still demonstrates the concept

KYC Orchestrator (simplified):

Run through 5 applicants live: 2 clean, 1 PEP hit, 1 sanctions near-match, 1 auto-approve edge case
Show the LLM’s escalation reasoning for the PEP and sanctions cases
The workflow visualization (pipeline steps with pass/fail indicators) is more important than the depth of each check

Deployment Model

Each customer deployment is a dedicated instance — one business, one container, one VPC. No shared infrastructure, no multi-tenancy, no tenant isolation complexity.

Docker containerization — all Elixir code, MCP framework, agent configs, and data in a single image
VPC deployment — each business gets dedicated compute, storage, and database
Instance provisioning — infrastructure-as-code pipeline for one-click deployment of new customer instances
Instance management — dashboard for deploying, monitoring, updating, and backing up customer instances
Horizontal scaling — add instances for new businesses, not tenants to shared infrastructure

The data model is simple by design: no tenant IDs, no row-level security, no context switching. Single-business context assumed throughout. Audit trails are per-instance.

Stack and Tools

Required:

WorkingAgents platform (agent definitions, tool configs, system prompts)
Synthetic data generator (Phase 1 deliverable)
Simple dashboard UI — single-page app showing a live agent action feed (all five agents’ outputs in one timeline)
Instance provisioning pipeline — Docker build + VPC setup automation (Terraform/Pulumi)

Nice to have (can simulate if unavailable):

Mambu sandbox (free developer account at developer.mambu.com) — provides authentic banking API calls instead of reading from JSON files
BaFin RSS feed (real, public, free) — gives the Regulatory Tracker live data
Mock REST API serving transaction data — agent calls an API endpoint instead of reading a file, which looks more realistic in demos

Cost: zero. Every component is either open source, free-tier, or synthetic. No paid services, no real data, no vendor dependencies.

Deliverables

Data generation script — reusable across demos, parameterized for different scenarios
Five agent configurations — WorkingAgents configs with tool definitions, system prompts, and trigger conditions
Dashboard UI — single page showing all agent activity in a unified timeline
Screen recordings — 30-second capture of each agent in action, for the pitch deck
Live demo flow (~5 minutes): transaction stream starts → Compliance Monitor flags structuring → SAR draft generates → Exposure Monitor fires on a counterparty approaching limit → Regulatory Tracker picks up a new BaFin bulletin and flags a policy gap

The demo tells a story: “Your compliance team is asleep. These five agents aren’t.”

Beyond the Demo: Autonomous Operations

The demo shows agents that analyze, recommend, and draft. The product vision goes further: agents that execute.

Each recommendation in the dashboard gets a one-click “Execute Recommendation” button. The Compliance Monitor does not just draft a SAR — it pre-populates the goAML submission. The Exposure Monitor does not just recommend syndication — it initiates the workflow. The Regulatory Tracker does not just identify gaps — it creates remediation tickets with owners and deadlines.

Guardrails are built in: approval workflows (configurable per action type), rollback mechanisms (every action reversible), audit trails (who approved, when it executed, what changed). Low-risk actions auto-execute with post-hoc review. High-risk actions require explicit human approval.

The consulting pitch: “AI that not only tells you what to do, but can do it for you — with appropriate oversight.” Industry-specific compliance and reporting is where generic AI falls flat. BaFin reporting, COREP filing, goAML submissions — these require domain-specific templates, regulatory knowledge, and integration with industry-standard systems. The agents become adapters: “AI that speaks BaFin.” Each vertical becomes a specialized offering. Stickier, harder to replace, premium pricing justified.

What We Still Need From Solaris

The architecture is designed. The demo is specced. But we are missing critical operational data to build a credible cost-reduction model:

Daily operations: How many transactions/day? How many KYC applications/month? How many API incidents?
Staffing: Headcount per compliance function — analysts, risk officers, KYC reviewers, IT security
Alert volume: Current AML alert volume, false positive rate, average investigation time
Cost structure: Operational headcount and salary bands for ROI calculation

We do not know if Solaris handles 10 incidents per day or 1,000. The demo proves the technology works. The business case requires their numbers.