The API Anomaly Detector: When Your Banking APIs Are Under Attack and Nobody Notices

James Aspinwall — February 2026

Banking APIs are the nervous system of modern finance. PSD2 opened them to third parties. DORA made their resilience a legal obligation. And every day, credential stuffing bots hammer authentication endpoints while nobody is watching the right dashboard.

This agent monitors API telemetry — latency distributions, error rates, authentication patterns — establishes statistical baselines, detects anomalies, correlates multiple signals into a single incident narrative, and tells a human exactly what is happening and why it matters.

The Threat Landscape

Banking APIs face specific threats that map directly to the OWASP API Security Top 10 (2023):

Credential stuffing (MITRE ATT&CK T1110.004): Attackers parse stolen email/password pairs from data breaches and test them against login endpoints in distributed, stealthy patterns. Modern campaigns use residential IP botnets, mimic real user behavior, and rotate User-Agent strings. In API logs, this manifests as: spike in HTTP 401 responses from distributed IPs, mismatched User-Agent strings, login attempts against unrelated accounts, and geographically impossible travel patterns.

BOLA (Broken Object Level Authorization — API1:2023): The #1 API vulnerability. Attackers manipulate object IDs to access other customers’ data. Represents approximately 40% of all API attacks.

Unrestricted Resource Consumption (API4:2023): Brute-force attacks that exhaust rate limits, balance inquiry scraping, and automated abuse of account-opening flows.

Business logic abuse (API6:2023): Automated exploitation of fund transfer, loan application, or account opening workflows in ways the system technically allows but operationally should not.

The agent does not prevent these attacks — firewalls, WAFs, and rate limiters do that. The agent detects them, correlates the signals, and explains what is happening so a human can respond appropriately.

Regulatory Requirements

DORA (Regulation 2022/2554) — In Force Since January 17, 2025

DORA is now the primary regulatory framework for ICT incident management in EU financial services. The key articles:

Article 17 — ICT Incident Management Process: Financial entities must define, establish, and implement processes to detect, manage, and notify ICT-related incidents. They must record all ICT-related incidents and significant cyber threats. The process must include early warning indicators, classification and tracking procedures, roles and responsibilities for different scenarios, communication plans, and management body reporting.

Article 18 — Classification: An incident is “major” if it meets quantitative thresholds for at least two of six criteria: clients affected, transactions affected, duration, geographic spread, data integrity/availability loss, and criticality of services. Specific thresholds are defined in Commission Delegated Regulation (EU) 2024/1772.

Article 19 — Reporting Timeline:

Report	Deadline	Content
Initial notification	Within 4 hours of classifying as major, within 24 hours of detection	Type, services affected, estimated impact, initial mitigation
Intermediate report	Within 72 hours of initial notification	Updated status, revised quantified impact, preliminary root cause
Final report	Within 1 month after intermediate	Confirmed root cause, control failures, definitive impact, lessons learned

The final report contains 101 data points. This is not a memo — it is a structured submission.

BaFin BAIT (Circular 10/2017 BA)

BAIT provides Germany-specific requirements on top of DORA:

Security-relevant information must be evaluated in an “appropriately timely, rule-based and centralised manner”
Logs must be available for a “reasonable amount of time” for subsequent evaluation
Zero Trust security model with least-privilege access
IT emergency management procedures must be documented and tested

PSD2 (Directive 2015/2366)

PSD2 Article 96 previously required major operational/security incident reporting. As of January 2025, this is superseded by DORA’s harmonized framework. However, PSD2’s Strong Customer Authentication (SCA) requirements still define what the APIs must enforce — and what the anomaly detector must watch for when SCA mechanisms are being probed.

Input: API Telemetry

The agent ingests API log entries with these fields:

Field	Purpose
`endpoint`	Which API path was called
`method`	HTTP method (GET, POST, etc.)
`latency_ms`	Response time
`status_code`	HTTP response code
`auth_method`	OAuth, API key, mTLS, session
`source_ip`	Client IP address
`timestamp`	When the request occurred
`user_agent`	Client identifier
`trace_id`	Request correlation ID

For the demo: 5,000-10,000 synthetic log entries with normal traffic, plus three injected anomaly scenarios.

Processing: Detection and Correlation

Stage 1 — Baseline Establishment

The system establishes what “normal” looks like using statistical methods:

Latency percentiles (per endpoint):

Metric	Typical Banking API Baseline
P50 (median)	50-100ms for account queries
P95	200-500ms
P99	500ms-2s

Error rate baselines:

Metric	Normal	Alert Threshold
HTTP 4xx rate	< 1-2%	> 5% sustained
HTTP 5xx rate	< 0.1%	> 0.5% sustained
HTTP 401 rate	< 0.5%	Spike > 3x baseline

Seasonality handling: Banking API traffic has strong daily patterns (3x higher during business hours), weekly patterns (weekday vs weekend), and monthly spikes (payroll days, month-end). The baseline uses time-of-day and day-of-week aware windows — comparing current traffic against the same window from previous weeks, not a flat average.

Stage 2 — Anomaly Detection

Statistical methods applied per metric:

Z-Score: z = (current - mean) / std_deviation. Values beyond +/- 3 sigma flag as anomalies. Best for normally distributed metrics like request rates and error counts.

Percentile/IQR: Flag values below Q1 - 1.5IQR or above Q3 + 1.5IQR. Better for skewed distributions like latency data.

Median Absolute Deviation (MAD): MAD = median(|Xi - median(X)|). Robust against outliers. Preferred for heavy-tailed latency distributions.

For the demo, hardcoded thresholds work fine: >50 errors/minute, >2s P99 latency. The statistical baseline is the production approach; the demo proves the concept without the math.

Stage 3 — Signal Correlation

Multiple anomalies occurring in the same time window are correlated into a single incident:

Temporal correlation: Anomalies within a 5-minute sliding window are grouped.

Causal correlation: Dependency graphs identify root causes. If the auth service degrades first and downstream errors follow, the auth service is the root cause.

Entity correlation: Same source IP range, same target endpoint, same geographic region.

Distinguishing operational vs security incidents:

Signal	Operational (Degradation)	Security (Attack)
Error type	5xx dominant	4xx dominant (401, 403, 429)
Source distribution	Normal user mix	Concentrated from unusual IPs
Endpoint pattern	All endpoints proportionally	Heavy on auth/data endpoints
Request velocity	May slow down	Dramatically increased
User-Agent	Normal browser/app mix	Headless browsers, missing agents

Stage 4 — LLM Narrative Generation

The LLM receives the correlated signals and generates an incident narrative:

“Auth failures on /api/v1/transfers spiked 400% (from 12/min to 48/min baseline, currently 214/min) while latency on /api/v1/accounts degraded from P99 120ms to P99 2,400ms over 30 minutes. Source IPs: 847 unique addresses, predominantly from residential proxies in regions with no customer base. User-Agent distribution anomalous: 73% missing or non-standard. Pattern consistent with credential stuffing campaign (MITRE ATT&CK T1110.004) with possible secondary effect of service degradation from request volume. Severity: SEV-2. Recommended actions: (1) escalate to SOC, (2) enable enhanced rate limiting on /transfers, (3) trigger additional MFA challenges for affected accounts, (4) preserve logs for DORA incident assessment.”

Alerting and Escalation

Severity Matrix

Level	Definition	Response Time	Notification
SEV-1 / Critical	Complete API outage, active data exfiltration, payment processing failure	Acknowledge < 5 min, respond < 15 min	SOC + CISO + CTO immediately
SEV-2 / High	Auth service degraded, confirmed attack campaign, single payment channel down	Acknowledge < 15 min, respond < 30 min	SOC + CISO < 1 hour
SEV-3 / Medium	Elevated latency on non-critical APIs, suspicious but unconfirmed activity	Acknowledge < 1 hour, respond < 4 hours	SOC analyst, daily CISO report
SEV-4 / Low	Intermittent slowness, isolated failed requests	Acknowledge < 4 hours	Queued, weekly report

If an incident is classified as major under DORA Article 18, the 4-hour reporting clock starts. The agent can pre-populate the initial notification template, but human review and approval before submission is mandatory.

Human-in-the-Loop

Always require human approval:

Classifying an incident as “major” (triggers DORA reporting obligations)
Submitting regulatory notifications
Customer-facing communications about incidents
Blocking legitimate customer accounts (false positive risk)
Changes to production security controls

Can be automated with guardrails:

Initial detection and alert generation
Alert enrichment and context gathering
Rate limiting escalation
IP blocking of clearly malicious sources — must be logged, reviewed within a defined window, and easily reversible
Triggering additional MFA challenges

Every automated action must be logged with full context: what triggered it, when, and why. Low-confidence detections should alert rather than block, requiring human confirmation.

Monitoring Standards

Banking API SLAs

Metric	Standard
Availability	99.95% - 99.99% (4.4 - 52.6 min downtime/year)
P95 Latency	< 500ms for account queries
P99 Latency	< 2s for transaction processing
Error Rate	< 0.1% for 5xx
Transaction Success Rate	> 99.9%

Log Retention

Regulation	Retention
PCI DSS	1 year minimum, 3 months immediately accessible
DORA	5 years for incident records
BAIT	“Reasonable amount of time” — typically 3-5 years
Industry practice	Hot logs 90 days, warm 1 year, cold 5-7 years

Testing and Validation

Injection Testing

Inject known anomalous patterns into the telemetry stream:

Synthetic latency spikes (step functions, gradual ramps)
Simulated credential stuffing (high 401 rates from distributed IPs)
Error rate spikes (sudden 5xx increases)
Volume anomalies (5x-10x baseline traffic)

Chaos Engineering for APIs

Inject latency into specific endpoints
Simulate partial service failures
Network partitions between microservices
Database connection pool exhaustion
Downstream third-party API failures
Certificate expiration simulation

Detection Accuracy Targets

Metric	Target
Precision	> 80% (minimize alert fatigue)
Recall	> 95% for security-critical
F1 Score	> 85%
Mean Time to Detect (MTTD)	< 2 minutes for SEV-1
False positive rate for auto-blocking	< 1%

Running Under the MCP Orchestrator

MCP Tools:

api_anomaly_check_health — current API health snapshot (latency percentiles, error rates, auth stats)
api_anomaly_detect — runs anomaly detection against current telemetry window
api_anomaly_correlate — correlates multiple anomalies into incident narrative
api_anomaly_incident_report — generates DORA-formatted incident report draft

System Prompt Context: API endpoint catalog with expected behavior profiles, OWASP API Security Top 10 definitions, MITRE ATT&CK technique mappings, DORA classification criteria, institution-specific SLAs and escalation procedures.

Trigger Conditions:

Continuous: streaming telemetry ingestion with sliding-window analysis
Threshold-based: immediate alert on metric breach
Scheduled: hourly baseline recalculation

Demo Flow

Scenario 1: Credential Stuffing

Normal API traffic scrolling. Auth failures on /api/v1/transfers start climbing — 12/min, 50/min, 200/min. The system detects the spike, identifies 847 unique source IPs, notes the headless browser User-Agent strings, correlates with latency degradation on /api/v1/accounts. LLM generates the incident narrative. Severity: SEV-2. Dashboard lights up.

Scenario 2: Unusual Endpoint Pattern

A service account that normally only calls /api/v1/accounts begins hitting /api/v1/admin/users repeatedly. The system detects the behavioral deviation, flags it as potential privilege escalation or compromised service credentials. MITRE ATT&CK mapping: T1078 (Valid Accounts).

Scenario 3: Latency Degradation

P99 latency on /api/v1/accounts climbs from 120ms to 2,400ms over 30 minutes. No error spike, no auth anomaly. The system identifies this as operational, not security. LLM narrative: “Gradual latency degradation consistent with infrastructure issue (possible database connection exhaustion, memory leak, or resource contention). No security indicators present. Severity: SEV-3.”

Three scenarios, three different root causes, three different narratives. The agent does not just flag numbers — it tells you what the numbers mean.

Beyond Detection: Execute the Response

Currently, the agent detects anomalies and generates incident narratives. The next step: a one-click “Execute” button that pushes mitigation actions directly to infrastructure — enhanced rate limiting on the targeted endpoint, additional MFA challenges for affected accounts, IP blocking of confirmed malicious sources — with auto-revert timers and rollback capability.

For DORA reporting: the agent does not just draft the incident report — it pre-populates the structured submission template (all 101 data points for major incidents) and stages it for CISO approval before transmission.

The consulting differentiator: This agent does not just detect anomalies — it maps them to MITRE ATT&CK techniques, classifies them against DORA Article 18 quantitative thresholds, and knows the difference between an operational degradation (SEV-3, 4-hour response) and a major ICT incident (4-hour BaFin notification clock starts). It integrates with the institution’s API gateway, WAF, and identity provider. Generic monitoring tools flag metrics. This agent speaks DORA.