By James Aspinwall, co-written by Alfred Pennyworth (my trusted AI) — March 7, 2026, 17:55
The Problem: You Don’t Know What You Don’t Know
Most AI observability tools answer questions you already thought to ask. “How many tool calls failed?” “What’s the average latency?” “Did the agent hallucinate?” These are important, but they assume you know where to look.
Distributional asks a different question: What behavioral patterns exist in your agent’s production data that you haven’t discovered yet?
WorkingAgents orchestrates 50+ MCP tools across CRM, task management, content, and communications. Every day, agents make thousands of decisions — which tool to call, what parameters to pass, how to synthesize results. Somewhere in that data are patterns that explain why some sessions succeed and others don’t, why certain users get better results than others, why performance drifts over time.
Distributional’s DBNL platform finds those patterns through unsupervised statistical analysis. It doesn’t require you to define what “good” looks like upfront. It discovers the behavioral fingerprint of your agents and surfaces deviations, clusters, and shifts you wouldn’t have thought to monitor.
This is a fundamentally different capability from what WorkingAgents has today — and from what most evaluation platforms offer.
What Distributional Brings
Distributional (DBNL) is an adaptive analytics platform for production AI agents. Founded in September 2023 by the SigOpt team (acquired by Intel in 2020), backed by $30M from Andreessen Horowitz, Two Sigma Ventures, and others, the platform is built on a core insight: AI behavior is probabilistic, not deterministic, and testing it requires statistical methods native to that reality.
The Distributional Fingerprint
Every AI application has what Distributional calls a “distributional fingerprint” — its unique baseline mixture of characteristic distributions across behavior dimensions. This fingerprint captures:
- How users interact with the system
- Which topics and intents appear, and in what proportions
- What tool sequences agents follow
- How quality, cost, and latency correlate with each other
- Where behavioral clusters form
When the fingerprint shifts — a new topic cluster emerges, a tool sequence that used to work starts failing, latency correlates with a specific user segment — DBNL surfaces it as an Insight.
The Adaptive Analytics Flywheel
DBNL operates through an eight-step cycle:
Ingest → Enrich → Analyze → Publish → Discover → Investigate → Track → Repeat
- Ingest: Production logs arrive via OpenTelemetry traces, SDK push, or SQL pull
- Enrich: Each log line is augmented with LLM-as-Judge evaluations, NLP metrics, topic classification, embeddings, and custom metrics — creating a rich behavioral vector per interaction
- Analyze: Unsupervised learning and statistical techniques discover patterns — temporal shifts, behavioral clusters, outliers
- Publish: Patterns appear as human-readable Insights and Dashboards
- Discover: Teams review automatically surfaced signals they didn’t know to look for
- Investigate: The Explorer tool enables population and temporal comparisons, drilling into the evidence behind each signal
- Track: Meaningful patterns become saved Segments and custom Metrics for ongoing monitoring
- Repeat: Tracked signals feed back into the enrichment and analysis, deepening future discovery
Three Types of Insights
- Temporal Insights — Behavior that shifts over time. “Tool X response quality dropped 15% this week compared to the 30-day baseline.”
- Segment Insights — Distinct behavioral clusters in the data. “Users who ask about CRM data in the morning get different tool sequences than afternoon users.”
- Outlier Insights — Significant deviations from the norm. “This specific tool chain produced an anomalous pattern that doesn’t match any known cluster.”
Deployment Model
DBNL is free, open, and downloadable. It deploys in your Kubernetes cluster within your VPC. No data leaves your environment. Enterprise features include OpenID Connect SSO, role-based access, and workspace administration.
This matters. For an AI consulting firm deploying agents for clients, “your data stays in your infrastructure” eliminates the security objection before it’s raised.
What WorkingAgents Brings
WorkingAgents (“The Orchestrator”) is an Elixir OTP platform that gives AI agents real tools for business operations:
- 50+ MCP tools — CRM contacts/companies/pipeline, task management with 60+ query functions, content authoring, article summarization, alarm scheduling, system monitoring
- Multi-provider LLM — Claude, OpenRouter, Perplexity, switchable at runtime
- Permission-gated execution — capability-based access control on every tool call
- Google A2A protocol — agent-to-agent task delegation and skill discovery
- WhatsApp bridge — natural language tool invocation via messaging
- Per-user isolation — separate SQLite databases per domain, per user
WorkingAgents generates exactly the kind of rich, multi-dimensional production data that Distributional is designed to analyze — tool calls, user interactions, topic diversity, model switching, and real business outcomes.
Where the Synergy Lives
1. Unsupervised Discovery on Tool Usage Patterns
WorkingAgents has 50+ tools. Users interact with agents in natural language, and the agent decides which tools to call, in what order, with what parameters. Today, there’s no systematic way to understand these tool usage patterns at scale.
Distributional’s unsupervised analysis would discover patterns like:
-
Tool sequence clusters — “80% of CRM-related sessions follow the pattern:
nis_list_contacts→nis_get_contact→nis_log_interaction. But 12% skip directly tonis_log_interaction, and those sessions have 40% lower user satisfaction.” -
Unused tool discovery — “The
nis_pipelinetool exists but is called in only 3% of sales-related sessions. Sessions that do use it have 2x higher tool completeness scores.” -
Parameter pattern analysis — “When users ask about ‘overdue tasks,’ the agent calls
task_querywithname: 'overdue'70% of the time buttask_dashboard30% of the time. The dashboard path produces higher-quality responses.”
These are the patterns you wouldn’t think to monitor because you didn’t know they existed. Traditional observability counts tool calls. Distributional discovers the behavioral relationships between them.
2. User Behavior Segmentation
WorkingAgents serves different users with different needs. James manages CRM contacts. Jimmy asks about tasks and deadlines. Other consulting clients will have their own patterns. Distributional’s segment discovery would reveal:
- User behavioral profiles — Natural clusters of how different users interact with agents, without predefined user categories
- Intent distribution shifts — When a user’s query patterns change (maybe they started using CRM tools more and task tools less), DBNL surfaces the shift as a temporal insight
- Cross-user patterns — “Users who set alarms via WhatsApp have 30% more follow-through on tasks than users who create tasks via the web interface”
This segmentation feeds directly into product decisions. If WhatsApp-originated tasks have higher completion rates, that’s a signal to invest more in the WhatsApp bridge experience.
3. Multi-Provider Model Comparison — Beyond Scores
Other evaluation platforms compare models with predefined metrics: accuracy, latency, cost. Distributional adds a dimension they can’t: behavioral fingerprint comparison.
WorkingAgents users can switch between Claude, OpenRouter models, and Perplexity at runtime. Distributional wouldn’t just score each provider — it would discover how the behavioral distribution changes:
- “Claude sessions produce 4 distinct topic clusters. GPT-4o sessions produce 6 — the extra two clusters correspond to edge-case queries where GPT-4o attempts more complex tool chains.”
- “Perplexity sessions show lower latency but a temporal drift in tool selection accuracy over multi-turn conversations — performance degrades after turn 5.”
-
“OpenRouter Llama sessions cluster differently from proprietary models on CRM queries — they under-use
nis_searchand over-rely onnis_list_contactswith broad filters.”
This is richer than “Model A scored 4.2, Model B scored 3.8.” It reveals how models behave differently, not just how well.
4. The AI Data Flywheel for Consulting Clients
Distributional explicitly positions their platform around the “Analytics-Driven AI Data Flywheel” — using discovered signals and surfaced examples for post-training optimization. For WorkingAgents’ consulting business, this creates a concrete service offering:
Month 1: Deploy — Install WorkingAgents with custom tools for the client’s domain. Connect DBNL to ingest traces.
Month 2: Discover — DBNL surfaces behavioral patterns. “Your agents handle inventory queries well but struggle with multi-step procurement workflows. Here are the 47 example traces showing the failure pattern.”
Month 3: Optimize — Use the surfaced examples for prompt engineering, tool redesign, or model switching. DBNL’s tracked segments measure whether the changes worked.
Month 4+: Flywheel — Each optimization cycle surfaces new patterns in the changed behavior. The agent gets measurably better every month, with evidence.
This is recurring revenue built on data, not opinion. The consulting engagement doesn’t end after deployment — it becomes a continuous optimization service powered by Distributional’s discovery engine.
5. Permission and Access Pattern Analytics
WorkingAgents’ capability-based access control system creates a rich dataset: which users have which permissions, which tools they actually call, and how their usage patterns differ from their permission scope.
Distributional could surface insights like:
-
“Users with
task_managerpermission but notnispermission ask CRM-related questions 15% of the time — hitting permission denials. Consider expanding their access or improving the agent’s handling of out-of-scope requests.” - “Temporary access keys (TTL-based) show a different behavioral distribution than permanent keys — users with temp keys complete tasks 25% faster, possibly due to urgency.”
-
“A new behavioral segment emerged this week: users who chain
nis_create_contact→task_create→task_link. This workflow isn’t documented but appears intentionally productive.”
6. Temporal Drift Detection on Agent Behavior
AI agents aren’t static. Model updates, prompt changes, data drift, and user behavior evolution all cause the behavioral fingerprint to shift. WorkingAgents currently has no way to detect these shifts.
Distributional’s temporal insights would catch:
- Model update drift — When Anthropic updates Claude, does the agent’s tool selection distribution change? Do certain tool chains break?
- Prompt engineering impact — After modifying system prompts, did the behavioral fingerprint change in the intended direction? Or did it also shift in unexpected dimensions?
- Seasonal patterns — Do end-of-month CRM queries spike? Do task creation patterns follow weekly cycles?
- User adaptation — As users learn the system, does their interaction pattern evolve? Do they discover more efficient tool chains over time?
These temporal signals are invisible to snapshot-based evaluation tools. They only emerge from continuous distributional analysis.
Why Distributional is Different From Arize or Deepchecks
Distributional occupies a distinct position in the AI analytics landscape:
| Dimension | Arize AI | Deepchecks | Distributional |
|---|---|---|---|
| Core approach | Trace observability | Evaluation scoring | Behavioral discovery |
| Primary question | “What happened?” | “Was it good?” | “What patterns exist?” |
| Method | OpenTelemetry spans | LLM-as-Judge swarm | Unsupervised statistical analysis |
| Requires predefined metrics | Partially | Yes | No — discovers metrics |
| Deployment | Cloud SaaS or self-hosted | Cloud SaaS or on-prem | Free, open, in your VPC |
| Best for | Debugging specific failures | Scoring agent quality | Finding unknown unknowns |
For WorkingAgents, these three platforms are complementary layers:
- Arize traces what happened (the execution path)
- Deepchecks evaluates whether it was done well (the quality score)
- Distributional discovers what you should be paying attention to (the behavioral signal)
Distributional fills the gap between “we monitor our agents” and “we understand our agents.”
The Gap Analysis
| WorkingAgents Gap | Distributional Solution |
|---|---|
| No behavioral pattern discovery | Unsupervised learning surfaces unknown clusters, shifts, and outliers |
| No tool usage correlation analysis | Distributional fingerprint captures tool-sequence-to-outcome correlations |
| No user segmentation analytics | Segment Insights automatically cluster user behavior profiles |
| No temporal drift detection | Temporal Insights surface behavioral shifts over time |
| No data flywheel for continuous improvement | Adaptive Analytics Flywheel with surfaced examples for optimization |
| No multi-dimensional model comparison | Behavioral fingerprint comparison across providers |
| Distributional Gap | WorkingAgents Solution |
|---|---|
| Need production agent data sources | 50+ MCP tool traces with rich business context |
| Need diverse tool-calling patterns | CRM + tasks + content + communication tool chains |
| Need multi-provider comparison scenarios | Runtime-switchable Claude/OpenRouter/Perplexity |
| Need consulting distribution channel | AI consulting firm deploying for medium-size companies |
| Need non-Python ecosystem references | Elixir OTP — unique agent orchestration stack |
| Need real business outcome data | CRM pipeline, task completion, follow-up tracking |
Partnership Models
Technology Integration
The natural starting point. WorkingAgents emits OpenTelemetry traces from its MCP dispatcher. DBNL ingests, enriches, and analyzes.
- DBNL deploys in WorkingAgents’ infrastructure — free, open, in the same VPC. Zero data leaves.
- Default metrics (answer relevancy, user frustration, topic classification) run automatically on every interaction.
- Custom metrics specific to WorkingAgents’ domain — CRM data accuracy, task completion rates, tool chain efficiency.
WorkingAgents gains: Behavioral discovery and continuous improvement analytics without building ML infrastructure. Distributional gains: A production MCP reference customer on Elixir/OTP with rich multi-tool, multi-provider agent data.
Consulting Partnership
Distributional’s team comes from SigOpt, Bloomberg, Google, Meta, Stripe, and Uber. They understand enterprise AI deployment. WorkingAgents’ consulting firm deploys agents for medium-size companies. The partnership creates a joint offering:
- WorkingAgents consulting: Deploys the agent orchestration layer
- DBNL: Provides the analytics layer that proves agents are working and improving
- Joint deliverable: Monthly behavioral intelligence reports showing discovered patterns, optimization recommendations, and measured improvements
For Distributional, this is channel distribution through consulting engagements. For WorkingAgents, this is a continuous-improvement service tier that generates recurring revenue.
Co-Marketing: “The Open AI Agent Analytics Stack”
Both companies share a deployment philosophy: open, self-hosted, data stays in your environment. A joint positioning as “the open stack for production AI agents” — orchestration (WorkingAgents) plus analytics (DBNL) — differentiates from cloud-locked alternatives.
Distributional’s $30M in funding from a16z and Two Sigma Ventures gives them marketing reach. A case study showing DBNL discovering behavioral patterns in a production MCP agent platform would be distinctive content for both companies.
Recommended Next Steps
-
Deploy DBNL sandbox — Distributional offers a free sandbox at docs.dbnl.com. Connect WorkingAgents’ MCP dispatcher traces. See what the platform discovers from even a week of production data.
-
Instrument the MCP dispatcher — Add OpenTelemetry span emission to
MyMCPServer.Manager. Include tool name, parameters, user ID, provider, and session ID as span attributes. This is the minimum data DBNL needs. -
Run the flywheel once — Ingest a month of traces. Let DBNL’s unsupervised analysis run. Review the Insights. Pick one discovered pattern and optimize for it. Measure the result. This single cycle demonstrates the value proposition to consulting clients.
-
Contact Distributional — They’re a Series A company actively expanding. The SigOpt team built their reputation on optimization for enterprise AI. An MCP-native agent orchestration reference on a non-Python stack would be a differentiated story for their portfolio.
-
Design the consulting package — “Managed AI Agent Operations with Behavioral Analytics” — deploy agents, connect DBNL, deliver monthly intelligence reports, continuously optimize. This is the repeating revenue model.
Conclusion
Distributional solves a problem that most AI teams don’t even know they have: the unknown unknowns in agent behavior. WorkingAgents builds the agents. Distributional discovers what those agents are actually doing in production — the behavioral patterns, correlations, clusters, and drifts that no amount of manual log reading or predefined metrics will surface.
The combination is particularly powerful for consulting. Walk into a client meeting and say: “We deploy AI agents, and we use statistical behavioral analysis to discover patterns in how they operate. Last month we found that your procurement agent was using an inefficient tool chain on 23% of requests. We optimized it. Here’s the before-and-after distributional fingerprint.”
That’s not a pitch. That’s evidence.
DBNL is free, open, and deploys in your infrastructure. The integration is OpenTelemetry — protocol-level, language-agnostic. The team is ex-SigOpt, Google, Meta, Bloomberg. The funding is a16z and Two Sigma. And they’re still early enough that a partnership conversation gets real attention.
The flywheel starts with one trace. Time to emit it.
Sources: