The LLM Proxy article and its implementation plan focus on proxying WorkingAgents’ own chat feature – the ServerChat GenServer that talks to LLM providers. But most enterprises won’t use WorkingAgents’ built-in chat for their agentic work. They’ll use Claude Code, Gemini CLI, OpenAI Codex, or custom agents built on LangChain, CrewAI, or the Anthropic Agent SDK.
These agents are powerful, autonomous, and intelligent. They’re also opaque. When Claude Code calls a WorkingAgents MCP tool, what did it send? When it gets a result back, what does it do with that data? Could a tool result contain injected instructions that cause the agent to exfiltrate data through a subsequent tool call?
WorkingAgents already gates tool access through capability-based permissions. But permissions answer “can this agent call this tool?” They don’t answer “is what the agent is doing with the tool safe?” That’s the proxy’s job.
The Threat Model for External Agents
External agents like Claude Code control their own LLM calls. WorkingAgents has no visibility into:
- The agent’s system prompt – what instructions is it operating under?
- The agent’s conversation history – what context has accumulated?
- What the agent does with tool results – does it send sensitive data to the LLM provider?
- Whether tool arguments contain injection payloads – is the agent (or its user) trying to manipulate tool behavior?
The attack surface is the MCP boundary – the point where tool calls come in and tool results go out. This is where WorkingAgents can inspect, filter, audit, and gate.
Specific Attack Vectors
1. Tool argument injection. An agent sends arguments to a tool that contain SQL injection, path traversal, or command injection. The tool executes the payload. Example: knowledge_search(query: "'; DROP TABLE knowledge_docs; --").
2. Data exfiltration via tool chaining. An agent reads sensitive data through one tool (knowledge_get), then writes it somewhere external through another tool (agentmail_send_message or fetch_url). Each individual call is authorized. The sequence is the attack.
3. Tool result poisoning upstream. A tool returns data from an external source (a web page, an API). That data contains hidden instructions. The external agent’s LLM processes those instructions and takes actions the user didn’t intend. WorkingAgents can’t control what the agent does with the result, but it can scan and sanitize the result before returning it.
4. Prompt extraction via tool abuse. An agent uses knowledge_search or blog_search to probe for system prompts, API keys, or internal configuration that may have been inadvertently stored in the knowledge base or blog content.
5. Budget exhaustion. An agent stuck in a tool-call loop fires hundreds of expensive tool calls per minute. Each call is individually authorized. The aggregate cost is catastrophic.
The Current Architecture
External agents connect to WorkingAgents through two MCP transports:
Claude Code / Gemini CLI / OpenAI Codex
-> MCP Transport (SSE at /sse or Streamable HTTP at /mcp)
-> handle_jsonrpc/2
-> dispatch_method("tools/call", params, user)
-> MyMCPServer.Manager.call_tool(name, args, user_id)
-> MyMCPServer.handle_tool_call(name, args, state)
-> Permissions.*.function(permissions, ...) [guard check]
-> Business logic (Knowledge, BlogStore, Nis, etc.)
-> Tool result returned to agent
Permission checks happen at the Permissions.* layer via compiled guard clauses. The user’s capability keys gate whether the tool call proceeds. This is solid access control.
What’s missing: inspection of the arguments going in and the results coming out.
Where the Proxy Inserts
The proxy inserts at the MCP transport boundary – between dispatch_method and call_tool. This is the single point where every external agent tool call passes through, regardless of which transport (SSE or Streamable HTTP) or which agent (Claude Code, Codex, etc.) is calling.
The Proxied Flow
External Agent
-> MCP Transport
-> handle_jsonrpc/2
-> dispatch_method("tools/call", params, user)
-> MCPProxy.call_tool(name, args, user) # <-- NEW
-> PreFlight pipeline
-> ArgumentGuard.scan(name, args) # injection patterns
-> PiiGuard.scan_args(args) # PII in arguments
-> RateLimiter.check(user_id, name) # per-user rate limiting
-> SequenceDetector.check(user_id, name) # suspicious call patterns
-> MyMCPServer.Manager.call_tool(name, args, user_id)
-> PostFlight pipeline
-> ResultGuard.scan(name, result) # injection in results
-> PiiGuard.scan_result(result) # PII in results
-> ResultSandbox.wrap(name, result) # data boundary markers
-> MCPProxy.Audit.log(user_id, name, args, result)
-> Tool result returned to agent
Implementation
Step 1: MCPProxy Module
Create lib/mcp_proxy.ex:
defmodule MCPProxy do
@moduledoc """
Security proxy for MCP tool calls from external agents.
Inspects arguments, gates execution, scans results, and logs everything.
"""
require Logger
def call_tool(name, args, user) do
user_id = user.id
with {:ok, args} <- MCPProxy.PreFlight.run(user_id, name, args),
result <- MyMCPServer.Manager.call_tool_as(name, args, user),
{:ok, result} <- MCPProxy.PostFlight.run(user_id, name, result) do
MCPProxy.Audit.log_call(user_id, name, args, result, :ok)
result
else
{:rejected, reason} ->
Logger.warning("MCPProxy rejected #{name} for user #{user_id}: #{inspect(reason)}")
MCPProxy.Audit.log_call(user_id, name, args, nil, {:rejected, reason})
{:error, %{code: -32000, message: "Request rejected: #{inspect(reason)}"}}
end
end
end
Step 2: Wire Into the Router
The change is two lines in my_mcp_server_router.ex:
Before:
defp call_tool(name, args, user) do
MyMCPServer.Manager.call_tool(name, args, user.id)
end
After:
defp call_tool(name, args, user) do
MCPProxy.call_tool(name, args, user)
end
Every external agent tool call now goes through the proxy. SSE transport, Streamable HTTP transport, A2A protocol – all of them call call_tool/3 in the router. One change, all transports covered.
Step 3: PreFlight Pipeline
Create lib/mcp_proxy/pre_flight.ex:
defmodule MCPProxy.PreFlight do
@checks [
MCPProxy.ArgumentGuard,
MCPProxy.PiiGuard,
MCPProxy.RateLimiter,
MCPProxy.SequenceDetector
]
def run(user_id, tool_name, args) do
Enum.reduce_while(@checks, {:ok, args}, fn check, {:ok, a} ->
case check.scan(user_id, tool_name, a) do
{:ok, a} -> {:cont, {:ok, a}}
{:rejected, reason} -> {:halt, {:rejected, reason}}
end
end)
end
end
Step 4: Argument Guard
Create lib/mcp_proxy/argument_guard.ex:
defmodule MCPProxy.ArgumentGuard do
@moduledoc """
Scans tool arguments for injection patterns, path traversal,
and suspicious payloads.
"""
@injection_patterns [
~r/;\s*(DROP|DELETE|INSERT|UPDATE|ALTER)\s/i,
~r/'\s*OR\s+'1'\s*=\s*'1/i,
~r/\.\.\//,
~r/<script/i,
~r/ignore\s+(all\s+)?previous\s+instructions/i,
~r/IMPORTANT:\s*(?:ignore|forget|disregard)/i
]
# Tools that accept free-text where injection is expected (search queries, content)
@freetext_tools ~w(
knowledge_search knowledge_search_text knowledge_add knowledge_import
blog_search blog_search_text summary_request task_capture
nis_search whatsapp_send agentmail_send_message
)
def scan(_user_id, tool_name, args) do
# Skip injection scanning on tools that naturally accept free-text
if tool_name in @freetext_tools do
{:ok, args}
else
texts = extract_strings(args)
case Enum.find(texts, &has_injection?/1) do
nil -> {:ok, args}
text -> {:rejected, {:argument_injection, tool_name, String.slice(text, 0, 100)}}
end
end
end
defp extract_strings(args) when is_map(args) do
Enum.flat_map(args, fn
{_k, v} when is_binary(v) -> [v]
{_k, v} when is_map(v) -> extract_strings(v)
_ -> []
end)
end
defp has_injection?(text) do
Enum.any?(@injection_patterns, &Regex.match?(&1, text))
end
end
Note the @freetext_tools list – tools like knowledge_search and whatsapp_send naturally accept free-text input that might match injection patterns. Scanning these would produce constant false positives. The guard only scans tools where structured arguments are expected.
Step 5: Rate Limiter
Create lib/mcp_proxy/rate_limiter.ex:
defmodule MCPProxy.RateLimiter do
@moduledoc """
Per-user, per-tool rate limiting to prevent budget exhaustion
from runaway agent loops.
"""
@table :mcp_rate_limiter
@window_ms 60_000
@default_limit 60 # calls per minute per tool
@high_cost_tools %{
"knowledge_search" => 20,
"summary_request" => 5,
"blog_import" => 10,
"agentmail_send_message" => 10,
"whatsapp_send" => 5
}
def setup do
:ets.new(@table, [:named_table, :public, :set])
end
def scan(user_id, tool_name, args) do
limit = Map.get(@high_cost_tools, tool_name, @default_limit)
key = {user_id, tool_name}
now = System.system_time(:millisecond)
count =
case :ets.lookup(@table, key) do
[{_, c, window_start}] when now - window_start < @window_ms -> c
_ -> 0
end
if count >= limit do
{:rejected, {:rate_limited, tool_name, limit, :per_minute}}
else
window_start =
case :ets.lookup(@table, key) do
[{_, _, ws}] when now - ws < @window_ms -> ws
_ -> now
end
:ets.insert(@table, {key, count + 1, window_start})
{:ok, args}
end
end
end
Expensive tools (embedding searches, email sending, WhatsApp) get tighter limits. An agent stuck in a loop calling knowledge_search 100 times per minute gets blocked after 20.
Step 6: Sequence Detector
Create lib/mcp_proxy/sequence_detector.ex:
defmodule MCPProxy.SequenceDetector do
@moduledoc """
Detects suspicious tool call sequences that indicate
data exfiltration or privilege escalation.
Example: knowledge_get followed by agentmail_send_message
suggests the agent is reading internal data and emailing it out.
"""
@table :mcp_sequence_tracker
@window_ms 300_000 # 5-minute window
# Suspicious sequences: {read_tool, write_tool}
@suspicious_sequences [
{"knowledge_get", "agentmail_send_message"},
{"knowledge_get", "whatsapp_send"},
{"knowledge_get", "fetch_url"},
{"read_file", "agentmail_send_message"},
{"read_file", "whatsapp_send"},
{"read_file", "fetch_url"},
{"nis_get_contact", "agentmail_send_message"},
{"nis_get_contact", "whatsapp_send"},
{"access_control_audit_log", "fetch_url"},
{"access_control_user_permissions", "fetch_url"}
]
def setup do
:ets.new(@table, [:named_table, :public, :bag])
end
def scan(user_id, tool_name, args) do
now = System.system_time(:millisecond)
# Record this call
:ets.insert(@table, {user_id, tool_name, now})
# Get recent calls for this user
recent =
:ets.lookup(@table, user_id)
|> Enum.filter(fn {_, _, ts} -> now - ts < @window_ms end)
|> Enum.map(fn {_, name, _} -> name end)
# Clean old entries
:ets.match_delete(@table, {user_id, :_, :"$1"})
Enum.each(recent, fn name ->
:ets.insert(@table, {user_id, name, now})
end)
# Check for suspicious sequences
case find_suspicious(recent, tool_name) do
nil ->
{:ok, args}
{read_tool, write_tool} ->
# Log warning but don't block -- this is a heuristic
Logger.warning(
"MCPProxy: Suspicious sequence for user #{user_id}: " <>
"#{read_tool} -> #{write_tool} within 5 minutes"
)
{:ok, args}
end
end
defp find_suspicious(recent_tools, current_tool) do
Enum.find(@suspicious_sequences, fn {read, write} ->
write == current_tool and read in recent_tools
end)
end
end
This is deliberately non-blocking in v1. It logs warnings for suspicious read-then-exfiltrate sequences without rejecting the call. Once we have enough data to calibrate false positive rates, it can be made blocking for specific high-risk sequences.
Step 7: PostFlight Pipeline
Create lib/mcp_proxy/post_flight.ex:
defmodule MCPProxy.PostFlight do
@checks [
MCPProxy.ResultGuard
]
def run(user_id, tool_name, result) do
Enum.reduce_while(@checks, {:ok, result}, fn check, {:ok, r} ->
case check.process(user_id, tool_name, r) do
{:ok, r} -> {:cont, {:ok, r}}
{:rejected, reason} -> {:halt, {:rejected, reason}}
end
end)
end
end
Step 8: Result Guard
Create lib/mcp_proxy/result_guard.ex:
defmodule MCPProxy.ResultGuard do
@moduledoc """
Scans tool results before returning them to the external agent.
Detects injection patterns embedded in data from external sources.
"""
@injection_patterns [
~r/ignore\s+(all\s+)?previous\s+instructions/i,
~r/IMPORTANT:\s*(?:ignore|forget|disregard)/i,
~r/you\s+are\s+now\s+(?:a|an)\s+/i,
~r/\[INST\]|\[\/INST\]|<<SYS>>|<\|im_start\|>/i
]
# Tools that return external/untrusted data
@scan_tools ~w(fetch_url knowledge_search knowledge_search_text blog_search blog_search_text)
def process(_user_id, tool_name, {:ok, result}) when is_map(result) do
if tool_name in @scan_tools do
text = result |> inspect() |> String.slice(0, 50_000)
if Enum.any?(@injection_patterns, &Regex.match?(&1, text)) do
# Don't block -- strip the injection and warn
Logger.warning("MCPProxy: Injection pattern detected in #{tool_name} result")
{:ok, {:ok, Map.put(result, :_proxy_warning, "Potential injection pattern detected in result")}}
else
{:ok, {:ok, result}}
end
else
{:ok, {:ok, result}}
end
end
def process(_user_id, _tool_name, result), do: {:ok, result}
end
Step 9: Audit Module
Create lib/mcp_proxy/audit.ex:
defmodule MCPProxy.Audit do
@moduledoc """
Logs every MCP tool call from external agents with full arguments,
results, and status for forensic analysis.
"""
def log_call(user_id, tool_name, args, result, status) do
Sqler.insert(:mcp_audit_db, "mcp_tool_audit", %{
user_id: user_id,
tool_name: tool_name,
args_hash: hash(args),
args_summary: summarize(args),
result_summary: summarize(result),
status: inspect(status),
created_at: System.system_time(:millisecond)
})
end
defp hash(data) do
:crypto.hash(:sha256, :erlang.term_to_binary(data))
|> Base.encode16(case: :lower)
end
defp summarize(data) do
inspect(data, limit: 500, printable_limit: 2000)
|> String.slice(0, 2000)
end
end
This goes beyond the existing ToolAudit which logs tool name, user, and duration. The MCP proxy audit logs full arguments and result summaries with SHA-256 hashing for tamper detection. When something goes wrong, you can reconstruct exactly what the agent sent and what it received.
File Structure
lib/
mcp_proxy.ex # Main proxy module
mcp_proxy/
pre_flight.ex # PreFlight pipeline
post_flight.ex # PostFlight pipeline
argument_guard.ex # Injection scanning on arguments
pii_guard.ex # PII detection in args and results
rate_limiter.ex # Per-user per-tool rate limiting
sequence_detector.ex # Suspicious call pattern detection
result_guard.ex # Injection scanning on results
audit.ex # Full argument/result audit logging
Integration Summary
| Change | Where | Lines Changed |
|---|---|---|
Replace call_tool body |
lib/my_mcp_server_router.ex |
1 line |
| Add ETS setup for rate limiter |
lib/mcp/application.ex |
2 lines |
| Add ETS setup for sequence tracker |
lib/mcp/application.ex |
2 lines |
| Add Sqler instance for mcp_audit |
lib/mcp/application.ex |
2 lines |
| New modules |
lib/mcp_proxy/*.ex |
~300 lines total |
The entire change touches one line of existing code. Everything else is additive.
What This Gives External Agents
After this is deployed, when Claude Code calls knowledge_search through the MCP connection:
- Arguments are scanned for injection patterns (SQL injection, path traversal, prompt injection)
- Rate limiting prevents a runaway agent from making 1,000 calls per minute
- Sequence detection flags if the agent reads sensitive data then immediately tries to send it externally
-
The tool executes through the existing permission-gated
Permissions.*guard - Results are scanned for injection patterns embedded in external data
- Everything is logged – full arguments, full results, hashed for tamper detection
The external agent never knows the proxy exists. The MCP protocol is unchanged. Tool definitions are unchanged. The agent gets the same results it would have gotten without the proxy – unless those results or arguments contained something dangerous.
What This Doesn’t Do
It doesn’t proxy the agent’s LLM calls. Claude Code talks to Anthropic’s API directly. WorkingAgents has no visibility into that conversation. The proxy only covers the MCP boundary – what the agent sends to WorkingAgents’ tools and what WorkingAgents sends back.
It doesn’t prevent all attacks. A sophisticated agent could exfiltrate data by encoding it in innocuous-looking tool arguments over many calls. The sequence detector catches simple read-then-write patterns but not slow exfiltration. This is a defense-in-depth layer, not a silver bullet.
It doesn’t modify the agent’s behavior. The proxy can reject a tool call, scan a result, or log a warning. It cannot change what the agent does with the results it receives. For that, you need the LLM proxy described in the companion article – which covers WorkingAgents’ own chat feature but not external agents.
The Two Proxy Layers Together
The complete security architecture uses both proxies:
WorkingAgents Chat (ServerChat)
-> LLMProxy (proxies LLM calls)
-> PreFlight: injection scan, PII redaction, cost estimation
-> LLM Provider API
-> PostFlight: response scan, cost recording
-> MCP tool calls go through MCPProxy too
External Agents (Claude Code, Codex, Gemini CLI)
-> MCPProxy (proxies MCP tool calls)
-> PreFlight: argument guard, rate limit, sequence detection
-> Permissions.* guard check
-> Tool execution
-> PostFlight: result guard, audit logging
The LLM Proxy secures the conversation between WorkingAgents’ own agents and the LLM. The MCP Proxy secures the boundary between any external agent and WorkingAgents’ tools. Together they cover both attack surfaces.
Priority Order
- MCPProxy skeleton + router integration – the one-line change, empty pipelines (1 hour)
- Audit logging – start capturing full arguments and results immediately (2 hours)
- Rate limiter – prevents budget exhaustion from runaway agents (1 hour)
- Argument guard – highest security impact on the input side (2 hours)
- Result guard – highest security impact on the output side (1 hour)
- Sequence detector – highest exfiltration detection impact (2 hours)
- PII guard – compliance impact (2 hours)
Steps 1-3 can be done in an afternoon. Each subsequent step is independent.