Proxying External Agents: Securing Claude Code, Gemini CLI, and OpenAI Codex Through WorkingAgents

The LLM Proxy article and its implementation plan focus on proxying WorkingAgents’ own chat feature – the ServerChat GenServer that talks to LLM providers. But most enterprises won’t use WorkingAgents’ built-in chat for their agentic work. They’ll use Claude Code, Gemini CLI, OpenAI Codex, or custom agents built on LangChain, CrewAI, or the Anthropic Agent SDK.

These agents are powerful, autonomous, and intelligent. They’re also opaque. When Claude Code calls a WorkingAgents MCP tool, what did it send? When it gets a result back, what does it do with that data? Could a tool result contain injected instructions that cause the agent to exfiltrate data through a subsequent tool call?

WorkingAgents already gates tool access through capability-based permissions. But permissions answer “can this agent call this tool?” They don’t answer “is what the agent is doing with the tool safe?” That’s the proxy’s job.

The Threat Model for External Agents

External agents like Claude Code control their own LLM calls. WorkingAgents has no visibility into:

The attack surface is the MCP boundary – the point where tool calls come in and tool results go out. This is where WorkingAgents can inspect, filter, audit, and gate.

Specific Attack Vectors

1. Tool argument injection. An agent sends arguments to a tool that contain SQL injection, path traversal, or command injection. The tool executes the payload. Example: knowledge_search(query: "'; DROP TABLE knowledge_docs; --").

2. Data exfiltration via tool chaining. An agent reads sensitive data through one tool (knowledge_get), then writes it somewhere external through another tool (agentmail_send_message or fetch_url). Each individual call is authorized. The sequence is the attack.

3. Tool result poisoning upstream. A tool returns data from an external source (a web page, an API). That data contains hidden instructions. The external agent’s LLM processes those instructions and takes actions the user didn’t intend. WorkingAgents can’t control what the agent does with the result, but it can scan and sanitize the result before returning it.

4. Prompt extraction via tool abuse. An agent uses knowledge_search or blog_search to probe for system prompts, API keys, or internal configuration that may have been inadvertently stored in the knowledge base or blog content.

5. Budget exhaustion. An agent stuck in a tool-call loop fires hundreds of expensive tool calls per minute. Each call is individually authorized. The aggregate cost is catastrophic.

The Current Architecture

External agents connect to WorkingAgents through two MCP transports:

Claude Code / Gemini CLI / OpenAI Codex
  -> MCP Transport (SSE at /sse or Streamable HTTP at /mcp)
    -> handle_jsonrpc/2
      -> dispatch_method("tools/call", params, user)
        -> MyMCPServer.Manager.call_tool(name, args, user_id)
          -> MyMCPServer.handle_tool_call(name, args, state)
            -> Permissions.*.function(permissions, ...) [guard check]
              -> Business logic (Knowledge, BlogStore, Nis, etc.)
  -> Tool result returned to agent

Permission checks happen at the Permissions.* layer via compiled guard clauses. The user’s capability keys gate whether the tool call proceeds. This is solid access control.

What’s missing: inspection of the arguments going in and the results coming out.

Where the Proxy Inserts

The proxy inserts at the MCP transport boundary – between dispatch_method and call_tool. This is the single point where every external agent tool call passes through, regardless of which transport (SSE or Streamable HTTP) or which agent (Claude Code, Codex, etc.) is calling.

The Proxied Flow

External Agent
  -> MCP Transport
    -> handle_jsonrpc/2
      -> dispatch_method("tools/call", params, user)
        -> MCPProxy.call_tool(name, args, user)    # <-- NEW
          -> PreFlight pipeline
            -> ArgumentGuard.scan(name, args)       # injection patterns
            -> PiiGuard.scan_args(args)              # PII in arguments
            -> RateLimiter.check(user_id, name)      # per-user rate limiting
            -> SequenceDetector.check(user_id, name) # suspicious call patterns
          -> MyMCPServer.Manager.call_tool(name, args, user_id)
          -> PostFlight pipeline
            -> ResultGuard.scan(name, result)        # injection in results
            -> PiiGuard.scan_result(result)           # PII in results
            -> ResultSandbox.wrap(name, result)       # data boundary markers
          -> MCPProxy.Audit.log(user_id, name, args, result)
        -> Tool result returned to agent

Implementation

Step 1: MCPProxy Module

Create lib/mcp_proxy.ex:

defmodule MCPProxy do
  @moduledoc """
  Security proxy for MCP tool calls from external agents.
  Inspects arguments, gates execution, scans results, and logs everything.
  """

  require Logger

  def call_tool(name, args, user) do
    user_id = user.id

    with {:ok, args} <- MCPProxy.PreFlight.run(user_id, name, args),
         result <- MyMCPServer.Manager.call_tool_as(name, args, user),
         {:ok, result} <- MCPProxy.PostFlight.run(user_id, name, result) do
      MCPProxy.Audit.log_call(user_id, name, args, result, :ok)
      result
    else
      {:rejected, reason} ->
        Logger.warning("MCPProxy rejected #{name} for user #{user_id}: #{inspect(reason)}")
        MCPProxy.Audit.log_call(user_id, name, args, nil, {:rejected, reason})
        {:error, %{code: -32000, message: "Request rejected: #{inspect(reason)}"}}
    end
  end
end

Step 2: Wire Into the Router

The change is two lines in my_mcp_server_router.ex:

Before:

defp call_tool(name, args, user) do
  MyMCPServer.Manager.call_tool(name, args, user.id)
end

After:

defp call_tool(name, args, user) do
  MCPProxy.call_tool(name, args, user)
end

Every external agent tool call now goes through the proxy. SSE transport, Streamable HTTP transport, A2A protocol – all of them call call_tool/3 in the router. One change, all transports covered.

Step 3: PreFlight Pipeline

Create lib/mcp_proxy/pre_flight.ex:

defmodule MCPProxy.PreFlight do
  @checks [
    MCPProxy.ArgumentGuard,
    MCPProxy.PiiGuard,
    MCPProxy.RateLimiter,
    MCPProxy.SequenceDetector
  ]

  def run(user_id, tool_name, args) do
    Enum.reduce_while(@checks, {:ok, args}, fn check, {:ok, a} ->
      case check.scan(user_id, tool_name, a) do
        {:ok, a} -> {:cont, {:ok, a}}
        {:rejected, reason} -> {:halt, {:rejected, reason}}
      end
    end)
  end
end

Step 4: Argument Guard

Create lib/mcp_proxy/argument_guard.ex:

defmodule MCPProxy.ArgumentGuard do
  @moduledoc """
  Scans tool arguments for injection patterns, path traversal,
  and suspicious payloads.
  """

  @injection_patterns [
    ~r/;\s*(DROP|DELETE|INSERT|UPDATE|ALTER)\s/i,
    ~r/'\s*OR\s+'1'\s*=\s*'1/i,
    ~r/\.\.\//,
    ~r/<script/i,
    ~r/ignore\s+(all\s+)?previous\s+instructions/i,
    ~r/IMPORTANT:\s*(?:ignore|forget|disregard)/i
  ]

  # Tools that accept free-text where injection is expected (search queries, content)
  @freetext_tools ~w(
    knowledge_search knowledge_search_text knowledge_add knowledge_import
    blog_search blog_search_text summary_request task_capture
    nis_search whatsapp_send agentmail_send_message
  )

  def scan(_user_id, tool_name, args) do
    # Skip injection scanning on tools that naturally accept free-text
    if tool_name in @freetext_tools do
      {:ok, args}
    else
      texts = extract_strings(args)

      case Enum.find(texts, &has_injection?/1) do
        nil -> {:ok, args}
        text -> {:rejected, {:argument_injection, tool_name, String.slice(text, 0, 100)}}
      end
    end
  end

  defp extract_strings(args) when is_map(args) do
    Enum.flat_map(args, fn
      {_k, v} when is_binary(v) -> [v]
      {_k, v} when is_map(v) -> extract_strings(v)
      _ -> []
    end)
  end

  defp has_injection?(text) do
    Enum.any?(@injection_patterns, &Regex.match?(&1, text))
  end
end

Note the @freetext_tools list – tools like knowledge_search and whatsapp_send naturally accept free-text input that might match injection patterns. Scanning these would produce constant false positives. The guard only scans tools where structured arguments are expected.

Step 5: Rate Limiter

Create lib/mcp_proxy/rate_limiter.ex:

defmodule MCPProxy.RateLimiter do
  @moduledoc """
  Per-user, per-tool rate limiting to prevent budget exhaustion
  from runaway agent loops.
  """

  @table :mcp_rate_limiter
  @window_ms 60_000
  @default_limit 60  # calls per minute per tool

  @high_cost_tools %{
    "knowledge_search" => 20,
    "summary_request" => 5,
    "blog_import" => 10,
    "agentmail_send_message" => 10,
    "whatsapp_send" => 5
  }

  def setup do
    :ets.new(@table, [:named_table, :public, :set])
  end

  def scan(user_id, tool_name, args) do
    limit = Map.get(@high_cost_tools, tool_name, @default_limit)
    key = {user_id, tool_name}
    now = System.system_time(:millisecond)

    count =
      case :ets.lookup(@table, key) do
        [{_, c, window_start}] when now - window_start < @window_ms -> c
        _ -> 0
      end

    if count >= limit do
      {:rejected, {:rate_limited, tool_name, limit, :per_minute}}
    else
      window_start =
        case :ets.lookup(@table, key) do
          [{_, _, ws}] when now - ws < @window_ms -> ws
          _ -> now
        end

      :ets.insert(@table, {key, count + 1, window_start})
      {:ok, args}
    end
  end
end

Expensive tools (embedding searches, email sending, WhatsApp) get tighter limits. An agent stuck in a loop calling knowledge_search 100 times per minute gets blocked after 20.

Step 6: Sequence Detector

Create lib/mcp_proxy/sequence_detector.ex:

defmodule MCPProxy.SequenceDetector do
  @moduledoc """
  Detects suspicious tool call sequences that indicate
  data exfiltration or privilege escalation.

  Example: knowledge_get followed by agentmail_send_message
  suggests the agent is reading internal data and emailing it out.
  """

  @table :mcp_sequence_tracker
  @window_ms 300_000  # 5-minute window

  # Suspicious sequences: {read_tool, write_tool}
  @suspicious_sequences [
    {"knowledge_get", "agentmail_send_message"},
    {"knowledge_get", "whatsapp_send"},
    {"knowledge_get", "fetch_url"},
    {"read_file", "agentmail_send_message"},
    {"read_file", "whatsapp_send"},
    {"read_file", "fetch_url"},
    {"nis_get_contact", "agentmail_send_message"},
    {"nis_get_contact", "whatsapp_send"},
    {"access_control_audit_log", "fetch_url"},
    {"access_control_user_permissions", "fetch_url"}
  ]

  def setup do
    :ets.new(@table, [:named_table, :public, :bag])
  end

  def scan(user_id, tool_name, args) do
    now = System.system_time(:millisecond)

    # Record this call
    :ets.insert(@table, {user_id, tool_name, now})

    # Get recent calls for this user
    recent =
      :ets.lookup(@table, user_id)
      |> Enum.filter(fn {_, _, ts} -> now - ts < @window_ms end)
      |> Enum.map(fn {_, name, _} -> name end)

    # Clean old entries
    :ets.match_delete(@table, {user_id, :_, :"$1"})
    Enum.each(recent, fn name ->
      :ets.insert(@table, {user_id, name, now})
    end)

    # Check for suspicious sequences
    case find_suspicious(recent, tool_name) do
      nil ->
        {:ok, args}

      {read_tool, write_tool} ->
        # Log warning but don't block -- this is a heuristic
        Logger.warning(
          "MCPProxy: Suspicious sequence for user #{user_id}: " <>
          "#{read_tool} -> #{write_tool} within 5 minutes"
        )
        {:ok, args}
    end
  end

  defp find_suspicious(recent_tools, current_tool) do
    Enum.find(@suspicious_sequences, fn {read, write} ->
      write == current_tool and read in recent_tools
    end)
  end
end

This is deliberately non-blocking in v1. It logs warnings for suspicious read-then-exfiltrate sequences without rejecting the call. Once we have enough data to calibrate false positive rates, it can be made blocking for specific high-risk sequences.

Step 7: PostFlight Pipeline

Create lib/mcp_proxy/post_flight.ex:

defmodule MCPProxy.PostFlight do
  @checks [
    MCPProxy.ResultGuard
  ]

  def run(user_id, tool_name, result) do
    Enum.reduce_while(@checks, {:ok, result}, fn check, {:ok, r} ->
      case check.process(user_id, tool_name, r) do
        {:ok, r} -> {:cont, {:ok, r}}
        {:rejected, reason} -> {:halt, {:rejected, reason}}
      end
    end)
  end
end

Step 8: Result Guard

Create lib/mcp_proxy/result_guard.ex:

defmodule MCPProxy.ResultGuard do
  @moduledoc """
  Scans tool results before returning them to the external agent.
  Detects injection patterns embedded in data from external sources.
  """

  @injection_patterns [
    ~r/ignore\s+(all\s+)?previous\s+instructions/i,
    ~r/IMPORTANT:\s*(?:ignore|forget|disregard)/i,
    ~r/you\s+are\s+now\s+(?:a|an)\s+/i,
    ~r/\[INST\]|\[\/INST\]|<<SYS>>|<\|im_start\|>/i
  ]

  # Tools that return external/untrusted data
  @scan_tools ~w(fetch_url knowledge_search knowledge_search_text blog_search blog_search_text)

  def process(_user_id, tool_name, {:ok, result}) when is_map(result) do
    if tool_name in @scan_tools do
      text = result |> inspect() |> String.slice(0, 50_000)

      if Enum.any?(@injection_patterns, &Regex.match?(&1, text)) do
        # Don't block -- strip the injection and warn
        Logger.warning("MCPProxy: Injection pattern detected in #{tool_name} result")
        {:ok, {:ok, Map.put(result, :_proxy_warning, "Potential injection pattern detected in result")}}
      else
        {:ok, {:ok, result}}
      end
    else
      {:ok, {:ok, result}}
    end
  end

  def process(_user_id, _tool_name, result), do: {:ok, result}
end

Step 9: Audit Module

Create lib/mcp_proxy/audit.ex:

defmodule MCPProxy.Audit do
  @moduledoc """
  Logs every MCP tool call from external agents with full arguments,
  results, and status for forensic analysis.
  """

  def log_call(user_id, tool_name, args, result, status) do
    Sqler.insert(:mcp_audit_db, "mcp_tool_audit", %{
      user_id: user_id,
      tool_name: tool_name,
      args_hash: hash(args),
      args_summary: summarize(args),
      result_summary: summarize(result),
      status: inspect(status),
      created_at: System.system_time(:millisecond)
    })
  end

  defp hash(data) do
    :crypto.hash(:sha256, :erlang.term_to_binary(data))
    |> Base.encode16(case: :lower)
  end

  defp summarize(data) do
    inspect(data, limit: 500, printable_limit: 2000)
    |> String.slice(0, 2000)
  end
end

This goes beyond the existing ToolAudit which logs tool name, user, and duration. The MCP proxy audit logs full arguments and result summaries with SHA-256 hashing for tamper detection. When something goes wrong, you can reconstruct exactly what the agent sent and what it received.

File Structure

lib/
  mcp_proxy.ex                     # Main proxy module
  mcp_proxy/
    pre_flight.ex                  # PreFlight pipeline
    post_flight.ex                 # PostFlight pipeline
    argument_guard.ex              # Injection scanning on arguments
    pii_guard.ex                   # PII detection in args and results
    rate_limiter.ex                # Per-user per-tool rate limiting
    sequence_detector.ex           # Suspicious call pattern detection
    result_guard.ex                # Injection scanning on results
    audit.ex                       # Full argument/result audit logging

Integration Summary

Change Where Lines Changed
Replace call_tool body lib/my_mcp_server_router.ex 1 line
Add ETS setup for rate limiter lib/mcp/application.ex 2 lines
Add ETS setup for sequence tracker lib/mcp/application.ex 2 lines
Add Sqler instance for mcp_audit lib/mcp/application.ex 2 lines
New modules lib/mcp_proxy/*.ex ~300 lines total

The entire change touches one line of existing code. Everything else is additive.

What This Gives External Agents

After this is deployed, when Claude Code calls knowledge_search through the MCP connection:

  1. Arguments are scanned for injection patterns (SQL injection, path traversal, prompt injection)
  2. Rate limiting prevents a runaway agent from making 1,000 calls per minute
  3. Sequence detection flags if the agent reads sensitive data then immediately tries to send it externally
  4. The tool executes through the existing permission-gated Permissions.* guard
  5. Results are scanned for injection patterns embedded in external data
  6. Everything is logged – full arguments, full results, hashed for tamper detection

The external agent never knows the proxy exists. The MCP protocol is unchanged. Tool definitions are unchanged. The agent gets the same results it would have gotten without the proxy – unless those results or arguments contained something dangerous.

What This Doesn’t Do

It doesn’t proxy the agent’s LLM calls. Claude Code talks to Anthropic’s API directly. WorkingAgents has no visibility into that conversation. The proxy only covers the MCP boundary – what the agent sends to WorkingAgents’ tools and what WorkingAgents sends back.

It doesn’t prevent all attacks. A sophisticated agent could exfiltrate data by encoding it in innocuous-looking tool arguments over many calls. The sequence detector catches simple read-then-write patterns but not slow exfiltration. This is a defense-in-depth layer, not a silver bullet.

It doesn’t modify the agent’s behavior. The proxy can reject a tool call, scan a result, or log a warning. It cannot change what the agent does with the results it receives. For that, you need the LLM proxy described in the companion article – which covers WorkingAgents’ own chat feature but not external agents.

The Two Proxy Layers Together

The complete security architecture uses both proxies:

WorkingAgents Chat (ServerChat)
  -> LLMProxy (proxies LLM calls)
    -> PreFlight: injection scan, PII redaction, cost estimation
    -> LLM Provider API
    -> PostFlight: response scan, cost recording
    -> MCP tool calls go through MCPProxy too

External Agents (Claude Code, Codex, Gemini CLI)
  -> MCPProxy (proxies MCP tool calls)
    -> PreFlight: argument guard, rate limit, sequence detection
    -> Permissions.* guard check
    -> Tool execution
    -> PostFlight: result guard, audit logging

The LLM Proxy secures the conversation between WorkingAgents’ own agents and the LLM. The MCP Proxy secures the boundary between any external agent and WorkingAgents’ tools. Together they cover both attack surfaces.

Priority Order

  1. MCPProxy skeleton + router integration – the one-line change, empty pipelines (1 hour)
  2. Audit logging – start capturing full arguments and results immediately (2 hours)
  3. Rate limiter – prevents budget exhaustion from runaway agents (1 hour)
  4. Argument guard – highest security impact on the input side (2 hours)
  5. Result guard – highest security impact on the output side (1 hour)
  6. Sequence detector – highest exfiltration detection impact (2 hours)
  7. PII guard – compliance impact (2 hours)

Steps 1-3 can be done in an afternoon. Each subsequent step is independent.