Can You Proxy the Traffic Between an External Agent and Its LLM?

The MCP proxy article covers securing the boundary between external agents (Claude Code, Gemini CLI, OpenAI Codex) and WorkingAgents’ tools. But that proxy only sees tool calls. It doesn’t see the conversation between the agent and its LLM – the system prompt, the reasoning, the full context window, or what the agent does with tool results before and after it talks to the model.

That conversation is where the real security risk lives. Can you intercept it? Does HTTPS prevent it? What techniques actually work?

The Short Answer

Yes, you can proxy the traffic between an external agent and its LLM. HTTPS does not prevent it – if you control the machine the agent runs on. There are four practical techniques, each with different tradeoffs.

Why HTTPS Doesn’t Stop You (On Your Own Machine)

HTTPS encrypts traffic between two endpoints so that no one in the middle can read it. The encryption is authenticated by TLS certificates – the client trusts the server’s certificate, and the encrypted channel prevents eavesdropping.

But that trust chain starts with Certificate Authorities (CAs) installed on the client machine. If you control the machine, you control which CAs it trusts. This is the foundation of enterprise TLS inspection: install a corporate CA certificate on the machine, and a proxy that presents certificates signed by that CA can terminate, inspect, and re-encrypt all HTTPS traffic.

The NSA’s own guidance document on TLS inspection (Managing Risk from TLS Inspection) confirms this is standard practice for enterprise network security. Google Cloud, Cloudflare, Cisco, and every major enterprise firewall vendor offer TLS inspection as a product feature.

So: HTTPS protects traffic from third parties on the network. It does not protect traffic from the machine’s administrator. If you own the machine, you can inspect everything.

Technique 1: API URL Redirect (Simplest, Most Practical)

Most AI agents support environment variables that control where they send API calls. Instead of calling the LLM provider directly, you redirect the agent to call your proxy, which logs the traffic and forwards it to the real provider.

How It Works

Claude Code
  -> ANTHROPIC_BASE_URL=http://your-proxy:4000
    -> Your proxy receives the full request (system prompt, messages, tools)
    -> Logs everything
    -> Forwards to api.anthropic.com
    -> Receives response
    -> Logs everything
    -> Returns response to Claude Code

Environment Variables by Agent

Agent Variable Example
Claude Code ANTHROPIC_BASE_URL http://proxy:4000
OpenAI Codex OPENAI_BASE_URL http://proxy:4000/v1
Gemini CLI GOOGLE_API_ENDPOINT http://proxy:4000
Any OpenAI-compatible OPENAI_API_BASE http://proxy:4000/v1

Claude Code explicitly supports this. Their LLM gateway documentation describes how to configure custom API endpoints, including enterprise network configuration for proxy servers and custom CA certificates.

Existing Tools

LiteLLM is the most mature option. It’s a Python proxy that speaks the Anthropic, OpenAI, Google, and other APIs. You point Claude Code at LiteLLM, and LiteLLM forwards to the real provider while logging everything.

pip install 'litellm[proxy]'
litellm --config config.yaml --port 4000

# Then configure Claude Code:
export ANTHROPIC_BASE_URL=http://localhost:4000
export ANTHROPIC_AUTH_TOKEN=your-litellm-key

LiteLLM logs every request and response, tracks token usage and cost, and provides a dashboard. It’s production-ready and used by thousands of teams. (LiteLLM Claude Code Quickstart)

claude-code-proxy by seifghazi captures and visualizes in-flight Claude Code requests and conversations. Lighter than LiteLLM, purpose-built for Claude Code inspection. (GitHub)

Advantages

Disadvantages

Building This in Elixir

This is the most natural fit for WorkingAgents. A Plug-based reverse proxy that speaks the Anthropic/OpenAI APIs:

defmodule LLMGateway do
  @moduledoc """
  Reverse proxy that intercepts agent-to-LLM traffic.
  Agents point their ANTHROPIC_BASE_URL or OPENAI_BASE_URL here.
  """
  use Plug.Router

  plug Plug.Parsers, parsers: [:json], json_decoder: Jason
  plug :match
  plug :dispatch

  # Catch-all: forward any request to the real provider
  match _ do
    provider = detect_provider(conn)
    body = conn.body_params

    # PRE-FLIGHT: scan the request
    LLMGateway.PreFlight.scan(conn, body)

    # LOG: full request payload
    LLMGateway.Audit.log_request(conn, body, provider)

    # FORWARD: to real provider
    {:ok, response} = forward_to_provider(conn, body, provider)

    # POST-FLIGHT: scan the response
    LLMGateway.PostFlight.scan(conn, response)

    # LOG: full response
    LLMGateway.Audit.log_response(conn, response, provider)

    # RETURN: to agent
    conn
    |> put_resp_content_type("application/json")
    |> send_resp(response.status, response.body)
  end

  defp detect_provider(conn) do
    cond do
      String.contains?(conn.request_path, "/v1/messages") -> :anthropic
      String.contains?(conn.request_path, "/v1/chat") -> :openai
      String.contains?(conn.request_path, "/v1beta") -> :google
      true -> :unknown
    end
  end

  defp forward_to_provider(conn, body, provider) do
    target = provider_url(provider) <> conn.request_path

    headers =
      conn.req_headers
      |> Enum.reject(fn {k, _} -> k in ["host", "content-length"] end)

    Req.post(target,
      json: body,
      headers: headers,
      receive_timeout: 120_000
    )
  end

  defp provider_url(:anthropic), do: "https://api.anthropic.com"
  defp provider_url(:openai), do: "https://api.openai.com"
  defp provider_url(:google), do: "https://generativelanguage.googleapis.com"
end

This could be integrated into WorkingAgents as a new endpoint – /llm-gateway – giving enterprises a single deployment that governs both MCP tool access and LLM API traffic.

Technique 2: TLS Interception (MITM Proxy)

For agents that don’t support configurable API URLs, or for comprehensive traffic inspection across all applications on a machine, TLS interception captures everything.

How It Works

Agent (trusts corporate CA)
  -> Corporate proxy (e.g., mitmproxy)
    -> Terminates TLS using corporate CA
    -> Reads plaintext request
    -> Logs everything
    -> Opens new TLS connection to api.anthropic.com
    -> Forwards request
    -> Receives response
    -> Logs everything
    -> Re-encrypts with corporate CA cert
  -> Agent receives response (trusts corporate CA, sees valid TLS)

Setup

  1. Install mitmproxy – open-source, actively maintained, Python-based
  2. Generate a CA certificate – mitmproxy creates one automatically
  3. Install the CA cert on the machine – add to system trust store
  4. Route traffic through the proxy – set HTTPS_PROXY environment variable or configure at the OS level
  5. All HTTPS traffic is now visible – mitmproxy’s web interface shows every request and response in plaintext

Existing Tools

mitmproxy is the gold standard. Free, open-source, handles HTTP/1.1, HTTP/2, WebSockets, and any TLS-protected protocol. Web interface for inspection. Python scripting for automation. (mitmproxy.org)

llm-interceptor is purpose-built for AI coding assistants. MITM proxy that specifically targets Claude Code, Cursor, and other agent-to-LLM traffic. Supports streaming (SSE), multi-provider detection, API key masking, and a web interface for analysis. (GitHub)

mitmproxy-mcp wraps mitmproxy as an MCP server, letting AI agents themselves inspect and modify traffic. Meta, but useful for automated security testing. (GitHub)

Advantages

Disadvantages

Erlang/Elixir Considerations

Building a MITM proxy in Erlang/Elixir is possible but not advisable for v1. The Erlang :ssl module can terminate and originate TLS connections. The sslproxy project demonstrates this. But the tooling is far less mature than mitmproxy in Python.

Recommended approach: Run mitmproxy as a sidecar process alongside WorkingAgents. Use mitmproxy’s Python scripting to forward inspection data to WorkingAgents via a webhook or API call. WorkingAgents logs it in its audit trail alongside MCP tool call data, creating a unified security view.

Technique 3: Agent SDK Instrumentation

Some agent frameworks provide hooks or callbacks that fire before and after LLM calls. If you’re building agents with the Anthropic Agent SDK, LangChain, or CrewAI, you can instrument the LLM call at the framework level.

How It Works

Custom Agent (your code)
  -> Framework callback: before_llm_call(messages)
    -> Your inspection code runs
    -> Log, scan, redact
  -> Framework makes LLM call
  -> Framework callback: after_llm_call(response)
    -> Your inspection code runs
    -> Log, scan, filter
  -> Agent continues

Example: Anthropic Agent SDK

from claude_agent_sdk import Agent

class GovernedAgent(Agent):
    async def _call_model(self, messages, tools):
        # PRE-FLIGHT
        self.audit.log_request(messages, tools)
        self.guard.scan_messages(messages)

        # CALL
        response = await super()._call_model(messages, tools)

        # POST-FLIGHT
        self.audit.log_response(response)
        self.guard.scan_response(response)

        return response

Advantages

Disadvantages

Technique 4: Network-Level Packet Capture

For forensic analysis or compliance auditing, you can capture all network traffic at the OS level and decrypt it later using TLS session keys.

How It Works

Most TLS implementations can export session keys via the SSLKEYLOGFILE environment variable. Set this variable, and the TLS library writes pre-master secrets to a file. Wireshark can then decrypt captured pcap files using these keys.

export SSLKEYLOGFILE=/tmp/tls-keys.log
# Start the agent
# Capture traffic with tcpdump
tcpdump -i any -w capture.pcap port 443
# Open in Wireshark with the key log file

Advantages

Disadvantages

Which Technique for WorkingAgents?

Technique Best For Effort Coverage
API URL redirect Claude Code, Codex, any agent with configurable base URL Low Agent-to-LLM traffic for cooperating agents
TLS interception All agents, all traffic, no cooperation needed Medium Everything on the machine
SDK instrumentation Custom agents built by WorkingAgents or clients Low Only custom agents
Packet capture Forensics, compliance audit, incident response Low setup, high analysis Everything, after the fact

Recommended Strategy for WorkingAgents

Phase 1: API URL Redirect (build now). Add an LLM Gateway endpoint to WorkingAgents (/llm-gateway) that acts as a reverse proxy for Anthropic, OpenAI, and Google APIs. Enterprises configure their agents to point at it. WorkingAgents logs every prompt, every response, every token. This covers the 80% case – Claude Code, Codex, and most frameworks support configurable base URLs. Buildable in Elixir with Plug + Req. Estimated effort: 2-3 days.

Phase 2: mitmproxy sidecar (deploy when needed). For agents that don’t support URL configuration, deploy mitmproxy as a sidecar process. Use mitmproxy’s Python scripting to forward inspection events to WorkingAgents’ audit API. This covers the remaining 20%. No Elixir code needed – just deployment configuration and a webhook receiver.

Phase 3: SDK hooks for custom agents (provide as library). Publish lightweight instrumentation libraries for the Anthropic Agent SDK, LangChain, and CrewAI that send pre/post LLM call events to WorkingAgents’ audit API. Clients building custom agents get governance without building their own logging.

The Complete Security Picture

With all three proxy layers deployed:

External Agent (Claude Code, Codex, etc.)
  -> LLM Gateway (Technique 1: API URL redirect)
    -> PreFlight: scan prompts, redact PII, estimate cost
    -> Forward to LLM provider
    -> PostFlight: scan response, log tokens, record cost
  -> Agent processes response, decides to call a tool
  -> MCP Proxy (from companion article)
    -> PreFlight: argument guard, rate limit, sequence detection
    -> Permission check (capability-based guard)
    -> Tool execution
    -> PostFlight: result guard, audit logging
  -> Agent receives tool result, sends it back to LLM
  -> LLM Gateway again (tool result now in the next prompt)
    -> PreFlight catches any injection in tool results
    -> Full cycle repeats

Three layers, three integration points, three levels of visibility:

  1. LLM Gateway – sees everything the agent says to the LLM and everything the LLM says back
  2. MCP Proxy – sees every tool call and every tool result
  3. Existing permission system – gates whether the tool call is allowed at all

No single layer is sufficient. Together they cover the full attack surface: the agent’s conversation with the model, the agent’s interaction with enterprise tools, and the authorization framework that determines what’s permitted.

The answer to “does HTTPS prevent logging?” is no – not if you control the infrastructure. The question is which technique gives you the right balance of coverage, effort, and operational complexity for your deployment.

Sources: