MCP and Claude SDK: How Compound Tools Eliminate Round Trips

The Round Trip Problem

Every time an AI agent needs to do something in the real world — read a database, call an API, check a file — it makes a tool call. The LLM generates a request, the client sends it to the server, the server executes it, the result comes back, the LLM reads the result, decides what to do next, and makes another tool call.

For a simple task like “get the weather in San Francisco,” that’s one round trip. Fine.

But real work isn’t one step. Consider “find all overdue invoices, email each customer a reminder, and log the outreach in the CRM.” That’s:

  1. Call list_invoices with a filter → wait for result
  2. LLM reads the list, decides to email customer #1
  3. Call send_email for customer #1 → wait for result
  4. Call log_crm_interaction for customer #1 → wait for result
  5. LLM decides to email customer #2
  6. Repeat steps 3-4 for each customer
  7. …20 customers later, you’ve made 41 API calls

Each round trip costs time and tokens. The LLM re-reads the entire conversation history on every turn. For 20 customers, that’s 41 LLM inference calls, each one paying the full context window cost. A task that should take seconds takes minutes and burns through your API budget.

This is the fundamental inefficiency of the standard MCP tool-use loop.

How MCP Works Today

The Model Context Protocol (MCP) defines a clean interface between AI agents and external tools. The current spec (2025-11-25) works like this:

Discovery — The client asks the server tools/list and gets back a catalog of available tools, each with a name, description, and JSON Schema for inputs and outputs.

Invocation — The LLM decides to call a tool. The client sends tools/call with the tool name and arguments. The server executes it and returns the result.

Loop — The LLM reads the result, decides what to do next, and potentially calls another tool. This repeats until the task is done.

LLM → Client → Server: tools/call("list_invoices", {status: "overdue"})
Server → Client → LLM: [invoice_1, invoice_2, ..., invoice_20]
LLM → Client → Server: tools/call("send_email", {to: "customer_1@..."})
Server → Client → LLM: {sent: true}
LLM → Client → Server: tools/call("log_interaction", {contact: "customer_1"})
Server → Client → LLM: {logged: true}
... repeat 19 more times ...

Each arrow is a network hop. Each LLM → is a full inference call. The protocol is correct, but chatty.

The Compound Tool Pattern

The solution isn’t a protocol change — it’s a server-side design pattern. Instead of exposing granular tools that the LLM orchestrates one at a time, you expose compound tools that execute multi-step workflows server-side in a single call.

Before: Granular Tools

[
  {"name": "list_invoices", "description": "List invoices by status"},
  {"name": "send_email", "description": "Send an email to a contact"},
  {"name": "log_interaction", "description": "Log a CRM interaction"}
]

The LLM has to orchestrate all three, calling each one in sequence, for each customer.

After: Compound Tool

[
  {
    "name": "process_overdue_invoices",
    "description": "Find all overdue invoices, email each customer a reminder using the standard template, and log the outreach in the CRM. Returns a summary of actions taken.",
    "inputSchema": {
      "type": "object",
      "properties": {
        "template": {"type": "string", "description": "Email template name"},
        "dry_run": {"type": "boolean", "description": "If true, preview actions without executing"}
      }
    }
  }
]

One tool call. The server handles the loop internally — queries the database, iterates over results, sends emails, logs interactions. Returns a summary. The LLM makes one inference call instead of 41.

What the Server Does Internally

def process_overdue_invoices(template, dry_run) do
  invoices = Invoice.list(status: "overdue")

  results = Enum.map(invoices, fn invoice ->
    contact = Contact.get(invoice.contact_id)

    email_result = if dry_run do
      {:preview, Email.render(template, contact, invoice)}
    else
      Email.send(template, contact, invoice)
    end

    crm_result = unless dry_run do
      CRM.log_interaction(contact, "sent_reminder", invoice)
    end

    %{contact: contact.name, invoice: invoice.id,
      email: email_result, crm: crm_result}
  end)

  %{processed: length(results), details: results}
end

The logic that the LLM would have orchestrated across 41 round trips now executes in a single server-side function. The LLM gets back a structured summary and can reason about the results.

The Claude SDK Agent Loop

The Claude Agent SDK (and the Claude API’s tool use feature) implements the agentic loop that drives this interaction:

  1. Send a message to Claude with available tools
  2. Claude responds with a tool_use content block
  3. Your code executes the tool and returns the result
  4. Claude reads the result and either calls another tool or produces a final response
  5. Repeat until stop_reason is end_turn
import anthropic

client = anthropic.Anthropic()

tools = [
    {
        "name": "process_overdue_invoices",
        "description": "Find overdue invoices, email reminders, log in CRM",
        "input_schema": {
            "type": "object",
            "properties": {
                "template": {"type": "string"},
                "dry_run": {"type": "boolean"}
            }
        }
    }
]

messages = [{"role": "user", "content": "Process all overdue invoice reminders"}]

# Agent loop
while True:
    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=4096,
        tools=tools,
        messages=messages
    )

    if response.stop_reason == "end_turn":
        break

    # Execute tool calls
    for block in response.content:
        if block.type == "tool_use":
            result = execute_mcp_tool(block.name, block.input)
            messages.append({"role": "assistant", "content": response.content})
            messages.append({
                "role": "user",
                "content": [{
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": json.dumps(result)
                }]
            })

With compound tools, this loop typically runs 1-2 iterations instead of 40+. Claude calls process_overdue_invoices once, gets the summary, and produces its final response.

Parallel Tool Use

Claude can also call multiple tools simultaneously in a single response. If the LLM determines that two tool calls are independent, it emits both in the same turn:

{
  "content": [
    {"type": "tool_use", "name": "get_sales_report", "input": {"quarter": "Q1"}},
    {"type": "tool_use", "name": "get_support_tickets", "input": {"status": "open"}}
  ]
}

The client executes both in parallel and returns both results. Two tools, one round trip. Combined with compound tools, this further reduces the total number of LLM inference calls.

Structured Outputs and Output Schemas

The latest MCP spec adds outputSchema to tool definitions. This tells the LLM exactly what shape the result will be, so it can plan its next action without guessing:

{
  "name": "process_overdue_invoices",
  "outputSchema": {
    "type": "object",
    "properties": {
      "processed": {"type": "integer"},
      "failed": {"type": "integer"},
      "details": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "contact": {"type": "string"},
            "invoice_id": {"type": "integer"},
            "email_sent": {"type": "boolean"},
            "crm_logged": {"type": "boolean"}
          }
        }
      }
    }
  }
}

With strict tool use (strict: true in the Claude API), the LLM’s tool call inputs are guaranteed to match the input schema exactly. No type mismatches, no missing fields. This eliminates retry round trips caused by malformed inputs.

Design Guidelines for Compound Tools

When to Compound

When NOT to Compound

The Hybrid Approach

Expose both granular and compound tools. The LLM can use process_overdue_invoices for the common case and fall back to individual send_email + log_interaction calls when it needs fine-grained control.

[
  {"name": "process_overdue_invoices", "description": "Batch: find, email, and log all overdue invoices"},
  {"name": "list_invoices", "description": "List invoices by status filter"},
  {"name": "send_email", "description": "Send a single email"},
  {"name": "log_interaction", "description": "Log a single CRM interaction"}
]

Claude is smart enough to pick the compound tool when the task matches and the granular tools when it needs flexibility.

The Dry Run Pattern

Compound tools should support a dry_run parameter. This lets the LLM preview what will happen before committing:

  1. LLM calls process_overdue_invoices(dry_run: true) — one round trip
  2. Server returns a preview: “Would email 20 customers, log 20 interactions”
  3. LLM presents the preview to the user for approval
  4. User approves
  5. LLM calls process_overdue_invoices(dry_run: false) — one round trip

Two round trips total, with human approval in the middle. Without the compound tool pattern, the dry run alone would be 21 round trips.

How The Orchestrator Fits

The Orchestrator is purpose-built for this pattern. Each MCP tool registered in The Orchestrator can be:

The server-side Elixir runtime is ideal for compound tools — concurrent processing with Task.async_stream, fault tolerance with supervisors, and the BEAM’s ability to handle thousands of concurrent operations without breaking a sweat.

The Numbers

For the overdue invoice example with 20 customers:

Approach LLM Calls Tool Calls Tokens (est.) Time (est.)
Granular tools 41 40 ~200K ~3 min
Compound tool 2 1 ~5K ~5 sec
Compound + dry run 3 2 ~8K ~8 sec

The compound tool approach uses 97% fewer tokens and completes 36x faster. The savings scale linearly — 200 customers would be 401 LLM calls vs. 2.

Summary

MCP gives AI agents a standard way to call tools. The Claude SDK gives agents a loop to orchestrate those calls. But the real efficiency gain comes from tool design, not protocol features.

Compound tools move the orchestration logic from the LLM (expensive, slow, token-hungry) to the server (fast, cheap, deterministic). The LLM focuses on what it’s good at — understanding intent, making judgment calls, communicating with users. The server handles what it’s good at — iterating over data, calling APIs, maintaining consistency.

Design your MCP tools for the workflow, not for the individual operation. One smart tool call beats forty dumb ones.