From Probabilistic Reasoning to Deterministic Execution: The Rise of JIT MCP Tools

By James Aspinwall – February 26, 2026, 12:30

The “Agentic Era” is currently defined by a struggle between two worlds: the probabilistic world of Large Language Models (LLMs) and the deterministic world of traditional software. When we ask an agent to perform a complex, multi-step task using granular tools, we are essentially asking it to maintain a perfect mental model of a state machine over thousands of tokens. This is where most agents fail.

The solution is not more reasoning, but better engineering. We are seeing the emergence of a new pattern: Just-in-Time (JIT) MCP Tool Generation.


The Core Concept: Logic Translation

In this paradigm, the agent acts as a compiler. Instead of executing logic step-by-step through a series of “chat-and-call” cycles, the agent translates its intent into high-level code (Elixir, Python, or Go), compiles it, and registers it as a new tool on an MCP (Model Context Protocol) server.

The “Email Campaign” Paradigm Shift

Consider the manual approach to an email promotion:

  1. Agent: “List all clients.” (Returns 10,000 rows)
  2. Agent: (Processes rows in context… runs out of space or loses focus)
  3. Agent: “Send email to Client A.”
  4. Agent: “Log result for Client A.”
  5. … (Repeat 9,999 times)

This is not just expensive; it’s fragile. If the connection drops at call 450, the agent has to “remember” where it was.

The JIT Approach: The agent writes a single Elixir module: PromotionOrchestrator. This module contains a function run_campaign/2 that:

The agent registers this tool, calls it once, and receives the result. The logic has been “offloaded” from the model’s reasoning engine to the server’s execution engine.


Beyond Email: Advanced Examples

The utility of JIT tools extends far beyond batch processing:


Why Now? The Frontier Model “Phase Change”

This approach was impossible until recently. Previous generations of LLMs could write code, but they couldn’t write reliable systems. Frontier models like Claude 3.5 Sonnet and GPT-4o have reached a level of coding proficiency that allows for:

  1. Idiomatic Correctness: They understand the nuances of the language they are writing in (e.g., Elixir’s pattern matching or Python’s list comprehensions).
  2. Robust Error Handling: They can anticipate and write try/catch or case statements for common failure modes without being prompted.
  3. Modular Design: They can structure code that follows existing project conventions, making the generated tools easier for humans to audit.

Our research shows that these models can now generate “first-pass” code that compiles and runs correctly over 90% of the time for scoped tasks.


The Advantages: Speed, Safety, and Cost

Conclusion

We are moving away from agents that “try their best” to agents that “build the best.” By allowing agents to expand their own toolsets through dynamic code generation and MCP integration, we are creating a more robust, efficient, and deterministic future for autonomous software engineering.