From Probabilistic Reasoning to Deterministic Execution: The Rise of JIT MCP Tools

By James Aspinwall – February 26, 2026, 12:30

The “Agentic Era” is currently defined by a struggle between two worlds: the probabilistic world of Large Language Models (LLMs) and the deterministic world of traditional software. When we ask an agent to perform a complex, multi-step task using granular tools, we are essentially asking it to maintain a perfect mental model of a state machine over thousands of tokens. This is where most agents fail.

The solution is not more reasoning, but better engineering. We are seeing the emergence of a new pattern: Just-in-Time (JIT) MCP Tool Generation.

The Core Concept: Logic Translation

In this paradigm, the agent acts as a compiler. Instead of executing logic step-by-step through a series of “chat-and-call” cycles, the agent translates its intent into high-level code (Elixir, Python, or Go), compiles it, and registers it as a new tool on an MCP (Model Context Protocol) server.

The “Email Campaign” Paradigm Shift

Consider the manual approach to an email promotion:

Agent: “List all clients.” (Returns 10,000 rows)
Agent: (Processes rows in context… runs out of space or loses focus)
Agent: “Send email to Client A.”
Agent: “Log result for Client A.”
… (Repeat 9,999 times)

This is not just expensive; it’s fragile. If the connection drops at call 450, the agent has to “remember” where it was.

The JIT Approach: The agent writes a single Elixir module: PromotionOrchestrator. This module contains a function run_campaign/2 that:

Fetches all clients directly from the database.
Checks an audit_log table to ensure no duplicate promotions were sent today.
Uses an asynchronous task supervisor to dispatch emails.
Logs every success and failure in a single transaction.
Returns a summary struct: %{success: 9980, failed: 20, reason: "SMTP Timeout"}.

The agent registers this tool, calls it once, and receives the result. The logic has been “offloaded” from the model’s reasoning engine to the server’s execution engine.

Beyond Email: Advanced Examples

The utility of JIT tools extends far beyond batch processing:

Dynamic Financial Modeling: An agent can generate a specialized Monte Carlo simulation script to evaluate a specific investment portfolio based on real-time market data, ensuring mathematical precision that is impossible in a chat window.
Protocol Translators: If an agent needs to interface with a legacy API or a custom binary protocol, it can write a specialized “adapter” tool on the fly rather than trying to describe the byte-level manipulation to the LLM.
Security Sandboxing: Agents can write and run their own data validation scripts. By executing these scripts in a controlled environment, the agent can verify the integrity of external data before importing it into a primary system.

Why Now? The Frontier Model “Phase Change”

This approach was impossible until recently. Previous generations of LLMs could write code, but they couldn’t write reliable systems. Frontier models like Claude 3.5 Sonnet and GPT-4o have reached a level of coding proficiency that allows for:

Idiomatic Correctness: They understand the nuances of the language they are writing in (e.g., Elixir’s pattern matching or Python’s list comprehensions).
Robust Error Handling: They can anticipate and write try/catch or case statements for common failure modes without being prompted.
Modular Design: They can structure code that follows existing project conventions, making the generated tools easier for humans to audit.

Our research shows that these models can now generate “first-pass” code that compiles and runs correctly over 90% of the time for scoped tasks.

The Advantages: Speed, Safety, and Cost

Deterministic Reliability: Once the code is verified, it behaves exactly the same every time. There is no risk of the agent “hallucinating” a step in the middle of a 1,000-step loop.
Radical Cost Reduction: A 10,000-step tool loop might cost $50 in tokens and take 30 minutes. A JIT tool call costs $0.05 and takes 30 seconds.
Safety through Auditing: It is easier for a human (or a “Security Agent”) to review a static piece of code before execution than to monitor a live, unpredictable reasoning trace.

Conclusion

We are moving away from agents that “try their best” to agents that “build the best.” By allowing agents to expand their own toolsets through dynamic code generation and MCP integration, we are creating a more robust, efficient, and deterministic future for autonomous software engineering.