Building a Workflow Engine Into an MCP Server

Most AI agent platforms treat workflows as a product feature – something you configure in a GUI, drag-and-drop steps, click deploy. We’re doing it differently. The Orchestrator embeds a workflow engine directly into the MCP server, which means every workflow step is a permissioned tool call, every transition is logged to the audit trail, and the whole thing survives process crashes because it’s backed by SQLite.

Here’s how it’s designed.

What a Workflow Is

A workflow is a named sequence of steps, each step being an MCP tool call with arguments. Steps can transition in several ways:

Sequential – step A completes, step B starts
Branching – if the result of step A meets a condition, go to B, otherwise go to C
Parallel – fan out to steps B, C, and D simultaneously, then join when all complete
Approval gate – pause the workflow and wait for a human to approve before continuing
Retry – if a step fails, back off and try again up to N times

The step graph lives in a workflow_definitions table as a JSON blob. Immutable after creation – changing a workflow creates a new version. This keeps audit history clean.

The Execution Model

Four SQLite tables back the engine:

workflow_definitions – the step graph, versioned and soft-deleted. Each step spec describes the tool to call, the args (which can include {{input.key}} templates resolved at runtime), the success transition, the error policy, and any approval or parallel config.

workflow_runs – one row per execution. Tracks status (running, paused, completed, failed, cancelled), the current step, the initial input, and the final output.

workflow_step_runs – one row per step execution, with an attempt counter for retries. Stores the exact input and output of each tool call as a JSON snapshot.

workflow_events – an append-only log of every state transition: step started, step completed, approval requested, approval granted, run failed. Drives real-time observability via SSE and WebSocket.

The Execution Loop

WorkflowServer is a registered GenServer that owns its own SQLite connection. The execution loop is driven by a single cast message: {:execute_step, run_id, step_name, attempt}.

When that message arrives:

Load the run and its definition from DB
Find the step spec
Record the step start in workflow_step_runs
If it’s an approval gate, pause the run and notify the approver (more on this below)
If it’s a parallel fan-out, send one :execute_step message per branch to self
Otherwise, call the tool via Workflow.Executor.run/3
On success: record the result, resolve the next step, send the next :execute_step
On error: check the retry policy, schedule a retry via Alarm if attempts remain, otherwise fail the run

The reason retries go through Alarm rather than Process.send_after is crash survival. If the server restarts mid-backoff, the Alarm is already persisted in SQLite and will fire again when WorkflowServer comes back up. Process.send_after would lose the retry entirely.

On startup, WorkflowServer.init/1 queries for all runs with status = 'running' and immediately sends itself :execute_step for each one. Crash recovery is automatic.

Approval Gates

An approval gate is a step with no tool – just a notify_user and an optional timeout_s. When the engine hits one:

Run status flips to paused
Pushover push notification goes to the approver’s phone with run and step details
A WebSocket RPC fires to the approver’s connected browser
An Alarm is scheduled for the timeout (auto-deny if no response)

The approver calls workflow_approve(run_id, step_name) or workflow_deny(...) via MCP tool, REST, or the web UI. On approval, the run resumes from the next step. On denial, the run is cancelled and the creator is notified.

This is the mechanism behind “an agent can read from CRM but cannot write without human confirmation.” You define a workflow where the write step is preceded by an approval gate. The agent starts the workflow, execution pauses, you get a notification, you approve or reject, done. The agent never gets to bypass it – the gate is enforced at the server level, not in the agent’s prompt.

Permission Model

Workflow.Executor calls tools through MyMCPServer.Manager.call_tool_as(tool, args, user) – the same code path as a direct MCP call from a client. This means:

The workflow runs as the user who started it
Every step is gated by that user’s permissions
A workflow cannot call a tool the creator doesn’t have access to
No privilege escalation is possible by wrapping a restricted action in a workflow

Workflows get their own permission key (workflow: 130_001) so access to define and run workflows can be granted independently of access to any specific tool.

Parallel Fan-Out and Join

When a step has a parallel list, the engine fans out to all named steps simultaneously by sending one :execute_step message per branch to itself. A join tracker lives in WorkflowServer state:

parallel_joins: %{
  run_id => %{waiting: MapSet of step names, results: %{step_name => result}}
}

When each branch completes, it removes itself from waiting and adds its result to results. When waiting is empty, all results are merged into the run context and the join step fires.

If the server crashes during fan-out, the join tracker is rebuilt on restart by querying workflow_step_runs for the run and checking which parallel steps have already completed.

What Condition Evaluation Looks Like

Branching steps evaluate a condition string against the previous step’s result. The condition language is intentionally minimal:

"result == true"
"result != nil"
"result == \"active\""

These are matched against a known set of patterns via Elixir function heads – not Code.eval_string. Executing arbitrary code from a database row is a security hole. If richer conditions are needed later, the pattern set can be extended safely.

MCP Tools

Nine tools expose the workflow engine:

Tool	Purpose
`workflow_define`	Create a new workflow definition from a step graph
`workflow_list_definitions`	List available workflows
`workflow_start`	Start a run with an input map
`workflow_status`	Full status of a run including step history
`workflow_list_runs`	List runs filtered by status
`workflow_approve`	Approve a paused approval gate
`workflow_deny`	Deny a paused approval gate
`workflow_cancel`	Cancel a running or paused workflow
`workflow_events`	Get the event log for a run

These follow the same permission wrapper pattern as every other module. An agent or human with the workflow permission can call them from any access surface – MCP, REST, or web UI.

What This Enables

The practical value is that multi-step agent tasks become durable, observable, and governable. Instead of an agent holding its plan in a context window (which disappears when the session ends), the workflow is persisted. Instead of guessing what an agent did, you have a step-by-step event log. Instead of hoping an agent won’t write to production without asking, you have an approval gate that stops it at the server level.

It’s not a replacement for agent intelligence. It’s the infrastructure that makes agent actions trustworthy enough to run unsupervised – except for the parts you’ve decided need a human in the loop.