# ServerChat

Per-user chat GenServer that routes messages to a pluggable LLM provider with MCP tool integration.

---

## Table of Contents

1. [Overview](#overview)
2. [Features](#features)
3. [Providers](#providers)
4. [Configuration](#configuration)
5. [Usage](#usage)
6. [API Reference](#api-reference)
7. [Internals / Flow](#internals--flow)
8. [Database Schema](#database-schema)
9. [Troubleshooting](#troubleshooting)
10. [Related Documentation](#related-documentation)

---

## Overview

`ServerChat` is a GenServer that manages a conversation session for a single user. It holds the conversation history, the active LLM provider, formatted tools, and the API key. Chat messages are dispatched to a background Task so the GenServer remains responsive to other calls.

Each instance self-terminates after 30 minutes of inactivity. On every chat turn, tools and the system message are refreshed from the user's current permissions, so permission grants and revocations take effect immediately without requiring `/clear` or a restart.

Conversation history is trimmed to 50 messages. Leading orphaned tool results are dropped to prevent API validation errors.

## Features

| Feature | Description |
|---------|-------------|
| Multi-provider | Five LLM providers behind a unified API |
| Runtime switching | Change provider or model per-session without restart |
| MCP tool integration | Tools fetched from `MCPServer.Manager`, formatted per-provider |
| Permission-aware tools | Tools refreshed every chat turn from user's current permissions |
| Background dispatch | LLM calls run in a background Task; GenServer stays responsive |
| Busy guard | Concurrent chat calls rejected with `{:error, :chat_in_progress}` |
| Auto-terminate | 30-minute idle timeout -- process self-terminates |
| History trimming | Max 50 messages, orphaned tool results dropped from head |
| Chat logging | All user/assistant messages persisted to `:chat_db` SQLite |
| Dynamic system message | System prompt rebuilt on each turn reflecting current tool set |

## Providers

Five providers implement the `ServerChat.Provider` behaviour:

| Key | Module | API | Notes |
|-----|--------|-----|-------|
| `:anthropic` | `ServerChat.Anthropic` | Anthropic Messages `/v1/messages` | Native MCP format (passthrough) |
| `:openrouter` | `ServerChat.OpenRouter` | OpenAI-compatible `/v1/chat/completions` | Default provider |
| `:perplexity` | `ServerChat.Perplexity` | Perplexity Responses `/v1/responses` | Citations, preset vs model routing |
| `:gemini` | `ServerChat.Gemini` | Google Gemini REST API | API key auth |
| `:gemini_cli` | `ServerChat.GeminiCli` | Gemini CLI (Google One AI Pro) | $0 cost, file-based auth per user |

### Provider Behaviour

Each provider implements three callbacks defined in `ServerChat.Provider`:

- **`format_tools/1`** -- Convert MCP tool definitions to the provider's wire format
- **`call_llm/4`** -- Make the HTTP request to the provider's API
- **`handle_response/5`** -- Parse the response, execute tool calls if needed, return final text

### Perplexity Provider Details

The Perplexity provider uses the `/v1/responses` endpoint (not `/chat/completions`):

- **Presets vs Models**: Bare names like `"sonar"` are sent as `preset:`, while `"provider/model"` format like `"openai/gpt-5.2"` is sent as `model:`
- **Function calling**: Uses `function_call` / `function_call_output` items in the `input` array with `call_id` matching
- **Citations**: Extracted from `annotations` on `output_text` blocks and appended as numbered source references
- **Empty content filtering**: Messages with empty content are stripped from input to avoid API validation errors

## Configuration

### Compile-time (config.exs)

```elixir
config :mcp,
  llm_provider: :openrouter,
  openrouter_model: "x-ai/grok-4.1-fast"
```

### Runtime (environment variables)

```bash
# Override provider
export LLM_PROVIDER=openrouter

# Provider-specific keys and models
export ANTHROPIC_API_KEY=sk-...
export CLAUDE_MODEL=claude-sonnet-4-5-20250929

export OPENROUTER_API_KEY=sk-...
export OPENROUTER_MODEL=anthropic/claude-sonnet-4.5

export PERPLEXITY_API_KEY=pplx-...
export PERPLEXITY_MODEL=sonar-pro

export GEMINI_API_KEY=...
export GEMINI_MODEL=gemini-3.1-pro-preview
```

### Default model resolution order

1. Environment variable (e.g. `OPENROUTER_MODEL`)
2. Application config (e.g. `:openrouter_model`)
3. Hardcoded fallback

| Provider | Env var | Config key | Default |
|----------|---------|------------|---------|
| Anthropic | `CLAUDE_MODEL` | `:claude_model` | `claude-sonnet-4-5-20250929` |
| OpenRouter | `OPENROUTER_MODEL` | `:openrouter_model` | `x-ai/grok-4.1-fast` |
| Perplexity | `PERPLEXITY_MODEL` | `:perplexity_model` | `perplexity/sonar` |
| Gemini | `GEMINI_MODEL` | `:gemini_model` | `gemini-3.1-pro-preview` |
| GeminiCli | `GEMINI_MODEL` | `:gemini_model` | `gemini-3-flash-preview` |

## Usage

### Basic chat session

```elixir
{:ok, pid} = ServerChat.start_link(user_id: 1)
{:ok, reply} = ServerChat.chat(pid, "What is 5 + 10?")
# => {:ok, "5 + 10 = 15"}
```

### Switch provider mid-session

```elixir
# Clears history (message formats differ across providers)
:ok = ServerChat.switch_provider(pid, :perplexity)
:ok = ServerChat.switch_provider(pid, :openrouter, "anthropic/claude-sonnet-4.5")
```

### Override model without clearing history

```elixir
:ok = ServerChat.set_model(pid, "sonar-pro")
:ok = ServerChat.set_model(pid, nil)  # revert to default
```

### Check current state

```elixir
ServerChat.model()      # global default from config/env
ServerChat.model(pid)   # effective model for this instance
ServerChat.provider(pid) # => "openrouter"
ServerChat.history(pid)  # => [%{role: "user", content: "..."}, ...]
```

### Clean up

```elixir
:ok = ServerChat.clear(pid)   # clear history, keep provider
ServerChat.stop(pid)           # terminate the process
```

## API Reference

### `start_link/1`

Start a new chat GenServer for the given user.

**Parameters:**
- `opts` (keyword) -- required `:user_id` (integer)

**Returns:** `{:ok, pid}`

```elixir
{:ok, pid} = ServerChat.start_link(user_id: 42)
```

### `chat/2`

Send a chat message and wait for the LLM reply. Blocks until the provider responds (uses `:infinity` timeout). Returns `{:error, :chat_in_progress}` if another message is already being processed.

**Parameters:**
- `pid` (pid) -- the ServerChat process
- `message` (string) -- user's message text

**Returns:** `{:ok, reply_text}` or `{:error, reason}`

```elixir
{:ok, reply} = ServerChat.chat(pid, "What tools do I have access to?")
{:error, :chat_in_progress} = ServerChat.chat(pid, "busy")
```

### `history/1`

Return the full conversation message history.

**Parameters:**
- `pid` (pid) -- the ServerChat process

**Returns:** list of `%{role: String.t(), content: String.t()}`

```elixir
messages = ServerChat.history(pid)
# => [%{role: "system", content: "..."}, %{role: "user", content: "..."}, ...]
```

### `clear/1`

Clear conversation history, keeping the same provider and tools. Resets to just the system message.

**Parameters:**
- `pid` (pid) -- the ServerChat process

**Returns:** `:ok`

```elixir
:ok = ServerChat.clear(pid)
```

### `refresh_tools/1`

Re-fetch and re-format the MCP tool list from `MCPServer.Manager`. Normally not needed -- tools are refreshed on every chat turn automatically.

**Parameters:**
- `pid` (pid) -- the ServerChat process

**Returns:** `:ok`

```elixir
:ok = ServerChat.refresh_tools(pid)
```

### `switch_provider/3`

Switch to a different LLM provider. Clears conversation history since message formats differ across providers. Optionally set a model.

**Parameters:**
- `pid` (pid) -- the ServerChat process
- `provider_key` (atom) -- `:anthropic`, `:openrouter`, `:perplexity`, `:gemini`, or `:gemini_cli`
- `model` (string | nil) -- optional model override

**Returns:** `:ok`, `{:error, :unknown_provider}`, `{:error, :missing_api_key}`, or `{:error, :chat_in_progress}`

```elixir
:ok = ServerChat.switch_provider(pid, :perplexity)
:ok = ServerChat.switch_provider(pid, :openrouter, "anthropic/claude-sonnet-4.5")
{:error, :missing_api_key} = ServerChat.switch_provider(pid, :anthropic)
```

### `set_model/2`

Override the model for this session without changing provider or history. Pass `nil` to revert to the default model for the current provider.

**Parameters:**
- `pid` (pid) -- the ServerChat process
- `model` (string | nil) -- model name or nil to revert

**Returns:** `:ok`

```elixir
:ok = ServerChat.set_model(pid, "sonar-pro")
:ok = ServerChat.set_model(pid, nil)
```

### `model/0`

Return the default model string for the globally configured provider. Does not reflect per-instance overrides.

**Returns:** string

```elixir
ServerChat.model()
# => "x-ai/grok-4.1-fast"
```

### `model/1`

Return the effective model for a running ServerChat instance, respecting any per-session override.

**Parameters:**
- `pid` (pid) -- the ServerChat process

**Returns:** string

```elixir
ServerChat.model(pid)
# => "sonar-pro"
```

### `provider/1`

Return the provider name (lowercase string) for a running instance.

**Parameters:**
- `pid` (pid) -- the ServerChat process

**Returns:** string

```elixir
ServerChat.provider(pid)
# => "openrouter"
```

### `call_mcp_tool/3`

Execute an MCP tool by name via `MCPServer.Manager`. Called by provider modules during tool-use loops, not typically called directly.

**Parameters:**
- `user_id` (integer) -- the user's ID (for permission scoping)
- `tool_name` (string) -- MCP tool name
- `args` (map) -- tool arguments

**Returns:** string (tool result text or error message)

```elixir
result = ServerChat.call_mcp_tool(42, "current_time", %{})
# => "2026-03-13T10:30:00+07:00"
```

### `setup_database/1`

Create the `chat_log` table in the given Sqler database. Called once at application startup.

**Parameters:**
- `db` (atom) -- database name (default `:chat_db`)

**Returns:** `:ok`

```elixir
ServerChat.setup_database(:chat_db)
```

### `stop/1`

Gracefully stop the GenServer.

**Parameters:**
- `pid` (pid) -- the ServerChat process

```elixir
ServerChat.stop(pid)
```

## Internals / Flow

### GenServer State

```elixir
%{
  user_id: 42,
  messages: [%{role: "system", content: "..."}, ...],
  api_key: "sk-...",
  tools: [%{name: "current_time", ...}, ...],
  provider: ServerChat.OpenRouter,
  busy: false,
  model_override: nil  # or "sonar-pro"
}
```

### Chat Flow

```
ServerChat.chat(pid, "Hello")
  |-- GenServer receives {:chat, message}
  |-- Check busy flag -> reject if true
  |-- Set busy = true
  |-- Refresh tools from MCPServer.Manager (permission-aware)
  |-- Update system message if tools changed
  |-- Spawn background Task:
  |   |-- provider.call_llm(api_key, tools, messages, model)
  |   |-- provider.handle_response(...) -- may loop on tool calls
  |   +-- send {:chat_result, from, user_id, message, result} back
  +-- GenServer receives {:chat_result, ...}
      |-- Log user + assistant messages to :chat_db
      |-- Trim history to 50 messages
      |-- Set busy = false
      +-- Reply to caller with {:ok, reply_text}
```

### Tool Refresh on Every Turn

When `chat/2` is called:

1. `fetch_tools(user_id)` calls `MCPServer.Manager.list_tools(user_id)` -- filtered by current permissions
2. Tools are formatted for the active provider via `provider.format_tools/1`
3. The system message is rebuilt with the current tool set
4. This means permission grants/revocations take effect on the next chat message

### Provider Resolution

```
1. Check LLM_PROVIDER env var -> atom -> lookup in @providers map
2. Check Application.get_env(:mcp, :llm_provider) -> lookup in @providers map
3. Default: ServerChat.OpenRouter
```

### Idle Timeout

Every GenServer callback returns `@idle_timeout` (30 minutes) as the timeout value. When the timeout fires, `handle_info(:timeout, state)` calls `{:stop, :normal, state}`, cleanly terminating the process.

## Database Schema

Chat messages are logged to the `:chat_db` SQLite database:

| Column | Type | Description |
|--------|------|-------------|
| `id` | INTEGER PK | Millisecond timestamp (Sqler convention) |
| `updated_at` | INTEGER | Timestamp |
| `user_id` | INTEGER NOT NULL | User who sent/received the message |
| `role` | TEXT NOT NULL | `"user"` or `"assistant"` |
| `content` | TEXT NOT NULL | Message content |

Index: `idx_chat_log_user_id` on `user_id`.

## Troubleshooting

**`{:error, :chat_in_progress}`**
- Only one chat call can be in progress at a time per instance. Wait for the current call to complete.

**`{:error, :missing_api_key}`**
- The environment variable for the requested provider is not set. Check `ANTHROPIC_API_KEY`, `OPENROUTER_API_KEY`, `PERPLEXITY_API_KEY`, or `GEMINI_API_KEY`.

**Tools not appearing in responses**
- Tools are fetched per-turn from the user's permissions. Check that the user has the correct permission keys via `AccessControl.get_permissions(user_id)`.
- The provider must support tool use. Verify with `ServerChat.provider(pid)`.

**Process terminated unexpectedly**
- ServerChat self-terminates after 30 minutes of inactivity. Start a new instance with `ServerChat.start_link(user_id: ...)`.

**Orphaned tool results in history**
- `trim_messages/1` drops leading non-user messages to prevent sending orphaned tool results to APIs. This is automatic.

**Perplexity: empty content errors**
- The Perplexity provider filters out messages with empty content before sending. If tool results return empty strings, they are stripped.

## Related Documentation

- [ServerChat.Provider](server_chat_provider.md) -- Provider behaviour definition
- [ServerChat.Anthropic](server_chat_anthropic.md) -- Anthropic provider
- [ServerChat.OpenRouter](server_chat_openrouter.md) -- OpenRouter provider
- [ServerChat.Perplexity](server_chat_perplexity.md) -- Perplexity provider
- [ServerChat Tracker](server_chat_tracker.md) -- Tracks active ServerChat instances
- [OpenRouter](openrouter.md) -- OpenRouter API integration details
- [Provider](provider.md) -- LLM provider abstraction

---

*Source: `lib/server_chat.ex` -- Last updated: 2026-03-13*