# MCP Bridge Chrome Extension Persistent WebSocket bridge between Chrome and the MCP server. Tracks page visits, executes server-driven DOM commands, and streams extraction results back. ## Files ``` mcp-extension/ ├── manifest.json Manifest V3 config, permissions, entry points ├── background.js Service worker — tab tracking, message routing, offscreen lifecycle ├── offscreen.html Minimal HTML shell for the offscreen document ├── offscreen.js Persistent WebSocket connection to wss://workingagents.ai/ws ├── content.js DOM command executor, injected into every page └── icons/ Toolbar icons (16/48/128px) ``` ## Architecture ``` Server (WsHandler) ↕ WebSocket (wss://) offscreen.js Holds the WebSocket, relays JSON both directions ↕ chrome.runtime.sendMessage background.js Routes messages, tracks tabs, manages offscreen lifecycle ↕ chrome.tabs.sendMessage(tabId) content.js Executes DOM commands on the actual page ``` Messages flow through Chrome's internal messaging API. The server never talks to content scripts directly — everything is relayed through background.js. ## Permissions (manifest.json) | Permission | Why | |---|---| | `offscreen` | Create hidden document to hold the WebSocket (service workers can't) | | `tabs` | Monitor tab navigation and closures | | `cookies` | Read the `user` auth cookie from workingagents.ai | | `activeTab` + `scripting` | Interact with the active tab on toolbar icon click | | `` host permission | Inject content.js into every page | ## offscreen.js — WebSocket Keeper Chrome Manifest V3 kills service workers after ~30s of inactivity, which kills WebSocket connections with them. The offscreen document is an invisible HTML page Chrome keeps alive, providing a persistent home for the WebSocket. **Responsibilities:** - Connects to `wss://workingagents.ai/ws?token=` - Receives JSON from server → forwards to background.js with `source: "offscreen"` - Receives JSON from background.js (tagged `target: "offscreen"`) → sends to server via `ws.send` - Reconnects with exponential backoff: 1s → 2s → 4s → ... → 30s max **Auth flow:** background.js reads the `user` cookie → sends it to offscreen.js → offscreen.js appends it as a query parameter on the WebSocket URL. ## background.js — The Router Service worker with three jobs: ### 1. Offscreen Lifecycle Creates the offscreen document on install/startup. If it disappears (killed by Chrome, extension updated), recreates it on the next outbound message. Sends auth token after a 500ms delay for listener initialization. ### 2. Page Tracking Maintains a `tabUrls` Map of `tabId → URL`. - **Tab navigation:** If the tab had a different URL before, sends `page_departure` for the old URL, then `page_visit` for the new one (with URL, title, and tab_id) - **Tab close:** Sends `page_departure`, cleans up the map - **Chrome internal pages:** Filtered out (`chrome://`, `chrome-extension://`) The `tab_id` is included in every event so the server can route DOM commands back to the correct tab. ### 3. Message Routing | Direction | Source | Routing | |---|---|---| | Server → browser | offscreen.js (`source: "offscreen"`) | `dom_command` → `chrome.tabs.sendMessage(tab_id)` to content script. Other types (`welcome`, `rpc`, `notification`, `summary_ack`) logged. | | Browser → server | content.js (`sender.tab`) | `dom_result` and `dom_mutation` → forwarded to offscreen.js for relay | If a `dom_command` can't reach its tab (closed, navigated away), sends an error `dom_result` back so the server doesn't hang waiting. ### 4. Summarize Page Toolbar icon click sends `summarize_page` with URL and title to the server, which triggers `Summary.request_summary`. ## content.js — DOM Executor Injected into every page at `document_idle`. Completely passive — does nothing until it receives a `dom_command` from background.js. ### Supported Commands | Command | Description | |---|---| | `dom_query` | `querySelector` — returns one element's text, HTML, or attribute | | `dom_query_all` | `querySelectorAll` — returns array of element data | | `dom_click` | Clicks an element matching a CSS selector | | `dom_wait` | MutationObserver waits for a selector to appear, with configurable timeout | | `dom_watch` | Persistent MutationObserver, streams `dom_mutation` events back to server | | `dom_unwatch` | Disconnects a watcher by selector | | `dom_extract` | Multi-field structured extraction (the main workhorse) | ### Extraction Types | Type | Returns | |---|---| | `text` | `el.textContent.trim()` | | `html` | `el.innerHTML` | | `attribute` | `el.getAttribute(name)` | | `table` | `{headers: [...], rows: [[...]]}` from `` | | `list` | Array of text content from `querySelectorAll` | ### Message Handling Content.js does NOT use `sendResponse`. Results are sent as separate `chrome.runtime.sendMessage` calls. This prevents Chrome from keeping the message channel open — if the async DOM operation takes too long, Chrome would auto-close it and reject the `tabs.sendMessage` promise in background.js, causing a duplicate error response. Watchers are cleaned up on `beforeunload`. Page departure detection is handled by background.js (more reliable than content script unload events). ## Server-Side Integration ### Modules | Module | File | Role | |---|---|---| | `WsHandler` | `lib/ws_handler.ex` | WebSocket protocol handler, message routing, rate limiting (30 msg/s) | | `WsPage` | `lib/ws_page.ex` | Per-user session state, page visit history | | `PageScraperServer` | `lib/page_scraper_server.ex` | Per-{user, pattern} GenServer, orchestrates DOM extraction lifecycle | | `PageScraper` | `lib/page_scraper.ex` | URL pattern registry and generic extraction storage | | `PageScraper.Handler` | `lib/page_scraper/handler.ex` | Behaviour defining handler callbacks | ### WebSocket Endpoint ``` GET /ws → WebSockAdapter.upgrade → WsHandler.init(%{user: user}) ``` Registers `{:handler, username}` and `{:page, username}` in `WsRegistry`. ### Message Types (Server Receives) | Type | From | Action | |---|---|---| | `page_visit` | Extension | Match URL against patterns → start PageScraperServers | | `page_departure` | Extension | Notify all scrapers for departing URL | | `dom_result` | Extension | Route to PageScraperServer by `pattern_id` | | `dom_mutation` | Extension | Route MutationObserver events to scraper | | `extension_connected` | Extension | Log connection | | `summarize_page` | Extension | Trigger `Summary.request_summary` | ### PageScraperServer Lifecycle 1. `WsHandler` receives `page_visit` → `PageScraper.match_url(url)` finds matching patterns 2. `PageScraperServer.get_or_start` starts a GenServer per {username, pattern_id} 3. Calls `handler.on_match(url, captures)` → returns extraction plan 4. If plan has `:wait`, sends `dom_wait` command (waits for selector to appear) 5. Sends all `:extractions` specs concurrently as separate `dom_extract` commands 6. Each `dom_result` arrives via cast, routed to `handler.on_extraction(step, data, state)` 7. Handler returns action: `:store`, `:next_step`, `:noop`, or `:stop` 8. `:store` calls `handler.store_result(url, captures, data)` to persist **Timers:** - Idle timeout: 30 minutes - Departure grace period: 5 seconds - Request timeout: 15 seconds ### Extraction Plan Structure ```elixir %{ wait: %{selector: "...", timeout: 10_000}, extractions: [%{name: "title", selector: ".title", type: :text}, ...], watches: [%{selector: "...", attributes: true, children: true, subtree: true}], poll: %{interval: 30_000, extractions: [...]}, sequence: [{:click, selector}, {:wait, sel, timeout}, {:extract, name, spec}, {:delay, ms}], handler_state: %{} } ``` ### Handler Behaviour Callbacks | Callback | Purpose | |---|---| | `on_match(url, captures)` | Define extraction plan for matched URL | | `on_extraction(step, data, state)` | Process results, decide next action | | `on_departure(state)` | Cleanup on page leave (optional) | | `on_mutation(selector, data, state)` | Handle watched DOM changes (optional) | | `setup_database()` | Create handler's table (optional) | | `url_exists?(url)` | Deduplication check (optional) | | `store_result(url, captures, data)` | Persist to handler's table (optional) | ### Example Handlers **GithubPR** (`lib/page_scraper/handlers/github_pr.ex`): - Pattern: `github.com/:owner/:repo/pull/:number` - Wait: `.markdown-title` (10s) - Extracts: title, state, author, body, files count, commits count - Stores in `github_prs` table **CrunchbaseFunding** (`lib/page_scraper/handlers/crunchbase_funding.ex`): - Pattern: `crunchbase.com/organization/:slug` - Wait: `#company_funding table` (15s) - Extracts: 13 fields across funding, investors, company info - Stores in `crunchbase_companies` table ## DOM Command Round-Trip Example Server wants to extract a PR title from a GitHub tab: ``` 1. Server sends dom_command {tab_id: 42, command: "dom_extract", payload: {extractions: [{name: "title", selector: ".markdown-title", type: "text"}]}, pattern_id: "github_pr", request_id: "abc"} ↓ WebSocket 2. offscreen.js receives JSON → chrome.runtime.sendMessage({source: "offscreen", data: ...}) ↓ 3. background.js sees dom_command → chrome.tabs.sendMessage(42, ...) ↓ 4. content.js runs document.querySelector(".markdown-title").textContent.trim() → chrome.runtime.sendMessage({type: "dom_result", request_id: "abc", result: {title: "Fix bug"}}) ↓ 5. background.js sees dom_result from tab → sendToOffscreen(message) ↓ 6. offscreen.js sends over WebSocket → server ``` ## Why This Architecture **Why offscreen document for WebSocket?** Chrome kills service workers after ~30s of inactivity. The offscreen document is the only Manifest V3 mechanism for persistent connections. **Why not WebSocket in content scripts?** Content scripts run in the page's origin. A WebSocket from `github.com` to `workingagents.ai` would hit CORS. The offscreen document runs in the extension's origin. **Why route through background.js?** Only the service worker has `chrome.tabs.sendMessage(tabId)`. The offscreen document can only broadcast via `chrome.runtime.sendMessage`. **Why fire-and-forget in content.js?** Using `sendResponse` keeps the message channel open. If the async DOM operation takes too long, Chrome auto-closes it and rejects the promise in background.js, causing a duplicate error result. Separate messages avoid this race.