NOTICE: This document was AI-assisted; when implementing a backend, always cross-check the details against the code.
In this project, MCP is used between the backend API (MCP client) and the ESP32 device (MCP server) to let the backend discover and invoke the device's capabilities (tools).
From main/protocols/protocol.cc and main/mcp_server.cc, MCP messages are wrapped inside the underlying transport (WebSocket or MQTT). The inner payload follows the JSON-RPC 2.0 specification.
Overall message layout:
{
"session_id": "...", // session id
"type": "mcp", // fixed value "mcp"
"payload": { // JSON-RPC 2.0 payload
"jsonrpc": "2.0",
"method": "...", // method name ("initialize", "tools/list", "tools/call", ...)
"params": { ... }, // arguments (for requests)
"id": ..., // request id (for requests and responses)
"result": { ... }, // success result (response)
"error": { ... } // error (response)
}
}The payload follows standard JSON-RPC 2.0:
jsonrpc: always"2.0".method: the method name (requests).params: structured parameters, usually an object (requests).id: request identifier; echoed back in responses.result: success value (responses).error: error information (responses).
MCP interactions are driven by the client (backend) discovering and invoking tools on the device.
-
Connection and capability announcement
- When: after the device boots and connects to the backend.
- Direction: device -> backend.
- Message: the device sends the transport hello, advertising supported capabilities. MCP support is signaled via
"mcp": truein thefeaturesmap. - Example (transport hello, not an MCP payload):
{ "type": "hello", "version": 1, "features": { "mcp": true }, "transport": "websocket", "audio_params": { ... }, "session_id": "..." }
-
Initialize the MCP session
-
When: after the backend sees that the device supports MCP. Usually the first MCP request.
-
Direction: backend -> device.
-
Method:
initialize -
Message (MCP payload):
{ "jsonrpc": "2.0", "method": "initialize", "params": { "capabilities": { // optional client capabilities "vision": { "url": "...", // camera image upload endpoint (must be an http URL, not a websocket URL) "token": "..." // token for the upload URL } // ... other client capabilities } }, "id": 1 } -
Device response:
{ "jsonrpc": "2.0", "id": 1, "result": { "protocolVersion": "2024-11-05", "capabilities": { "tools": {} }, "serverInfo": { "name": "...", // device name (BOARD_NAME) "version": "..." // firmware version } } }
-
-
Discover the tools
- When: whenever the backend needs the list of callable tools and their signatures.
- Direction: backend -> device.
- Method:
tools/list - Request parameters:
cursor(string, optional): pagination cursor. Empty on the first request.withUserTools(boolean, optional, defaultfalse): iftrue, the device also includes "user-only" tools (see "User-only tools" below) in the listing. This is typically used by a companion app that lets the user trigger privileged actions directly.
- Message (MCP payload):
{ "jsonrpc": "2.0", "method": "tools/list", "params": { "cursor": "", "withUserTools": false }, "id": 2 } - Device response:
{ "jsonrpc": "2.0", "id": 2, "result": { "tools": [ { "name": "self.get_device_status", "description": "...", "inputSchema": { ... } }, { "name": "self.audio_speaker.set_volume", "description": "...", "inputSchema": { ... } } // ... more tools ], "nextCursor": "..." } } - Pagination: when
nextCursoris non-empty, the backend must send anothertools/listrequest with that cursor to fetch the next page.
-
Call a tool
- When: the backend wants to execute a specific device function.
- Direction: backend -> device.
- Method:
tools/call - Message (MCP payload):
{ "jsonrpc": "2.0", "method": "tools/call", "params": { "name": "self.audio_speaker.set_volume", "arguments": { "volume": 50 } }, "id": 3 } - Successful response:
{ "jsonrpc": "2.0", "id": 3, "result": { "content": [ { "type": "text", "text": "true" } ], "isError": false } } - Error response:
{ "jsonrpc": "2.0", "id": 3, "error": { "code": -32601, "message": "Unknown tool: self.non_existent_tool" } }
-
Device-initiated notifications
- When: the device wants to inform the backend of internal events (e.g. state transitions).
Application::SendMcpMessageis the outbound entry point. - Direction: device -> backend.
- Method: conventionally
notifications/...or any custom method. - Message (MCP payload): JSON-RPC notifications have no
id.{ "jsonrpc": "2.0", "method": "notifications/state_changed", "params": { "newState": "idle", "oldState": "connecting" } } - Backend handling: process the notification without replying.
- When: the device wants to inform the backend of internal events (e.g. state transitions).
The MCP server on the device maintains two kinds of tools:
- Regular tools - registered via
McpServer::AddTool. Exposed to the backend (and hence the AI model) by default. - User-only tools - registered via
McpServer::AddUserOnlyTool. These are hidden from standardtools/listresults, because they are privileged or user-facing actions that should not be invoked autonomously by the AI. Examples include system reboot, firmware upgrade, and screen snapshot upload.
The backend opts in to user-only tools by sending tools/list with params.withUserTools = true. Typical usage: a companion app screen that exposes these actions to the end user.
See MCP IoT control usage for how to register either kind of tool on the device side.
A simplified diagram of the main MCP message flow:
sequenceDiagram
participant Device as ESP32 Device
participant BackendAPI as Backend API (Client)
Note over Device, BackendAPI: Establish WebSocket / MQTT
Device->>BackendAPI: Hello (features.mcp = true)
BackendAPI->>Device: MCP Initialize request
Note over BackendAPI: method: initialize
Note over BackendAPI: params: { capabilities: ... }
Device->>BackendAPI: MCP Initialize response
Note over Device: result: { protocolVersion, serverInfo, ... }
BackendAPI->>Device: MCP tools/list request
Note over BackendAPI: params: { cursor: "", withUserTools: false }
Device->>BackendAPI: MCP tools/list response
Note over Device: result: { tools: [...], nextCursor: ... }
loop Optional pagination
BackendAPI->>Device: MCP tools/list request
Note over BackendAPI: params: { cursor: "..." }
Device->>BackendAPI: MCP tools/list response
Note over Device: result: { tools: [...], nextCursor: "" }
end
BackendAPI->>Device: MCP tools/call request
Note over BackendAPI: params: { name, arguments }
alt Call succeeds
Device->>BackendAPI: MCP tools/call success response
Note over Device: result: { content, isError: false }
else Call fails
Device->>BackendAPI: MCP tools/call error response
Note over Device: error: { code, message }
end
opt Device notification
Device->>BackendAPI: MCP notification
Note over Device: method: notifications/...
end
This document summarizes the MCP interaction flow in this project. For exact parameter shapes, behavior, and available tools, refer to McpServer::AddCommonTools / AddUserOnlyTools in main/mcp_server.cc and the per-board InitializeTools implementations.