What is MCP and why should I care?
The Model Context Protocol (MCP) is an open standard for piping external data and tools into large-language-model hosts. Think of it as USB-C for LLMs—a single plug that lets any compliant client talk to any compliant server and swap in new data sources or models without rewiring everything.
CustomGPT.ai now ships a Hosted MCP Server for every agent you create:
- Top-ranked RAG built-in. Retrieval layer independently benchmarked #1 for business-document accuracy and minimal hallucination.
- Fully managed. We run the container, watch the graphs, patch the OS, and keep TLS fresh.
- Zero extra cost for existing plans. Free-trial users can test it too.
- Standard endpoint. Works with any client that understands MCP (Claude Desktop, Cursor, Windsurf, n8n, Zapier, etc.).
Architecture at a glance
┌──────────────────────────┐
───────▶│ CustomGPT Hosted MCP │ SSE / HTTPS
│ (RAG + auth, scaled) │──────────┐
└──────────────────────────┘ │
▲ ▼
Private knowledge base MCP-aware clients
(PDF, Drive, Notion …) (Claude, Cursor, Windsurf …)
Internally we expose an SSE (Server-Sent Events) stream at
https://mcp.customgpt.ai/projects/<PROJECT_ID>/sse?token=<TOKEN>
The stream carries JSON events defined by the MCP spec. You authenticate with an MCP token (see below).
Prerequisites
Item | Notes |
---|---|
CustomGPT.ai account | Free trial or paid |
A project / agent with data ingested | PDFs, Google Drive, Notion, Confluence, Web, etc. |
One of the supported clients | Claude Desktop, Cursor IDE, Windsurf, Trae, n8n, Zapier, or your own code |
Desktop host only (for now) | MCP currently runs only on desktop hosts; browser-based hosts are in progress.([Model Context Protocol][1]) |
Node.js ≥ 18 | Required by several clients (Claude, Trae) to spawn the gateway |
Enable the Hosted MCP Server
- Log in to app.customgpt.ai.
- Create / open a project and ingest your data.
- Test it in the ASK tab to confirm answers look good.
- Click Deploy ➞ MCP Server (Beta).
- Hit Generate MCP Token. A toast will confirm creation; the token is embedded in the sample JSON.
- Copy the server configuration shown on-screen.
Security note – the token is a bearer secret. Keep it out of repos and rotate it from the same panel if it leaks.
Supported Clients
Get more info here:
Deep Dive into Model Context Protocol (MCP)
Key Concept | What it means in practice |
---|---|
Purpose | Standard I/O contract so any LLM client (IDE, chat, workflow engine) can call any back-end “context provider” the same way. |
Transport | Server-Sent Events (SSE) over HTTPS. One long-lived stream; the server pushes JSON blobs that match the official schema. |
Message types | tool_list , tool_call , content_chunk , error , done . Clients send JSON; servers answer in-stream. |
Discovery | Each server advertises one or more tools. In our case you’ll always see send_message (RAG search) plus, for advanced plans, upload_file , list_sources , etc. |
Security | Bearer-token header or query-string param. No cookies. Tokens are opaque 256-bit secrets that you rotate in the UI. |
Why SSE, not WebSocket? | Simpler proxy support, out-of-the-box with curl , no extra heartbeat logic, and Anthropic picked it for the reference gateway (supergateway ). |
Where CustomGPT fits?
Client (Cursor / Claude / n8n)
│ JSON request
▼
┌─────────────────────────┐
│ npx supergateway │ ← optional local sidecar; some clients spawn this
└─────────────────────────┘
│ SSE
▼
┌─────────────────────────┐
│ Hosted MCP Server (CustomGPT.ai) ← you enable this
│ • Auth + rate-limit │
│ • RAG retrieval layer │
│ • Streaming responder │
└─────────────────────────┘
│
▼
Knowledge base (PDF, Drive, Notion …)
If a client can hit SSE directly (Cursor, Windsurf, Zapier), you just feed it the URL. If it needs a local sidecar (Claude Desktop, Trae), you install Node.js; the sidecar opens the SSE and the client talks Unix pipes.