Documentation MCP Server (RAG-enabled)
An MCP server that provides AI assistants with semantic search access to any documentation. Uses RAG (Retrieval-Augmented Generation) with vector embeddings to find relevant content based on meaning, not just keywords.
Features
- Semantic Search - Find relevant content by meaning, not just keyword matching
- Recursive Chunking - Documents split by headers (H1 → H2 → H3) for precise retrieval
- Low Token Usage - Returns ~500-1000 tokens per query instead of 15k+ full documents
- Vector Embeddings - Uses OpenRouter (text-embedding-3-small) via Apify proxy
- PostgreSQL + pgvector - Supabase for scalable vector storage
Architecture
Scrape → Parse HTML → Recursive Chunk → Generate Embeddings → Store in Supabase
↓
User Query → Embed Query → Vector Similarity Search → Return Relevant Chunks
Available Tools
| Tool | Description | Token Usage |
|---|---|---|
list_docs | List all docs with metadata (id, title, summary, sections) | ~200 |
search | Semantic search across all docs | ~500-1000 |
get_doc_overview | Get document summary and section structure | ~200 |
get_section | Get content of a specific section | ~500-1000 |
get_chunks | Paginated access to all chunks in a doc | ~500-1000 |
search_in_doc | Semantic search within a specific document | ~500-1000 |
Setup
1. Create Supabase Project
- Go to supabase.com and create a project
- Go to SQL Editor and run the schema below
- Copy your Project URL and anon key from Settings → API
2. Supabase SQL Schema
Run this in your Supabase SQL Editor:
-- Enable pgvector extension
create extension if not exists vector;
-- Document metadata table
create table doc_metadata (
id text primary key,
title text not null,
url text not null,
summary text,
sections text[],
total_chunks integer,
type text,
created_at timestamp default now()
);
-- Chunks table with vector embeddings
create table doc_chunks (
id uuid primary key default gen_random_uuid(),
doc_id text references doc_metadata(id) on delete cascade,
doc_title text,
doc_url text,
section_path text[],
heading text,
content text not null,
token_count integer,
chunk_index integer,
embedding vector(1536),
created_at timestamp default now()
);
-- Function for similarity search
create or replace function search_chunks(
query_embedding vector(1536),
match_count int default 5,
filter_doc_id text default null
)
returns table (
id uuid,
doc_id text,
doc_title text,
section_path text[],
heading text,
content text,
similarity float
)
language plpgsql
as $$
begin
return query
select
dc.id,
dc.doc_id,
dc.doc_title,
dc.section_path,
dc.heading,
dc.content,
1 - (dc.embedding <=> query_embedding) as similarity
from doc_chunks dc
where (filter_doc_id is null or dc.doc_id = filter_doc_id)
order by dc.embedding <=> query_embedding
limit match_count;
end;
$$;
-- Optional: Add index for large datasets (1000+ chunks)
-- create index doc_chunks_embedding_idx
-- on doc_chunks using ivfflat (embedding vector_cosine_ops)
-- with (lists = 100);
3. Configure Environment Variables
Set these in your Apify Actor or use Apify secrets:
# Add secrets (recommended)
apify secrets add supabaseKey "your-supabase-anon-key"
In .actor/actor.json:
{
"environmentVariables": {
"SUPABASE_URL": "https://your-project.supabase.co",
"SUPABASE_KEY": "@supabaseKey"
}
}
4. Configure Documentation Sources
| Input | Description | Default |
|---|---|---|
| Start URLs | Documentation pages to scrape | Apify SDK docs |
| Max Pages | Maximum pages to scrape (1-1000) | 100 |
| Force Refresh | Re-scrape even if data exists | false |
Example input:
{
"startUrls": [
{ "url": "https://docs.example.com/getting-started" },
{ "url": "https://docs.example.com/api-reference" }
],
"maxPages": 200,
"forceRefresh": false
}
5. Deploy and Connect
# Push to Apify
apify push
# Add to Claude Code (get URL from Actor output)
claude mcp add api-docs https://<YOUR_ACTOR_URL>/mcp \
--transport http \
--header "Authorization: Bearer <YOUR_APIFY_TOKEN>"
Usage Examples
Once connected, your AI assistant can:
"Search the docs for authentication best practices"
→ Returns relevant chunks from multiple documents
"Show me the overview of the API reference doc"
→ Returns summary and section list
"Get the 'Getting Started' section from doc-1"
→ Returns specific section content
"What documentation is available?"
→ Returns list of all docs with summaries
API Endpoints
| Endpoint | Method | Description |
|---|---|---|
/ | GET | Server status and setup instructions |
/mcp | POST | MCP protocol endpoint |
/refresh-docs | POST | Re-scrape and update all documentation |
/stats | GET | Get document and chunk statistics |
How Chunking Works
Documents are split recursively by headers:
Document (15k tokens)
├── H1: Introduction (chunk 1, ~600 tokens)
├── H2: Getting Started
│ ├── H3: Installation (chunk 2, ~400 tokens)
│ └── H3: Configuration (chunk 3, ~500 tokens)
├── H2: API Reference
│ ├── H3: Methods (chunk 4, ~700 tokens)
│ └── H3: Examples (chunk 5, ~600 tokens)
└── ...
- Target chunk size: 500-800 tokens
- Max chunk size: 1000 tokens
- Min chunk size: 100 tokens (smaller sections merged with parent)
Pricing
This Actor uses pay-per-event pricing through Apify. Costs include:
- Scraping: Initial crawl of documentation
- Embeddings: Generated via OpenRouter (charged to Apify account)
- Tool calls: Each MCP tool invocation
Development
# Install dependencies
npm install
# Run locally
npm run start:dev
# Build
npm run build
# Push to Apify
apify push
License
ISC
