Documentation MCP Server (RAG-enabled)

An MCP server that provides AI assistants with semantic search access to any documentation. Uses RAG (Retrieval-Augmented Generation) with vector embeddings to find relevant content based on meaning, not just keywords.

Features

Semantic Search - Find relevant content by meaning, not just keyword matching
Recursive Chunking - Documents split by headers (H1 → H2 → H3) for precise retrieval
Low Token Usage - Returns ~500-1000 tokens per query instead of 15k+ full documents
Vector Embeddings - Uses OpenRouter (text-embedding-3-small) via Apify proxy
PostgreSQL + pgvector - Supabase for scalable vector storage

Architecture

Scrape → Parse HTML → Recursive Chunk → Generate Embeddings → Store in Supabase
                                                                      ↓
User Query → Embed Query → Vector Similarity Search → Return Relevant Chunks

Available Tools

Tool	Description	Token Usage
`list_docs`	List all docs with metadata (id, title, summary, sections)	~200
`search`	Semantic search across all docs	~500-1000
`get_doc_overview`	Get document summary and section structure	~200
`get_section`	Get content of a specific section	~500-1000
`get_chunks`	Paginated access to all chunks in a doc	~500-1000
`search_in_doc`	Semantic search within a specific document	~500-1000

Setup

1. Create Supabase Project

Go to supabase.com and create a project
Go to SQL Editor and run the schema below
Copy your Project URL and anon key from Settings → API

2. Supabase SQL Schema

Run this in your Supabase SQL Editor:

-- Enable pgvector extension
create extension if not exists vector;

-- Document metadata table
create table doc_metadata (
    id text primary key,
    title text not null,
    url text not null,
    summary text,
    sections text[],
    total_chunks integer,
    type text,
    created_at timestamp default now()
);

-- Chunks table with vector embeddings
create table doc_chunks (
    id uuid primary key default gen_random_uuid(),
    doc_id text references doc_metadata(id) on delete cascade,
    doc_title text,
    doc_url text,
    section_path text[],
    heading text,
    content text not null,
    token_count integer,
    chunk_index integer,
    embedding vector(1536),
    created_at timestamp default now()
);

-- Function for similarity search
create or replace function search_chunks(
    query_embedding vector(1536),
    match_count int default 5,
    filter_doc_id text default null
)
returns table (
    id uuid,
    doc_id text,
    doc_title text,
    section_path text[],
    heading text,
    content text,
    similarity float
)
language plpgsql
as $$
begin
    return query
    select
        dc.id,
        dc.doc_id,
        dc.doc_title,
        dc.section_path,
        dc.heading,
        dc.content,
        1 - (dc.embedding <=> query_embedding) as similarity
    from doc_chunks dc
    where (filter_doc_id is null or dc.doc_id = filter_doc_id)
    order by dc.embedding <=> query_embedding
    limit match_count;
end;
$$;

-- Optional: Add index for large datasets (1000+ chunks)
-- create index doc_chunks_embedding_idx
-- on doc_chunks using ivfflat (embedding vector_cosine_ops)
-- with (lists = 100);

3. Configure Environment Variables

Set these in your Apify Actor or use Apify secrets:

# Add secrets (recommended)
apify secrets add supabaseKey "your-supabase-anon-key"

In .actor/actor.json:

{
  "environmentVariables": {
    "SUPABASE_URL": "https://your-project.supabase.co",
    "SUPABASE_KEY": "@supabaseKey"
  }
}

4. Configure Documentation Sources

Input	Description	Default
Start URLs	Documentation pages to scrape	Apify SDK docs
Max Pages	Maximum pages to scrape (1-1000)	100
Force Refresh	Re-scrape even if data exists	false

Example input:

{
  "startUrls": [
    { "url": "https://docs.example.com/getting-started" },
    { "url": "https://docs.example.com/api-reference" }
  ],
  "maxPages": 200,
  "forceRefresh": false
}

5. Deploy and Connect

# Push to Apify
apify push

# Add to Claude Code (get URL from Actor output)
claude mcp add api-docs https://<YOUR_ACTOR_URL>/mcp \
  --transport http \
  --header "Authorization: Bearer <YOUR_APIFY_TOKEN>"

Usage Examples

Once connected, your AI assistant can:

"Search the docs for authentication best practices"
→ Returns relevant chunks from multiple documents

"Show me the overview of the API reference doc"
→ Returns summary and section list

"Get the 'Getting Started' section from doc-1"
→ Returns specific section content

"What documentation is available?"
→ Returns list of all docs with summaries

API Endpoints

Endpoint	Method	Description
`/`	GET	Server status and setup instructions
`/mcp`	POST	MCP protocol endpoint
`/refresh-docs`	POST	Re-scrape and update all documentation
`/stats`	GET	Get document and chunk statistics

How Chunking Works

Documents are split recursively by headers:

Document (15k tokens)
├── H1: Introduction (chunk 1, ~600 tokens)
├── H2: Getting Started
│   ├── H3: Installation (chunk 2, ~400 tokens)
│   └── H3: Configuration (chunk 3, ~500 tokens)
├── H2: API Reference
│   ├── H3: Methods (chunk 4, ~700 tokens)
│   └── H3: Examples (chunk 5, ~600 tokens)
└── ...

Target chunk size: 500-800 tokens
Max chunk size: 1000 tokens
Min chunk size: 100 tokens (smaller sections merged with parent)

Pricing

This Actor uses pay-per-event pricing through Apify. Costs include:

Scraping: Initial crawl of documentation
Embeddings: Generated via OpenRouter (charged to Apify account)
Tool calls: Each MCP tool invocation

Development

# Install dependencies
npm install

# Run locally
npm run start:dev

# Build
npm run build

# Push to Apify
apify push

License

ISC