YouTube Knowledge Base MCP
An MCP server that builds a searchable knowledge base from video content.
Why
We consume more content than we can remember. Videos watched, podcasts heard, lectures attended—the information fades. This project builds a searchable knowledge base from that content. Start with YouTube, expand to other sources.
The key: it's an MCP server. Plug it into any LLM (Claude, GPT, local models) and your AI assistant can search everything you've ever watched. Your memory, augmented.
Features
- Extract transcripts from YouTube videos
- Hybrid search (semantic + keyword)
- Timestamped links to exact video moments
- Organize with tags and notes
- Multiple embedding providers (Voyage, OpenAI, local)
Installation
Requirements
- Python 3.10+
- uv package manager
- One of: Voyage API key, OpenAI API key, or local Ollama
Setup
git clone https://github.com/yourusername/youtube-knowledge-base-mcp.git
cd youtube-knowledge-base-mcp
uv sync
Environment
cp .env.example .env
Add your API key (at least one required):
VOYAGE_API_KEY=your_key_here
# or
OPENAI_API_KEY=your_key_here
Usage
With Claude Desktop (recommended)
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"youtube-kb": {
"command": "uv",
"args": ["--directory", "/path/to/youtube-knowledge-base-mcp", "run", "youtube-kb"]
}
}
}
Then ask Claude: "Add this video to my knowledge base: [URL]"
With Python
See demo.ipynb for interactive examples.
from youtube_knowledgebase_mcp import process_video, search
# Add a video
result = await process_video("https://youtube.com/watch?v=...")
# Search
results = await search("What is context engineering?")
for r in results.results:
print(r.timestamp_link) # Jump to exact moment
MCP Tools
4 workflow-based tools designed for LLM efficiency:
| Tool | Description |
|---|---|
process_video | Add a video to the knowledge base (with optional tags/summary) |
manage_source | Update tags and summary for a source |
explore_library | Browse sources, list tags, or get statistics |
search | Hybrid semantic + keyword search with reranking |
Developer CLI
Administrative commands for database management (not exposed to LLMs):
uv run kb db stats # Show database statistics
uv run kb db reset --confirm # Reset database (destructive)
uv run kb db migrate <path> # Move database to new location
uv run kb source list # List all sources
uv run kb source delete <id> # Delete a source
uv run kb health # System health check
uv run kb import-urls <file> # Bulk import from file
Run uv run kb --help for all commands.
Configuration
Data Location
By default, data is stored in your OS's standard application data directory:
- macOS:
~/Library/Application Support/youtube-kb/ - Linux:
~/.local/share/youtube-kb/ - Windows:
%APPDATA%/youtube-kb/
Note: If you have existing data in
./data/from a previous version, it will continue to be used automatically.
To use a custom location, set the YOUTUBE_KB_DATA_DIR environment variable:
export YOUTUBE_KB_DATA_DIR=/path/to/custom/location
Or in Claude Desktop config:
{
"mcpServers": {
"youtube-kb": {
"command": "uv",
"args": ["--directory", "/path/to/repo", "run", "youtube-kb"],
"env": {
"YOUTUBE_KB_DATA_DIR": "/custom/data/path"
}
}
}
}
Moving Your Database
To move your database to a new location (e.g., Dropbox):
uv run kb db migrate ~/Dropbox/youtube-kb --confirm
Then follow the printed instructions to set the environment variable.
Architecture
youtube_knowledgebase_mcp/
├── core/ # Config, models, database, embeddings
├── repositories/ # Data access layer (LanceDB)
├── services/ # Business logic (search, ingestion, organization)
├── mcp_tools.py # MCP tools (4 workflow-based tools)
└── cli.py # Developer CLI for admin operations
Tech Stack
- LanceDB - Vector database with hybrid search
- yt-dlp - YouTube transcript extraction
- Embeddings - Voyage (default), OpenAI, BGE, Ollama
- FastMCP - MCP server framework
License
MIT
