MCP Audio RAG Server
Transform your audio files into a searchable knowledge base using AI. Ask Claude questions about your meetings, podcasts, lectures, or any audio content.
What is this?
This is an MCP (Model Context Protocol) server that lets you:
- Transcribe any audio file using Google's Gemini AI
- Store the transcriptions in a searchable database
- Search through all your audio content using natural language
Once set up, you can simply ask Claude things like:
- "What did they discuss about the budget in my meeting recording?"
- "Find mentions of machine learning in my podcast collection"
- "What were the key points from yesterday's lecture?"
How It Works
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Audio File │ ──▶ │ Gemini │ ──▶ │ Chunking │ ──▶ │ Supabase │
│ (.mp3, etc) │ │ Transcribe │ │ + Embedding │ │ (pgvector) │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
│
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ Claude │ ◀── │ Results │ ◀── │ Search │ ◀──────────┘
│ Response │ │ + Snippets │ │ Query │
└─────────────┘ └─────────────┘ └─────────────┘
Quick Start
Prerequisites
- Node.js 18+ - Download here
- Gemini API Key - Get one free
- Supabase Account - Sign up free
Step 1: Clone & Install
git clone https://github.com/matheusslg/mcp-audio-rag.git
cd mcp-audio-rag
npm install
Step 2: Set Up Supabase Database
- Create a new project at supabase.com
- Go to SQL Editor in your dashboard
- Paste and run the contents of
supabase/schema.sql
Step 3: Get Your API Keys
Supabase (Settings → API):
- Copy Project URL →
SUPABASE_URL - Copy service_role key →
SUPABASE_SERVICE_KEY
Google AI Studio:
- Create key at aistudio.google.com/apikey →
GEMINI_API_KEY
Step 4: Configure
cp .env.example .env
Edit .env:
GEMINI_API_KEY=your-key-here
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_SERVICE_KEY=your-service-role-key
Step 5: Add to Claude
For Claude Code CLI (~/.claude.json):
{
"mcpServers": {
"audio-rag": {
"command": "npx",
"args": ["tsx", "/full/path/to/mcp-audio-rag/src/server.ts"],
"env": {
"GEMINI_API_KEY": "your-key",
"SUPABASE_URL": "https://your-project.supabase.co",
"SUPABASE_SERVICE_KEY": "your-service-role-key"
}
}
}
}
For Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json on Mac):
Same config as above.
Usage
Transcribe Audio
Just tell Claude to transcribe a file:
Transcribe /path/to/meeting.mp3
Want to use a specific model? Just ask:
Transcribe /path/to/lecture.m4a using gemini-2.5-pro
Search Your Audio
Ask natural questions:
What did they say about the project timeline?
Search for mentions of "budget" in my recordings
Find discussions about AI in my podcasts
Manage Your Library
List all my transcribed audio files
Delete the recording from last week
Get the full transcript of meeting.mp3
Summarize the podcast episode
Available Models
| Model | Best For |
|---|---|
gemini-2.5-flash | Default - Fast & accurate, great balance |
gemini-2.5-flash-lite | Fastest, cheapest - good for bulk processing |
gemini-2.5-pro | Best quality - complex audio, multiple speakers |
gemini-3-pro-preview | Newest - cutting edge capabilities |
gemini-2.0-flash | Reliable - previous generation |
gemini-2.0-flash-lite | Fast - previous generation |
Supported Audio Formats
.mp3 .mp4 .m4a .wav .webm .mpeg .mpga
Available Tools
| Tool | Description |
|---|---|
ingest_audio | Transcribe and store an audio file |
search_transcripts | Search through your audio using natural language |
list_transcripts | List all transcribed audio files |
get_full_transcript | Get the complete transcript of a file |
summarize_audio | Generate an AI summary of a transcript |
delete_transcript | Remove a transcribed file from the database |
Troubleshooting
| Problem | Solution |
|---|---|
| "No relevant segments found" | Try rephrasing your search, or check if audio was ingested |
| "Missing environment variable" | Check your .env file or Claude config has all 3 keys |
| Supabase errors | Make sure you're using service_role key, not anon key |
| Slow transcription | Use gemini-2.5-flash-lite for faster processing |
Support This Project
If this project saved you time or helped you out, consider buying me a coffee!
License
MIT - Use it however you want!
Made with Gemini + Supabase + Claude
