Crawl4AI MCP Wrapper
Custom Model Context Protocol (MCP) server that wraps the Crawl4AI Docker API with reliable stdio transport for Claude Code integration.
Overview
This MCP wrapper provides 7 tools for web scraping, crawling, and content extraction:
- scrape_markdown - Extract clean markdown from webpages
- extract_html - Get preprocessed HTML structures
- capture_screenshot - Take full-page PNG screenshots
- generate_pdf - Create PDF documents from webpages
- execute_javascript - Run JavaScript in browser context
- crawl_urls - Crawl multiple URLs with configuration options
- ask_crawl4ai - Query Crawl4AI documentation and examples
Prerequisites
- Python 3.8 or higher
- Crawl4AI Docker container running on port 11235
- Claude Code installed
Installation
1. Start Crawl4AI Docker Container
docker run -d -p 11235:11235 --name crawl4ai --shm-size=2g unclecode/crawl4ai:latest
Verify it's running:
curl http://localhost:11235/health
2. Set Up Virtual Environment
cd /Volumes/4TB/Users/josephmcmyne/myProjects/mcp/crawl4ai-wrapper
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
3. Test the Wrapper
python test_wrapper.py
You should see all tests pass:
✅ All tests passed! MCP wrapper is ready to use.
Configuration
For Claude Code (Project-Level)
Add to /Volumes/4TB/Users/josephmcmyne/general/.mcp.json:
{
"mcpServers": {
"iterm-mcp": {
"args": ["-y", "iterm-mcp"],
"command": "npx"
},
"crawl4ai-custom": {
"command": "/Volumes/4TB/Users/josephmcmyne/myProjects/mcp/crawl4ai-wrapper/venv/bin/python",
"args": ["/Volumes/4TB/Users/josephmcmyne/myProjects/mcp/crawl4ai-wrapper/crawl4ai_mcp.py"],
"env": {
"CRAWL4AI_BASE_URL": "http://localhost:11235"
}
}
}
}
For Claude Code (Global)
Add to ~/.claude/claude_desktop_config.json (if this file exists):
{
"mcpServers": {
"crawl4ai-custom": {
"command": "/Volumes/4TB/Users/josephmcmyne/myProjects/mcp/crawl4ai-wrapper/venv/bin/python",
"args": ["/Volumes/4TB/Users/josephmcmyne/myProjects/mcp/crawl4ai-wrapper/crawl4ai_mcp.py"],
"env": {
"CRAWL4AI_BASE_URL": "http://localhost:11235"
}
}
}
}
Usage
Restart Claude Code
After configuration, restart Claude Code to load the new MCP server.
Verify Connection
claude mcp list
You should see:
crawl4ai-custom: ... - ✓ Connected
Use in Claude Code
Example prompts:
Scrape the content from https://example.com and summarize it.
Take a screenshot of https://github.com
Crawl these URLs and extract their main content: https://example.com, https://example.org
Tool Reference
scrape_markdown
Extract clean markdown from a webpage.
Parameters:
url(required): URL to scrapefilter_type(optional): 'fit', 'raw', 'bm25', or 'llm' (default: 'fit')query(optional): Query string for BM25/LLM filterscache_bust(optional): Cache-bust counter (default: '0')
Example:
result = await scrape_markdown("https://example.com")
extract_html
Get preprocessed HTML from a webpage.
Parameters:
url(required): URL to extract HTML from
capture_screenshot
Capture a full-page PNG screenshot.
Parameters:
url(required): URL to screenshotscreenshot_wait_for(optional): Seconds to wait before capture (default: 2.0)output_path(optional): Path to save the screenshot file
generate_pdf
Generate a PDF document from a webpage.
Parameters:
url(required): URL to convert to PDFoutput_path(optional): Path to save the PDF file
execute_javascript
Run JavaScript code on a webpage.
Parameters:
url(required): URL to execute scripts onscripts(required): List of JavaScript snippets to execute
crawl_urls
Crawl multiple URLs with configuration.
Parameters:
urls(required): List of 1-100 URLs to crawlbrowser_config(optional): Browser configuration overridescrawler_config(optional): Crawler configuration overrideshooks(optional): Custom hook functions
ask_crawl4ai
Query Crawl4AI documentation.
Parameters:
query(required): Search querycontext_type(optional): 'code', 'doc', or 'all' (default: 'all')score_ratio(optional): Minimum score threshold (default: 0.5)max_results(optional): Maximum results (default: 20)
Troubleshooting
Container Not Running
If tests fail with connection errors:
docker ps | grep crawl4ai
If not running, start it:
docker start crawl4ai
MCP Server Not Connecting
Check Claude Code logs:
tail -f ~/Library/Logs/Claude/mcp*.log
Verify the Python path in your .mcp.json is correct:
which python
# Should match the venv path in your config
Port Conflicts
If port 11235 is in use:
- Stop the Crawl4AI container:
docker stop crawl4ai - Find the conflicting process:
lsof -i :11235 - Either stop that process or change the port in both Docker and this wrapper
Development
Adding New Tools
To add a new tool:
- Add an async function decorated with
@mcp.tool()incrawl4ai_mcp.py - Make an HTTP request to the appropriate Crawl4AI endpoint
- Add error handling
- Add a test in
test_wrapper.py - Update this README
Running in Debug Mode
# Enable debug logging
export CRAWL4AI_MCP_LOG=DEBUG
# Run the server
python crawl4ai_mcp.py
Architecture
┌─────────────────┐
│ Claude Code │
└────────┬────────┘
│ stdio (MCP protocol)
▼
┌─────────────────┐
│ FastMCP Server │ (this wrapper)
│ crawl4ai_mcp.py│
└────────┬────────┘
│ HTTP/REST API
▼
┌─────────────────┐
│ Crawl4AI Docker │
│ Container │
│ (port 11235) │
└─────────────────┘
Benefits Over Official SSE Server
- Reliable stdio transport: No SSE connection issues
- Full control: Easy to extend and customize
- Better error handling: Graceful degradation
- Simple debugging: Standard Python stack traces
- No protocol mismatches: Direct HTTP to Crawl4AI API
License
MIT License - feel free to modify and distribute.
Credits
- Built with FastMCP
- Wraps Crawl4AI
- For use with Claude Code
