Enterprise Code Search MCP Server
A powerful Model Context Protocol (MCP) server for semantic code search with shared vector database. Supports both OpenAI and Ollama for embeddings, and can index local projects or Git repositories.
🚀 Features
- Semantic code search using AI embeddings
- Dual provider support: OpenAI or Ollama (local, private)
- Flexible indexing: Local projects or Git repositories
- Shared vector database with ChromaDB
- Multi-project management: Handle multiple projects simultaneously
- Automatic project structure analysis
- Similar code search based on code snippets
- Enterprise-ready: Private, secure, self-hosted
📋 Requirements
- Node.js 18+
- Docker and Docker Compose
- Git (for repository indexing)
🛠️ Quick Start
1. Clone the repository
git clone https://github.com/your-username/semantic-context-mcp.git
cd semantic-context-mcp
2. Install dependencies
npm install
3. Configure environment
cp .env.example .env
# Edit .env with your configuration
4. Start services
# Start ChromaDB and Ollama
docker-compose up -d
# Wait for Ollama to download models
docker-compose logs -f ollama-setup
5. Build and run
npm run build
npm start
⚙️ Configuration
Using Ollama (Recommended for Enterprise)
# .env
EMBEDDING_PROVIDER=ollama
OLLAMA_HOST=http://localhost:11434
OLLAMA_MODEL=nomic-embed-text
CHROMA_HOST=localhost
CHROMA_PORT=8000
Using OpenAI
# .env
EMBEDDING_PROVIDER=openai
OPENAI_API_KEY=your-api-key
OPENAI_MODEL=text-embedding-3-small
🔧 Claude Desktop Integration
To use this MCP server with Claude Desktop, add to your claude_desktop_config.json:
{
"mcpServers": {
"enterprise-code-search": {
"command": "node",
"args": ["/path/to/semantic-context-mcp/dist/index.js"],
"env": {
"EMBEDDING_PROVIDER": "ollama",
"OLLAMA_HOST": "http://localhost:11434",
"OLLAMA_MODEL": "nomic-embed-text",
"CHROMA_HOST": "localhost",
"CHROMA_PORT": "8000",
"COMPANY_NAME": "YourCompany"
}
}
}
}
🎯 Usage Examples
1. Index a local project
Index my local project at /home/user/my-app with the name "frontend-app"
2. Search in code
Search for "main application function" in all indexed projects
3. Find similar code
Find code similar to:
```python
def authenticate_user(username, password):
return check_credentials(username, password)
4. Analyze project structure
Analyze the structure of project "frontend-app"
🛠️ Available Tools
| Tool | Description |
|---|---|
index_local_project | Index a local directory |
search_codebase | Semantic search in code |
list_indexed_projects | List all indexed projects |
get_embedding_provider_info | Get embedding provider information |
📊 Example Queries
Functional searches
- "Where is the authentication logic?"
- "Functions that handle database operations"
- "Environment variable configuration"
- "Unit tests for the API"
Code analysis
- "What design patterns are used?"
- "Most complex functions in the project"
- "Error handling in the code"
Technology-specific search
- "Code using React hooks"
- "PostgreSQL queries"
- "Docker configuration"
🔧 Advanced Configuration
Recommended Ollama Models
# For code embeddings
ollama pull nomic-embed-text # Best for code (384 dims)
ollama pull all-minilm # Lightweight alternative (384 dims)
ollama pull mxbai-embed-large # Higher precision (1024 dims)
File Patterns
The server supports extensive file type recognition including:
- Programming Languages: Python, JavaScript/TypeScript, Java, C/C++, Go, Rust, PHP, Ruby, Swift, Kotlin, Scala, and more
- Web Technologies: HTML, CSS, SCSS, Vue, Svelte
- Configuration: JSON, YAML, TOML, Docker, Terraform
- Documentation: Markdown, reStructuredText, AsciiDoc
- Database: SQL files
Performance Tuning
# Maximum chunk size (characters)
MAX_CHUNK_SIZE=1500
# Maximum file size (KB)
MAX_FILE_SIZE=500
# Batch size for indexing
BATCH_SIZE=100
🏢 Enterprise Deployment
Option 1: Dedicated Server
# On enterprise server
docker-compose up -d
Option 2: Network Deployment
# Configure for network access
CHROMA_HOST=192.168.1.100
OLLAMA_HOST=http://192.168.1.100:11434
🔒 Security Considerations
Key Benefits
- Private Data: Ollama keeps everything local
- No External APIs: When using Ollama, no data leaves your network
- Self-hosted: Full control over your code and embeddings
- Isolated Environment: Docker containers provide isolation
Security Best Practices
# Restrict ChromaDB access
CHROMA_SERVER_HOST=127.0.0.1 # Localhost only
# Use HTTPS for production
OLLAMA_HOST=https://ollama.company.com
📈 Monitoring & Troubleshooting
Useful Logs
# View indexing logs
docker-compose logs -f enterprise-mcp-server
# ChromaDB performance
docker-compose logs -f chromadb
# Monitor Ollama
curl http://localhost:11434/api/tags
Common Issues
Ollama not responding:
curl http://localhost:11434/api/tags
# If it fails: docker-compose restart ollama
ChromaDB slow:
# Check disk space
docker system df
# Clean if necessary
docker system prune
Poor embedding quality:
- Try different model:
all-minilmvsnomic-embed-text - Adjust chunk size
- Verify source file quality
🤝 Collaborative Workflow
Typical Enterprise Workflow
- DevOps indexes main projects
- Developers search code using Claude
- Automatic updates via CI/CD
- Code analysis for code reviews
Best Practices
- Index after important merges
- Use descriptive project names
- Maintain project-specific search filters
- Document naming conventions
🛠️ Development
Project Structure
src/
├── index.ts # Main MCP server
└── http-server.ts # HTTP server variant
scripts/ # Setup and utility scripts
docker-compose.yml # Service orchestration
package.json # Dependencies and scripts
Available Scripts
npm run build # Compile TypeScript
npm run dev # Development mode
npm run start # Production mode
npm run clean # Clean build directory
📚 API Reference
The MCP server implements the standard Model Context Protocol with these specific tools:
- index_local_project: Index local directories with configurable file patterns
- search_codebase: Semantic search with project filtering and similarity scoring
- list_indexed_projects: Enumerate all indexed projects with metadata
- get_embedding_provider_info: Get current provider status and configuration
Each tool includes detailed JSON schema with examples and validation.
🤖 Recommended AI Models
For embeddings (Ollama)
nomic-embed-text: Optimized for codeall-minilm: Balanced, fastmxbai-embed-large: High precision
For embeddings (OpenAI)
text-embedding-3-small: Cost-effectivetext-embedding-3-large: Higher precision
🐳 Docker Support
The project includes a complete Docker setup:
- ChromaDB: Vector database for embeddings
- Ollama: Local embedding generation
- PostgreSQL: Optional metadata storage
All services are orchestrated with Docker Compose for easy deployment.
☕ Support
If this project helps you with your development workflow, consider supporting it:
📄 License
MIT License - see LICENSE file for details.
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the project
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
📞 Support & Issues
- 📧 Issues: GitHub Issues
- 💬 Discussions: GitHub Discussions
- ☕ Support: Buy Me a Coffee

