RAG MCP Application
This project demonstrates a powerful, modular AI application using the Model Context Protocol (MCP). The architecture follows a clean agent-tool-resource model: a central orchestrator LLM acts as the agent, consuming tools provided by a lean MCP server to access various resources.
The core components are:
client_ui.py: A Gradio-based client that houses the single orchestrator LLM. This agent is responsible for all reasoning, including deciding when to use tools and generating final responses based on tool outputs.rag_server.py: A lightweight MCP server that provides tools to access resources. It does not contain any LLM. The available tools are:search_knowledge_base: Accesses a ChromaDB vector database (the resource) to retrieve relevant information.get_weather: Accesses an external weather API (the resource).
This separation of concerns makes the system highly modular and easy to extend.
Project Structure
rag-mcp-app/data/: Directory for your PDF documents to be indexed.chroma_db/: Directory where the ChromaDB vector store is persisted.rag_server.py: The MCP server that provides tools.client_ui.py: The client application with the orchestrator LLM and Gradio UI.ingest.py: Script to index PDF documents into the vector database..env: Your local configuration file.requirements.txt: Project dependencies.
Getting Started
Prerequisites
- Python 3.13+: Ensure you have a compatible Python version installed.
- Ollama: Install Ollama from ollama.ai and ensure it's running.
- Ollama Model: Pull the model for the orchestrator LLM. The default is
qwen3:1.7b.ollama pull qwen3:1.7b - Google API Key: For document embeddings, set your
GOOGLE_API_KEYin the.envfile.
Installation
-
Clone the repository and navigate into it.
-
Create and Activate a Virtual Environment:
python -m venv .venv # On Windows: .\.venv\Scripts\activate # On macOS/Linux: source .venv/bin/activate -
Install Dependencies:
uv pip install -r requirements.txt -
Configure the Application:
- Copy the example environment file:
cp .env.example .env - Edit the
.envfile to set yourGOOGLE_API_KEYand any other desired configurations (e.g., model, port).
- Copy the example environment file:
Data Preparation
-
Populate the
data/directory: Place your PDF documents into therag-mcp-app/data/directory. -
Run the Ingestion Script: This must be run before starting the application for the first time, or whenever you update the documents.
uv run python ingest.py
Running the Application
Activate your virtual environment. The client application starts the MCP server as a background process, so you only need to run one command:
cd rag-mcp-app; .\.venv\Scripts\activate; uv run python client_ui.py --mcp-server rag_server.py
uv run python client_ui.py --mcp-server rag_server.py
The client will start the server, connect to it, and launch the Gradio UI. Access it in your browser at the configured port (e.g., http://127.0.0.1:3000).
Example Usage
- Ask a question about your documents: "What is the main topic of the documents?"
- Ask about the weather: "What's the weather like in London?"
Project Status
The initial refactoring is complete. The architecture now correctly implements the agent-tool-resource model.
- Review Architecture, Code: The architecture has been reviewed and refactored for clarity and modularity.
- Remove RAG LLM: The redundant LLM has been removed from the server.
- Make Chroma Vector DB a resource: The vector DB is now treated as a resource, accessed via a dedicated tool.
- Make access to Vector DB a tool: The
search_knowledge_basetool provides this functionality. - Consider adding web search: The new architecture makes this easy. A new tool can be added to
rag_server.pyto enable web search capabilities.
