MCP Application - Context Memory Updater

This application showcases context memory using Model Context Protocol (MCP) tools in an LLM-based system. It allows different users to store, retrieve, and update their travel preferences and other general memories with intelligence assistant through a chatinterface.

On the analytic side, the application summarizes tool usage statistics to support continuous improvement and cost control, answering questions such as which tool-call sequences happens the more frequently? or which tool incurs most tokens?

System Design
- MCP Server
  - Tools
- MCP Client
  - Command-Line Client
  - Web Client
Setup, Development, and Usage
Tests
Deployment
Future Work
Final Thoughts
Appendices: Prompt Testing

System Design

diagram

Key Design Rationale

Containerized MCP Server
- Chosen over serverless to maintain persistent DB connections, stable multi-turn tool state, and reproducible deployment for both MCP server and gateway.
- Provides lightweight process and network isolation, allowing the MCP server to run with minimal privileges and internal-only access.
Clear Server–Gateway Separation
- MCP server handles tool execution and memory logic; the Python API Gateway manages browser-safe access, authentication, and LLM request shaping.
- Prevents exposing secrets and keeps protocol logic clean.
Database Design
- SQLite for portability and local testing; modular data access enables drop-in replacement with PostgreSQL for production.
Authentication & Isolation
- Registration, token issuance, and strict per-user data isolation via user_id; architecture leaves room for RBAC/IAM extensions.
Unified Client Architecture
- CLI and web clients share a single client_core, ensuring consistent MCP behavior without duplicating logic.
- Web client only communicates through the gateway for security and simplicity.
- Ease of Use: Both clients expose a simple chat interface, user memory retrieval, making interaction symmetric across environments.
- LLM Client Adapter: A pluggable LLM adapter abstracts API calls, allowing the system to switch between different LLM providers without modifying client logic.
Tooling Structure
- Memory, travel-preference, and external dummy tools model different MCP interaction patterns, CRUD state, retrieval, and simulated external actions, supporting robust LLM behavior testing.
Testability & Analytics
- Built-in tool-usage analytics for understanding LLM behavior and token cost.
- Multi-layer testing (unit, LLM integration, Puppeteer E2E) ensures reliability and controlled prompt evaluation.
Stack Choices
- FastMCP for standardized MCP protocol and tool execution, LLM interoperability via Pydantic.
- FastAPI for the simple, asynchronous-friendly API gateway.
- Render for fast containerized-app hosting.
- Pytest + Puppeteer for unit, integration, and end-to-end testing.

MCP Server

Standardized, specific tools made available for LLM to utilize upon natural language query: CRUD operations for user's general memory and travel preferences
Unified datamodel and data storage for context memory
Containerized: more persistent, suitable for LLM integration than serverless deployment

Available Tools

The core engine that exposes standardized tools to the LLM.

Context Tools: store_, retrieve_, update_, delete_ for both Travel Preferences and General Memory.
External Service Simulation: Dummy tools for lookup_flights, book_hotels, etc., to test complex tool-chaining capabilities.

MCP Client

Chat-based user interface, self-identification, individual context information
Lightweight server analytics - how has MCP server been utilized by the client? - macro statistics of tool usage
Two versions of client, sharing the same core functionality

mcp-statistics

Command-Line Client

Native Python cli application with interactive chat loop
No deployment, only connect to local MCP server and LLM API

cli-client

Web Client

Lightweight frontend that interacts with the single gateway, holding no secret keys and user information
Single page application using vanilla JS
Python API gateway for MCP & LLM API
Containerized and deployable
Alternative: heavier frontend JS frameworks as a one-stop solution to connect with LLM and MCP, which also requires its own deployment process & secret management, hence falling back to simple client-gateway solution

web-client

Setup, Development, and Usage

Update local environment secret key. Never hard-code one.

cp .env.example .env
vi .env # update OPENAI_API_KEY

Local dependencies. Tested on Python 3.13. Using PyEnv to manage local Python version is recommended.

# option 1: installing in virtual environment (Pipenv--https://pipenv.pypa.io)
pipenv install
pipenv shell

# option 2: installing from frozen requirements.txt
pip install -r requirements.txt

MCP Server & Clients

MCP server and available clients can operate in multiple options.

Option 1) Starting server without container.

# start server
python src/context-updater/server.py

Option 2) Starting servers with individual containers (mcp-server & web client) & host data sync.

# build images
docker build --target mcp-server -t context-mcp-server:latest .
docker build --target web-client -t context-web-client:latest .

# start mcp-server container
docker run -d --rm -p 0.0.0.0:8000:8000 \
    -v $(pwd)/database:/app/database \
    context-mcp-server:latest

# start web-client container
docker run -d --rm -p 0.0.0.0:8001:8001 \
    --env-file .env \
    -e MCP_SERVER_URL=http://host.docker.internal:8000/mcp \
    -e PYTHONUNBUFFERED=1 \
    context-web-client:latest

Option 3) Building and starting all containers (mcp-server & web client) with docker-compose.

docker-compose build
docker-compose up -d

Non-Containerized Clients

Starting CLI client, only available for local use.

# register new user
python src/context-updater/cli_client.py --register

# login or enter chat using stored token
python src/context-updater/cli_client.py

# start web gateway & client
python src/context-updater/web-client/web_gateway.py
# web client runs on http://127.0.0.1:8001

Either option should yield following endpoint URLs:

MCP - http://127.0.0.1:8000
- anonymous memory overview - http://127.0.0.1:8000/memory_overview
Web-Client - http://127.0.0.1:8001

Tests

Unit testing whether MCP tools work correctly without LLM.

# Install dev dependencies
pipenv install --dev

# MCP functional test
pytest test/test_server.py test/test_auth_db.py -v -s

Integration testing whether LLM understands the data from, and can correctly interact with, MCP tools.

# make sure to start server with test flag - do not contaminate real database
IS_MCP_CONTEXT_UPDATER_TEST=true  python src/context-updater/server.py

# LLM hallunication test ~$0.20 per run - can still be flaky :/
RUN_LLM_TESTS=true pytest tests/test_llm.py -v -s

# Single test case
RUN_LLM_TESTS=true pytest test/test_llm.py::TestLLMRealHallucination::test_llm_empty_database_no_hallucination -v -s

Testing End-to-End Scenario

Run Puppeteer to test a chat scenario visually.

# check pre-dependency Node.js ≥ 18 & npm
node -v
npm -v

# install Puppeteer with its own bundled Chromium
npm install puppeteer

Start testing scenario.

# use local env to run e2e test
pipenv shell
# spin web-client & mcp-server -> run puppeteer
bash run-puppeteer.sh tony

Deployment

Deploying on Render (see actions script).

MCP - https://context-mcp-server.onrender.com/
- anonymous memory overview - https://context-mcp-server.onrender.com/memory_overview
Web-Client - https://context-web-client.onrender.com/

Future Work

There are several pending tasks on TODOs.

scalable database: currently, I use SQLite to store MCP data. It's definitely not ideal. I can replace the database functions to use more robust database solution e.g., PostgreSQL or managed database services--only modify server_database.py.
better user authorization: current simple token-based authentication may neither scale nor support role-based system. To achieve these requirements, I can implement IAM system that follows least-privilege principles, so each user or client only has access to the tools and actions they actually need.
model performance test: I use gpt-4o-mini for its cost effectiveness. It should be tested whether switching to more advanced models worth the costs for this type of tasks. I can create semantically difficult dataset and questions to test this out.

Final Thoughts

I chose the MCP option out of curiosity because I've recently implemented an LLM document assitant for my old blog as a toy project, which has covered most required functionalities for the RAG option, and partially for the data-driven assistant option. I felt it's a good opportunity to try building an MCP server from scratch because I've only worked with MCP tools implemented by other people. However, it was quite an open-ended requirement, leaving room for interpretation. I hope I haven't misunderstood it by far :)

The most challenging problem I encountered was the rapidly changing API specifications of all the essential libraries and tools. This makes AI coding assistant very unreliable. Documentation does not provide a good support either. Trial-and-error, with very specific questions prompts on small and concise issues, is the best workaround in this case. Holistic prompting never works.

Appendices - Prompt Testing

Prompt plays important roles in MCP application. To ensure the prompt efficiency, I ran the test on different prompt versions that focus on different aspects: minimalist, exploratory, analytical, risk-awareness, and comprehensiveness. I create mock-up conversation scenarios, collect tool usage statistics and response success, and choose the best prompt as the initial version for the application.

prompt-test

I combine variant 1 (comprehensive) and variant 2 (analytical) because a well-balanced tool call distribution, reasonable expected responses, and acceptable tokens-out which reflect the LLM cost.

pipenv install --dev
python test/prompt_test/test_prompts.py
python test/prompt_test/plot_result.py

MCP Application - Context Memory Updater

System Design
- MCP Server
  - Tools
- MCP Client
  - Command-Line Client
  - Web Client
Setup, Development, and Usage
Tests
Deployment
Future Work
Final Thoughts
Appendices: Prompt Testing

System Design

diagram

Key Design Rationale

Containerized MCP Server
- Chosen over serverless to maintain persistent DB connections, stable multi-turn tool state, and reproducible deployment for both MCP server and gateway.
- Provides lightweight process and network isolation, allowing the MCP server to run with minimal privileges and internal-only access.
Clear Server–Gateway Separation
- MCP server handles tool execution and memory logic; the Python API Gateway manages browser-safe access, authentication, and LLM request shaping.
- Prevents exposing secrets and keeps protocol logic clean.
Database Design
- SQLite for portability and local testing; modular data access enables drop-in replacement with PostgreSQL for production.
Authentication & Isolation
- Registration, token issuance, and strict per-user data isolation via user_id; architecture leaves room for RBAC/IAM extensions.
Unified Client Architecture
- CLI and web clients share a single client_core, ensuring consistent MCP behavior without duplicating logic.
- Web client only communicates through the gateway for security and simplicity.
- Ease of Use: Both clients expose a simple chat interface, user memory retrieval, making interaction symmetric across environments.
- LLM Client Adapter: A pluggable LLM adapter abstracts API calls, allowing the system to switch between different LLM providers without modifying client logic.
Tooling Structure
- Memory, travel-preference, and external dummy tools model different MCP interaction patterns, CRUD state, retrieval, and simulated external actions, supporting robust LLM behavior testing.
Testability & Analytics
- Built-in tool-usage analytics for understanding LLM behavior and token cost.
- Multi-layer testing (unit, LLM integration, Puppeteer E2E) ensures reliability and controlled prompt evaluation.
Stack Choices
- FastMCP for standardized MCP protocol and tool execution, LLM interoperability via Pydantic.
- FastAPI for the simple, asynchronous-friendly API gateway.
- Render for fast containerized-app hosting.
- Pytest + Puppeteer for unit, integration, and end-to-end testing.

MCP Server

Standardized, specific tools made available for LLM to utilize upon natural language query: CRUD operations for user's general memory and travel preferences
Unified datamodel and data storage for context memory
Containerized: more persistent, suitable for LLM integration than serverless deployment

Available Tools

The core engine that exposes standardized tools to the LLM.

Context Tools: store_, retrieve_, update_, delete_ for both Travel Preferences and General Memory.
External Service Simulation: Dummy tools for lookup_flights, book_hotels, etc., to test complex tool-chaining capabilities.

MCP Client

Chat-based user interface, self-identification, individual context information
Lightweight server analytics - how has MCP server been utilized by the client? - macro statistics of tool usage
Two versions of client, sharing the same core functionality

mcp-statistics

Command-Line Client

Native Python cli application with interactive chat loop
No deployment, only connect to local MCP server and LLM API

cli-client

Web Client

Lightweight frontend that interacts with the single gateway, holding no secret keys and user information
Single page application using vanilla JS
Python API gateway for MCP & LLM API
Containerized and deployable
Alternative: heavier frontend JS frameworks as a one-stop solution to connect with LLM and MCP, which also requires its own deployment process & secret management, hence falling back to simple client-gateway solution

web-client

Setup, Development, and Usage

Update local environment secret key. Never hard-code one.

cp .env.example .env
vi .env # update OPENAI_API_KEY

Local dependencies. Tested on Python 3.13. Using PyEnv to manage local Python version is recommended.

# option 1: installing in virtual environment (Pipenv--https://pipenv.pypa.io)
pipenv install
pipenv shell

# option 2: installing from frozen requirements.txt
pip install -r requirements.txt

MCP Server & Clients

MCP server and available clients can operate in multiple options.

Option 1) Starting server without container.

# start server
python src/context-updater/server.py

Option 2) Starting servers with individual containers (mcp-server & web client) & host data sync.

# build images
docker build --target mcp-server -t context-mcp-server:latest .
docker build --target web-client -t context-web-client:latest .

# start mcp-server container
docker run -d --rm -p 0.0.0.0:8000:8000 \
    -v $(pwd)/database:/app/database \
    context-mcp-server:latest

# start web-client container
docker run -d --rm -p 0.0.0.0:8001:8001 \
    --env-file .env \
    -e MCP_SERVER_URL=http://host.docker.internal:8000/mcp \
    -e PYTHONUNBUFFERED=1 \
    context-web-client:latest

Option 3) Building and starting all containers (mcp-server & web client) with docker-compose.

docker-compose build
docker-compose up -d

Non-Containerized Clients

Starting CLI client, only available for local use.

# register new user
python src/context-updater/cli_client.py --register

# login or enter chat using stored token
python src/context-updater/cli_client.py

# start web gateway & client
python src/context-updater/web-client/web_gateway.py
# web client runs on http://127.0.0.1:8001

Either option should yield following endpoint URLs:

MCP - http://127.0.0.1:8000
- anonymous memory overview - http://127.0.0.1:8000/memory_overview
Web-Client - http://127.0.0.1:8001

Tests

Unit testing whether MCP tools work correctly without LLM.

# Install dev dependencies
pipenv install --dev

# MCP functional test
pytest test/test_server.py test/test_auth_db.py -v -s

Integration testing whether LLM understands the data from, and can correctly interact with, MCP tools.

# make sure to start server with test flag - do not contaminate real database
IS_MCP_CONTEXT_UPDATER_TEST=true  python src/context-updater/server.py

# LLM hallunication test ~$0.20 per run - can still be flaky :/
RUN_LLM_TESTS=true pytest tests/test_llm.py -v -s

# Single test case
RUN_LLM_TESTS=true pytest test/test_llm.py::TestLLMRealHallucination::test_llm_empty_database_no_hallucination -v -s

Testing End-to-End Scenario

Run Puppeteer to test a chat scenario visually.

# check pre-dependency Node.js ≥ 18 & npm
node -v
npm -v

# install Puppeteer with its own bundled Chromium
npm install puppeteer

Start testing scenario.

# use local env to run e2e test
pipenv shell
# spin web-client & mcp-server -> run puppeteer
bash run-puppeteer.sh tony

Deployment

Deploying on Render (see actions script).

MCP - https://context-mcp-server.onrender.com/
- anonymous memory overview - https://context-mcp-server.onrender.com/memory_overview
Web-Client - https://context-web-client.onrender.com/

Future Work

There are several pending tasks on TODOs.

scalable database: currently, I use SQLite to store MCP data. It's definitely not ideal. I can replace the database functions to use more robust database solution e.g., PostgreSQL or managed database services--only modify server_database.py.
better user authorization: current simple token-based authentication may neither scale nor support role-based system. To achieve these requirements, I can implement IAM system that follows least-privilege principles, so each user or client only has access to the tools and actions they actually need.
model performance test: I use gpt-4o-mini for its cost effectiveness. It should be tested whether switching to more advanced models worth the costs for this type of tasks. I can create semantically difficult dataset and questions to test this out.

Final Thoughts

Appendices - Prompt Testing

prompt-test

I combine variant 1 (comprehensive) and variant 2 (analytical) because a well-balanced tool call distribution, reasonable expected responses, and acceptable tokens-out which reflect the LLM cost.

pipenv install --dev
python test/prompt_test/test_prompts.py
python test/prompt_test/plot_result.py

Context Memory Updater

MCP Application - Context Memory Updater

Table of Contents

System Design

Key Design Rationale

MCP Server

Available Tools

MCP Client

Command-Line Client

Web Client

Setup, Development, and Usage

MCP Server & Clients

Non-Containerized Clients

Tests

Testing End-to-End Scenario

Deployment

Future Work

Final Thoughts

Appendices - Prompt Testing

MCP Application - Context Memory Updater

Table of Contents

System Design

Key Design Rationale

MCP Server

Available Tools

MCP Client

Command-Line Client

Web Client

Setup, Development, and Usage

MCP Server & Clients

Non-Containerized Clients

Tests

Testing End-to-End Scenario

Deployment

Future Work

Final Thoughts

Appendices - Prompt Testing