PM Data Pipeline System
A complete system for fetching, storing, and querying Performance Monitoring (PM) XML data from remote SFTP servers.
Architecture
SFTP Server → Job Server (scheduled fetcher) → PostgreSQL → MCP Server → REST API → Streamlit Chatbot
Components
- Job Server: Scheduled SFTP file fetcher and XML parser
- PostgreSQL Database: Stores all PM counter data
- MCP Server: Query engine with natural language processing
- REST API: FastAPI server exposing query endpoints
- Streamlit Frontend: Chatbot interface for querying data
Quick Start
Prerequisites
- Docker and Docker Compose
Setup
-
Clone the repository
-
Copy
env.exampleto.env:cp env.example .env -
The
.envfile is already configured for the sample SFTP server included in docker-compose:SFTP_HOST=sftp_server SFTP_USERNAME=sftpuser SFTP_PASSWORD=password SFTP_PORT=22 SFTP_REMOTE_PATH=/home/sftpuser/uploads -
Start all services (including the sample SFTP server):
docker-compose up -d -
Upload sample XML files to the SFTP server:
./scripts/upload_sample_files.sh -
Access the frontend at
http://localhost:8501
Using Your Own SFTP Server
If you have your own SFTP server, update the .env file with your credentials:
SFTP_HOST=your-sftp-host.com
SFTP_USERNAME=your-username
SFTP_PASSWORD=your-password
SFTP_PORT=22
SFTP_REMOTE_PATH=/path/to/xml/files/
Then start only the application services (excluding the sample SFTP server):
docker-compose up -d postgres job_server api_server frontend
Usage
Streamlit Chatbot
Open the Streamlit frontend and ask questions in natural language:
- "What is the ifUtilizationIn value on 2024-01-16 at 2:10 pm?"
- "Show me all interfaces"
- "What counters are available?"
API Endpoints
GET /api/query?q=<natural language query>- Query dataGET /api/interfaces- List interfacesGET /api/counters- List countersGET /api/alerts- Get alertsGET /api/config/fetch-interval- Get fetch intervalPOST /api/config/fetch-interval- Update fetch interval
Configuration
The fetch interval can be changed:
- Via the Streamlit UI sidebar
- Via the REST API
- The job server will automatically pick up the new interval
Database Schema
The system stores:
- File metadata and checksums
- Network element information
- Measurement intervals
- Interface counters
- IP/TCP/System counters
- BGP peer data
- Threshold alerts
- Data quality indicators
Development
Running Individual Services
# Job Server
cd job_server
python main.py
# API Server
cd api_server
uvicorn main:app --reload
# Frontend
cd frontend
streamlit run app.py
Environment Variables
See .env.example for all available configuration options.
Testing with Sample SFTP Server
The docker-compose includes a sample SFTP server for testing. It uses the atmoz/sftp image and is configured with:
- Username:
sftpuser - Password:
password - Port:
2222(mapped to container port 22) - Upload directory:
/home/sftpuser/uploads
Manual SFTP Connection
You can connect to the sample SFTP server manually:
# From your host machine
sftp -P 2222 sftpuser@localhost
# Or from within Docker network
docker exec -it pm_sftp_server sftp sftpuser@localhost
Uploading Files
-
Using the upload script (recommended):
./scripts/upload_sample_files.sh -
Manual upload via Docker:
docker cp example_1.xml pm_sftp_server:/home/sftpuser/uploads/ docker cp example_2.xml pm_sftp_server:/home/sftpuser/uploads/ -
Manual upload via SFTP client:
sftp -P 2222 sftpuser@localhost # Then in SFTP prompt: put example_1.xml /home/sftpuser/uploads/ put example_2.xml /home/sftpuser/uploads/
License
MIT
