MCP RSS

MCP RSS is a Model Context Protocol (MCP) server for intelligent RSS feed management with advanced search capabilities, semantic search using AI embeddings, and a comprehensive reading workflow.

Features

📰 RSS Feed Management - Parse OPML files and automatically fetch articles from RSS feeds
🔍 Advanced Search - Keyword search with date range, category, and status filtering
🤖 Semantic Search - AI-powered natural language search using OpenAI embeddings (optional)
📊 Smart Organization - Four-status workflow (unread/read/favorite/archived)
📅 Daily Digest - Get today's unread articles grouped by category
🚀 High Performance - PostgreSQL with pgvector for efficient vector similarity search
🔄 Auto-Deduplication - Prevents duplicate articles and wasted API calls
⚡ Token-Efficient - Browse titles/excerpts first, fetch full content only when needed
📑 Pagination Support - Handle large feed collections (500+) with efficient pagination

Installation

Prerequisites

Node.js (v18 or higher)
Docker & Docker Compose (for PostgreSQL with pgvector)
OpenAI API Key (optional, only for semantic search)

Quick Start with Docker Compose

Clone or install the package:

npm install -g mcp_rss
# OR for local development
git clone <repository-url>
cd mcp_rss
npm install

Start PostgreSQL with pgvector:
```
docker-compose up -d
```

Configure environment variables:

cp .env.example .env
# Edit .env with your settings

Build the project:
```
npm run build
```

Database Setup

The project uses PostgreSQL 17 with pgvector extension for vector similarity search.

Using Docker Compose (Recommended):

docker-compose up -d             # Start PostgreSQL
docker-compose down              # Stop PostgreSQL
docker-compose down -v           # Stop and remove volumes (fresh start)
docker-compose logs -f postgres  # View PostgreSQL logs

Manual PostgreSQL Setup:

docker run -d \
  --name mcp-rss-postgres \
  -p 5433:5432 \
  -e POSTGRES_USER=mcp_user \
  -e POSTGRES_PASSWORD=123456 \
  -e POSTGRES_DB=mcp_rss \
  pgvector/pgvector:pg17

Configuration

Environment Variables

Create a .env file with the following configuration:

Variable	Description	Default	Required
Database Configuration
`DB_HOST`	PostgreSQL host	`localhost`	No
`DB_PORT`	PostgreSQL port	`5433`	No
`DB_USER` / `DB_USERNAME`	Database username	`mcp_user`	No
`DB_PASSWORD`	Database password	`123456`	No
`DB_NAME` / `DB_DATABASE`	Database name	`mcp_rss`	No
RSS Configuration
`OPML_FILE_PATH`	Path to OPML file with RSS feeds	`./feeds.opml`	Yes
`RSS_UPDATE_INTERVAL`	Feed update interval (minutes)	`1`	No
OpenAI Configuration
`OPENAI_API_KEY`	OpenAI API key for embeddings	-	No*

* Only required for semantic search feature. All other features work without it.

Claude Desktop Configuration

For local development, use the built dist folder:

{
  "mcpServers": {
    "rss": {
      "command": "node",
      "args": ["/absolute/path/to/mcp_rss/dist/index.js"],
      "env": {
        "OPML_FILE_PATH": "/path/to/your/feeds.opml",
        "DB_HOST": "localhost",
        "DB_PORT": "5433",
        "DB_USER": "mcp_user",
        "DB_PASSWORD": "123456",
        "DB_NAME": "mcp_rss",
        "RSS_UPDATE_INTERVAL": "60",
        "OPENAI_API_KEY": "sk-your-key-here"
      }
    }
  }
}

For global installation via npm:

{
  "mcpServers": {
    "rss": {
      "command": "npx",
      "args": ["mcp_rss"],
      "env": {
        "OPML_FILE_PATH": "/path/to/your/feeds.opml",
        "OPENAI_API_KEY": "sk-your-key-here"
      }
    }
  }
}

MCP Tools Reference

The server exposes 8 powerful tools for RSS feed management:

Token Efficiency Guide

All list/search tools now support ultra-efficient token usage:

Default behavior: Returns ONLY titles and metadata (no excerpts, no content)
Optional excerpts: Set includeExcerpt: true for content previews (moderate token usage)
Full content: Set includeContent: true for complete article text (high token usage)
On-demand content: Use get_article_full to fetch specific articles by ID (most efficient)

Recommended workflow (90%+ token savings):

Browse titles only with get_content or search_articles (default settings)
Identify interesting articles from titles alone
Optionally fetch excerpts for borderline cases with includeExcerpt: true
Fetch full content with get_article_full for selected articles only

Token Usage Comparison:

Titles only: ~50-100 tokens per article
Titles + excerpts: ~150-300 tokens per article
Titles + full content: ~1,000-5,000 tokens per article

1. get_content

Get articles with basic filtering and pagination. Returns latest articles first (sorted by pubDate DESC).

Use this for:

Browsing recent articles
Checking unread articles
Simple filtering by status or source
Date range filtering for specific time periods

Token Efficiency:

By default, returns ONLY titles and metadata (most token-efficient)
Set includeExcerpt: true to add content previews
Set includeContent: true to get full article text
For best efficiency: browse titles only, then use get_article_full for specific articles

Parameters:

Parameter	Type	Description	Default
`statuses`	`string[]`	Filter by statuses: `"unread"`, `"read"`, `"favorite"`, `"archived"`	All statuses
`source`	`string`	Filter by feed source title	All sources
`limit`	`number`	Number of articles to return	`10`
`offset`	`number`	Offset for pagination	`0`
`favoriteBlogsOnly`	`boolean`	Only show articles from favorite blogs	`false`
`prioritizeFavoriteBlogs`	`boolean`	Show favorite blog articles first	`false`
`includeContent`	`boolean`	Include full article content (uses more tokens)	`false`
`includeExcerpt`	`boolean`	Include article excerpt/preview	`false`
`startDate`	`string`	Start date (ISO: YYYY-MM-DD or YYYY-MM-DDTHH:mm:ssZ)	-
`endDate`	`string`	End date (ISO format)	-

Example (titles only - most efficient):

{
  "statuses": ["unread"],
  "limit": 20
}

Example (with date range and excerpts):

{
  "startDate": "2025-10-01",
  "endDate": "2025-10-25",
  "includeExcerpt": true,
  "limit": 15
}

Example (favorite blogs with full content):

{
  "favoriteBlogsOnly": true,
  "limit": 5,
  "includeContent": true
}

Response (default - titles only, no excerpt/content):

{
  "articles": [
    {
      "id": 123,
      "title": "Article Title",
      "link": "https://example.com/article",
      "pubDate": "2024-01-15T10:30:00Z",
      "fetchDate": "2024-01-15T11:00:00Z",
      "status": "unread",
      "feedTitle": "Engineering Blog",
      "feedCategory": "Technology"
    }
  ],
  "total": 150,
  "success": true
}

2. search_articles

Advanced search with keyword matching, date ranges, categories, and status filters. Searches both title and content.

Use this for:

Finding articles on specific topics
Date-based filtering
Complex multi-criteria searches
Category-specific searches

Parameters:

Parameter	Type	Description	Default
`keyword`	`string`	Search term (case-insensitive, searches title + content)	-
`category`	`string`	Filter by feed category	-
`statuses`	`string[]`	Filter by article statuses	All
`startDate`	`string`	Start date (ISO format: `YYYY-MM-DD` or `YYYY-MM-DDTHH:mm:ssZ`)	-
`endDate`	`string`	End date (ISO format)	-
`limit`	`number`	Number of results	`20`
`offset`	`number`	Offset for pagination	`0`
`includeContent`	`boolean`	Include full article content (uses more tokens)	`false`

Example:

{
  "keyword": "kubernetes",
  "category": "Engineering",
  "startDate": "2024-01-01",
  "endDate": "2024-12-31",
  "statuses": ["unread"],
  "limit": 10
}

3. semantic_search

AI-powered semantic search using OpenAI embeddings. Finds conceptually similar articles even without exact keyword matches.

Use this for:

Natural language queries
Finding related concepts
Research and discovery
Topic exploration

Requirements:

OPENAI_API_KEY must be set
Only works for articles from 2020 onwards
Automatically disabled if API key is missing (fails gracefully)

Parameters:

Parameter	Type	Description	Default
`query`	`string`	Natural language search query (required)	-
`includeContent`	`boolean`	Include full article content (uses more tokens)	`false`
`limit`	`number`	Number of results	`10`
`statuses`	`string[]`	Filter by article statuses	All
`category`	`string`	Filter by feed category	-

Example:

{
  "query": "how to optimize database performance and reduce query latency",
  "limit": 5,
  "statuses": ["unread"]
}

How it works:

Converts your query into a 1536-dimensional vector using OpenAI
Compares against article embeddings using pgvector cosine similarity
Returns semantically similar articles ranked by relevance

4. get_daily_digest

Get today's unread articles grouped by category. Perfect for daily reading workflows. Filters by publication date (pubDate), not fetch date.

Use this for:

Morning briefings
Daily catch-up
Category-organized reading
Articles published today (based on pubDate)

Parameters:

Parameter	Type	Description	Default
`limit`	`number`	Max articles per category	`5`
`includeContent`	`boolean`	Include full article content (uses more tokens)	`false`

Example:

{
  "limit": 5
}

Response: Articles grouped by category, with up to N articles per category fetched today.

5. get_weekly_favorites

NEW: Get favorite articles from the last 7 days (titles only). Perfect for weekly review of bookmarked content.

Use this for:

Weekly reading lists
Reviewing saved articles from the past week
Tracking important bookmarked content
Quick overview of what you found valuable recently

Parameters: None

Example: No parameters needed - simply call the tool.

Response:

{
  "articles": [
    {
      "id": 789,
      "title": "Optimizing PostgreSQL for High Write Throughput",
      "link": "https://engineering.example.com/postgres-optimization",
      "pubDate": "2025-10-22T14:30:00Z",
      "fetchDate": "2025-10-22T15:00:00Z",
      "status": "favorite",
      "feedTitle": "Engineering at Example",
      "feedCategory": "Database"
    },
    {
      "id": 654,
      "title": "Building Resilient Microservices with Circuit Breakers",
      "link": "https://blog.example.com/circuit-breakers",
      "pubDate": "2025-10-20T09:15:00Z",
      "fetchDate": "2025-10-20T10:00:00Z",
      "status": "favorite",
      "feedTitle": "Tech Blog",
      "feedCategory": "Architecture"
    }
  ],
  "total": 2,
  "success": true
}

Features:

Returns articles marked as "favorite" published in last 7 days
Sorted by publication date (newest first)
Ultra token-efficient - titles and metadata only
No excerpts or content by default
Use get_article_full to read full content of any article

6. get_article_full

Get full article content by ID. Use this for token-efficient reading: browse titles first, then fetch complete content only for articles you want to read.

Use this for:

Reading full articles after browsing titles
Getting complete content for specific interesting articles
Token-efficient workflow (browse → select → read)

Parameters:

Parameter	Type	Description	Required
`articleId`	`number`	Article ID from get_content/search_articles	Yes

Example:

{
  "articleId": 123
}

Response:

{
  "articles": [
    {
      "id": 123,
      "title": "Complete Article Title",
      "content": "Full article content with all HTML and formatting...",
      "link": "https://example.com/article",
      "pubDate": "2024-01-15T10:30:00Z",
      "fetchDate": "2024-01-15T11:00:00Z",
      "status": "unread",
      "feedTitle": "Engineering Blog",
      "feedCategory": "Technology",
      "excerpt": "First 200 characters..."
    }
  ],
  "success": true
}

Token-Efficient Workflow:

1. get_content(limit=20) → Browse 20 titles/excerpts
2. Find interesting article with id=456
3. get_article_full(articleId=456) → Read full content
4. set_tag(articleId=456, status="favorite") → Save for later

7. get_sources

Get RSS feed sources with pagination and filtering. With hundreds of feeds, pagination is essential to avoid token limits.

Use this for:

Discovering available sources
Finding valid source names for filtering
Exploring feed categories
Browsing favorite blogs

Parameters:

Parameter	Type	Description	Default
`limit`	`number`	Number of sources to return (max recommended: 100)	`50`
`offset`	`number`	Offset for pagination (e.g., 50 for page 2)	`0`
`favoritesOnly`	`boolean`	Only show favorite blogs	`false`
`category`	`string`	Filter by category (case-insensitive, partial match)	All categories

Example (first page):

{
  "limit": 50,
  "offset": 0
}

Example (favorites only):

{
  "favoritesOnly": true,
  "limit": 20
}

Example (filter by category):

{
  "category": "Engineering",
  "limit": 30
}

Response:

{
  "sources": [
    {
      "id": 1,
      "title": "Engineering at Meta",
      "category": "Engineering Blogs",
      "url": "https://engineering.fb.com/feed/",
      "isFavorite": true
    },
    {
      "id": 2,
      "title": "Netflix Tech Blog",
      "category": "Engineering Blogs",
      "url": "https://netflixtechblog.com/feed",
      "isFavorite": false
    }
  ],
  "total": 518,
  "success": true
}

Pagination Example:

Page 1: offset=0, limit=50   → Sources 1-50 of 518
Page 2: offset=50, limit=50  → Sources 51-100 of 518
Page 3: offset=100, limit=50 → Sources 101-150 of 518

8. set_tag

Update article status to manage your reading workflow.

Use this for:

Marking articles as read
Saving favorites
Archiving old articles
Managing reading queue

Parameters:

Parameter	Type	Description	Required
`articleId`	`number`	Article ID to update	Yes
`status`	`string`	New status: `"unread"`, `"read"`, `"favorite"`, `"archived"`	Yes

Example:

{
  "articleId": 123,
  "status": "favorite"
}

Article Status Workflow

The server supports a comprehensive 4-status workflow:

┌─────────┐
│ unread  │ ← New articles start here
└────┬────┘
     │
     ├──→ read      (marked as read)
     ├──→ favorite  (important/bookmarked)
     └──→ archived  (old/irrelevant)

Vector Search & Embeddings

How Embeddings Work

Automatic Generation: When fetching RSS articles, the server automatically generates embeddings for articles from 2020 onwards
OpenAI Integration: Uses text-embedding-3-small model (1536 dimensions)
Deduplication: Embeddings are only generated once per article (checked by URL)
Graceful Degradation: If OPENAI_API_KEY is missing or invalid, the server continues to work normally (embeddings skipped)

Storage

Embeddings stored as vector(1536) in PostgreSQL using pgvector extension
Enables fast cosine similarity search: ORDER BY embedding <=> query_vector

Cost Optimization

Only articles from 2020+ get embeddings (configurable in RssService.shouldGenerateEmbedding())
Duplicate articles are skipped (no redundant API calls)
Embedding generation failures don't block article saving

Development

Project Structure

mcp_rss/
├── src/
│   ├── entities/           # TypeORM entities
│   │   ├── Article.ts      # Article entity with vector embeddings
│   │   └── Feed.ts         # RSS feed source entity
│   ├── services/
│   │   ├── OpmlService.ts  # OPML parsing
│   │   ├── RssService.ts   # RSS fetching + embedding generation
│   │   ├── McpService.ts   # MCP tool implementations
│   │   └── EmbeddingService.ts # OpenAI embedding wrapper
│   ├── config/
│   │   └── database.ts     # TypeORM + pgvector setup
│   └── index.ts            # MCP server entry point
├── docker-compose.yml      # PostgreSQL with pgvector
├── .env.example            # Environment template
└── package.json

Building

npm run build        # Compile TypeScript
npm run watch        # Watch mode for development

Testing

# Test database connection
docker-compose ps

# Test MCP server locally
node dist/index.js

# Debug with MCP inspector
npx @modelcontextprotocol/inspector node dist/index.js

Troubleshooting

Database Connection Issues

Error: connect ETIMEDOUT

Ensure PostgreSQL is running: docker-compose ps
Check port 5433 is available: lsof -i :5433
Verify environment variables match docker-compose settings

OpenAI API Errors

Error: 401 Incorrect API key

Verify your API key at https://platform.openai.com/api-keys
Ensure you have available credits
Check the key isn't expired

Embeddings not being generated:

Server works fine without API key (embeddings skipped)
Check article dates (only 2020+ articles get embeddings)
Look for errors in console logs

MCP Server Issues

Server not appearing in Claude Desktop:

Check Claude Desktop config path is correct
Verify dist/index.js exists (run npm run build)
Restart Claude Desktop after config changes
Check Claude Desktop logs for errors

Performance Tips

Adjust Update Interval: Set RSS_UPDATE_INTERVAL to 60+ minutes for production
Limit Embedding Generation: Embeddings are only for articles from 2020+
Use Pagination: Always use offset and limit for large result sets
Database Indexing: PostgreSQL automatically indexes the vector column

License

MIT

Contributing

Contributions welcome! Please ensure:

TypeScript compiles without errors (npm run build)
Environment variables are documented
New features include appropriate error handling

MCP RSS

MCP RSS is a Model Context Protocol (MCP) server for intelligent RSS feed management with advanced search capabilities, semantic search using AI embeddings, and a comprehensive reading workflow.

Features

📰 RSS Feed Management - Parse OPML files and automatically fetch articles from RSS feeds
🔍 Advanced Search - Keyword search with date range, category, and status filtering
🤖 Semantic Search - AI-powered natural language search using OpenAI embeddings (optional)
📊 Smart Organization - Four-status workflow (unread/read/favorite/archived)
📅 Daily Digest - Get today's unread articles grouped by category
🚀 High Performance - PostgreSQL with pgvector for efficient vector similarity search
🔄 Auto-Deduplication - Prevents duplicate articles and wasted API calls
⚡ Token-Efficient - Browse titles/excerpts first, fetch full content only when needed
📑 Pagination Support - Handle large feed collections (500+) with efficient pagination

Installation

Prerequisites

Node.js (v18 or higher)
Docker & Docker Compose (for PostgreSQL with pgvector)
OpenAI API Key (optional, only for semantic search)

Quick Start with Docker Compose

Clone or install the package:

npm install -g mcp_rss
# OR for local development
git clone <repository-url>
cd mcp_rss
npm install

Start PostgreSQL with pgvector:
```
docker-compose up -d
```

Configure environment variables:

cp .env.example .env
# Edit .env with your settings

Build the project:
```
npm run build
```

Database Setup

The project uses PostgreSQL 17 with pgvector extension for vector similarity search.

Using Docker Compose (Recommended):

docker-compose up -d             # Start PostgreSQL
docker-compose down              # Stop PostgreSQL
docker-compose down -v           # Stop and remove volumes (fresh start)
docker-compose logs -f postgres  # View PostgreSQL logs

Manual PostgreSQL Setup:

docker run -d \
  --name mcp-rss-postgres \
  -p 5433:5432 \
  -e POSTGRES_USER=mcp_user \
  -e POSTGRES_PASSWORD=123456 \
  -e POSTGRES_DB=mcp_rss \
  pgvector/pgvector:pg17

Configuration

Environment Variables

Create a .env file with the following configuration:

Variable	Description	Default	Required
Database Configuration
`DB_HOST`	PostgreSQL host	`localhost`	No
`DB_PORT`	PostgreSQL port	`5433`	No
`DB_USER` / `DB_USERNAME`	Database username	`mcp_user`	No
`DB_PASSWORD`	Database password	`123456`	No
`DB_NAME` / `DB_DATABASE`	Database name	`mcp_rss`	No
RSS Configuration
`OPML_FILE_PATH`	Path to OPML file with RSS feeds	`./feeds.opml`	Yes
`RSS_UPDATE_INTERVAL`	Feed update interval (minutes)	`1`	No
OpenAI Configuration
`OPENAI_API_KEY`	OpenAI API key for embeddings	-	No*

* Only required for semantic search feature. All other features work without it.

Claude Desktop Configuration

For local development, use the built dist folder:

{
  "mcpServers": {
    "rss": {
      "command": "node",
      "args": ["/absolute/path/to/mcp_rss/dist/index.js"],
      "env": {
        "OPML_FILE_PATH": "/path/to/your/feeds.opml",
        "DB_HOST": "localhost",
        "DB_PORT": "5433",
        "DB_USER": "mcp_user",
        "DB_PASSWORD": "123456",
        "DB_NAME": "mcp_rss",
        "RSS_UPDATE_INTERVAL": "60",
        "OPENAI_API_KEY": "sk-your-key-here"
      }
    }
  }
}

For global installation via npm:

{
  "mcpServers": {
    "rss": {
      "command": "npx",
      "args": ["mcp_rss"],
      "env": {
        "OPML_FILE_PATH": "/path/to/your/feeds.opml",
        "OPENAI_API_KEY": "sk-your-key-here"
      }
    }
  }
}

MCP Tools Reference

The server exposes 8 powerful tools for RSS feed management:

Token Efficiency Guide

All list/search tools now support ultra-efficient token usage:

Default behavior: Returns ONLY titles and metadata (no excerpts, no content)
Optional excerpts: Set includeExcerpt: true for content previews (moderate token usage)
Full content: Set includeContent: true for complete article text (high token usage)
On-demand content: Use get_article_full to fetch specific articles by ID (most efficient)

Recommended workflow (90%+ token savings):

Browse titles only with get_content or search_articles (default settings)
Identify interesting articles from titles alone
Optionally fetch excerpts for borderline cases with includeExcerpt: true
Fetch full content with get_article_full for selected articles only

Token Usage Comparison:

Titles only: ~50-100 tokens per article
Titles + excerpts: ~150-300 tokens per article
Titles + full content: ~1,000-5,000 tokens per article

1. get_content

Get articles with basic filtering and pagination. Returns latest articles first (sorted by pubDate DESC).

Use this for:

Browsing recent articles
Checking unread articles
Simple filtering by status or source
Date range filtering for specific time periods

Token Efficiency:

By default, returns ONLY titles and metadata (most token-efficient)
Set includeExcerpt: true to add content previews
Set includeContent: true to get full article text
For best efficiency: browse titles only, then use get_article_full for specific articles

Parameters:

Parameter	Type	Description	Default
`statuses`	`string[]`	Filter by statuses: `"unread"`, `"read"`, `"favorite"`, `"archived"`	All statuses
`source`	`string`	Filter by feed source title	All sources
`limit`	`number`	Number of articles to return	`10`
`offset`	`number`	Offset for pagination	`0`
`favoriteBlogsOnly`	`boolean`	Only show articles from favorite blogs	`false`
`prioritizeFavoriteBlogs`	`boolean`	Show favorite blog articles first	`false`
`includeContent`	`boolean`	Include full article content (uses more tokens)	`false`
`includeExcerpt`	`boolean`	Include article excerpt/preview	`false`
`startDate`	`string`	Start date (ISO: YYYY-MM-DD or YYYY-MM-DDTHH:mm:ssZ)	-
`endDate`	`string`	End date (ISO format)	-

Example (titles only - most efficient):

{
  "statuses": ["unread"],
  "limit": 20
}

Example (with date range and excerpts):

{
  "startDate": "2025-10-01",
  "endDate": "2025-10-25",
  "includeExcerpt": true,
  "limit": 15
}

Example (favorite blogs with full content):

{
  "favoriteBlogsOnly": true,
  "limit": 5,
  "includeContent": true
}

Response (default - titles only, no excerpt/content):

{
  "articles": [
    {
      "id": 123,
      "title": "Article Title",
      "link": "https://example.com/article",
      "pubDate": "2024-01-15T10:30:00Z",
      "fetchDate": "2024-01-15T11:00:00Z",
      "status": "unread",
      "feedTitle": "Engineering Blog",
      "feedCategory": "Technology"
    }
  ],
  "total": 150,
  "success": true
}

2. search_articles

Advanced search with keyword matching, date ranges, categories, and status filters. Searches both title and content.

Use this for:

Finding articles on specific topics
Date-based filtering
Complex multi-criteria searches
Category-specific searches

Parameters:

Parameter	Type	Description	Default
`keyword`	`string`	Search term (case-insensitive, searches title + content)	-
`category`	`string`	Filter by feed category	-
`statuses`	`string[]`	Filter by article statuses	All
`startDate`	`string`	Start date (ISO format: `YYYY-MM-DD` or `YYYY-MM-DDTHH:mm:ssZ`)	-
`endDate`	`string`	End date (ISO format)	-
`limit`	`number`	Number of results	`20`
`offset`	`number`	Offset for pagination	`0`
`includeContent`	`boolean`	Include full article content (uses more tokens)	`false`

Example:

{
  "keyword": "kubernetes",
  "category": "Engineering",
  "startDate": "2024-01-01",
  "endDate": "2024-12-31",
  "statuses": ["unread"],
  "limit": 10
}

3. semantic_search

AI-powered semantic search using OpenAI embeddings. Finds conceptually similar articles even without exact keyword matches.

Use this for:

Natural language queries
Finding related concepts
Research and discovery
Topic exploration

Requirements:

OPENAI_API_KEY must be set
Only works for articles from 2020 onwards
Automatically disabled if API key is missing (fails gracefully)

Parameters:

Parameter	Type	Description	Default
`query`	`string`	Natural language search query (required)	-
`includeContent`	`boolean`	Include full article content (uses more tokens)	`false`
`limit`	`number`	Number of results	`10`
`statuses`	`string[]`	Filter by article statuses	All
`category`	`string`	Filter by feed category	-

Example:

{
  "query": "how to optimize database performance and reduce query latency",
  "limit": 5,
  "statuses": ["unread"]
}

How it works:

Converts your query into a 1536-dimensional vector using OpenAI
Compares against article embeddings using pgvector cosine similarity
Returns semantically similar articles ranked by relevance

4. get_daily_digest

Get today's unread articles grouped by category. Perfect for daily reading workflows. Filters by publication date (pubDate), not fetch date.

Use this for:

Morning briefings
Daily catch-up
Category-organized reading
Articles published today (based on pubDate)

Parameters:

Parameter	Type	Description	Default
`limit`	`number`	Max articles per category	`5`
`includeContent`	`boolean`	Include full article content (uses more tokens)	`false`

Example:

{
  "limit": 5
}

Response: Articles grouped by category, with up to N articles per category fetched today.

5. get_weekly_favorites

NEW: Get favorite articles from the last 7 days (titles only). Perfect for weekly review of bookmarked content.

Use this for:

Weekly reading lists
Reviewing saved articles from the past week
Tracking important bookmarked content
Quick overview of what you found valuable recently

Parameters: None

Example: No parameters needed - simply call the tool.

Response:

{
  "articles": [
    {
      "id": 789,
      "title": "Optimizing PostgreSQL for High Write Throughput",
      "link": "https://engineering.example.com/postgres-optimization",
      "pubDate": "2025-10-22T14:30:00Z",
      "fetchDate": "2025-10-22T15:00:00Z",
      "status": "favorite",
      "feedTitle": "Engineering at Example",
      "feedCategory": "Database"
    },
    {
      "id": 654,
      "title": "Building Resilient Microservices with Circuit Breakers",
      "link": "https://blog.example.com/circuit-breakers",
      "pubDate": "2025-10-20T09:15:00Z",
      "fetchDate": "2025-10-20T10:00:00Z",
      "status": "favorite",
      "feedTitle": "Tech Blog",
      "feedCategory": "Architecture"
    }
  ],
  "total": 2,
  "success": true
}

Features:

Returns articles marked as "favorite" published in last 7 days
Sorted by publication date (newest first)
Ultra token-efficient - titles and metadata only
No excerpts or content by default
Use get_article_full to read full content of any article

6. get_article_full

Get full article content by ID. Use this for token-efficient reading: browse titles first, then fetch complete content only for articles you want to read.

Use this for:

Reading full articles after browsing titles
Getting complete content for specific interesting articles
Token-efficient workflow (browse → select → read)

Parameters:

Parameter	Type	Description	Required
`articleId`	`number`	Article ID from get_content/search_articles	Yes

Example:

{
  "articleId": 123
}

Response:

{
  "articles": [
    {
      "id": 123,
      "title": "Complete Article Title",
      "content": "Full article content with all HTML and formatting...",
      "link": "https://example.com/article",
      "pubDate": "2024-01-15T10:30:00Z",
      "fetchDate": "2024-01-15T11:00:00Z",
      "status": "unread",
      "feedTitle": "Engineering Blog",
      "feedCategory": "Technology",
      "excerpt": "First 200 characters..."
    }
  ],
  "success": true
}

Token-Efficient Workflow:

1. get_content(limit=20) → Browse 20 titles/excerpts
2. Find interesting article with id=456
3. get_article_full(articleId=456) → Read full content
4. set_tag(articleId=456, status="favorite") → Save for later

7. get_sources

Get RSS feed sources with pagination and filtering. With hundreds of feeds, pagination is essential to avoid token limits.

Use this for:

Discovering available sources
Finding valid source names for filtering
Exploring feed categories
Browsing favorite blogs

Parameters:

Parameter	Type	Description	Default
`limit`	`number`	Number of sources to return (max recommended: 100)	`50`
`offset`	`number`	Offset for pagination (e.g., 50 for page 2)	`0`
`favoritesOnly`	`boolean`	Only show favorite blogs	`false`
`category`	`string`	Filter by category (case-insensitive, partial match)	All categories

Example (first page):

{
  "limit": 50,
  "offset": 0
}

Example (favorites only):

{
  "favoritesOnly": true,
  "limit": 20
}

Example (filter by category):

{
  "category": "Engineering",
  "limit": 30
}

Response:

{
  "sources": [
    {
      "id": 1,
      "title": "Engineering at Meta",
      "category": "Engineering Blogs",
      "url": "https://engineering.fb.com/feed/",
      "isFavorite": true
    },
    {
      "id": 2,
      "title": "Netflix Tech Blog",
      "category": "Engineering Blogs",
      "url": "https://netflixtechblog.com/feed",
      "isFavorite": false
    }
  ],
  "total": 518,
  "success": true
}

Pagination Example:

Page 1: offset=0, limit=50   → Sources 1-50 of 518
Page 2: offset=50, limit=50  → Sources 51-100 of 518
Page 3: offset=100, limit=50 → Sources 101-150 of 518

8. set_tag

Update article status to manage your reading workflow.

Use this for:

Marking articles as read
Saving favorites
Archiving old articles
Managing reading queue

Parameters:

Parameter	Type	Description	Required
`articleId`	`number`	Article ID to update	Yes
`status`	`string`	New status: `"unread"`, `"read"`, `"favorite"`, `"archived"`	Yes

Example:

{
  "articleId": 123,
  "status": "favorite"
}

Article Status Workflow

The server supports a comprehensive 4-status workflow:

┌─────────┐
│ unread  │ ← New articles start here
└────┬────┘
     │
     ├──→ read      (marked as read)
     ├──→ favorite  (important/bookmarked)
     └──→ archived  (old/irrelevant)

Vector Search & Embeddings

How Embeddings Work

Automatic Generation: When fetching RSS articles, the server automatically generates embeddings for articles from 2020 onwards
OpenAI Integration: Uses text-embedding-3-small model (1536 dimensions)
Deduplication: Embeddings are only generated once per article (checked by URL)
Graceful Degradation: If OPENAI_API_KEY is missing or invalid, the server continues to work normally (embeddings skipped)

Storage

Embeddings stored as vector(1536) in PostgreSQL using pgvector extension
Enables fast cosine similarity search: ORDER BY embedding <=> query_vector

Cost Optimization

Only articles from 2020+ get embeddings (configurable in RssService.shouldGenerateEmbedding())
Duplicate articles are skipped (no redundant API calls)
Embedding generation failures don't block article saving

Development

Project Structure

mcp_rss/
├── src/
│   ├── entities/           # TypeORM entities
│   │   ├── Article.ts      # Article entity with vector embeddings
│   │   └── Feed.ts         # RSS feed source entity
│   ├── services/
│   │   ├── OpmlService.ts  # OPML parsing
│   │   ├── RssService.ts   # RSS fetching + embedding generation
│   │   ├── McpService.ts   # MCP tool implementations
│   │   └── EmbeddingService.ts # OpenAI embedding wrapper
│   ├── config/
│   │   └── database.ts     # TypeORM + pgvector setup
│   └── index.ts            # MCP server entry point
├── docker-compose.yml      # PostgreSQL with pgvector
├── .env.example            # Environment template
└── package.json

Building

npm run build        # Compile TypeScript
npm run watch        # Watch mode for development

Testing

# Test database connection
docker-compose ps

# Test MCP server locally
node dist/index.js

# Debug with MCP inspector
npx @modelcontextprotocol/inspector node dist/index.js

Troubleshooting

Database Connection Issues

Error: connect ETIMEDOUT

Ensure PostgreSQL is running: docker-compose ps
Check port 5433 is available: lsof -i :5433
Verify environment variables match docker-compose settings

OpenAI API Errors

Error: 401 Incorrect API key

Verify your API key at https://platform.openai.com/api-keys
Ensure you have available credits
Check the key isn't expired

Embeddings not being generated:

Server works fine without API key (embeddings skipped)
Check article dates (only 2020+ articles get embeddings)
Look for errors in console logs

MCP Server Issues

Server not appearing in Claude Desktop:

Check Claude Desktop config path is correct
Verify dist/index.js exists (run npm run build)
Restart Claude Desktop after config changes
Check Claude Desktop logs for errors

Performance Tips

Adjust Update Interval: Set RSS_UPDATE_INTERVAL to 60+ minutes for production
Limit Embedding Generation: Embeddings are only for articles from 2020+
Use Pagination: Always use offset and limit for large result sets
Database Indexing: PostgreSQL automatically indexes the vector column

License

MIT

Contributing

Contributions welcome! Please ensure:

TypeScript compiles without errors (npm run build)
Environment variables are documented
New features include appropriate error handling