📖 简介
一个基于 Model Context Protocol (MCP) 的智能服务器,帮助研究者和开发者实时追踪 AI/LLM 领域的最新进展。
🎯 核心功能
- 📚 多源集成 - arXiv、GitHub、Hugging Face、Papers with Code
- 🔍 智能搜索 - 按关键词、领域、时间范围搜索
- 📊 自动汇总 - 每日/每周研究进展自动生成
- ⚡ 高效缓存 - 智能缓存机制,减少 API 调用
- 🌍 覆盖全面 - 15+ AI 研究领域全覆盖
✨ 功能特点
📚 多数据源集成
- arXiv - 搜索最新的 AI/ML 学术论文
- Papers with Code - 获取带代码实现的热门论文
- Hugging Face - 每日精选论文、热门模型和数据集
- GitHub - 追踪高 star 的 AI 项目和 trending 仓库
🎯 覆盖的 AI 研究领域
- 核心 AI/ML: 大语言模型 (LLM)、Transformer、深度学习
- 多模态与生成: CLIP、Stable Diffusion、文本生成图像
- 机器人学: 具身智能、机械臂控制、导航
- 生物信息学: 蛋白质折叠、药物发现、基因组学
- AI for Science: 科学计算、物理模拟
- 强化学习: 多智能体、策略梯度、离线 RL
- 图神经网络: 分子建模、知识图谱
- 高效 AI: 模型压缩、量化、LoRA
- AI 安全: 对齐、可解释性、公平性
- 新兴方向: 联邦学习、持续学习、神经形态计算
🛠️ MCP 工具
- search_latest_papers: 搜索最新 AI 论文
- search_github_repos: 搜索热门 AI GitHub 仓库
- get_daily_papers: 获取今日精选论文
- get_trending_repos: 获取 GitHub trending 仓库
- get_trending_models: 获取 Hugging Face 热门模型
- search_by_area: 按研究领域搜索(LLM、视觉、机器人等)
- generate_daily_summary: 生成每日 AI 研究汇总
- generate_weekly_summary: 生成每周 AI 研究汇总
📊 MCP 资源
ai-research://daily-summary: 每日 AI 研究汇总(自动缓存)ai-research://weekly-summary: 每周 AI 研究汇总(自动缓存)
🚀 快速开始
前置要求
- Python 3.10+
- pip 包管理器
- Claude Desktop (推荐) 或其他 MCP 客户端
安装步骤
# 1. 克隆仓库
git clone https://github.com/nanyang12138/AI-Research-MCP.git
cd AI-Research-MCP
# 2. 安装依赖
pip install -e .
# 3. (可选) 配置 GitHub Token
cp .env.example .env
# 编辑 .env 文件,添加你的 GitHub Token
💡 提示: 查看 QUICKSTART.md 获取更详细的安装指南
⚙️ 配置
环境变量(可选)
创建 .env 文件:
# GitHub Personal Access Token (强烈推荐)
# 提高 API 速率限制: 60 req/h → 5000 req/h
GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxx
# 缓存目录(可选,默认 .cache)
CACHE_DIR=.cache
# 缓存过期时间(秒)
CACHE_EXPIRY_GITHUB=3600 # 1 小时
CACHE_EXPIRY_ARXIV=7200 # 2 小时
CACHE_EXPIRY_SUMMARY=86400 # 24 小时
🔑 获取 GitHub Token
虽然可选,但强烈推荐配置以避免 API 速率限制
点击展开配置步骤
- 访问 GitHub Token Settings
- 点击
Generate new token (classic) - 勾选
public_repo权限 - 复制生成的 token
- 添加到
.env文件
GITHUB_TOKEN=ghp_your_token_here
💬 在 Claude Desktop 中使用
配置 Claude Desktop
编辑 Claude Desktop 配置文件:
| 操作系统 | 配置文件路径 |
|---|---|
| macOS | ~/Library/Application Support/Claude/claude_desktop_config.json |
| Windows | %APPDATA%\Claude\claude_desktop_config.json |
| Linux | ~/.config/Claude/claude_desktop_config.json |
方式 1: 使用 Python 命令(推荐)
{
"mcpServers": {
"ai-research": {
"command": "python",
"args": ["-m", "ai_research_mcp.server"],
"env": {
"GITHUB_TOKEN": "your_github_token_here"
}
}
}
}
方式 2: 使用绝对路径
{
"mcpServers": {
"ai-research": {
"command": "C:\\Users\\YourName\\path\\to\\python.exe",
"args": ["-m", "ai_research_mcp.server"],
"env": {
"GITHUB_TOKEN": "your_github_token_here"
}
}
}
}
重启 Claude Desktop
配置完成后,重启 Claude Desktop 以加载 MCP 服务器。
在聊天窗口右下角应该能看到 🔌 图标,表示 MCP 服务器已连接。
📖 使用示例
在 Claude Desktop 中,你可以这样提问:
🔍 搜索最新论文
帮我找最近一周关于大语言模型的论文
搜索最近三天关于多模态模型的研究
有什么关于 Diffusion Model 的新论文吗?
💻 查找 GitHub 仓库
有哪些新的高 star LLM 相关仓库?
找一些关于机器人学习的 GitHub 项目
最近有什么火热的 AI 开源项目?
📊 获取每日汇总
生成今天的 AI 研究汇总
给我看看本周的 AI 研究进展
今天有什么重要的 AI 新闻吗?
🎯 按领域搜索
帮我找生物信息学领域的最新 AI 研究
搜索强化学习的最新论文和项目
计算机视觉领域有什么新进展?
🤖 追踪模型
Hugging Face 上有哪些热门的新模型?
最近有哪些流行的文本生成模型?
有什么新发布的开源 LLM 吗?
💡 查看 EXAMPLES.md 获取更多使用示例
技术架构
项目结构
ai-research-mcp/
├── src/
│ └── ai_research_mcp/
│ ├── __init__.py
│ ├── server.py # MCP 服务器主文件
│ ├── data_sources/ # 数据源客户端
│ │ ├── arxiv_client.py
│ │ ├── github_client.py
│ │ ├── huggingface_client.py
│ │ └── papers_with_code_client.py
│ └── utils/
│ └── cache.py # 缓存管理
├── pyproject.toml
└── README.md
缓存机制
为了减少 API 调用次数和提高响应速度,服务器实现了文件缓存:
- GitHub API 结果缓存 1 小时
- arXiv 搜索结果缓存 2 小时
- 每日/每周汇总缓存 24 小时
缓存文件存储在 .cache 目录(可通过环境变量配置)。
API 数据源
arXiv
- API: arXiv API
- 限制: 每 3 秒最多 1 个请求
- 覆盖类别: cs.AI, cs.CL, cs.LG, cs.CV, cs.RO, q-bio.*, 等
GitHub
- API: GitHub REST API v3
- 限制:
- 无 token: 60 请求/小时
- 有 token: 5000 请求/小时
- 推荐: 配置 GitHub Token
Hugging Face
- API: Hugging Face Hub API
- 限制: 较宽松,建议使用缓存
- 数据: 每日论文、模型、数据集
Papers with Code
- API: Papers with Code API
- 限制: 较宽松
- 特点: 论文 + 代码实现
🔧 故障排除
❓ 为什么搜索结果为空?
可能原因:
- 关键词太具体 → 尝试使用更通用的术语
- 时间范围太短 → 增加
days参数 - API 速率限制 → 等待几分钟后重试
- 网络问题 → 检查网络连接
⚠️ GitHub API 速率限制错误
解决方法: 配置 GITHUB_TOKEN 环境变量
速率限制对比:
- ❌ 无 Token: 60 请求/小时
- ✅ 有 Token: 5000 请求/小时
🚫 服务器启动失败
检查清单:
- Python 版本 >= 3.10
- 依赖已安装:
pip install -e . - 配置文件路径正确
- 环境变量设置正确
🔄 缓存数据过时
删除缓存目录重新获取:
# Linux/macOS
rm -rf .cache
# Windows
rmdir /s .cache
🆘 更多问题?查看 TROUBLESHOOTING.md 或 提交 Issue
👨💻 开发
运行测试
# 安装开发依赖
pip install -e ".[dev]"
# 运行测试
pytest
# 运行特定测试
python test_clients.py
代码格式化
# 格式化代码
black src/
# Lint 检查
ruff check src/
# 类型检查(可选)
mypy src/
🤝 贡献
我们欢迎任何形式的贡献!
如何贡献
- Fork 本仓库
- 创建你的特性分支 (
git checkout -b feature/AmazingFeature) - 提交你的更改 (
git commit -m 'Add some AmazingFeature') - 推送到分支 (
git push origin feature/AmazingFeature) - 开启一个 Pull Request
贡献指南
- 遵循现有代码风格
- 添加适当的测试
- 更新相关文档
- 确保所有测试通过
📄 许可证
本项目采用 MIT 许可证 - 查看 LICENSE 文件了解详情
🙏 致谢
特别感谢以下项目和服务:
- Anthropic MCP - Model Context Protocol
- arXiv API - 学术论文数据
- GitHub API - 代码仓库数据
- Hugging Face Hub - 模型和数据集
- Papers with Code - 论文和代码配对
📝 更新日志
v0.1.0 (2025-10-28)
🎉 初始发布
- ✅ 集成 4 大数据源:arXiv、GitHub、Hugging Face、Papers with Code
- ✅ 实现 8 个 MCP 工具和 2 个 MCP 资源
- ✅ 智能缓存机制
- ✅ 覆盖 15+ AI 研究领域
- ✅ 完整的文档和示例
🗺️ 路线图
v0.2.0 (计划中)
- 添加 OpenReview 和 SemanticScholar 集成
- 支持自定义关键词订阅
- 改进缓存策略和性能优化
- 添加更多单元测试
v0.3.0 (未来)
- Web 界面
- 邮件通知功能
- 导出为 PDF/HTML
- 可视化图表
v1.0.0 (长期)
- 多语言支持(完整中英文)
- 智能推荐算法
- 移动端支持
💬 社区
🌐 English Version
📖 Introduction
An intelligent server based on Model Context Protocol (MCP) that helps researchers and developers track the latest AI/LLM research progress in real-time.
🎯 Core Features
- 📚 Multi-source Integration - arXiv, GitHub, Hugging Face, Papers with Code
- 🔍 Smart Search - Search by keywords, domains, and time ranges
- 📊 Auto Summary - Automated daily/weekly research digest generation
- ⚡ Efficient Caching - Smart caching mechanism to reduce API calls
- 🌍 Comprehensive Coverage - 15+ AI research areas covered
✨ Features
📚 Multi-source Data Integration
- arXiv - Search latest AI/ML academic papers
- Papers with Code - Get popular papers with code implementations
- Hugging Face - Daily featured papers, trending models and datasets
- GitHub - Track high-star AI projects and trending repositories
🎯 Covered AI Research Areas
- Core AI/ML: Large Language Models (LLM), Transformer, Deep Learning
- Multimodal & Generation: CLIP, Stable Diffusion, Text-to-Image
- Robotics: Embodied AI, Robot Arm Control, Navigation
- Bioinformatics: Protein Folding, Drug Discovery, Genomics
- AI for Science: Scientific Computing, Physics Simulation
- Reinforcement Learning: Multi-agent, Policy Gradient, Offline RL
- Graph Neural Networks: Molecular Modeling, Knowledge Graphs
- Efficient AI: Model Compression, Quantization, LoRA
- AI Safety: Alignment, Interpretability, Fairness
- Emerging Directions: Federated Learning, Continual Learning, Neuromorphic Computing
🛠️ MCP Tools
- search_latest_papers - Search latest AI papers
- search_github_repos - Search trending AI GitHub repositories
- get_daily_papers - Get today's featured papers
- get_trending_repos - Get GitHub trending repositories
- get_trending_models - Get Hugging Face trending models
- search_by_area - Search by research area (LLM, Vision, Robotics, etc.)
- generate_daily_summary - Generate daily AI research digest
- generate_weekly_summary - Generate weekly AI research digest
📊 MCP Resources
ai-research://daily-summary- Daily AI research digest (auto-cached)ai-research://weekly-summary- Weekly AI research digest (auto-cached)
🚀 Quick Start
Prerequisites
- Python 3.10+
- pip package manager
- Claude Desktop (recommended) or other MCP clients
Installation Steps
# 1. Clone the repository
git clone https://github.com/nanyang12138/AI-Research-MCP.git
cd AI-Research-MCP
# 2. Install dependencies
pip install -e .
# 3. (Optional) Configure GitHub Token
cp .env.example .env
# Edit .env file and add your GitHub Token
💡 Tip: See QUICKSTART.md for detailed installation guide
⚙️ Configuration
Environment Variables (Optional)
Create a .env file:
# GitHub Personal Access Token (Highly Recommended)
# Increase API rate limit: 60 req/h → 5000 req/h
GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxx
# Cache directory (optional, defaults to .cache)
CACHE_DIR=.cache
# Cache expiry times (in seconds)
CACHE_EXPIRY_GITHUB=3600 # 1 hour
CACHE_EXPIRY_ARXIV=7200 # 2 hours
CACHE_EXPIRY_SUMMARY=86400 # 24 hours
🔑 Getting GitHub Token
Although optional, highly recommended to avoid API rate limits
Click to expand setup steps
- Visit GitHub Token Settings
- Click
Generate new token (classic) - Select
public_repopermission - Copy the generated token
- Add to
.envfile
GITHUB_TOKEN=ghp_your_token_here
💬 Using with Claude Desktop
Configure Claude Desktop
Edit Claude Desktop configuration file:
| OS | Configuration File Path |
|---|---|
| macOS | ~/Library/Application Support/Claude/claude_desktop_config.json |
| Windows | %APPDATA%\Claude\claude_desktop_config.json |
| Linux | ~/.config/Claude/claude_desktop_config.json |
Method 1: Using Python Command (Recommended)
{
"mcpServers": {
"ai-research": {
"command": "python",
"args": ["-m", "ai_research_mcp.server"],
"env": {
"GITHUB_TOKEN": "your_github_token_here"
}
}
}
}
Method 2: Using Absolute Path
{
"mcpServers": {
"ai-research": {
"command": "C:\\Users\\YourName\\path\\to\\python.exe",
"args": ["-m", "ai_research_mcp.server"],
"env": {
"GITHUB_TOKEN": "your_github_token_here"
}
}
}
}
Restart Claude Desktop
After configuration, restart Claude Desktop to load the MCP server.
You should see a 🔌 icon in the bottom right corner of the chat window, indicating the MCP server is connected.
📖 Usage Examples
In Claude Desktop, you can ask questions like:
🔍 Search Latest Papers
Find me papers about large language models from the past week
Search for recent research on multimodal models from the last 3 days
Any new papers on Diffusion Models?
💻 Find GitHub Repositories
What are some new high-star LLM related repositories?
Find some GitHub projects about robot learning
What are the trending AI open source projects recently?
📊 Get Daily Digest
Generate today's AI research digest
Show me this week's AI research progress
Any important AI news today?
🎯 Search by Domain
Find me the latest AI research in bioinformatics
Search for latest papers and projects in reinforcement learning
What's new in computer vision?
🤖 Track Models
What are the trending new models on Hugging Face?
Any popular text generation models recently?
Any newly released open-source LLMs?
💡 See EXAMPLES.md for more usage examples
🏗️ Technical Architecture
Project Structure
ai-research-mcp/
├── src/
│ └── ai_research_mcp/
│ ├── __init__.py
│ ├── server.py # MCP server main file
│ ├── data_sources/ # Data source clients
│ │ ├── arxiv_client.py
│ │ ├── github_client.py
│ │ ├── huggingface_client.py
│ │ └── papers_with_code_client.py
│ └── utils/
│ └── cache.py # Cache management
├── pyproject.toml
└── README.md
Caching Mechanism
To reduce API calls and improve response speed, the server implements file caching:
- GitHub API results cached for 1 hour
- arXiv search results cached for 2 hours
- Daily/weekly digests cached for 24 hours
Cache files are stored in the .cache directory (configurable via environment variables).
🌐 API Data Sources
arXiv
- API: arXiv API
- Limits: Maximum 1 request per 3 seconds
- Coverage: cs.AI, cs.CL, cs.LG, cs.CV, cs.RO, q-bio.*, etc.
GitHub
- API: GitHub REST API v3
- Limits:
- Without token: 60 requests/hour
- With token: 5000 requests/hour
- Recommendation: Configure GitHub Token
Hugging Face
- API: Hugging Face Hub API
- Limits: Relatively lenient, caching recommended
- Data: Daily papers, models, datasets
Papers with Code
- API: Papers with Code API
- Limits: Relatively lenient
- Features: Papers + code implementations
🔧 Troubleshooting
❓ Why are search results empty?
Possible reasons:
- Keywords too specific → Try more general terms
- Time range too short → Increase
daysparameter - API rate limit → Wait a few minutes and retry
- Network issues → Check network connection
⚠️ GitHub API Rate Limit Error
Solution: Configure GITHUB_TOKEN environment variable
Rate limit comparison:
- ❌ Without Token: 60 requests/hour
- ✅ With Token: 5000 requests/hour
🚫 Server Startup Failed
Checklist:
- Python version >= 3.10
- Dependencies installed:
pip install -e . - Configuration file path correct
- Environment variables set correctly
🔄 Cached Data Outdated
Delete cache directory to refresh:
# Linux/macOS
rm -rf .cache
# Windows
rmdir /s .cache
🆘 More issues? Check TROUBLESHOOTING.md or Submit an Issue
👨💻 Development
Running Tests
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run specific tests
python test_clients.py
Code Formatting
# Format code
black src/
# Lint check
ruff check src/
# Type checking (optional)
mypy src/
🤝 Contributing
We welcome all forms of contributions!
How to Contribute
- Fork this repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
Contribution Guidelines
- Follow existing code style
- Add appropriate tests
- Update relevant documentation
- Ensure all tests pass
📄 License
This project is licensed under the MIT License - see LICENSE file for details
🙏 Acknowledgments
Special thanks to the following projects and services:
- Anthropic MCP - Model Context Protocol
- arXiv API - Academic paper data
- GitHub API - Code repository data
- Hugging Face Hub - Models and datasets
- Papers with Code - Papers and code pairing
📝 Changelog
v0.1.0 (2025-10-28)
🎉 Initial Release
- ✅ Integrated 4 major data sources: arXiv, GitHub, Hugging Face, Papers with Code
- ✅ Implemented 8 MCP tools and 2 MCP resources
- ✅ Smart caching mechanism
- ✅ Coverage of 15+ AI research areas
- ✅ Complete documentation and examples
🗺️ Roadmap
v0.2.0 (Planned)
- Add OpenReview and SemanticScholar integration
- Support custom keyword subscriptions
- Improve caching strategy and performance optimization
- Add more unit tests
v0.3.0 (Future)
- Web interface
- Email notification feature
- Export to PDF/HTML
- Visualization charts
v1.0.0 (Long-term)
- Multi-language support (full Chinese & English)
- Smart recommendation algorithm
- Mobile support
💬 Community
- 💡 Submit Feature Requests
- 🐛 Report Bugs
- 💭 Join Discussions
- ⭐ If you find it useful, please give us a Star!
如果这个项目对你有帮助,请给它一个 ⭐ Star!
If you find this project helpful, please give it a ⭐ Star!
Made with ❤️ by the AI Research Community
