SEO Audit MCP Server

A Model Context Protocol (MCP) server that provides comprehensive technical SEO auditing tools, optimized for job board websites.

Features

Page Analysis - Deep analysis of individual pages including meta tags, headings, structured data, rendering behavior, and links
Site Crawling - Multi-page crawling with automatic page type classification
Lighthouse Integration - Core Web Vitals and performance auditing
Sitemap Analysis - Robots.txt and XML sitemap parsing with job-specific insights
JobPosting Schema Validation - Specialized validation against Google's requirements

Installation

Prerequisites

Node.js 18+
Chrome/Chromium (for Playwright)
Lighthouse CLI (optional, for performance audits)

Setup

# Clone or extract the project
cd seo-audit-mcp

# Install dependencies
npm install

# Install Playwright browsers
npx playwright install chromium

# Install Lighthouse globally (optional but recommended)
npm install -g lighthouse

# Build the project
npm run build

Configure Claude Desktop

Add to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "seo-audit": {
      "command": "node",
      "args": ["/path/to/seo-audit-mcp/dist/index.js"]
    }
  }
}

Configure Claude CLI

Add to your Claude CLI config or use directly:

# Using npx (after npm link)
claude --mcp-server "node /path/to/seo-audit-mcp/dist/index.js"

Available Tools

`analyze_page`

Analyze a single web page for SEO factors.

Input:
- url (required): The URL to analyze
- waitForSelector: CSS selector to wait for (for JS-heavy pages)
- timeout: Timeout in milliseconds (default: 30000)
- device: 'desktop' or 'mobile' (default: desktop)

Output:
- Meta tags, headings, structured data
- JobPosting schema validation
- JavaScript rendering analysis
- Link and image analysis

`crawl_site`

Crawl multiple pages starting from a URL.

Input:
- startUrl (required): Starting URL
- maxPages: Maximum pages to crawl (default: 50)
- maxDepth: Maximum link depth (default: 5)
- includePatterns: Regex patterns to include
- excludePatterns: Regex patterns to exclude

Output:
- Aggregated statistics
- Page type classification (job detail, category, location pages)
- Duplicate detection
- Critical issues and warnings

`run_lighthouse`

Run Lighthouse performance audit.

Input:
- url (required): URL to audit
- device: 'mobile' or 'desktop' (default: mobile)
- categories: Array of categories to audit
- saveReport: Save HTML report (default: false)

Output:
- Performance, Accessibility, Best Practices, SEO scores
- Core Web Vitals (LCP, CLS, TBT, FCP, TTFB)
- Optimization opportunities
- Diagnostics

`analyze_sitemap`

Analyze robots.txt and XML sitemaps.

Input:
- baseUrl (required): Base URL of the site
- includeSitemapUrls: Include full URL list (default: true)
- maxUrls: Max URLs per sitemap (default: 1000)

Output:
- robots.txt rules and issues
- Discovered sitemaps
- Job URL detection
- Recommendations

`check_urls`

Check HTTP status codes for multiple URLs.

Input:
- urls (required): Array of URLs to check
- timeout: Timeout per URL in milliseconds

Output:
- Status code, redirect destination, response time per URL

Usage Examples

Quick Page Audit

"Analyze the SEO of https://example.com/jobs/software-engineer"

Full Site Audit

"Crawl https://example.com and analyze their job board SEO. 
Focus on structured data and landing pages."

Performance Check

"Run a Lighthouse audit on https://example.com/jobs for mobile devices"

Job Board Discovery

"Analyze the sitemap for https://example.com and find their job posting pages"

Job Board Specific Features

This tool is optimized for job boards with:

JobPosting Schema Validation
- Validates all required fields (title, description, datePosted, etc.)
- Checks recommended fields (validThrough, baseSalary, employmentType)
- Remote job validation (applicantLocationRequirements)
- Expiration date checking
Page Type Classification
- Job detail pages
- Job listing/search pages
- Category landing pages (e.g., /marketing-jobs/)
- Location landing pages (e.g., /jobs-in-new-york/)
- Company profile pages
Expired Job Handling Analysis
- Detects 404s, redirects, and soft 404s
- Checks validThrough dates in schema
- Recommends proper handling strategies
Recommendations
- Google Indexing API implementation
- Job-specific sitemaps
- Landing page architecture

Development

# Run in development mode
npm run dev

# Run tests
npm test

# Lint code
npm run lint

Architecture

src/
├── index.ts          # Entry point
├── server.ts         # MCP server implementation
├── tools/
│   ├── index.ts      # Tool registry
│   ├── crawl-page.ts # Single page analysis
│   ├── crawl-site.ts # Multi-page crawler
│   ├── lighthouse.ts # Performance audits
│   └── sitemap.ts    # Sitemap/robots analysis
├── types/
│   └── index.ts      # TypeScript definitions
└── utils/
    ├── browser.ts    # Playwright helpers
    └── http.ts       # HTTP utilities

Troubleshooting

Playwright Issues

# Reinstall browsers
npx playwright install chromium --force

# On Linux, you may need system dependencies
npx playwright install-deps

Lighthouse Not Found

# Install globally
npm install -g lighthouse

# Or use npx (slower)
npx lighthouse --version

Permission Errors

The server needs to write to /tmp for temporary files. Ensure proper permissions.

Timeout Errors

For slow sites, increase timeouts:

Page analysis: Use timeout parameter
Crawling: Reduce maxPages or increase delays

License

MIT