SunZhi-Will/website-to-markdown-mcp
If you are the rightful owner of website-to-markdown-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The Website to Markdown MCP Server is a robust tool designed to fetch website content and convert it into Markdown format, enhancing AI's ability to process and understand web information.
fetch_website
Fetch any website and convert to Markdown
list_configured_websites
List all configured websites for easy access
๐ Website to Markdown MCP Server
Language: |
A powerful Model Context Protocol (MCP) server designed for fetching website content and converting it to Markdown format, making it easier for AI to understand and process website information.
โจ Key Features
๐ Enhanced Processing | ๐ OpenAPI Support | โ๏ธ Smart Analysis | ๐ฏ Advanced Extraction |
---|---|---|---|
AI-powered content cleanup | OpenAPI 3.x/Swagger 2.0 | Reading time calculation | Main content detection |
Auto ad removal | Professional validation | Word count statistics | Language detection |
Content summarization | Structured API parsing | Smart retry mechanism | Multi-format support |
๐ What's New in v1.2.0
๐ Major Enhancements
Feature | Status | Description |
---|---|---|
๐ง Enhanced Content Processor | โ | AI-powered content cleaning and extraction |
๐ Smart Analytics | โ | Word count, reading time, content summary |
๐ Language Detection | โ | Automatic language identification |
๐ฏ Intelligent Retry | โ | Smart retry mechanism with exponential backoff |
๐ Stealth Browser | โ | Anti-detection browsing capabilities |
โก Rate Limiting | โ | Built-in rate limiting and concurrency control |
๐งน Content Cleanup | โ | Remove ads, navigation, and irrelevant content |
๐ Enhanced Markdown | โ | Support for strikethrough, underline, highlights |
๐ Quick Start
๐ฏ Method 1: NPX Installation (๐ Recommended)
๐ก Easiest way: No local installation needed!
Step 1: Create Configuration File ๐
Create a my-websites.json
file:
{
"websites": [
{
"name": "your_website",
"url": "https://your-website.com",
"description": "Your Project Website"
},
{
"name": "api_docs",
"url": "https://api.example.com/openapi.json",
"description": "Your API Specification"
}
]
}
Step 2: Configure MCP Server โ๏ธ
Add to .cursor/mcp.json
:
{
"mcpServers": {
"website-to-markdown": {
"command": "npx",
"args": ["-y", "website-to-markdown-mcp"],
"disabled": false,
"env": {
"WEBSITES_CONFIG_PATH": "./my-websites.json"
}
}
}
}
Step 3: Restart and Test ๐
- Restart Cursor
- Open Chat and use Agent mode
- Test command:
Please list all configured websites
๐ Done! No installation required!
๐ฏ Method 2: Local Installation
๐ก Best Practice: Use this method for development or customization!
Step 1: Clone and Build
git clone https://github.com/your-username/website-to-markdown-mcp.git
cd website-to-markdown-mcp
npm install
npm run build
Step 2: Configure MCP Server
Add to .cursor/mcp.json
:
{
"mcpServers": {
"website-to-markdown": {
"command": "cmd",
"args": ["/c", "node", "./website-to-markdown-mcp/dist/index.js"],
"disabled": false,
"env": {
"WEBSITES_CONFIG_PATH": "./my-websites.json"
}
}
}
}
๐ฅ Enhanced Output Features
๐ Rich Content Analysis
Every fetched content now includes:
- ๐ Content Summary: AI-generated summary of the main content
- โฑ๏ธ Reading Time: Estimated reading time based on content length
- ๐ข Word Count: Accurate word count for both English and Chinese
- ๐ Language Detection: Automatic language identification
- ๐ฏ Content Quality Score: Assessment of content relevance
๐ Enhanced Markdown Output
# ๐ Example Website
**Source**: https://example.com
**Website**: example_site - Example Website
**๐ Reading Time**: 5 minutes
**๐ข Word Count**: 1,250 words
**๐ Language**: English
**๐ Summary**: This article discusses the latest developments in web technology...
---
[Enhanced Markdown content with better formatting...]
๐ Complete OpenAPI/Swagger Support
๐ฅ Professional API Documentation
Feature | OpenAPI 3.x | Swagger 2.0 | Description |
---|---|---|---|
๐ Auto Detection | โ | โ | Support JSON/YAML formats |
โ Professional Validation | โ | โ | Using @readme/openapi-parser |
๐ Structured Parsing | โ | โ | Endpoints, parameters, responses |
๐ Reference Resolution | โ | โ | Auto handle $ref references |
๐ Smart Summary | โ | โ | Generate API overview |
๐ Formatted Output | โ | โ | Readable Markdown |
๐ Pre-configured Example Websites
{
"websites": [
{
"name": "petstore_openapi",
"url": "https://petstore3.swagger.io/api/v3/openapi.json",
"description": "๐ Swagger Petstore OpenAPI 3.0 Spec (Demo)"
},
{
"name": "petstore_swagger",
"url": "https://petstore.swagger.io/v2/swagger.json",
"description": "๐ฑ Swagger Petstore Swagger 2.0 Spec (Demo)"
},
{
"name": "github_api",
"url": "https://raw.githubusercontent.com/github/rest-api-description/main/descriptions/api.github.com/api.github.com.json",
"description": "๐ GitHub REST API OpenAPI Spec"
}
]
}
๐ฆ Installation & Setup
๐ ๏ธ System Requirements
- Node.js 20.18.1+ (Recommended: v22.15.0 LTS)
- npm 10.0.0+ or yarn
- Cursor Editor
โ ๏ธ Important: Some dependencies require Node.js v20.18.1 or higher. Please update your Node.js version if you encounter engine compatibility warnings.
โก NPM Package Installation
# Global installation
npm install -g website-to-markdown-mcp
# Or use directly with npx (recommended)
npx website-to-markdown-mcp
๐ง Development Setup
# 1. Clone repository
git clone https://github.com/your-username/website-to-markdown-mcp.git
cd website-to-markdown-mcp
# 2. Install dependencies
npm install
# 3. Build project
npm run build
๐๏ธ Advanced Configuration Options
Configuration Priority Order
graph TD
A[๐ Check Environment Variable<br/>WEBSITES_CONFIG_PATH] --> B{File exists?}
B -->|Yes| C[โ
Load External Config File]
B -->|No| D[๐ Check Environment Variable<br/>WEBSITES_CONFIG]
D --> E{Valid JSON?}
E -->|Yes| F[โ
Load Embedded Config]
E -->|No| G[๐ Check config.json]
G --> H{File exists?}
H -->|Yes| I[โ
Load Local Config]
H -->|No| J[๐ง Use Default Config]
๐จ Configuration Method Details
๐ Method 1: External Configuration File (๐ Recommended)
๐ก Advantages: Easy to edit, syntax highlighting, version control friendly
๐ง Detailed Setup Steps
-
Create Configuration File
# Can be placed anywhere touch my-api-configs.json
-
Edit Configuration Content
{ "websites": [ { "name": "my_docs", "url": "https://docs.example.com", "description": "๐ My Documentation Website" } ] }
-
Set Environment Variable
{ "env": { "WEBSITES_CONFIG_PATH": "./my-api-configs.json" } }
๐ Method 2: Embedded JSON (Backward Compatible)
๐ง Configuration Example
{
"mcpServers": {
"website-to-markdown": {
"command": "cmd",
"args": ["/c", "node", "./website-to-markdown-mcp/dist/index.js"],
"disabled": false,
"env": {
"WEBSITES_CONFIG": "{\"websites\":[{\"name\":\"example\",\"url\":\"https://example.com\",\"description\":\"Example Website\"}]}"
}
}
}
}
๐ Method 3: Local config.json
๐ง Local Configuration
Directly edit config.json
in the project root directory:
{
"websites": [
{
"name": "local_site",
"url": "https://local.example.com",
"description": "๐ Local Test Website"
}
]
}
๐ง Available Tools
๐ General Tools
Tool Name | Function | Parameters | Example |
---|---|---|---|
fetch_website | Fetch any website | url : Website URL | Fetch OpenAPI spec files |
list_configured_websites | List configured websites | None | View all available websites |
๐ฏ Dedicated Tools
Each configured website automatically generates corresponding dedicated tools:
fetch_petstore_openapi
- Fetch Petstore OpenAPI 3.0 specfetch_petstore_swagger
- Fetch Petstore Swagger 2.0 specfetch_github_api
- Fetch GitHub API specfetch_tailwind_css
- Fetch Tailwind CSS documentation
๐ Enhanced Output Format Examples
๐ General Website Content with Analytics
# Website Title
**Source**: https://example.com
**Website**: example_site - Example Website
**๐ Reading Time**: 3 minutes
**๐ข Word Count**: 650 words
**๐ Language**: English
**๐ Summary**: This article provides a comprehensive overview of modern web development practices, covering frontend frameworks, backend technologies, and deployment strategies.
---
[Enhanced cleaned Markdown content with ads removed and main content extracted...]
๐ OpenAPI 3.x Specification File
# ๐ Example API (v2.1.0)
**Source**: https://api.example.com/openapi.json
**OpenAPI Version**: 3.0.3
**Validation Status**: โ
Valid
**๐ Processing Time**: 1.2 seconds
**๐ข Endpoints**: 25 endpoints
**๐ Server Locations**: 3 servers
---
## ๐ API Basic Information
- **API Name**: Example API
- **Version**: 2.1.0
- **OpenAPI Version**: 3.0.3
- **Description**: A powerful example API for modern applications
## ๐ Servers
1. **https://api.example.com**
- ๐ข Production server
2. **https://staging-api.example.com**
- ๐งช Testing server
## ๐ ๏ธ API Endpoints
Total of **25** endpoints:
### ๐ฅ `/users`
- **GET**: Get user list
- **POST**: Create new user
### ๐ `/users/{id}`
- **GET**: Get specific user
- **PUT**: Update user information
- **DELETE**: Delete user
## ๐งฉ Components
- **Schemas**: 12 data models
- **Parameters**: 8 reusable parameters
- **Responses**: 15 reusable responses
- **Security Schemes**: 3 security mechanisms
๐ฏ Usage Examples
๐ป Basic Usage
Please fetch the content from https://docs.example.com and convert to markdown
๐ OpenAPI Specification Fetching
Please use the fetch_petstore_openapi tool to fetch Petstore OpenAPI specification
๐ Documentation Website Fetching
Please fetch React official documentation content
๐จ Troubleshooting
๐ Complete Troubleshooting Guide: See for detailed solutions to common issues.
โ Quick Solutions
๐ง Node.js Version Issues
Error: npm WARN EBADENGINE Unsupported engine
- Solution: Update Node.js to v20.18.1 or higher
- Download: Node.js Official Website
- Verify:
node --version
๐ Module Not Found Issues
Error: Cannot find module './db.json'
- Solution 1: Clear npm cache:
npm cache clean --force
- Solution 2: Update Node.js version
- Solution 3: Use local installation instead of npx
โ๏ธ Configuration Issues
Q: Configuration changes not taking effect?
- โ Confirm JSON format is correct
- โ Restart Cursor
- โ Check environment variable names
Q: JSON format errors?
- ๐ ๏ธ Use JSON Validator
- ๐ ๏ธ Confirm using double quotes
- ๐ ๏ธ Check for extra commas
๐ Debug Mode
Detailed logs are output to stderr at startup:
# View debug messages
npm run dev 2> debug.log
๐ Performance & Optimization
โก Performance Features
- ๐ Smart Retry: Intelligent retry with exponential backoff
- ๐พ Rate Limiting: Built-in rate limiting to prevent overload
- ๐ฏ Content Filtering: Remove irrelevant content for faster processing
- ๐งน Ad Removal: Automatic ad and popup removal
- ๐ Stealth Mode: Anti-detection browsing capabilities
๐ก๏ธ Security Considerations
- ๐ HTTPS websites only (recommended)
- ๐ ๏ธ Auto filter malicious scripts
- ๐ Limit output content length
- ๐ Stealth browsing to avoid detection
๐ฆ Dependencies
Package | Version | Purpose |
---|---|---|
@modelcontextprotocol/sdk | ^1.0.0 | MCP Core Framework |
@readme/openapi-parser | ^4.1.0 | Professional OpenAPI Parsing |
axios | ^1.6.0 | HTTP Request Handling |
cheerio | ^1.0.0 | HTML Parsing Engine |
turndown | ^7.1.2 | HTML to Markdown |
yaml | ^2.8.0 | YAML Format Support |
zod | ^3.22.0 | Data Validation Framework |
playwright | ^1.40.0 | Browser automation |
๐ Changelog
๐ v1.2.0 (Latest)
๐ Major Feature Updates
- โจ Added Enhanced content processing with AI-powered cleanup
- โจ Added Smart analytics: word count, reading time, content summary
- โจ Added Language detection and multi-language support
- โจ Added Stealth browser capabilities for anti-detection
- โจ Added Built-in rate limiting and retry mechanisms
- โจ Added Advanced content filtering and ad removal
- ๐ง Enhanced Markdown processing with more HTML element support
- ๐ Improved Output format with rich metadata
- ๐ฏ Fixed Various technical issues and dependencies
๐ฏ v1.1.0 (Previous)
๐ Major Feature Updates
- โจ Added Full OpenAPI 3.x/Swagger 2.0 support
- โจ Added JSON/YAML format auto-detection
- โจ Added Professional-grade spec validation and reference resolution
- โจ Added Version auto-adaptation mechanism
- โจ Added Structured API documentation summary
- ๐ง Pre-configured Multiple OpenAPI/Swagger examples
- ๐ฆ Added NPM package distribution with npx support
- ๐ฏ Enhanced Installation methods for better user experience
๐ฏ v1.0.0 (Stable)
- ๐ Initial Release
- ๐ Basic Functions Website content fetching
- ๐ Core Functions Markdown conversion
- โ๏ธ Configuration Support Multi-website management
๐ค Contributing
๐ก How to Contribute
- ๐ด Fork this project
- ๐ Create feature branch (
git checkout -b feature/AmazingFeature
) - ๐ Commit changes (
git commit -m 'Add some AmazingFeature'
) - ๐ค Push to branch (
git push origin feature/AmazingFeature
) - ๐ Open Pull Request
๐ Issue Reporting
Report issues on the Issues page, please include:
- ๐ Issue Description
- ๐ Reproduction Steps
- ๐ป Environment Information
- ๐ธ Screenshots or Logs
๐ License
This project is licensed under the MIT License - see the file for details.
๐ If this project helps you, please give it a Star!
๐ฌ Have questions or suggestions? Feel free to open an Issue!
Made by Sun โค๏ธ for the Developer Community