NAYEMAHMED000/cleanweb-mcp
If you are the rightful owner of cleanweb-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
CleanWeb MCP is a lightweight server designed to extract core web content efficiently, filtering out ads and irrelevant elements, and converting it into Markdown format.
CleanWeb MCP: Efficiently Extract Core Web Content 🌐
Table of Contents
Overview
CleanWeb MCP is a lightweight Model Context Protocol (MCP) server designed to intelligently extract core web content. It automatically filters out ads and irrelevant elements, converting the content into a clean Markdown format. This tool aims to enhance the way users interact with web data, making it easier to consume and share.
Features
- Intelligent Content Extraction: Focuses on the main content of web pages.
- Ad Filtering: Automatically removes ads and distractions.
- Markdown Conversion: Outputs clean content in Markdown format.
- Lightweight: Efficient and fast, with minimal resource usage.
- Real-time Updates: Supports Server-Sent Events (SSE) for real-time data streaming.
- Node.js Support: Built on Node.js, ensuring compatibility with modern web technologies.
- TypeScript: Written in TypeScript for better maintainability and type safety.
Installation
To get started with CleanWeb MCP, follow these steps:
-
Clone the repository:
git clone https://github.com/NAYEMAHMED000/cleanweb-mcp.git
-
Navigate to the project directory:
cd cleanweb-mcp
-
Install the dependencies:
npm install
-
Start the server:
npm start
-
Access the server at
http://localhost:3000
.
Usage
To use CleanWeb MCP, you can send a request to the server with the URL of the web page you want to extract content from. Here’s a simple example using curl
:
curl -X POST http://localhost:3000/extract -H "Content-Type: application/json" -d '{"url": "https://example.com"}'
The server will respond with the extracted content in Markdown format. You can then use this content in your applications or for documentation purposes.
API Documentation
Endpoints
POST /extract
- Description: Extracts core content from the specified URL.
- Request Body:
{ "url": "string" }
- Response:
{ "markdown": "string" }
Example Response
{
"markdown": "# Example Title\n\nThis is the main content extracted from the web page."
}
Contributing
We welcome contributions to CleanWeb MCP! If you have ideas for improvements or new features, please follow these steps:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Make your changes and commit them.
- Push your branch to your forked repository.
- Create a pull request to the main repository.
Please ensure your code adheres to the project's coding standards and includes tests where applicable.
License
CleanWeb MCP is licensed under the MIT License. See the file for details.
Acknowledgments
- Thanks to the open-source community for their contributions and support.
- Special thanks to the developers of Node.js and TypeScript for their incredible tools.
For the latest releases and updates, visit our Releases section.