4bd4ll4h/mcp-devtools-browser
If you are the rightful owner of mcp-devtools-browser and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The Model Context Protocol (MCP) server is designed to enhance LLMs with browser automation capabilities, enabling them to autonomously explore web pages, inspect network traffic, and generate robust web-scraping scripts.
DevTool Broswer for developers
A powerful Model Context Protocol (MCP) server that provides LLMs with browser automation capabilities using Puppeteer.
🚀 Quick Start
npm install -g @4bd4ll4h/mcp-devtools-browser
Add to your MCP client configuration:
{
"mcpServers": {
"devtools-browser": {
"command": "npx",
"args": ["@4bd4ll4h/mcp-devtools-browser"]
}
}
}
✨ Features
- Browser Automation: Open, navigate, and control browser pages
- DOM Inspection: Extract structured DOM data with accessibility focus
- Network Monitoring: Capture and analyze network requests
- Event Logging: Comprehensive session tracking and debugging
- User Interactions: Click, type, scroll, hover, and more
- Visual Capture: Screenshots and visual analysis
- Resource Management: Automatic cleanup and memory management
📖 Documentation
- - Complete tool and resource documentation
- - Usage examples and tutorials
- - Architecture and contribution guidelines
- - How to contribute to this project
🤝 Contributing
We welcome contributions! Please read our and .
Project Specification
Project Overview
This project aims to build a Model Context Protocol (MCP) server that assists an LLM in generating high-quality, reliable web-scraping scripts using TypeScript + Puppeteer.
The MCP will act as a managed gateway between:
- A real browser environment (Puppeteer)
- Page lifecycle events (requests, responses, selectors, DOM state)
- An LLM tasked with understanding the page and generating scraping actions
The goal:
Allow the LLM to autonomously explore pages, inspect network traffic, extract DOM node paths/selectors, and generate robust scraping scripts on demand.
Core Use-Cases
-
Data Extraction From Any Website
- Offers, products, tables, PDFs, metadata, images
-
Network Intelligence
- Detect backend API calls
- Infer JSON data structures
- Prioritize structured data over rendered HTML
-
Dynamic DOM Inspection
- Choose stable selectors
- Scroll and lazily load content
- Handle shadow DOM, iframes, modals
Why an MCP?
MCP provides:
- Structured bidirectional workflow
- Model actions with schema
- Better orchestration
- Reproducibility
It enables constructing LLM-driven scraping agents.
Primary Technical Challenges
This system must solve:
✅ Exposing DevTools-like insights to an LLM
- Network requests/responses
- Headers, bodies, error codes
✅ Large DOM visibility
- Without overloading token limits
✅ Robust stateful browsing
- Tabs
- Navigation history
- Parallel extraction
High-Level Architecture
LLM <--> MCP Server <--> Puppeteer Controller
|
├── Browser Pool (multi-page sessions)
├── Network Listener
└── DOM Snapshot Manager
Key Components
1. Browser Manager
Responsibilities:
- Start/stop browser instances
- Create new tabs
- Close tabs
- Report session info
Recommendations:
- Maintain an internal registry keyed by
sessionId - One session per LLM conversation
2. Tab/Page Manager
For each page:
- Navigation
- Click, Type, Scroll, WaitForSelector
- Save screenshots
- Persist page state
Recommended:
- Limit max open tabs
- Auto-cleanup resources
3. Network Interceptor
Goals:
- Capture all XHR/fetch calls
- Inspect responses
- Identify API endpoints
- Detect potential structured data sources
MCP tool actions:
getNetworkRequests()filterRequestsBy(url|type|status)fetchResponseBody(requestId)
4. DOM Inspector
Challenge: Pages can be huge; LLM token limits apply.
Approaches:
A. DOM Chunking
Split the DOM into slices:
- By depth
- By visual viewport
- By selector path
B. Selector Spotlight
LLM requests:
- Highlight possible selectors for hovered element or query
C. CSS Path Generation
Automatically compute:
- CSS selectors
- XPath
- Robust heuristic selectors
Recommended LLM-Facing MCP Actions
Navigation
navigate(url)goBack()goForward()
DOM Interrogation
querySelector(selector)querySelectorAll(selector, limit)extractAttributes(selector, attrs[])getBoundingClientRect(selector)scroll(amount)scrollToBottom()
Network
listNetworkRequests(type?)getRequestDetails(requestId)getResponseBody(requestId)
Screenshots
captureScreenshot(mode=viewport|full)
Debugging
printConsoleLogs()printNetworkErrors()
Utility
generateSelectorsAtPoint(x,y)
Browser/Tab Lifecycle Strategy
Session Rules
- Each MCP session creates one browser instance
- Tabs are registered and tracked
- Hard limit (e.g., 5) to avoid memory blowup
Garbage Collection
- Idle tabs > N minutes → auto close
- Close all tabs on session end
Tab Identification
Return structured tab state:
{
"tabId": "abc123",
"url": "...",
"title": "...",
"loading": false
}
Exposing DevTools-Like Capabilities
Approach A — Chrome DevTools Protocol Events
Use:
page.on('request')page.on('response')page.on('console')
Pros:
- Real-time
- Low overhead
Approach B — Intercept & Store Network History
Store:
- Method
- URL
- Status
- Request body
- Response size/body hints
Let the LLM filter later.
Approach C — Filter by Type
- XHR
- Fetch
- Media
- Stylesheet
- Script
Useful for target discovery.
Recommended: All three.
Making Large DOMs LLM-Friendly
Approach A — Contextual Chunking
Split DOM by:
- visible sections
- semantic regions (
<section>,<article>)
Approach B — Selector-Only Summaries
Instead of dumping HTML, provide:
selector -> value summary
Example:
.OfferTitle -> "Summer Sale"
.OfferPrice -> "$19.99"
Approach C — On-Demand Snapshot
LLM asks:
"Give me the DOM for the 'products' container"
You respond with localized HTML only.
Recommended
A + C combination.
Your Suggested Way (Evaluation)
Strengths
✔ Monitoring requests catches hidden APIs ✔ DOM extraction enables visual scraping ✔ Intercepting lifecycle gives completeness
Weaknesses / Risks
⚠ Dumping full DOM = token explosion ⚠ JSON responses can be massive ⚠ Too many network logs → noise ⚠ Repeated structure confusion for LLMs
Potential Quality Issues
- Selector instability (dynamic classes)
- Infinite scroll complexity
- Event timing issues
- CSP blocking screenshots
We will mitigate these using heuristics and detection rules.
Recommended Selector Stability Heuristics
- Prefer:
data-*attributes- Semantic HTML
- Parent chains
- Avoid:
- Obfuscated class names
- Auto-generated IDs
- Validate:
- That selector matches consistent count across scroll events
Script Generation Philosophy
Horizontal fallback order:
- Structured API JSON (Best)
- Semantic HTML
- Computed DOM text
- Visual scraping (worst)
The LLM should operate with this hierarchy.
MCP Tool Schema Examples
Example Action: List Network Requests
{
"name": "listNetworkRequests",
"arguments": {
"type": "xhr",
"status": 200,
"contains": "offers"
}
}
Example Action: Query DOM
{
"name": "querySelectorAll",
"arguments": {
"selector": ".offer-card",
"limit": 20,
"attributes": ["href", "innerText"]
}
}
LLM Workflow Example
- Navigate to target URL
- Monitor network for JSON endpoints
- Request DOM snapshot of target regions
- Choose stable selectors
- Generate a reusable scraping script in TS
- Test selectors on multiple pages (if pagination)
- Output structured results
Error Handling Strategy
- Expose structured errors
- Include stack traces
- Inform LLM of transient failures
Example:
{
"error": "SelectorNotFound",
"selector": ".price",
"attempts": 3
}
Future Extensions
- PDF downloading
- File metadata extraction
- Accessibility tree scraping
- Snapshot diff detection
- Session replay
Security Considerations
- Do not allow navigation to
localhostports - Disable downloads by default
- Sanitize file output paths
- Strip sensitive request headers
Technology Stack
Language: TypeScript Browser Automation: Puppeteer Protocol: MCP State Storage: In-memory map Parser Tools:
- DOM traversal utilities
- CSS/XPath generator libraries
Folder Structure (Proposed)
/src
/mcp
actions/
schemas/
router.ts
/browser
BrowserManager.ts
PageManager.ts
NetworkTracker.ts
DomInspector.ts
utils/
index.ts
types.ts
Success Criteria
✅ The LLM can:
- Inspect network calls
- Read DOM structure safely
- Navigate tabs
- Identify stable selectors
- Generate robust scripts
✅ The agent:
- Avoids full DOM dumps
- Uses API endpoints when possible
- Extracts structured results reliably
End Goal
A fully autonomous scraping assistant that can:
- Discover data sources
- Generate resilient extraction logic
- Produce TypeScript/Node scripts
- Handle dynamic web apps
Ready to Build
Now Cursor AI has:
- Global context
- Architecture
- Best-practice heuristics
- Risks
- Workflows
- Expected APIs
This file should power smart context-aware coding assistance.
Let me know when you're ready for:
- Code scaffolding
- MCP action definitions
- Puppeteer wrapper implementations
- Selector heuristics
- JSON schema contracts