mcp-selenium

SirBlobby/mcp-selenium

3.2

If you are the rightful owner of mcp-selenium and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

MCP Selenium is a comprehensive Model Context Protocol server implementation for Selenium WebDriver, enabling advanced browser automation through standardized MCP clients.

Tools
  1. start_browser

    Launches a browser with optional configuration.

  2. navigate

    Navigates to a specified URL.

  3. find_element

    Finds an element using various locator strategies.

  4. click_element

    Clicks on an element.

  5. take_screenshot

    Captures a screenshot of the current page.

MCP Selenium

A comprehensive Model Context Protocol (MCP) server implementation for Selenium WebDriver, enabling advanced browser automation through standardized MCP clients like Claude Desktop and other MCP-compatible applications.

This allows AI assistants to control web browsers programmatically with 80+ automation tools.

Installation

Install the package using npm:

npm install @sirblob/mcp-selenium

Install the package using pnpm:

pnpm install @sirblob/mcp-selenium

Usage

Add to your MCP client configuration:

{
  "mcpServers": {
    "selenium": {
      "command": "npx",
      "args": ["-y", "@sirblob/mcp-selenium"]
    }
  }
}

Supported Browsers

  • Chrome - Full feature support including headless mode
  • Firefox - Full feature support including headless mode
  • Edge - Full feature support including headless mode
  • Safari - Basic feature support (limited options)

Available Tools

Browser Management

  • start_browser - Launches a browser (Chrome, Firefox, Edge, or Safari) with optional configuration
  • navigate - Navigates to a specified URL
  • close_session - Closes the current browser session
  • get_browser_status - Gets the status of the current browser session

Element Finding and Interaction

  • find_element - Finds an element using various locator strategies
  • click_element - Clicks on an element
  • send_keys - Sends text input to an element (typing)
  • get_element_text - Gets the text content of an element
  • get_element_source - Gets the HTML source code of an element and all its child elements
  • upload_file - Uploads a file using a file input element
  • find_elements_by_xpath - Finds multiple elements using XPath
  • scroll_to_element - Scrolls to bring an element into view
  • highlight_element - Highlights an element with a colored border for debugging
  • find_parent_element - Finds the parent element of a given element
  • find_sibling_element - Finds sibling elements of a given element

Select Element Tools

  • select_option_by_text - Selects an option in a select element by its visible text
  • select_option_by_value - Selects an option in a select element by its value attribute
  • select_option_by_index - Selects an option in a select element by its index
  • get_select_options - Gets all available options from a select element
  • get_selected_option - Gets the currently selected option from a select element

Table Element Tools

  • get_table_data - Extracts all data from a table element
  • get_table_cell - Gets the content of a specific table cell by row and column
  • click_table_cell - Clicks on a specific table cell by row and column
  • get_table_row_count - Gets the number of rows in a table
  • get_table_column_count - Gets the number of columns in a table
  • find_table_row_by_text - Finds a table row that contains specific text

List Element Tools

  • get_list_items - Gets all items from a list element (ul or ol)
  • get_list_item - Gets a specific list item by index
  • click_list_item - Clicks on a specific list item by index
  • click_list_item_by_text - Clicks on a list item that contains specific text
  • find_list_item_by_text - Finds the index of a list item that contains specific text
  • get_list_item_count - Gets the number of items in a list
  • get_nested_lists - Gets information about nested lists within a list
  • filter_list_items - Filters list items based on text criteria

Element State and Properties

  • get_element_attribute - Gets an attribute value from an element
  • get_element_css_property - Gets a CSS property value from an element
  • is_element_displayed - Checks if an element is visible
  • is_element_enabled - Checks if an element is enabled/interactive
  • is_element_selected - Checks if an element is selected (for checkboxes, radio buttons)

Mouse and Keyboard Interactions

  • hover - Moves the mouse to hover over an element
  • hover_element - Alternative hover command for element interaction
  • drag_and_drop - Drags an element to another location
  • double_click - Double-clicks an element
  • right_click - Right-clicks an element (context menu)
  • send_key_combination - Sends keyboard combinations (Ctrl+C, Alt+Tab, etc.)
  • press_key - Simulates pressing a single keyboard key

Page Actions

  • take_screenshot - Captures a screenshot of the current page and saves it to the current directory with timestamp
  • get_page_title - Gets the current page title
  • get_title - Alternative method to get the current page title
  • get_current_url - Gets the current page URL
  • get_page_source - Gets the complete HTML source of the page
  • page_source - Alternative method to get the complete HTML source of the page
  • refresh_page - Refreshes the current page
  • go_back - Navigates back in browser history
  • go_forward - Navigates forward in browser history

JavaScript Execution

  • execute_javascript - Executes JavaScript code in the browser
  • execute_async_javascript - Executes asynchronous JavaScript with callback support

Scrolling

  • scroll_to_element - Scrolls to bring an element into view
  • scroll_element_into_view - Alternative method to scroll an element into view
  • scroll_by_pixels - Scrolls by a specified number of pixels
  • scroll_to_coordinates - Scrolls to specific coordinates on the page
  • scroll_to_top - Scrolls to the top of the page
  • scroll_to_bottom - Scrolls to the bottom of the page

Window Management

  • get_window_size - Gets the current window dimensions
  • set_window_size - Sets the window size
  • maximize_window - Maximizes the browser window
  • get_window_handles - Gets all open window handles
  • switch_to_window - Switches to a specific window by handle
  • switch_to_window_by_title - Switches to a window/tab by its title
  • switch_to_window_by_url - Switches to a window/tab by its URL
  • close_window - Closes the current window
  • switch_to_frame - Switches to a frame by ID or name
  • switch_to_default_content - Switches back to the main document

Cookie Management

  • get_cookies - Gets all cookies for the current domain
  • add_cookie - Adds a new cookie
  • delete_cookie - Deletes a specific cookie
  • delete_all_cookies - Deletes all cookies

XPath Tools

  • evaluate_xpath - Evaluates an XPath expression and returns the result
  • count_elements_by_xpath - Counts elements matching an XPath expression
  • get_xpath_text_content - Gets text content from elements matching XPath
  • get_element_source_by_xpath - Gets HTML source of elements matching XPath
  • get_element_xpath - Gets the XPath of an element
  • get_elements_xpath - Gets XPath expressions for multiple elements
  • get_element_attribute_by_xpath - Gets an attribute value from an element using XPath
  • find_element_by_xpath_attribute - Finds elements by XPath and attribute values
  • find_element_by_xpath_index - Finds an element by XPath at a specific index
  • click_element_by_xpath_text - Clicks an element found by XPath containing specific text

Advanced Element Operations

Element State and Properties

  • get_element_attribute - Gets an attribute value from an element
  • get_element_css_property - Gets a CSS property value from an element
  • is_element_displayed - Checks if an element is visible
  • is_element_enabled - Checks if an element is enabled/interactive
  • is_element_selected - Checks if an element is selected (for checkboxes, radio buttons)

Configuration Parameters

Browser Options

Configure browser behavior with optional parameters:

{
  "headless": false,
  "arguments": ["--window-size=1920,1080", "--disable-web-security", "--disable-dev-shm-usage"]
}

Common Browser Arguments:

  • --headless=new - Run in headless mode (Chrome/Edge)
  • --window-size=width,height - Set initial window size
  • --disable-web-security - Disable CORS restrictions
  • --disable-dev-shm-usage - Overcome limited resource problems
  • --no-sandbox - Disable sandbox (useful in containerized environments)
  • --disable-gpu - Disable GPU hardware acceleration

Locator Strategies

Multiple ways to find elements on the page:

  • id - Find by element ID (<div id="myElement">)
  • css - Find by CSS selector (div.class-name, #id-name)
  • xpath - Find by XPath expression (//div[@class='example'])
  • name - Find by name attribute (<input name="username">)
  • tag - Find by HTML tag name (div, span, input)
  • class - Find by class name (class-name)

Timeout Configuration

Most tools accept an optional timeout parameter (default: 10000ms):

{
  "by": "id",
  "value": "submit-button",
  "timeout": 15000
}

Requirements

  • Node.js 22+
  • npm or pnpm
  • Browser drivers (automatically managed by Selenium)
  • TypeScript 5.0+ (dev dependency)

Troubleshooting

Common Issues

  • Driver not found: Selenium automatically downloads drivers, but ensure you have the target browser installed
  • Permission errors: On Linux, you may need to install browser packages (chromium-browser, firefox, etc.)
  • Timeout errors: Increase timeout values for slow-loading pages
  • Headless mode issues: Some features may not work in headless mode (file uploads, certain interactions)

Platform-Specific Notes

  • macOS: Safari requires enabling automation in Safari preferences
  • Linux: May require additional dependencies for GUI browsers
  • Windows: Should work out of the box with standard browser installations

License

MIT License - see file for details.

Acknowledgements

Inspired by @angiejones/mcp-selenium