hao-cyber/phone-mcp

3.6

If you are the rightful owner of phone-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

Phone MCP Plugin is a powerful tool that allows users to control their Android phones using ADB commands, enabling a wide range of functionalities from making calls to UI automation.

Tools

Functions exposed to the LLM to take actions

call_number

Make a phone call to the specified number.

Initiates a call using Android's dialer app through ADB. The number will be dialed immediately without requiring user confirmation.

Args: phone_number (str): The phone number to call. Country code will be automatically added if not provided.

Returns: str: Success message with the number being called, or an error message if the call could not be initiated.

end_call

End the current phone call.

Terminates any active phone call by sending the end call keycode through ADB.

Returns: str: Success message if the call was ended, or an error message if the end call command failed.

check_device_connection

Check if an Android device is connected via ADB.

Verifies that an Android device is properly connected and recognized by ADB, which is required for all other functions to work.

Returns: str: Status message indicating whether a device is connected and ready, or an error message if no device is found.

send_text_message

Send a text message to the specified number.

Uses the phone's messaging app with UI automation to send SMS. Process: Opens messaging app, fills recipient and content, automatically clicks send button, then auto-exits app.

Args: phone_number (str): Recipient's phone number. Country code will be automatically added if not included. Example: "13812345678" or "+8613812345678" message (str): SMS content. Supports any text, including emojis. Example: "Hello, this is a test message"

Returns: str: String description of the operation result: - Success: "Text message sent to {phone_number}" - Failure: Message containing error reason, like "Failed to open messaging app: {error}" or "Failed to navigate to send button: {error}"

receive_text_messages

Get recent text messages from the phone.

Retrieves recent SMS messages from the device's SMS database using ADB and content provider queries to get structured message data.

Args: limit (int): Maximum number of messages to retrieve (default: 5) Example: 10 will return the 10 most recent messages

Returns: str: JSON string containing messages or an error message: - Success: Formatted JSON string with list of messages, each with fields: - address: Sender's number - body: Message content - date: Timestamp - formatted_date: Human-readable date time (like "2023-07-25 14:30:22") - Failure: Text message describing the error, like "No recent text messages found..."

get_sent_messages

Get recently sent text messages from the phone.

Retrieves sent SMS messages from the device's SMS database. This provides a complete list of messages that were successfully sent from this device.

Args: limit (int): Maximum number of sent messages to retrieve (default: 5)

Returns: str: JSON string containing sent messages with: - from: Sender phone number (device owner) - to: Recipient phone number - text: Message content - date: Timestamp - formatted_date: Human-readable date time (like "2023-07-25 14:30:22")

start_screen_recording

Start recording the phone's screen.

Records the screen activity for the specified duration and saves the video to the phone's storage. Automatically creates directories if they don't exist.

Args: duration_seconds (int): Recording duration in seconds (default: 30, max: 180 seconds due to ADB limitations)

Returns: str: Success message with the path to the recording, or an error message if the recording could not be started.

play_media

Simulate media button press to play/pause media.

This function sends a keyevent that simulates pressing the media play/pause button, which can control music, videos, or podcasts that are currently playing.

Returns: str: Success message, or an error message if the command failed.

set_alarm

Set an alarm on the phone.

Creates a new alarm with the specified time and label using the default clock application.

Args: hour (int): Hour in 24-hour format (0-23) minute (int): Minute (0-59) label (str): Optional label for the alarm (default: "Alarm")

Returns: str: Success message if the alarm was set, or an error message if the alarm could not be created.

receive_incoming_call

Handle an incoming phone call.

Checks for any incoming calls and provides options to answer or reject the call. This function first checks if there's an incoming call, then can either answer it or reject it based on the action parameter.

Returns: str: Information about any incoming call including the caller number, or a message indicating no incoming calls.

get_contacts

Retrieve contacts from the phone.

Core function for accessing the contacts database on the device. Fetches contact information including names and phone numbers. Returns data in structured JSON format.

Args: limit (int): Number of contacts to retrieve, defaults to 20

Returns: str: JSON string with contact data or error message

create_contact

Create a new contact on the phone.

Opens the contact creation UI with pre-filled name and phone number, allowing the user to review and save the contact.

Args: name (str): The contact's full name phone_number (str): The contact's phone number (For testing, 10086 is recommended) email (str, optional): The contact's email address

Returns: str: Success message if the contact UI was launched, or an error message if the operation failed.

Note: When testing this feature, it's recommended to use 10086 as the test phone number. This is China Mobile's customer service number, which is suitable for testing environments and easy to recognize.

get_current_window

Get information about the current active window on the device.

Retrieves details about the currently focused window, active application, and foreground activities on the device using multiple methods for reliability.

Returns: str: JSON string with current window details or error message

get_app_shortcuts

Get application shortcuts for installed apps.

Retrieves shortcuts (quick actions) available for Android apps. If package_name is provided, returns shortcuts only for that app, otherwise lists all apps with shortcuts.

Args: package_name (str, optional): Specific app package to get shortcuts for

Returns: str: JSON string with app shortcuts information or error message

launch_app_activity

Launch an app using package name and optionally an activity name

This function uses adb to start an application on the device either by package name or by specifying both package and activity. It provides reliable app launching across different Android devices and versions.

Args: package_name (str): The package name of the app to launch (e.g., "com.android.contacts") activity_name (str): The specific activity to launch. If not provided, launches the app's main activity. Defaults to None.

Returns: str: JSON string with operation result: For successful operations: { "status": "success", "message": "Successfully launched <package_name>" }

    For failed operations:
        {
            "status": "error",
            "message": "Failed to launch app: <error details>"
        }

Examples: # Launch an app using just the package name result = await launch_app_activity("com.android.contacts")

# Launch a specific activity within an app
result = await launch_app_activity("com.android.dialer", "com.android.dialer.DialtactsActivity")

# Launch Android settings
result = await launch_app_activity("com.android.settings")

list_installed_apps

List installed applications on the device with pagination support.

Args: only_system (bool): If True, only show system apps only_third_party (bool): If True, only show third-party apps page (int): Page number (starts from 1) page_size (int): Number of items per page basic (bool): If True, only return basic info (faster loading, default behavior)

Returns: str: JSON string containing: { "status": "success" or "error", "message": Error message if status is error, "total_count": Total number of apps, "total_pages": Total number of pages, "current_page": Current page number, "page_size": Number of items per page, "apps": [ { "package_name": str, "app_name": str, "system_app": bool, "version_name": str (if not basic), "version_code": str (if not basic), "install_time": str (if not basic) }, ... ] }

terminate_app

Force stop an application on the device.

Args: package_name (str): Package name of the app to terminate

Returns: str: Success or error message

open_url

Open a URL in the device's default browser.

Args: url (str): URL to open

Returns: str: Success or error message

analyze_screen

Analyze the current screen and provide structured information about UI elements

This function captures the current screen state and returns a detailed analysis of the UI elements, their attributes, and suggests possible interactions.

Args: include_screenshot (bool, optional): Whether to include base64-encoded screenshot in the result. Default is False to reduce response size. max_elements (int, optional): Maximum number of UI elements to process. Default is 50 to limit processing time and response size.

Returns: str: JSON string with the analysis result containing: { "status": "success" or "error", "message": "Success/error message", "screen_size": { "width": Width of the screen in pixels, "height": Height of the screen in pixels }, "screen_analysis": { "text_elements": { "all": [List of all text elements with coordinates], "by_region": { "top": [Text elements in the top of the screen], "middle": [Text elements in the middle of the screen], "bottom": [Text elements in the bottom of the screen] } }, "notable_clickables": [List of important clickable elements], "ui_patterns": { "has_bottom_nav": Whether screen has bottom navigation, "has_top_bar": Whether screen has top app bar, "has_dialog": Whether screen has a dialog showing, "has_list_view": Whether screen has a scrollable list } }, "suggested_actions": [ { "action": Action type (e.g., "tap_element"), "element_text": Text of element to interact with, "element_id": ID of element to interact with, "coordinates": [x, y] coordinates for interaction, "confidence": Confidence score (0-100) } ] }

If include_screenshot is True, the response will also include:
{
    "screenshot": base64-encoded PNG image of the screen
}

Examples: # Basic screen analysis result = await analyze_screen()

# Get screen analysis with screenshot included
result_with_screenshot = await analyze_screen(include_screenshot=True)

# Get detailed analysis including more elements
detailed_result = await analyze_screen(max_elements=100)

interact_with_screen

Execute screen interaction actions

Unified interface for screen interactions including tapping, swiping, key pressing, text input, and element search.

Args: action (str): Action type, one of: - "tap": Tap screen at specified coordinates - "swipe": Swipe screen from one position to another - "key": Press a system key - "text": Input text - "find": Find UI element(s) - "wait": Wait for element to appear - "scroll": Scroll to find element

params (Dict[str, Any]): Parameters dictionary with action-specific values:
    For "tap" action:
        - x (int): X coordinate to tap
        - y (int): Y coordinate to tap
    
    For "swipe" action:
        - x1 (int): Start X coordinate
        - y1 (int): Start Y coordinate
        - x2 (int): End X coordinate
        - y2 (int): End Y coordinate
        - duration (int, optional): Swipe duration in ms, defaults to 300
    
    For "key" action:
        - keycode (str/int): Key to press (e.g., "back", "home", "enter", or keycode number)
    
    For "text" action:
        - content (str): Text to input. For Chinese characters, use pinyin instead
                      (e.g. "yu\ tian" for "雨天") with escaped spaces.
                      Direct Chinese character input may fail on some devices.
    
    For "find" action:
        - method (str): Search method, one of: "text", "id", "content_desc", "class", "clickable"
        - value (str): Text/value to search for (not required for method="clickable")
        - partial (bool, optional): Use partial matching, defaults to True (for text/content_desc)
    
    For "wait" action:
        - method (str): Search method, same options as "find"
        - value (str): Text/value to search for
        - timeout (int, optional): Maximum wait time in seconds, defaults to 30
        - interval (float, optional): Check interval in seconds, defaults to 1.0
    
    For "scroll" action:
        - method (str): Search method, same options as "find"
        - value (str): Text/value to search for
        - direction (str, optional): Scroll direction, one of: "up", "down", "left", "right", defaults to "down"
        - max_swipes (int, optional): Maximum swipe attempts, defaults to 5

Returns: str: JSON string with operation result containing: For successful operations: { "status": "success", "message": "Operation-specific success message", ... (optional action-specific data) }

    For failed operations:
        {
            "status": "error",
            "message": "Error description"
        }
    
    Special cases:
        - find: Returns elements list containing matching elements with their properties
        - wait: Returns success when element found or error if timeout
        - scroll: Returns success when element found or error if not found after max attempts

Examples: # Tap by coordinates result = await interact_with_screen("tap", {"x": 100, "y": 200})

# Swipe down
result = await interact_with_screen("swipe", 
                                   {"x1": 500, "y1": 300, 
                                    "x2": 500, "y2": 1200, 
                                    "duration": 300})

# Input text
result = await interact_with_screen("text", {"content": "Hello world"})

# Press back key
result = await interact_with_screen("key", {"keycode": "back"})

# Find element by text
result = await interact_with_screen("find", 
                                   {"method": "text", 
                                    "value": "Settings", 
                                    "partial": True})

# Wait for element to appear
result = await interact_with_screen("wait", 
                                   {"method": "text", 
                                    "value": "Success", 
                                    "timeout": 10,
                                    "interval": 0.5})
                                    
# Scroll to find element
result = await interact_with_screen("scroll", 
                                   {"method": "text", 
                                    "value": "Privacy Policy", 
                                    "direction": "down", 
                                    "max_swipes": 8})

mcp_monitor_ui_changes

Monitor the UI for changes with MCP compatible parameters.

This is a simplified version of monitor_ui_changes that doesn't use callback functions, making it compatible with MCP's JSON schema requirements.

Args: interval_seconds (float): Time between UI checks (seconds) max_duration_seconds (float): Maximum monitoring time (seconds) watch_for (str): What to watch for - "any_change", "text_appears", "text_disappears", "id_appears", "id_disappears", "class_appears", "content_desc_appears" target_text (str): Text to watch for (when watch_for includes "text") target_id (str): ID to watch for (when watch_for includes "id") target_class (str): Class to watch for (when watch_for includes "class") target_content_desc (str): Content description to watch for (when watch_for includes "content_desc")

Returns: str: JSON string with monitoring results

Prompts

Interactive templates invoked by user choice

No prompts

Resources

Contextual data attached and managed by the client

No resources

Related MCP Servers

View all os_automation servers →

wcgw

4.5

by rusiaaman

wcgw is an MCP server with tightly integrated shell and code editing tools, designed to empower chat applications like Claude to code, build, and run on your local machine.

developer_tools

DesktopCommanderMCP

4.3

by wonderwhy-er

Desktop Commander MCP is a tool that allows users to search, update, manage files, and run terminal commands using AI, without incurring API token costs.

file_systems

mcp-server-siri-shortcuts

4.2

by dvcrn

The Siri Shortcuts MCP Server provides access to Siri shortcuts functionality via the Model Context Protocol (MCP), allowing listing, opening, and running shortcuts from the macOS Shortcuts app.

os_automation

tfmcp

4.1

by nwiizo

tfmcp is a command-line tool that facilitates interaction with Terraform using the Model Context Protocol (MCP), enabling LLMs to manage Terraform environments.

cloud_platforms

frida-mcp

4.1

by dnakov

A Model Context Protocol (MCP) implementation for Frida dynamic instrumentation toolkit.

os_automation

google-search

4.0

by web-agent-master

A Playwright-based Node.js tool that bypasses search engine anti-scraping mechanisms to execute Google searches and extract results.

browser_automation

FileScopeMCP

3.9

by admica

FileScopeMCP is a TypeScript-based tool designed to analyze codebases, rank files by importance, track dependencies, and provide summaries to enhance code understanding.

developer_tools

todoist-mcp

3.9

by Doist

Connect this Model Context Protocol server to your LLM to interact with Todoist.

developer_tools

mcp-server-kubernetes

3.8

by Flux159

MCP Server that can connect to a Kubernetes cluster and manage it.

os_automation

adb-mcp

3.8

by srmorete

An MCP server for interacting with Android devices through ADB, providing a bridge between AI models and Android device functionality.

os_automation

claude-code-mcp

3.8

by steipete

An MCP (Model Context Protocol) server that allows running Claude Code in one-shot mode with permissions bypassed automatically.

developer_tools

qgis_mcp

3.7

by jjsantos01

QGISMCP connects QGIS to Claude AI through the Model Context Protocol, enabling direct interaction and control over QGIS.

cloud_platforms

ros-mcp-server

3.7

by robotmcp

The ROS MCP Server bridges large language models (LLMs) with robot control, allowing users to issue natural language commands that are translated into ROS/ROS2 instructions.

cloud_platforms

mcp-server-docker

3.7

by ckreiling

An MCP server for managing Docker with natural language.

os_automation

fastapi_mcp

3.7

by tadata-org

FastAPI-MCP is a tool that allows you to expose your FastAPI endpoints as Model Context Protocol (MCP) tools with built-in authentication.

developer_tools

mcp-windbg

3.7

by svnscha

A Model Context Protocol server providing tools to analyze Windows crash dumps using WinDBG/CDB.

developer_tools

android-mcp-server

3.7

by minhalvp

An MCP server providing programmatic control over Android devices via ADB.

os_automation

hass-mcp

3.7

by voska

Hass-MCP is a Model Context Protocol server designed for integrating Home Assistant with AI assistants like Claude, enabling direct interaction with smart home devices.

os_automation

whistle-mcp

3.7

by 7gugu

Whistle MCP Server is a Whistle proxy management tool based on the Model Context Protocol (MCP), allowing AI assistants to directly operate and control local Whistle proxy servers.

os_automation

macos-automator-mcp

3.7

by steipete

macOS Automator MCP Server allows execution of AppleScript and JavaScript for Automation (JXA) scripts on macOS, supporting a knowledge base of pre-defined scripts.

os_automation

freecad-mcp

3.7

by neka-nat

FreeCAD MCP is a Model Context Protocol server that allows users to control FreeCAD from Claude Desktop, enabling automation and integration of design tasks.

os_automation

mcp-server-apple-shortcuts

3.7

by recursechat

A Model Context Protocol (MCP) server that allows AI assistants to control Apple Shortcuts automations on macOS.

os_automation

MCP-Kali-Server

3.7

by Wh0am123

Kali MCP Server is a lightweight API bridge that connects MCP Clients to the API server, allowing execution of commands on a Linux terminal.

security

FluxCD MCP Server

3.7

by controlplaneio-fluxcd

The Flux MCP Server connects AI assistants directly to Kubernetes clusters running Flux Operator, enabling GitOps analysis and troubleshooting through natural language.

cloud_platforms

strava-mcp

3.6

by r-huijts

The Strava MCP Server is a TypeScript-based Model Context Protocol server that interfaces with the Strava API, allowing LLMs to access and manipulate Strava data.

os_automation

moling

3.6

by gojue

MoLing is a dependency-free local office automation assistant that interacts with the system through operating system APIs, enabling file system operations and command execution.

os_automation

MCP-server-client-computer-use-ai-sdk

3.6

by mediar-ai

Computer Use AI SDK is an open-source MCP server that allows users to control their computer using AI.

os_automation