phone-mcp

hao-cyber/phone-mcp

3.6

If you are the rightful owner of phone-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

Phone MCP Plugin is a powerful tool that allows users to control their Android phones using ADB commands, enabling a wide range of functionalities from making calls to UI automation.

Tools

Functions exposed to the LLM to take actions

call_number

Make a phone call to the specified number.

Initiates a call using Android's dialer app through ADB. The number will be dialed immediately without requiring user confirmation.

Args: phone_number (str): The phone number to call. Country code will be automatically added if not provided.

Returns: str: Success message with the number being called, or an error message if the call could not be initiated.

end_call

End the current phone call.

Terminates any active phone call by sending the end call keycode through ADB.

Returns: str: Success message if the call was ended, or an error message if the end call command failed.

check_device_connection

Check if an Android device is connected via ADB.

Verifies that an Android device is properly connected and recognized by ADB, which is required for all other functions to work.

Returns: str: Status message indicating whether a device is connected and ready, or an error message if no device is found.

send_text_message

Send a text message to the specified number.

Uses the phone's messaging app with UI automation to send SMS. Process: Opens messaging app, fills recipient and content, automatically clicks send button, then auto-exits app.

Args: phone_number (str): Recipient's phone number. Country code will be automatically added if not included. Example: "13812345678" or "+8613812345678" message (str): SMS content. Supports any text, including emojis. Example: "Hello, this is a test message"

Returns: str: String description of the operation result: - Success: "Text message sent to {phone_number}" - Failure: Message containing error reason, like "Failed to open messaging app: {error}" or "Failed to navigate to send button: {error}"

receive_text_messages

Get recent text messages from the phone.

Retrieves recent SMS messages from the device's SMS database using ADB and content provider queries to get structured message data.

Args: limit (int): Maximum number of messages to retrieve (default: 5) Example: 10 will return the 10 most recent messages

Returns: str: JSON string containing messages or an error message: - Success: Formatted JSON string with list of messages, each with fields: - address: Sender's number - body: Message content - date: Timestamp - formatted_date: Human-readable date time (like "2023-07-25 14:30:22") - Failure: Text message describing the error, like "No recent text messages found..."

get_sent_messages

Get recently sent text messages from the phone.

Retrieves sent SMS messages from the device's SMS database. This provides a complete list of messages that were successfully sent from this device.

Args: limit (int): Maximum number of sent messages to retrieve (default: 5)

Returns: str: JSON string containing sent messages with: - from: Sender phone number (device owner) - to: Recipient phone number - text: Message content - date: Timestamp - formatted_date: Human-readable date time (like "2023-07-25 14:30:22")

start_screen_recording

Start recording the phone's screen.

Records the screen activity for the specified duration and saves the video to the phone's storage. Automatically creates directories if they don't exist.

Args: duration_seconds (int): Recording duration in seconds (default: 30, max: 180 seconds due to ADB limitations)

Returns: str: Success message with the path to the recording, or an error message if the recording could not be started.

play_media

Simulate media button press to play/pause media.

This function sends a keyevent that simulates pressing the media play/pause button, which can control music, videos, or podcasts that are currently playing.

Returns: str: Success message, or an error message if the command failed.

set_alarm

Set an alarm on the phone.

Creates a new alarm with the specified time and label using the default clock application.

Args: hour (int): Hour in 24-hour format (0-23) minute (int): Minute (0-59) label (str): Optional label for the alarm (default: "Alarm")

Returns: str: Success message if the alarm was set, or an error message if the alarm could not be created.

receive_incoming_call

Handle an incoming phone call.

Checks for any incoming calls and provides options to answer or reject the call. This function first checks if there's an incoming call, then can either answer it or reject it based on the action parameter.

Returns: str: Information about any incoming call including the caller number, or a message indicating no incoming calls.

get_contacts

Retrieve contacts from the phone.

Core function for accessing the contacts database on the device. Fetches contact information including names and phone numbers. Returns data in structured JSON format.

Args: limit (int): Number of contacts to retrieve, defaults to 20

Returns: str: JSON string with contact data or error message

create_contact

Create a new contact on the phone.

Opens the contact creation UI with pre-filled name and phone number, allowing the user to review and save the contact.

Args: name (str): The contact's full name phone_number (str): The contact's phone number (For testing, 10086 is recommended) email (str, optional): The contact's email address

Returns: str: Success message if the contact UI was launched, or an error message if the operation failed.

Note: When testing this feature, it's recommended to use 10086 as the test phone number. This is China Mobile's customer service number, which is suitable for testing environments and easy to recognize.

get_current_window

Get information about the current active window on the device.

Retrieves details about the currently focused window, active application, and foreground activities on the device using multiple methods for reliability.

Returns: str: JSON string with current window details or error message

get_app_shortcuts

Get application shortcuts for installed apps.

Retrieves shortcuts (quick actions) available for Android apps. If package_name is provided, returns shortcuts only for that app, otherwise lists all apps with shortcuts.

Args: package_name (str, optional): Specific app package to get shortcuts for

Returns: str: JSON string with app shortcuts information or error message

launch_app_activity

Launch an app using package name and optionally an activity name

This function uses adb to start an application on the device either by package name or by specifying both package and activity. It provides reliable app launching across different Android devices and versions.

Args: package_name (str): The package name of the app to launch (e.g., "com.android.contacts") activity_name (str): The specific activity to launch. If not provided, launches the app's main activity. Defaults to None.

Returns: str: JSON string with operation result: For successful operations: { "status": "success", "message": "Successfully launched <package_name>" }

    For failed operations:
        {
            "status": "error",
            "message": "Failed to launch app: <error details>"
        }

Examples: # Launch an app using just the package name result = await launch_app_activity("com.android.contacts")

# Launch a specific activity within an app
result = await launch_app_activity("com.android.dialer", "com.android.dialer.DialtactsActivity")

# Launch Android settings
result = await launch_app_activity("com.android.settings")

list_installed_apps

List installed applications on the device with pagination support.

Args: only_system (bool): If True, only show system apps only_third_party (bool): If True, only show third-party apps page (int): Page number (starts from 1) page_size (int): Number of items per page basic (bool): If True, only return basic info (faster loading, default behavior)

Returns: str: JSON string containing: { "status": "success" or "error", "message": Error message if status is error, "total_count": Total number of apps, "total_pages": Total number of pages, "current_page": Current page number, "page_size": Number of items per page, "apps": [ { "package_name": str, "app_name": str, "system_app": bool, "version_name": str (if not basic), "version_code": str (if not basic), "install_time": str (if not basic) }, ... ] }

terminate_app

Force stop an application on the device.

Args: package_name (str): Package name of the app to terminate

Returns: str: Success or error message

open_url

Open a URL in the device's default browser.

Args: url (str): URL to open

Returns: str: Success or error message

analyze_screen

Analyze the current screen and provide structured information about UI elements

This function captures the current screen state and returns a detailed analysis of the UI elements, their attributes, and suggests possible interactions.

Args: include_screenshot (bool, optional): Whether to include base64-encoded screenshot in the result. Default is False to reduce response size. max_elements (int, optional): Maximum number of UI elements to process. Default is 50 to limit processing time and response size.

Returns: str: JSON string with the analysis result containing: { "status": "success" or "error", "message": "Success/error message", "screen_size": { "width": Width of the screen in pixels, "height": Height of the screen in pixels }, "screen_analysis": { "text_elements": { "all": [List of all text elements with coordinates], "by_region": { "top": [Text elements in the top of the screen], "middle": [Text elements in the middle of the screen], "bottom": [Text elements in the bottom of the screen] } }, "notable_clickables": [List of important clickable elements], "ui_patterns": { "has_bottom_nav": Whether screen has bottom navigation, "has_top_bar": Whether screen has top app bar, "has_dialog": Whether screen has a dialog showing, "has_list_view": Whether screen has a scrollable list } }, "suggested_actions": [ { "action": Action type (e.g., "tap_element"), "element_text": Text of element to interact with, "element_id": ID of element to interact with, "coordinates": [x, y] coordinates for interaction, "confidence": Confidence score (0-100) } ] }

If include_screenshot is True, the response will also include:
{
    "screenshot": base64-encoded PNG image of the screen
}

Examples: # Basic screen analysis result = await analyze_screen()

# Get screen analysis with screenshot included
result_with_screenshot = await analyze_screen(include_screenshot=True)

# Get detailed analysis including more elements
detailed_result = await analyze_screen(max_elements=100)

interact_with_screen

Execute screen interaction actions

Unified interface for screen interactions including tapping, swiping, key pressing, text input, and element search.

Args: action (str): Action type, one of: - "tap": Tap screen at specified coordinates - "swipe": Swipe screen from one position to another - "key": Press a system key - "text": Input text - "find": Find UI element(s) - "wait": Wait for element to appear - "scroll": Scroll to find element

params (Dict[str, Any]): Parameters dictionary with action-specific values:
    For "tap" action:
        - x (int): X coordinate to tap
        - y (int): Y coordinate to tap
    
    For "swipe" action:
        - x1 (int): Start X coordinate
        - y1 (int): Start Y coordinate
        - x2 (int): End X coordinate
        - y2 (int): End Y coordinate
        - duration (int, optional): Swipe duration in ms, defaults to 300
    
    For "key" action:
        - keycode (str/int): Key to press (e.g., "back", "home", "enter", or keycode number)
    
    For "text" action:
        - content (str): Text to input. For Chinese characters, use pinyin instead
                      (e.g. "yu\ tian" for "雚怩") with escaped spaces.
                      Direct Chinese character input may fail on some devices.
    
    For "find" action:
        - method (str): Search method, one of: "text", "id", "content_desc", "class", "clickable"
        - value (str): Text/value to search for (not required for method="clickable")
        - partial (bool, optional): Use partial matching, defaults to True (for text/content_desc)
    
    For "wait" action:
        - method (str): Search method, same options as "find"
        - value (str): Text/value to search for
        - timeout (int, optional): Maximum wait time in seconds, defaults to 30
        - interval (float, optional): Check interval in seconds, defaults to 1.0
    
    For "scroll" action:
        - method (str): Search method, same options as "find"
        - value (str): Text/value to search for
        - direction (str, optional): Scroll direction, one of: "up", "down", "left", "right", defaults to "down"
        - max_swipes (int, optional): Maximum swipe attempts, defaults to 5

Returns: str: JSON string with operation result containing: For successful operations: { "status": "success", "message": "Operation-specific success message", ... (optional action-specific data) }

    For failed operations:
        {
            "status": "error",
            "message": "Error description"
        }
    
    Special cases:
        - find: Returns elements list containing matching elements with their properties
        - wait: Returns success when element found or error if timeout
        - scroll: Returns success when element found or error if not found after max attempts

Examples: # Tap by coordinates result = await interact_with_screen("tap", {"x": 100, "y": 200})

# Swipe down
result = await interact_with_screen("swipe", 
                                   {"x1": 500, "y1": 300, 
                                    "x2": 500, "y2": 1200, 
                                    "duration": 300})

# Input text
result = await interact_with_screen("text", {"content": "Hello world"})

# Press back key
result = await interact_with_screen("key", {"keycode": "back"})

# Find element by text
result = await interact_with_screen("find", 
                                   {"method": "text", 
                                    "value": "Settings", 
                                    "partial": True})

# Wait for element to appear
result = await interact_with_screen("wait", 
                                   {"method": "text", 
                                    "value": "Success", 
                                    "timeout": 10,
                                    "interval": 0.5})
                                    
# Scroll to find element
result = await interact_with_screen("scroll", 
                                   {"method": "text", 
                                    "value": "Privacy Policy", 
                                    "direction": "down", 
                                    "max_swipes": 8})

mcp_monitor_ui_changes

Monitor the UI for changes with MCP compatible parameters.

This is a simplified version of monitor_ui_changes that doesn't use callback functions, making it compatible with MCP's JSON schema requirements.

Args: interval_seconds (float): Time between UI checks (seconds) max_duration_seconds (float): Maximum monitoring time (seconds) watch_for (str): What to watch for - "any_change", "text_appears", "text_disappears", "id_appears", "id_disappears", "class_appears", "content_desc_appears" target_text (str): Text to watch for (when watch_for includes "text") target_id (str): ID to watch for (when watch_for includes "id") target_class (str): Class to watch for (when watch_for includes "class") target_content_desc (str): Content description to watch for (when watch_for includes "content_desc")

Returns: str: JSON string with monitoring results

Prompts

Interactive templates invoked by user choice

No prompts

Resources

Contextual data attached and managed by the client

No resources