hao-cyber/phone-mcp
If you are the rightful owner of phone-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
Phone MCP Plugin is a powerful tool that allows users to control their Android phones using ADB commands, enabling a wide range of functionalities from making calls to UI automation.
Tools
Functions exposed to the LLM to take actions
call_number
Make a phone call to the specified number.
Initiates a call using Android's dialer app through ADB. The number will be dialed immediately without requiring user confirmation.
Args: phone_number (str): The phone number to call. Country code will be automatically added if not provided.
Returns: str: Success message with the number being called, or an error message if the call could not be initiated.
end_call
End the current phone call.
Terminates any active phone call by sending the end call keycode through ADB.
Returns: str: Success message if the call was ended, or an error message if the end call command failed.
check_device_connection
Check if an Android device is connected via ADB.
Verifies that an Android device is properly connected and recognized by ADB, which is required for all other functions to work.
Returns: str: Status message indicating whether a device is connected and ready, or an error message if no device is found.
send_text_message
Send a text message to the specified number.
Uses the phone's messaging app with UI automation to send SMS. Process: Opens messaging app, fills recipient and content, automatically clicks send button, then auto-exits app.
Args: phone_number (str): Recipient's phone number. Country code will be automatically added if not included. Example: "13812345678" or "+8613812345678" message (str): SMS content. Supports any text, including emojis. Example: "Hello, this is a test message"
Returns: str: String description of the operation result: - Success: "Text message sent to {phone_number}" - Failure: Message containing error reason, like "Failed to open messaging app: {error}" or "Failed to navigate to send button: {error}"
receive_text_messages
Get recent text messages from the phone.
Retrieves recent SMS messages from the device's SMS database using ADB and content provider queries to get structured message data.
Args: limit (int): Maximum number of messages to retrieve (default: 5) Example: 10 will return the 10 most recent messages
Returns: str: JSON string containing messages or an error message: - Success: Formatted JSON string with list of messages, each with fields: - address: Sender's number - body: Message content - date: Timestamp - formatted_date: Human-readable date time (like "2023-07-25 14:30:22") - Failure: Text message describing the error, like "No recent text messages found..."
get_sent_messages
Get recently sent text messages from the phone.
Retrieves sent SMS messages from the device's SMS database. This provides a complete list of messages that were successfully sent from this device.
Args: limit (int): Maximum number of sent messages to retrieve (default: 5)
Returns: str: JSON string containing sent messages with: - from: Sender phone number (device owner) - to: Recipient phone number - text: Message content - date: Timestamp - formatted_date: Human-readable date time (like "2023-07-25 14:30:22")
start_screen_recording
Start recording the phone's screen.
Records the screen activity for the specified duration and saves the video to the phone's storage. Automatically creates directories if they don't exist.
Args: duration_seconds (int): Recording duration in seconds (default: 30, max: 180 seconds due to ADB limitations)
Returns: str: Success message with the path to the recording, or an error message if the recording could not be started.
play_media
Simulate media button press to play/pause media.
This function sends a keyevent that simulates pressing the media play/pause button, which can control music, videos, or podcasts that are currently playing.
Returns: str: Success message, or an error message if the command failed.
set_alarm
Set an alarm on the phone.
Creates a new alarm with the specified time and label using the default clock application.
Args: hour (int): Hour in 24-hour format (0-23) minute (int): Minute (0-59) label (str): Optional label for the alarm (default: "Alarm")
Returns: str: Success message if the alarm was set, or an error message if the alarm could not be created.
receive_incoming_call
Handle an incoming phone call.
Checks for any incoming calls and provides options to answer or reject the call. This function first checks if there's an incoming call, then can either answer it or reject it based on the action parameter.
Returns: str: Information about any incoming call including the caller number, or a message indicating no incoming calls.
get_contacts
Retrieve contacts from the phone.
Core function for accessing the contacts database on the device. Fetches contact information including names and phone numbers. Returns data in structured JSON format.
Args: limit (int): Number of contacts to retrieve, defaults to 20
Returns: str: JSON string with contact data or error message
create_contact
Create a new contact on the phone.
Opens the contact creation UI with pre-filled name and phone number, allowing the user to review and save the contact.
Args: name (str): The contact's full name phone_number (str): The contact's phone number (For testing, 10086 is recommended) email (str, optional): The contact's email address
Returns: str: Success message if the contact UI was launched, or an error message if the operation failed.
Note: When testing this feature, it's recommended to use 10086 as the test phone number. This is China Mobile's customer service number, which is suitable for testing environments and easy to recognize.
get_current_window
Get information about the current active window on the device.
Retrieves details about the currently focused window, active application, and foreground activities on the device using multiple methods for reliability.
Returns: str: JSON string with current window details or error message
get_app_shortcuts
Get application shortcuts for installed apps.
Retrieves shortcuts (quick actions) available for Android apps. If package_name is provided, returns shortcuts only for that app, otherwise lists all apps with shortcuts.
Args: package_name (str, optional): Specific app package to get shortcuts for
Returns: str: JSON string with app shortcuts information or error message
launch_app_activity
Launch an app using package name and optionally an activity name
This function uses adb to start an application on the device either by package name or by specifying both package and activity. It provides reliable app launching across different Android devices and versions.
Args: package_name (str): The package name of the app to launch (e.g., "com.android.contacts") activity_name (str): The specific activity to launch. If not provided, launches the app's main activity. Defaults to None.
Returns: str: JSON string with operation result: For successful operations: { "status": "success", "message": "Successfully launched <package_name>" }
For failed operations:
{
"status": "error",
"message": "Failed to launch app: <error details>"
}
Examples: # Launch an app using just the package name result = await launch_app_activity("com.android.contacts")
# Launch a specific activity within an app
result = await launch_app_activity("com.android.dialer", "com.android.dialer.DialtactsActivity")
# Launch Android settings
result = await launch_app_activity("com.android.settings")
list_installed_apps
List installed applications on the device with pagination support.
Args: only_system (bool): If True, only show system apps only_third_party (bool): If True, only show third-party apps page (int): Page number (starts from 1) page_size (int): Number of items per page basic (bool): If True, only return basic info (faster loading, default behavior)
Returns: str: JSON string containing: { "status": "success" or "error", "message": Error message if status is error, "total_count": Total number of apps, "total_pages": Total number of pages, "current_page": Current page number, "page_size": Number of items per page, "apps": [ { "package_name": str, "app_name": str, "system_app": bool, "version_name": str (if not basic), "version_code": str (if not basic), "install_time": str (if not basic) }, ... ] }
terminate_app
Force stop an application on the device.
Args: package_name (str): Package name of the app to terminate
Returns: str: Success or error message
open_url
Open a URL in the device's default browser.
Args: url (str): URL to open
Returns: str: Success or error message
analyze_screen
Analyze the current screen and provide structured information about UI elements
This function captures the current screen state and returns a detailed analysis of the UI elements, their attributes, and suggests possible interactions.
Args: include_screenshot (bool, optional): Whether to include base64-encoded screenshot in the result. Default is False to reduce response size. max_elements (int, optional): Maximum number of UI elements to process. Default is 50 to limit processing time and response size.
Returns: str: JSON string with the analysis result containing: { "status": "success" or "error", "message": "Success/error message", "screen_size": { "width": Width of the screen in pixels, "height": Height of the screen in pixels }, "screen_analysis": { "text_elements": { "all": [List of all text elements with coordinates], "by_region": { "top": [Text elements in the top of the screen], "middle": [Text elements in the middle of the screen], "bottom": [Text elements in the bottom of the screen] } }, "notable_clickables": [List of important clickable elements], "ui_patterns": { "has_bottom_nav": Whether screen has bottom navigation, "has_top_bar": Whether screen has top app bar, "has_dialog": Whether screen has a dialog showing, "has_list_view": Whether screen has a scrollable list } }, "suggested_actions": [ { "action": Action type (e.g., "tap_element"), "element_text": Text of element to interact with, "element_id": ID of element to interact with, "coordinates": [x, y] coordinates for interaction, "confidence": Confidence score (0-100) } ] }
If include_screenshot is True, the response will also include:
{
"screenshot": base64-encoded PNG image of the screen
}
Examples: # Basic screen analysis result = await analyze_screen()
# Get screen analysis with screenshot included
result_with_screenshot = await analyze_screen(include_screenshot=True)
# Get detailed analysis including more elements
detailed_result = await analyze_screen(max_elements=100)
interact_with_screen
Execute screen interaction actions
Unified interface for screen interactions including tapping, swiping, key pressing, text input, and element search.
Args: action (str): Action type, one of: - "tap": Tap screen at specified coordinates - "swipe": Swipe screen from one position to another - "key": Press a system key - "text": Input text - "find": Find UI element(s) - "wait": Wait for element to appear - "scroll": Scroll to find element
params (Dict[str, Any]): Parameters dictionary with action-specific values:
For "tap" action:
- x (int): X coordinate to tap
- y (int): Y coordinate to tap
For "swipe" action:
- x1 (int): Start X coordinate
- y1 (int): Start Y coordinate
- x2 (int): End X coordinate
- y2 (int): End Y coordinate
- duration (int, optional): Swipe duration in ms, defaults to 300
For "key" action:
- keycode (str/int): Key to press (e.g., "back", "home", "enter", or keycode number)
For "text" action:
- content (str): Text to input. For Chinese characters, use pinyin instead
(e.g. "yu\ tian" for "éšć€©") with escaped spaces.
Direct Chinese character input may fail on some devices.
For "find" action:
- method (str): Search method, one of: "text", "id", "content_desc", "class", "clickable"
- value (str): Text/value to search for (not required for method="clickable")
- partial (bool, optional): Use partial matching, defaults to True (for text/content_desc)
For "wait" action:
- method (str): Search method, same options as "find"
- value (str): Text/value to search for
- timeout (int, optional): Maximum wait time in seconds, defaults to 30
- interval (float, optional): Check interval in seconds, defaults to 1.0
For "scroll" action:
- method (str): Search method, same options as "find"
- value (str): Text/value to search for
- direction (str, optional): Scroll direction, one of: "up", "down", "left", "right", defaults to "down"
- max_swipes (int, optional): Maximum swipe attempts, defaults to 5
Returns: str: JSON string with operation result containing: For successful operations: { "status": "success", "message": "Operation-specific success message", ... (optional action-specific data) }
For failed operations:
{
"status": "error",
"message": "Error description"
}
Special cases:
- find: Returns elements list containing matching elements with their properties
- wait: Returns success when element found or error if timeout
- scroll: Returns success when element found or error if not found after max attempts
Examples: # Tap by coordinates result = await interact_with_screen("tap", {"x": 100, "y": 200})
# Swipe down
result = await interact_with_screen("swipe",
{"x1": 500, "y1": 300,
"x2": 500, "y2": 1200,
"duration": 300})
# Input text
result = await interact_with_screen("text", {"content": "Hello world"})
# Press back key
result = await interact_with_screen("key", {"keycode": "back"})
# Find element by text
result = await interact_with_screen("find",
{"method": "text",
"value": "Settings",
"partial": True})
# Wait for element to appear
result = await interact_with_screen("wait",
{"method": "text",
"value": "Success",
"timeout": 10,
"interval": 0.5})
# Scroll to find element
result = await interact_with_screen("scroll",
{"method": "text",
"value": "Privacy Policy",
"direction": "down",
"max_swipes": 8})
mcp_monitor_ui_changes
Monitor the UI for changes with MCP compatible parameters.
This is a simplified version of monitor_ui_changes that doesn't use callback functions, making it compatible with MCP's JSON schema requirements.
Args: interval_seconds (float): Time between UI checks (seconds) max_duration_seconds (float): Maximum monitoring time (seconds) watch_for (str): What to watch for - "any_change", "text_appears", "text_disappears", "id_appears", "id_disappears", "class_appears", "content_desc_appears" target_text (str): Text to watch for (when watch_for includes "text") target_id (str): ID to watch for (when watch_for includes "id") target_class (str): Class to watch for (when watch_for includes "class") target_content_desc (str): Content description to watch for (when watch_for includes "content_desc")
Returns: str: JSON string with monitoring results
Prompts
Interactive templates invoked by user choice
No prompts
Resources
Contextual data attached and managed by the client