haarshitgarg/Nudge-Server
If you are the rightful owner of Nudge-Server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
Nudge Navigation Server is a Swift-based Model Context Protocol (MCP) server designed for macOS UI automation and application interaction through accessibility APIs.
get_ui_elements
Retrieves UI elements for an application in a tree structure with limited depth.
click_element_by_id
Clicks a UI element by its ID using direct AXUIElement reference.
update_ui_element_tree
Updates and returns the UI element tree for a specific element by its ID.
Nudge Navigation Server & Library
A Swift-based toolkit for macOS UI automation that provides both:
- MCP Server - Model Context Protocol server for AI agents
- Swift Library - Direct library for embedding in your applications
Enables interaction with macOS applications through accessibility APIs using a sophisticated element-based approach with UI tree management.
Requirements
- Swift 6.1 or later
- macOS 14.0 or later
- Accessibility permissions - The server requires accessibility permissions to interact with applications
Tools Provided
This MCP server exposes the following tools:
1. get_ui_elements
Retrieves UI elements for an application in a tree structure with limited depth (2-3 levels). Automatically opens the application if not running, brings it to focus, and provides a comprehensive overview of the application state.
Parameters:
bundle_identifier
(string, required): Bundle identifier of the application (e.g., "com.apple.safari" for Safari, "com.apple.TextEdit" for TextEdit)
Returns: JSON tree structure with UI elements containing:
element_id
: Unique identifier for the elementdescription
: Human-readable description of the elementchildren
: Array of child elements
Use case: Get an overview of the application state. If you need more details about specific elements, use update_ui_element_tree
.
2. click_element_by_id
Clicks a UI element by its ID using direct AXUIElement reference for maximum performance and reliability.
Parameters:
bundle_identifier
(string, required): Bundle identifier of the applicationelement_id
(string, required): Element ID obtained fromget_ui_elements
Returns: Confirmation message when the element is successfully clicked.
3. update_ui_element_tree
Updates and returns the UI element tree for a specific element by its ID. Call this function when you need more information about the children of a particular UI element.
Parameters:
bundle_identifier
(string, required): Bundle identifier of the applicationelement_id
(string, required): Element ID to update and return tree from (obtained fromget_ui_elements
)
Returns: JSON tree structure with updated UI elements and their children.
Use case: When you need to explore deeper into the UI hierarchy of a specific element.
Setup and Installation
As MCP Server
1. Build the project
swift build
2. Run the server
swift run NudgeServer
3. Run tests
swift test
As Swift Library
1. Add to Package.swift
dependencies: [
.package(url: "https://github.com/haarshitgarg/Nudge-Server.git", from: "1.0.0")
]
targets: [
.target(
name: "YourTarget",
dependencies: [
.product(name: "NudgeLibrary", package: "Nudge-Server")
]
)
]
2. Add to Xcode Project
- File ā Add Package Dependencies...
- Enter:
https://github.com/haarshitgarg/Nudge-Server.git
- Select NudgeLibrary product
- Add to your target
Accessibility Permissions
ā ļø Important: This server requires accessibility permissions to function properly.
- Go to System Preferences ā Security & Privacy ā Privacy ā Accessibility
- Add your terminal application or the built executable to the list of allowed applications
- Ensure the checkbox is checked for the application
Without these permissions, the server will throw accessibilityPermissionDenied
errors.
Project Structure
Nudge-Server/
āāā Package.swift # Swift package manifest
āāā Sources/
ā āāā main_server.swift # Main server entry point
ā āāā servers/
ā ā āāā NavServer.swift # Main MCP server implementation
ā ā āāā TestServer.swift # Test server (development)
ā āāā managers/
ā ā āāā StateManager.swift # UI state management and element registry
ā āāā utility/
ā ā āāā utility.swift # Utility functions
ā ā āāā StateManagerStructs.swift # Data structures for UI elements
ā āāā error/
ā āāā NudgeError.swift # Custom error types
āāā Tests/
ā āāā NudgeServerTests/
ā āāā WorkflowIntegrationTests.swift # Complete workflow tests
ā āāā EnhancedStateManagerTests.swift # Enhanced state manager tests
ā āāā ComprehensiveErrorHandlingTests.swift # Comprehensive error handling tests
ā āāā ComprehensiveStateManagerTests.swift # Comprehensive state manager tests
āāā Documentation/
āāā ENHANCED_SERVER_GUIDE.md # Enhanced server guide
āāā GEMINI.md # Gemini integration guide
āāā REFACTORING_SUMMARY.md # Refactoring summary
Key Features
Advanced UI Element Management
- Element Registry: Maintains a registry of UI elements with unique IDs for reliable interaction
- Tree-based Discovery: Provides hierarchical UI structure for comprehensive application understanding
- Direct AXUIElement References: Uses direct accessibility API references for maximum performance
- Multi-Application Support: Handles multiple applications simultaneously with proper state management
Smart Application Handling
- Auto-opening: Automatically opens applications if not running
- Focus Management: Brings applications to focus before interaction
- Window Detection: Focuses on frontmost windows and menu bars
- State Consistency: Maintains consistent UI state across operations
Comprehensive Testing
- Workflow Integration Tests: Tests complete workflows across multiple applications
- Error Handling Tests: Comprehensive error scenarios and recovery testing
- Performance Tests: Ensures operations complete within reasonable time limits
- Multi-Application Tests: Tests interaction with multiple applications simultaneously
Usage Examples
Library Usage (Direct Swift Integration)
Basic Example
import NudgeLibrary
class MyAutomator {
private let nudge = NudgeLibrary.shared
func automateApplication() async throws {
// Get UI elements for Safari
let elements = try await nudge.getUIElements(for: "com.apple.Safari")
print("Found \(elements.count) elements")
// Find and click a button
if let searchButton = elements.first(where: { $0.description.contains("Search") }) {
try await nudge.clickElement(
bundleIdentifier: "com.apple.Safari",
elementId: searchButton.element_id
)
print("Clicked search button")
}
// Update tree after interaction
if let firstElement = elements.first {
let updatedTree = try await nudge.updateUIElementTree(
bundleIdentifier: "com.apple.Safari",
elementId: firstElement.element_id
)
print("Updated tree has \(updatedTree.count) elements")
}
}
}
SwiftUI Example
import SwiftUI
import NudgeLibrary
struct ContentView: View {
@State private var elements: [UIElementInfo] = []
@State private var isLoading = false
var body: some View {
VStack {
Button("Get Safari Elements") {
Task { await loadElements() }
}
.disabled(isLoading)
List(elements, id: \.element_id) { element in
Button(element.description) {
Task { await clickElement(element) }
}
}
}
}
func loadElements() async {
isLoading = true
defer { isLoading = false }
do {
elements = try await NudgeLibrary.shared.getUIElements(for: "com.apple.Safari")
} catch {
print("Error: \(error)")
}
}
func clickElement(_ element: UIElementInfo) async {
do {
try await NudgeLibrary.shared.clickElement(
bundleIdentifier: "com.apple.Safari",
elementId: element.element_id
)
} catch {
print("Click error: \(error)")
}
}
}
Getting Available Tools
import NudgeLibrary
func listTools() async throws {
let (tools, nextCursor) = try await NudgeLibrary.shared.getNavTools()
for tool in tools {
print("Tool: \(tool.name)")
print("Description: \(tool.description)")
print("---")
}
}
MCP Server Usage
This server runs in stdio mode and can be integrated with MCP-compatible clients. The server will:
- Accept MCP protocol messages via stdin
- Process tool calls for the 3 available tools
- Return results via stdout
- Handle errors gracefully with appropriate error messages
Basic MCP Workflow
- Use
get_ui_elements
to discover UI elements and get their IDs - Use
click_element_by_id
to interact with specific elements - Use
update_ui_element_tree
to explore deeper into specific UI areas
Advanced UI Automation
The server enables AI agents to:
- Navigate complex application interfaces using element IDs
- Understand application state through hierarchical UI trees
- Perform reliable interactions with persistent element references
- Handle multi-step workflows across different applications
- Recover from errors and maintain state consistency
Supported Applications
Tested with:
- TextEdit: Text editing and document manipulation
- Calculator: Mathematical operations and button interactions
- Safari: Web browsing and navigation
- And many more macOS applications
Library vs MCP Server
Feature | Library | MCP Server |
---|---|---|
Performance | Direct function calls | JSON serialization overhead |
Integration | Swift Package Manager / Xcode | MCP protocol clients |
Type Safety | Native Swift types | String-based JSON |
Use Case | Embed in Swift apps | AI agent integration |
Dependencies | Minimal (just MCP for tools) | Full MCP + Service Lifecycle |
When to Use Which Approach
Use the Library (NudgeLibrary) when:
- Building native Swift/macOS applications
- Need maximum performance for UI automation
- Want type safety and direct debugging
- Integrating into existing Swift projects
- Building custom automation tools
Use the MCP Server when:
- Working with AI agents (Claude, GPT, etc.)
- Need language-agnostic integration
- Want to expose tools to external systems
- Building distributed automation systems
- Using MCP-compatible clients
Error Handling
The server provides comprehensive error handling for:
- Missing accessibility permissions
- Application not found or not running
- Invalid UI elements or element IDs
- Element registry inconsistencies
- Network and protocol errors
- Invalid arguments and requests
- Multi-application state conflicts
All errors are returned as structured MCP error responses with descriptive messages and proper error recovery mechanisms.
Testing
The project includes comprehensive tests covering:
- Workflow Integration: Complete end-to-end workflows
- State Management: UI element registry and state consistency
- Error Handling: All error scenarios and recovery paths
- Performance: Timing and efficiency of operations
- Multi-Application: Concurrent application handling
Run all tests with:
swift test
Server Capabilities
When started, the server provides:
- š Auto-opening applications
- š Tree-based UI structure discovery
- ā” Direct AXUIElement performance
- šÆ Element ID-based interactions
- š UI tree updates and exploration
- š ļø Comprehensive error handling
- š± Multi-application support
Ready for advanced macOS UI automation tasks!