JameZUK/ProcmonMCP
If you are the rightful owner of ProcmonMCP and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
ProcmonMCP is a Model Context Protocol server designed to allow LLMs to autonomously analyze Procmon XML log files.
ProcmonMCP
ProcmonMCP is a Model Context Protocol server designed to allow LLMs to autonomously analyze Procmon XML log files. It exposes numerous functionalities to MCP clients.
Overview
This project provides a Model Context Protocol (MCP) server that parses and analyzes Process Monitor (Procmon) XML log files (.xml, .xml.gz, .xml.bz2, .xml.xz). It allows Large Language Models (LLMs) connected via MCP clients (like Cline) to investigate system activity captured in these logs.
By pre-loading a specific Procmon XML file specified via the --input-file argument at startup, this server optimizes the data for in-memory analysis using string interning and other techniques. It then exposes various tools enabling the LLM to query events, inspect process details, view metadata, export results, and perform basic analysis on the loaded log data.
This project was inspired by the approach taken in the GhidraMCP project.
⚠ VERY IMPORTANT SECURITY WARNING ⚠
- Process Monitor logs can contain extremely sensitive system information (keystrokes, passwords in command lines, file contents, network traffic details, etc.).
- This script loads any file path provided via the
--input-fileargument that the user running the script has read permissions for. There is NO directory sandboxing. - Exposing Procmon data via an API (like the MCP server) carries significant security risks. Malicious actors could potentially request sensitive information from the loaded log file.
- Only run this server in highly trusted environments.
- NEVER run this server with Procmon logs captured from systems containing sensitive production or personal data unless you fully understand and accept the risks.
- Carefully review the logs you intend to load for sensitive information BEFORE using this tool.
Features
- Load a specific Procmon XML file (
.xmlor compressed.xml.gz/.bz2/.xz) using the--input-filepath at startup. - Optimizes loaded data using in-memory string interning for reduced memory footprint and faster querying on repetitive data.
- Provides progress reporting during the potentially long loading phase.
- Provide MCP tools for LLMs to:
- Query event summaries with filtering capabilities (process name/contains, operation, result, path contains/regex, detail regex, timestamp, stack module path).
- Retrieve detailed information for specific events by index.
- Get stack traces (module path, location, address) for specific events (if loaded).
- List unique processes found in the log's process list section.
- Get detailed information for specific processes by PID from the process list.
- Retrieve basic metadata about the loaded file.
- Perform basic analysis (count events by process, summarize operations by process, calculate timing statistics, find network connections, find file access).
- Export filtered event results to CSV or JSON files.
- Uses
lxmlfor faster XML parsing if available, with fallback to standard libraryxml.etree.ElementTree. - Supports
stdioandsseMCP transport protocols. - Optional flags to skip loading stack traces (
--no-stack-traces) or extra unknown event fields (--no-extra-data) to save memory. - Debug logging option (
--debug). - Memory usage reporting if
psutilis installed.
Installation
-
Prerequisites:
- Python 3.x (developed with 3.10+ in mind).
pip(Python package installer).
-
Clone the Repository (Optional):
git clone [https://github.com/JameZUK/ProcmonMCP](https://github.com/JameZUK/ProcmonMCP) cd ProcmonMCP(Or just download the Python script)
-
Install Dependencies:
# modelcontextprotocol is required # lxml is highly recommended for performance # psutil is optional for memory reporting pip install "mcp[cli]" lxml psutil(If you choose not to install
lxml, the script will use the slower built-in XML parser. If you don't installpsutil, memory usage won't be reported after loading.)
Usage
The server requires specifying the path to the Procmon XML file to pre-load for analysis.
Command-Line Arguments:
--input-file <path>: (Required) The full path to the Procmon XML file (.xml, .gz, .bz2, .xz) to load and analyze. The script must have read permissions for this file.--transport <stdio|sse>: (Optional) Transport protocol for MCP. Default:stdio.--mcp-host <ip>: (Optional) Host address for the MCP server (only used forssetransport). Default:127.0.0.1.--mcp-port <port>: (Optional) Port for the MCP server (only used forssetransport). Default:8081.--debug: (Optional) Enable verbose debug logging.--log-file <path>: (Optional) Path to a file to write logs to instead of the console.--no-stack-traces: (Optional) Do not parse or store stack traces (saves memory).--no-extra-data: (Optional) Do not store unknown fields found within<event>tags (saves memory).
Examples:
-
Run with STDIO, loading a compressed XML file:
python procmon-mcp.py --input-file /path/to/logs/my_capture.xml.gz -
Run with SSE on port 8082, loading an uncompressed XML file, with debug logging, and skipping stacks:
python procmon-mcp.py --input-file C:\procmon_files\trace_log.xml --transport sse --mcp-port 8082 --debug --no-stack-traces
MCP Clients
Theoretically, any MCP client should work with ProcmonMCP. Three examples are given below.
Example 1: Cline
To use GhidraMCP with Cline, this requires manually running the MCP server as well. First run the following command:
python procmon-mcp.py --input-file C:\procmon_files\trace_log.xml --transport sse --mcp-port 8082
Specify the path to the procmon XML. If all other arguments are unspecified, they will default to the above. Once the MCP server is running, open up Cline and select MCP Servers at the top.
Cline select
Then select Remote Servers and add the following, ensuring that the url matches the MCP host and port:
Server Name: ProcmonMCP
Server URL: http://127.0.0.1:8081/sse
Available MCP Tools
Once the server is running with a loaded file and connected to an MCP client, the following tools are available:
get_loaded_file_summary(): Returns basic summary (filename, type, compression, counts, interner stats, selective loading flags) of the loaded file.query_events(...): Queries events with various filters (see docstring/code for all filters likefilter_process,filter_path_contains,filter_start_time,filter_path_regex,filter_stack_module_path, etc.) and returns a list of event summaries including their index. Use thelimitparameter (default 50).get_event_details(event_index): Gets detailed properties for a specific event by its index (returned byquery_events).get_event_stack_trace(event_index): Gets the stack trace (list of frames with address, path, location) for a specific event by index (only works if--no-stack-traceswas not used).list_processes(): Lists summaries (PID, Name, ImagePath, ParentPID) of unique processes found in the file's process list section.get_process_details(pid): Gets detailed properties for a specific process by PID from the file's process list section.get_metadata(): Retrieves basic metadata about the loaded file (filename, type, counts). (Corrected)count_events_by_process(): Counts events per process name across all loaded events.summarize_operations_by_process(process_name_filter): Counts operations for a specific process name (case-sensitive match).get_timing_statistics(group_by): Calculates event duration statistics, grouped by 'process' (default) or 'operation'.get_process_lifetime(pid): Finds the 'Process Create' and 'Process Exit' event timestamps (unix float) for a given PID by scanning events.find_file_access(path_contains, limit=100): Finds file system events where the path contains the given substring (case-insensitive).find_network_connections(process_name): Finds unique remote network endpoints (IP:port) accessed by a specific process name (case-sensitive match).export_query_results(...): Queries events using the same filters asquery_eventsand exports the full details of all matching events to a specified file (CSV or JSON). Useful for offline analysis.
(Refer to the tool docstrings within the script or use the client's tools/list command for detailed argument descriptions.)
Example LLM Prompts for Malware Analysis
(Assuming a relevant Procmon XML file is loaded)
-
Initial Triage:
- "Get the summary of the loaded file."
- "List the unique processes found in the log."
- "Count the events per process." (Identify high-activity processes)
- "Calculate timing statistics grouped by process." (Identify processes with long-duration events)
-
Investigating a Suspicious Process (e.g.,
suspicious.exewith PID 4568):- "Get details for process PID 4568." (Check command line, parent PID, image path)
- "Summarize operations for process
suspicious.exe." (See what it mainly does - file access, registry, network?) - "Query events where filter_process is
suspicious.exeand filter_operation isRegSetValue, limit 10." (Check registry writes) - "Query events where filter_process is
suspicious.exeand filter_operation isWriteFile, limit 20." (Check file writes) - "Find network connections for process
suspicious.exe." - "Query events where filter_process_contains is
suspiciousand filter_detail_regex issome_pattern_in_details, limit 5." (Use regex on the Detail column) - "Find file access containing
temp\\suspicious_data, limit 50."
-
Looking for Persistence:
- "Query events where filter_operation is
RegSetValueand filter_path_contains isCurrentVersion\\Run, limit 20." - "Query events where filter_operation is
RegCreateKeyand filter_path_contains isServices, limit 20." - "Query events where filter_operation is
CreateFileand filter_path_contains isStartUp, limit 10." (Check common persistence locations)
- "Query events where filter_operation is
-
Troubleshooting Errors / Evasion:
- "Query events where filter_result is
ACCESS DENIED, limit 10." - "Query events where filter_result is
NAME NOT FOUND, limit 10." - "Query events where filter_result is
PATH NOT FOUND, limit 10." - "Query events where filter_result is
0xc0000022, limit 5." (Use hex codes for results if needed) - (After finding an interesting error event at index 987): "Get details for event 987."
- (If details suggest a code issue and stacks were loaded): "Get stack trace for event 987."
- "Query events where filter_result is
-
Exporting Data:
- "Export query results to
suspicious_reg_writes.csvwhere filter_process issuspicious.exeand filter_operation containsRegSet." - "Export query results to
network_activity.jsonin json format where filter_operation containsTCPor filter_operation containsUDP."
- "Export query results to
Limitations
- Single File: The tool loads and analyzes only one file specified via
--input-fileat startup. Analyzing a different file requires restarting the server. - Memory Usage: While optimized with interning, loading extremely large XML files (millions of events, especially with highly unique string data or if stack traces are loaded) can still consume significant RAM. Use
--no-stack-tracesand--no-extra-datafor very large files. - Loading Time: Parsing and optimizing large XML files, especially compressed ones, can take considerable time during startup (though faster than previously). Progress is reported to the console.
- Filter Performance: Querying is generally fast for filters using interned IDs (process, operation, result). Filters requiring string comparisons (
_contains), regular expressions (_regex), or stack inspection (filter_stack_module_path) are slower as they require more processing per event. The stack filter is particularly intensive. Indexing helps significantly for process name and operation filters. - XML Structure: Relies on the standard Procmon XML export structure. Malformed or non-standard XML files will likely cause parsing errors.
- Stack Traces: Stack trace information (module paths, locations) depends entirely on what Procmon resolved and included in the XML export, and requires running Procmon with symbols configured correctly. Stacks are only loaded if
--no-stack-tracesis not used.
Contributing
Contributions are welcome! Please feel free to submit pull requests or open issues.