bjoaquinc/hunyo-mcp-server
If you are the rightful owner of hunyo-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
Hunyo MCP Server provides zero-configuration DataFrame tracking and runtime debugging for Marimo notebooks via the Model Context Protocol (MCP).
Query Tool
Allows direct SQL analysis of runtime events.
Schema Tool
Provides database inspection capabilities.
Lineage Tool
Tracks DataFrame transformations and performance metrics.
Hunyo MCP Server
Zero-configuration DataFrame tracking and runtime debugging for multiple notebook environments via MCP
A single-command orchestrator that provides automatic notebook instrumentation, real-time event capture, DuckDB ingestion, and LLM-accessible query tools via the Model Context Protocol (MCP). Supports Marimo notebooks with extensible architecture for Jupyter and other environments.
š Quick Start
# Install and run in one command
pipx run hunyo-mcp-server --notebook analysis.py
# Or install globally
pipx install hunyo-mcp-server
hunyo-mcp-server --notebook /path/to/your/notebook.py
That's it! No notebook modifications required. Your LLM assistant can now analyze DataFrame operations, performance metrics, and data lineage in real-time.
šÆ What It Does
Hunyo MCP Server automatically:
- š Instruments your notebook - Zero-touch capture layer injection
- š Captures execution events - DataFrame operations, runtime metrics, errors
- š¾ Stores in DuckDB - Real-time ingestion with OpenLineage compliance
- š¤ Exposes MCP tools - Rich querying interface for LLM analysis
Example Workflow
# Start the MCP server
hunyo-mcp-server --notebook my_analysis.py
# Your LLM can now ask questions like:
# "What DataFrames were created in the last run?"
# "Show me the performance metrics for the merge operations"
# "Trace the lineage of the final results DataFrame"
# "Which operations took the most memory?"
ļæ½ļæ½ļø Architecture
Environment-Agnostic Dual-Package System
Hunyo MCP Server uses a dual-package architecture with environment-agnostic design for optimal deployment flexibility across multiple notebook environments:
Package 1: hunyo-mcp-server
(pipx installable)
- Purpose: Global CLI tool for orchestration and data analysis
- Installation:
pipx install hunyo-mcp-server
- Dependencies: Full MCP stack (DuckDB, OpenLineage, WebSockets)
- Features: Database management, MCP tools, file watching, graceful fallback
Package 2: hunyo-capture
(pip installable)
- Purpose: Lightweight DataFrame instrumentation layer with multi-environment support
- Installation:
pip install hunyo-capture
(in notebook environment) - Dependencies: Minimal (pandas only)
- Features: DataFrame tracking, event generation, environment-agnostic architecture
- š§ Marimo support: Full integration with Marimo runtime hooks
- š Jupyter support: Extensible design for future Jupyter integration
- š¤ Auto-detection: Automatically detects and adapts to notebook environment
- š Unified API: Same tracking functions work across all supported environments
Data Flow
Notebook Environment ā hunyo-capture ā JSONL Events ā File Watcher ā
(Marimo/Jupyter/etc.) ā (Environment-aware)
Auto-detects environment
ā
Unified tracking interface ā DuckDB Database ā MCP Query Tools ā LLM Analysis
ā ā
(pip install) (pipx install)
Environment Isolation Benefits
Production Setup (Recommended):
# Global MCP server installation
pipx install hunyo-mcp-server
# Capture layer in notebook environment
pip install hunyo-capture
Benefits:
- ā Clean separation: MCP server isolated from notebook dependencies
- ā Minimal notebook overhead: Only lightweight capture layer installed
- ā Graceful fallback: MCP server works without capture layer
- ā Easy management: Global server, environment-specific capture
- ā Environment flexibility: Same capture layer works across Marimo, Jupyter, and future environments
- ā Auto-detection: Automatically adapts to detected notebook environment without configuration
Graceful Fallback System
When capture layer is not available, the MCP server provides helpful guidance:
hunyo-mcp-server --notebook analysis.py
# Output:
# [INFO] To enable notebook tracking, add this to your notebook:
# [INFO] # Install capture layer: pip install hunyo-capture
# [INFO] from hunyo_capture import enable_unified_tracking
# [INFO] enable_unified_tracking() # Auto-detects environment
Environment-Agnostic Architecture
The capture layer automatically detects and adapts to different notebook environments:
# Same API works across all supported environments
from hunyo_capture import enable_unified_tracking
# Auto-detects environment (Marimo, Jupyter, etc.)
enable_unified_tracking()
# Or specify environment explicitly
enable_unified_tracking(environment='marimo')
Architecture Components:
- Environment Detection: Auto-identifies notebook type (Marimo, Jupyter, etc.)
- Hook Abstractions: Environment-specific hook management (MarimoHooks, JupyterHooks)
- Context Adapters: Normalize cell execution context across environments
- Component Factory: Creates appropriate components for detected environment
- Unified API: Same tracking functions work across all environments
Storage Structure
# Production: ~/.hunyo/
# Development: {repo}/.hunyo/
āāā events/
ā āāā runtime/ # Cell execution metrics, timing, memory
ā āāā lineage/ # DataFrame operations, OpenLineage events
ā āāā dataframe_lineage/ # Column-level lineage and transformations
āāā database/
ā āāā lineage.duckdb # Queryable database with rich schema
āāā config/
āāā settings.yaml # Configuration and preferences
š ļø MCP Tools Available to LLMs
Your LLM assistant gets access to these powerful analysis tools:
š Query Tool - Direct SQL Analysis
-- Your LLM can run queries like:
SELECT operation, AVG(duration_ms) as avg_time
FROM runtime_events
WHERE success = true
GROUP BY operation;
š Schema Tool - Database Inspection
- Table structures and column definitions
- Data type information and constraints
- Example queries and usage patterns
- Statistics and metadata
š Lineage Tool - DataFrame Tracking
- Complete DataFrame transformation chains
- Performance metrics per operation
- Memory usage and optimization insights
- Data flow visualization
š Features
ā Zero Configuration
- No notebook modifications - Automatic instrumentation
- Smart environment detection - Dev vs production modes
- Automatic directory management - Creates
.hunyo/
structure - One-command startup -
pipx run hunyo-mcp-server --notebook file.py
ā Comprehensive Tracking
- DataFrame operations - Create, transform, merge, filter, groupby
- Runtime metrics - Execution time, memory usage, success/failure
- OpenLineage compliance - Standard lineage format for interoperability
- Smart output handling - Large objects described, not stored
ā Real-Time Analysis
- Background ingestion - File watching with <100ms latency
- Live querying - Database updates in real-time
- Performance monitoring - Track operations as they happen
- Error context - Link DataFrame issues to execution environment
ā LLM-Friendly Interface
- Rich MCP tools - Structured data access for AI assistants
- Natural language queries - Ask questions about your data pipeline
- Contextual analysis - Link performance to specific operations
- Historical tracking - Analyze patterns across multiple runs
š§ Installation & Usage
Prerequisites
- Python 3.10+ (3.11+ recommended)
- Notebook environments - Supports multiple notebook types:
- Marimo notebooks - Full support for
.py
marimo notebook files - Jupyter notebooks - Extensible architecture for future integration
- Auto-detection - Automatically detects and adapts to environment
- Marimo notebooks - Full support for
- Cross-platform - Fully compatible with Windows, macOS, and Linux
Installation Options
# Option 1: MCP Server Only (Recommended)
# Install MCP server globally via pipx
pipx install hunyo-mcp-server
# Install capture layer in your notebook environment
pip install hunyo-capture
# Then in your notebook, add one line:
# from hunyo_capture import enable_unified_tracking
# enable_unified_tracking() # Auto-detects environment (Marimo/Jupyter/etc.)
# Option 2: Quick Start (Run without installing)
pipx run hunyo-mcp-server --notebook analysis.py
# Note: Capture layer must be installed separately in notebook environment
# Option 3: All-in-One Installation (Same environment)
pip install hunyo-mcp-server hunyo-capture
# Option 4: Development installation
git clone https://github.com/hunyo-dev/hunyo-notebook-memories-mcp
cd hunyo-notebook-memories-mcp
hatch run install-packages
hunyo-mcp-server --notebook examples/demo.py
Installation Scenarios
š Production Setup (Recommended)
# Install MCP server in isolated environment
pipx install hunyo-mcp-server
# In your notebook environment (conda, venv, etc.)
pip install hunyo-capture
# Start MCP server
hunyo-mcp-server --notebook your_analysis.py
š¬ Development/Testing Setup
# Install both packages in same environment
pip install hunyo-mcp-server hunyo-capture
hunyo-mcp-server --notebook your_analysis.py
ā” Graceful Fallback (MCP server only)
# If capture layer not available, MCP server provides helpful instructions
pipx install hunyo-mcp-server
hunyo-mcp-server --notebook your_analysis.py
# Shows: "To enable tracking, install: pip install hunyo-capture"
Command-Line Options
hunyo-mcp-server --help
Options:
--notebook PATH Path to marimo notebook file [required]
--dev-mode Force development mode (.hunyo in repo root)
--verbose, -v Enable verbose logging
--standalone Run standalone (for testing/development)
--help Show this message and exit
Usage Examples
# Basic usage
hunyo-mcp-server --notebook data_analysis.py
# Development mode with verbose logging
hunyo-mcp-server --notebook ml_pipeline.py --dev-mode --verbose
# Standalone mode (for testing)
hunyo-mcp-server --notebook test.py --standalone
š Example LLM Interactions
Once running, your LLM assistant can analyze your notebook with natural language:
Performance Analysis
"Which DataFrame operations in my notebook are the slowest?"
SELECT operation, AVG(duration_ms) as avg_time, COUNT(*) as count
FROM runtime_events
WHERE event_type = 'dataframe_operation'
GROUP BY operation
ORDER BY avg_time DESC;
Memory Usage Tracking
"Show me memory usage patterns for large DataFrames"
SELECT operation, input_shape, output_shape, memory_delta_mb
FROM lineage_events
WHERE memory_delta_mb > 10
ORDER BY memory_delta_mb DESC;
Data Lineage Analysis
"Trace the transformation chain for my final results DataFrame"
The lineage tool provides complete DataFrame ancestry and transformation history with visual representation.
šÆ Use Cases
š¬ Data Science Workflows
- Track DataFrame transformations across complex analysis pipelines
- Monitor memory usage and performance bottlenecks
- Debug data quality issues with execution context
- Analyze patterns in iterative model development
š Performance Optimization
- Identify slow operations and memory-intensive transformations
- Compare execution metrics across different implementations
- Track improvements after optimization changes
- Monitor resource usage in production notebooks
š Debugging & Troubleshooting
- Link DataFrame errors to specific execution context
- Trace data flow through complex transformation chains
- Identify where data quality issues are introduced
- Understand the impact of individual operations
š Documentation & Knowledge Sharing
- Automatic documentation of data transformation logic
- Share lineage analysis with team members
- Understand inherited notebooks and data pipelines
- Maintain data governance and compliance records
š§ Development
Local Development Setup
# Clone the repository
git clone https://github.com/hunyo-dev/hunyo-notebook-memories-mcp
cd hunyo-notebook-memories-mcp
# Set up development environment (installs both packages)
hatch run install-packages
# Run tests (both commands work - they use the test environment)
hatch run test # Shorter command (delegates to test environment)
hatch run test:pytest # Explicit command (direct test environment)
# Check code quality
hatch run style
hatch run typing
# Run with development notebook
hunyo-mcp-server --notebook test/fixtures/openlineage_demo_notebook.py --dev-mode
Monorepo Package Structure
hunyo-notebook-memories-mcp/
āāā packages/
ā āāā hunyo-mcp-server/ # MCP server package (pipx installable)
ā ā āāā pyproject.toml
ā ā āāā src/hunyo_mcp_server/
ā ā ā āāā server.py # CLI entry point
ā ā ā āāā orchestrator.py # Component coordination
ā ā ā āāā config.py # Environment detection and paths
ā ā ā āāā ingestion/ # Data pipeline components
ā ā ā ā āāā duckdb_manager.py # Database operations
ā ā ā ā āāā event_processor.py # Event validation and transformation
ā ā ā ā āāā file_watcher.py # Real-time file monitoring
ā ā ā āāā tools/ # MCP tools for LLM access
ā ā ā āāā query_tool.py # Direct SQL querying
ā ā ā āāā schema_tool.py # Database inspection
ā ā ā āāā lineage_tool.py # DataFrame lineage analysis
ā ā āāā tests/ # MCP server-specific tests
ā āāā hunyo-capture/ # Capture layer package (pip installable)
ā āāā pyproject.toml
ā āāā src/hunyo_capture/ # Instrumentation layer
ā ā āāā __init__.py
ā ā āāā logger.py # Logging utilities
ā ā āāā unified_marimo_interceptor.py # DataFrame capture
ā āāā tests/ # Capture layer-specific tests
āāā tests/integration/ # Cross-package integration tests
āāā schemas/ # Shared database schemas
āāā .github/workflows/
ā āāā test.yml # Per-package testing
ā āāā test-integration.yml # Package separation testing
āāā pyproject.toml # Workspace configuration
Package Independence
MCP Server (hunyo-mcp-server
):
- Zero dependencies on capture package
- Graceful fallback when capture not available
- Standalone CLI tool for data analysis
- Installable via pipx for global access
Capture Layer (hunyo-capture
):
- Lightweight DataFrame instrumentation
- No dependencies on MCP server
- Installable in any notebook environment
- Works with existing marimo workflows
š¤ Contributing
We welcome contributions! See for detailed development setup and guidelines.
Quick Contribution Setup
git clone https://github.com/hunyo-dev/hunyo-notebook-memories-mcp
cd hunyo-notebook-memories-mcp
hatch shell
hatch run test # Run test suite (or: hatch run test:pytest)
š Requirements
Core Dependencies
MCP Server (hunyo-mcp-server
):
# Runtime requirements
mcp >= 1.0.0 # MCP protocol implementation
click >= 8.0.0 # CLI framework
duckdb >= 0.9.0 # Database engine
pandas >= 2.0.0 # Data processing
pydantic >= 2.0.0 # Data validation
watchdog >= 3.0.0 # File monitoring
websockets >= 11.0.0 # WebSocket support
openlineage-python >= 0.28.0 # Data lineage specification
Capture Layer (hunyo-capture
):
# Lightweight requirements
pandas >= 2.0.0 # DataFrame operations (only dependency)
Optional Dependencies
Development & Testing:
pytest >= 7.0.0 # Testing framework
pytest-cov >= 4.0.0 # Coverage reporting
pytest-timeout >= 2.1.0 # Test timeout management
marimo >= 0.8.0 # Marimo notebook support
Development Tools:
black >= 23.0.0 # Code formatting
ruff >= 0.1.0 # Fast linting
mypy >= 1.0.0 # Type checking
Installation Dependencies
Production Setup:
pipx
for global MCP server installationpip
for capture layer in notebook environments- Python 3.10+ with virtual environment support
Development Setup:
hatch
for workspace managementgit
for version control- IDE with Python support (VS Code, PyCharm, etc.)
š Links
- Documentation: GitHub README
- Issues: GitHub Issues
- Source Code: GitHub Repository
- Model Context Protocol: MCP Specification
- OpenLineage: OpenLineage.io
š License
MIT License - see for details.
Ready to supercharge your notebook analysis?
pipx run hunyo-mcp-server --notebook your_notebook.py
Your LLM assistant is now equipped with powerful DataFrame lineage and performance analysis capabilities! š