hunyo-mcp-server

bjoaquinc/hunyo-mcp-server

3.2

If you are the rightful owner of hunyo-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

Hunyo MCP Server provides zero-configuration DataFrame tracking and runtime debugging for Marimo notebooks via the Model Context Protocol (MCP).

Tools
  1. Query Tool

    Allows direct SQL analysis of runtime events.

  2. Schema Tool

    Provides database inspection capabilities.

  3. Lineage Tool

    Tracks DataFrame transformations and performance metrics.

Hunyo MCP Server

Zero-configuration DataFrame tracking and runtime debugging for multiple notebook environments via MCP

A single-command orchestrator that provides automatic notebook instrumentation, real-time event capture, DuckDB ingestion, and LLM-accessible query tools via the Model Context Protocol (MCP). Supports Marimo notebooks with extensible architecture for Jupyter and other environments.

šŸš€ Quick Start

# Install and run in one command
pipx run hunyo-mcp-server --notebook analysis.py

# Or install globally  
pipx install hunyo-mcp-server
hunyo-mcp-server --notebook /path/to/your/notebook.py

That's it! No notebook modifications required. Your LLM assistant can now analyze DataFrame operations, performance metrics, and data lineage in real-time.

šŸŽÆ What It Does

Hunyo MCP Server automatically:

  1. šŸ“ Instruments your notebook - Zero-touch capture layer injection
  2. šŸ” Captures execution events - DataFrame operations, runtime metrics, errors
  3. šŸ’¾ Stores in DuckDB - Real-time ingestion with OpenLineage compliance
  4. šŸ¤– Exposes MCP tools - Rich querying interface for LLM analysis

Example Workflow

# Start the MCP server
hunyo-mcp-server --notebook my_analysis.py

# Your LLM can now ask questions like:
# "What DataFrames were created in the last run?"
# "Show me the performance metrics for the merge operations"
# "Trace the lineage of the final results DataFrame"
# "Which operations took the most memory?"

ļæ½ļæ½ļø Architecture

Environment-Agnostic Dual-Package System

Hunyo MCP Server uses a dual-package architecture with environment-agnostic design for optimal deployment flexibility across multiple notebook environments:

Package 1: hunyo-mcp-server (pipx installable)
  • Purpose: Global CLI tool for orchestration and data analysis
  • Installation: pipx install hunyo-mcp-server
  • Dependencies: Full MCP stack (DuckDB, OpenLineage, WebSockets)
  • Features: Database management, MCP tools, file watching, graceful fallback
Package 2: hunyo-capture (pip installable)
  • Purpose: Lightweight DataFrame instrumentation layer with multi-environment support
  • Installation: pip install hunyo-capture (in notebook environment)
  • Dependencies: Minimal (pandas only)
  • Features: DataFrame tracking, event generation, environment-agnostic architecture
    • šŸ”§ Marimo support: Full integration with Marimo runtime hooks
    • šŸ“ Jupyter support: Extensible design for future Jupyter integration
    • šŸ¤– Auto-detection: Automatically detects and adapts to notebook environment
    • šŸ”„ Unified API: Same tracking functions work across all supported environments

Data Flow

Notebook Environment → hunyo-capture → JSONL Events → File Watcher → 
(Marimo/Jupyter/etc.)     ↓           (Environment-aware)
                   Auto-detects environment
                         ↓
              Unified tracking interface → DuckDB Database → MCP Query Tools → LLM Analysis
                         ↑                                          ↑
                   (pip install)                            (pipx install)

Environment Isolation Benefits

Production Setup (Recommended):

# Global MCP server installation
pipx install hunyo-mcp-server

# Capture layer in notebook environment  
pip install hunyo-capture

Benefits:

  • āœ… Clean separation: MCP server isolated from notebook dependencies
  • āœ… Minimal notebook overhead: Only lightweight capture layer installed
  • āœ… Graceful fallback: MCP server works without capture layer
  • āœ… Easy management: Global server, environment-specific capture
  • āœ… Environment flexibility: Same capture layer works across Marimo, Jupyter, and future environments
  • āœ… Auto-detection: Automatically adapts to detected notebook environment without configuration

Graceful Fallback System

When capture layer is not available, the MCP server provides helpful guidance:

hunyo-mcp-server --notebook analysis.py
# Output: 
# [INFO] To enable notebook tracking, add this to your notebook:
# [INFO]   # Install capture layer: pip install hunyo-capture
# [INFO]   from hunyo_capture import enable_unified_tracking
# [INFO]   enable_unified_tracking()  # Auto-detects environment

Environment-Agnostic Architecture

The capture layer automatically detects and adapts to different notebook environments:

# Same API works across all supported environments
from hunyo_capture import enable_unified_tracking

# Auto-detects environment (Marimo, Jupyter, etc.)
enable_unified_tracking()

# Or specify environment explicitly  
enable_unified_tracking(environment='marimo')

Architecture Components:

  • Environment Detection: Auto-identifies notebook type (Marimo, Jupyter, etc.)
  • Hook Abstractions: Environment-specific hook management (MarimoHooks, JupyterHooks)
  • Context Adapters: Normalize cell execution context across environments
  • Component Factory: Creates appropriate components for detected environment
  • Unified API: Same tracking functions work across all environments

Storage Structure

# Production: ~/.hunyo/
# Development: {repo}/.hunyo/
ā”œā”€ā”€ events/
│   ā”œā”€ā”€ runtime/           # Cell execution metrics, timing, memory
│   ā”œā”€ā”€ lineage/          # DataFrame operations, OpenLineage events  
│   └── dataframe_lineage/ # Column-level lineage and transformations
ā”œā”€ā”€ database/
│   └── lineage.duckdb    # Queryable database with rich schema
└── config/
    └── settings.yaml     # Configuration and preferences

šŸ› ļø MCP Tools Available to LLMs

Your LLM assistant gets access to these powerful analysis tools:

šŸ“Š Query Tool - Direct SQL Analysis

-- Your LLM can run queries like:
SELECT operation, AVG(duration_ms) as avg_time 
FROM runtime_events 
WHERE success = true 
GROUP BY operation;

šŸ” Schema Tool - Database Inspection

  • Table structures and column definitions
  • Data type information and constraints
  • Example queries and usage patterns
  • Statistics and metadata

šŸ”— Lineage Tool - DataFrame Tracking

  • Complete DataFrame transformation chains
  • Performance metrics per operation
  • Memory usage and optimization insights
  • Data flow visualization

šŸ“‹ Features

āœ… Zero Configuration

  • No notebook modifications - Automatic instrumentation
  • Smart environment detection - Dev vs production modes
  • Automatic directory management - Creates .hunyo/ structure
  • One-command startup - pipx run hunyo-mcp-server --notebook file.py

āœ… Comprehensive Tracking

  • DataFrame operations - Create, transform, merge, filter, groupby
  • Runtime metrics - Execution time, memory usage, success/failure
  • OpenLineage compliance - Standard lineage format for interoperability
  • Smart output handling - Large objects described, not stored

āœ… Real-Time Analysis

  • Background ingestion - File watching with <100ms latency
  • Live querying - Database updates in real-time
  • Performance monitoring - Track operations as they happen
  • Error context - Link DataFrame issues to execution environment

āœ… LLM-Friendly Interface

  • Rich MCP tools - Structured data access for AI assistants
  • Natural language queries - Ask questions about your data pipeline
  • Contextual analysis - Link performance to specific operations
  • Historical tracking - Analyze patterns across multiple runs

šŸ”§ Installation & Usage

Prerequisites

  • Python 3.10+ (3.11+ recommended)
  • Notebook environments - Supports multiple notebook types:
    • Marimo notebooks - Full support for .py marimo notebook files
    • Jupyter notebooks - Extensible architecture for future integration
    • Auto-detection - Automatically detects and adapts to environment
  • Cross-platform - Fully compatible with Windows, macOS, and Linux

Installation Options

# Option 1: MCP Server Only (Recommended)
# Install MCP server globally via pipx
pipx install hunyo-mcp-server

# Install capture layer in your notebook environment
pip install hunyo-capture

# Then in your notebook, add one line:
# from hunyo_capture import enable_unified_tracking
# enable_unified_tracking()  # Auto-detects environment (Marimo/Jupyter/etc.)

# Option 2: Quick Start (Run without installing)
pipx run hunyo-mcp-server --notebook analysis.py
# Note: Capture layer must be installed separately in notebook environment

# Option 3: All-in-One Installation (Same environment)
pip install hunyo-mcp-server hunyo-capture

# Option 4: Development installation
git clone https://github.com/hunyo-dev/hunyo-notebook-memories-mcp
cd hunyo-notebook-memories-mcp
hatch run install-packages
hunyo-mcp-server --notebook examples/demo.py

Installation Scenarios

šŸ­ Production Setup (Recommended)
# Install MCP server in isolated environment
pipx install hunyo-mcp-server

# In your notebook environment (conda, venv, etc.)
pip install hunyo-capture

# Start MCP server
hunyo-mcp-server --notebook your_analysis.py
šŸ”¬ Development/Testing Setup
# Install both packages in same environment
pip install hunyo-mcp-server hunyo-capture
hunyo-mcp-server --notebook your_analysis.py
⚔ Graceful Fallback (MCP server only)
# If capture layer not available, MCP server provides helpful instructions
pipx install hunyo-mcp-server
hunyo-mcp-server --notebook your_analysis.py
# Shows: "To enable tracking, install: pip install hunyo-capture"

Command-Line Options

hunyo-mcp-server --help

Options:
  --notebook PATH     Path to marimo notebook file [required]
  --dev-mode         Force development mode (.hunyo in repo root)
  --verbose, -v      Enable verbose logging
  --standalone       Run standalone (for testing/development)
  --help             Show this message and exit

Usage Examples

# Basic usage
hunyo-mcp-server --notebook data_analysis.py

# Development mode with verbose logging
hunyo-mcp-server --notebook ml_pipeline.py --dev-mode --verbose

# Standalone mode (for testing)
hunyo-mcp-server --notebook test.py --standalone

šŸ“Š Example LLM Interactions

Once running, your LLM assistant can analyze your notebook with natural language:

Performance Analysis

"Which DataFrame operations in my notebook are the slowest?"

SELECT operation, AVG(duration_ms) as avg_time, COUNT(*) as count
FROM runtime_events 
WHERE event_type = 'dataframe_operation'
GROUP BY operation 
ORDER BY avg_time DESC;

Memory Usage Tracking

"Show me memory usage patterns for large DataFrames"

SELECT operation, input_shape, output_shape, memory_delta_mb
FROM lineage_events 
WHERE memory_delta_mb > 10
ORDER BY memory_delta_mb DESC;

Data Lineage Analysis

"Trace the transformation chain for my final results DataFrame"

The lineage tool provides complete DataFrame ancestry and transformation history with visual representation.

šŸŽÆ Use Cases

šŸ”¬ Data Science Workflows

  • Track DataFrame transformations across complex analysis pipelines
  • Monitor memory usage and performance bottlenecks
  • Debug data quality issues with execution context
  • Analyze patterns in iterative model development

šŸ“ˆ Performance Optimization

  • Identify slow operations and memory-intensive transformations
  • Compare execution metrics across different implementations
  • Track improvements after optimization changes
  • Monitor resource usage in production notebooks

šŸ› Debugging & Troubleshooting

  • Link DataFrame errors to specific execution context
  • Trace data flow through complex transformation chains
  • Identify where data quality issues are introduced
  • Understand the impact of individual operations

šŸ“š Documentation & Knowledge Sharing

  • Automatic documentation of data transformation logic
  • Share lineage analysis with team members
  • Understand inherited notebooks and data pipelines
  • Maintain data governance and compliance records

šŸ”§ Development

Local Development Setup

# Clone the repository
git clone https://github.com/hunyo-dev/hunyo-notebook-memories-mcp
cd hunyo-notebook-memories-mcp

# Set up development environment (installs both packages)
hatch run install-packages

# Run tests (both commands work - they use the test environment)
hatch run test          # Shorter command (delegates to test environment)
hatch run test:pytest   # Explicit command (direct test environment)

# Check code quality
hatch run style
hatch run typing

# Run with development notebook
hunyo-mcp-server --notebook test/fixtures/openlineage_demo_notebook.py --dev-mode

Monorepo Package Structure

hunyo-notebook-memories-mcp/
ā”œā”€ā”€ packages/
│   ā”œā”€ā”€ hunyo-mcp-server/              # MCP server package (pipx installable)
│   │   ā”œā”€ā”€ pyproject.toml
│   │   ā”œā”€ā”€ src/hunyo_mcp_server/
│   │   │   ā”œā”€ā”€ server.py              # CLI entry point
│   │   │   ā”œā”€ā”€ orchestrator.py        # Component coordination
│   │   │   ā”œā”€ā”€ config.py              # Environment detection and paths
│   │   │   ā”œā”€ā”€ ingestion/             # Data pipeline components
│   │   │   │   ā”œā”€ā”€ duckdb_manager.py  # Database operations
│   │   │   │   ā”œā”€ā”€ event_processor.py # Event validation and transformation
│   │   │   │   └── file_watcher.py    # Real-time file monitoring
│   │   │   └── tools/                 # MCP tools for LLM access
│   │   │       ā”œā”€ā”€ query_tool.py      # Direct SQL querying
│   │   │       ā”œā”€ā”€ schema_tool.py     # Database inspection
│   │   │       └── lineage_tool.py    # DataFrame lineage analysis
│   │   └── tests/                     # MCP server-specific tests
│   └── hunyo-capture/                 # Capture layer package (pip installable)
│       ā”œā”€ā”€ pyproject.toml
│       ā”œā”€ā”€ src/hunyo_capture/         # Instrumentation layer
│       │   ā”œā”€ā”€ __init__.py
│       │   ā”œā”€ā”€ logger.py              # Logging utilities
│       │   └── unified_marimo_interceptor.py  # DataFrame capture
│       └── tests/                     # Capture layer-specific tests
ā”œā”€ā”€ tests/integration/                 # Cross-package integration tests
ā”œā”€ā”€ schemas/                           # Shared database schemas
ā”œā”€ā”€ .github/workflows/
│   ā”œā”€ā”€ test.yml                       # Per-package testing
│   └── test-integration.yml           # Package separation testing
└── pyproject.toml                     # Workspace configuration

Package Independence

MCP Server (hunyo-mcp-server):

  • Zero dependencies on capture package
  • Graceful fallback when capture not available
  • Standalone CLI tool for data analysis
  • Installable via pipx for global access

Capture Layer (hunyo-capture):

  • Lightweight DataFrame instrumentation
  • No dependencies on MCP server
  • Installable in any notebook environment
  • Works with existing marimo workflows

šŸ¤ Contributing

We welcome contributions! See for detailed development setup and guidelines.

Quick Contribution Setup

git clone https://github.com/hunyo-dev/hunyo-notebook-memories-mcp
cd hunyo-notebook-memories-mcp
hatch shell
hatch run test  # Run test suite (or: hatch run test:pytest)

šŸ“ Requirements

Core Dependencies

MCP Server (hunyo-mcp-server):

# Runtime requirements
mcp >= 1.0.0              # MCP protocol implementation
click >= 8.0.0             # CLI framework
duckdb >= 0.9.0            # Database engine
pandas >= 2.0.0            # Data processing
pydantic >= 2.0.0          # Data validation
watchdog >= 3.0.0          # File monitoring
websockets >= 11.0.0       # WebSocket support
openlineage-python >= 0.28.0  # Data lineage specification

Capture Layer (hunyo-capture):

# Lightweight requirements
pandas >= 2.0.0            # DataFrame operations (only dependency)

Optional Dependencies

Development & Testing:

pytest >= 7.0.0           # Testing framework
pytest-cov >= 4.0.0       # Coverage reporting
pytest-timeout >= 2.1.0   # Test timeout management
marimo >= 0.8.0            # Marimo notebook support

Development Tools:

black >= 23.0.0            # Code formatting
ruff >= 0.1.0              # Fast linting
mypy >= 1.0.0              # Type checking

Installation Dependencies

Production Setup:

  • pipx for global MCP server installation
  • pip for capture layer in notebook environments
  • Python 3.10+ with virtual environment support

Development Setup:

  • hatch for workspace management
  • git for version control
  • IDE with Python support (VS Code, PyCharm, etc.)

šŸ”— Links

šŸ“„ License

MIT License - see for details.


Ready to supercharge your notebook analysis?

pipx run hunyo-mcp-server --notebook your_notebook.py

Your LLM assistant is now equipped with powerful DataFrame lineage and performance analysis capabilities! šŸš€