spreadsheet-mcp

PSU3D0/spreadsheet-mcp

3.4

If you are the rightful owner of spreadsheet-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

Spreadsheet MCP is a server designed for efficient spreadsheet analysis and editing, optimized for LLM agents.

Tools
5
Resources
0
Prompts
0

Spreadsheet MCP

Crates.io Documentation License

Spreadsheet MCP

MCP server for spreadsheet analysis and editing. Slim, token-efficient tool surface designed for LLM agents.

Why?

Dumping a 50,000-row spreadsheet into an LLM context is expensive and usually unnecessary. Most spreadsheet tasks need surgical access: find a region, profile its structure, read a filtered slice. This server exposes tools that let agents discover → profile → extract without burning tokens on cells they don't need.

  • Full support: .xlsx, .xlsm (via umya-spreadsheet)
  • VBA source inspection (optional): .xlsm via SPREADSHEET_MCP_VBA_ENABLED=true / --vba-enabled (parses embedded xl/vbaProject.bin via ovba)
  • Discovery only: .xls, .xlsb (enumerated, not parsed)

Architecture

Architecture Overview

  • LRU cache keeps recently-accessed workbooks in memory (configurable capacity)
  • Lazy sheet metrics computed once per sheet, reused across tools
  • Region detection runs once and caches bounds for sheet_overview, find_value, read_table, table_profile

Tool Surface

ToolPurpose
list_workbooks, describe_workbook, list_sheetsDiscover workbooks/sheets and metadata
workbook_summary, sheet_overviewOrientation + region detection
read_table, table_profileStructured reads and lightweight profiling
range_values, sheet_pageTargeted spot checks / raw paging fallback
find_value, find_formulaSearch values/labels or formulas
sheet_statisticsQuick sheet stats (density, nulls, duplicates hints)
sheet_formula_map, formula_trace, scan_volatilesFormula analysis and tracing
sheet_styles, workbook_style_summaryStyle inspection (sheet-scoped + workbook-wide)
named_rangesList defined names + tables
vba_project_summary, vba_module_sourceRead VBA project metadata + module source (disabled by default; .xlsm)
get_manifest_stubGenerate manifest scaffold
close_workbookEvict workbook from cache

VBA Support (Read-Only)

VBA tools are disabled by default. When enabled, the server can extract and parse the embedded VBA project from .xlsm files and return module source code.

Enable via:

  • CLI: --vba-enabled
  • Env: SPREADSHEET_MCP_VBA_ENABLED=true

Tools:

  • vba_project_summary: Lists modules + basic project metadata
  • vba_module_source: Returns paged source for a single module

Notes:

  • This does not execute macros; it only reads and returns text.
  • Responses are size-limited; page through module source.

Write & Recalc Support

Write tools allow "what-if" analysis: fork a workbook, edit cells, recalculate formulas via LibreOffice, and diff the results. For safety, you can create checkpoints for high‑fidelity rollback and apply previewed (staged) changes explicitly.

Enabling Write Tools

Always use the :full Docker image for write/recalc features:

docker run -v /path/to/workbooks:/data -p 8079:8079 ghcr.io/psu3d0/spreadsheet-mcp:full

The Docker image includes LibreOffice with pre-configured macros required for reliable recalculation. Running outside Docker requires manual LibreOffice setup (macro trust, headless config) and is not recommended.

Write Tools

ToolPurpose
create_forkCreate a temporary editable copy for "what-if" analysis
checkpoint_fork, restore_checkpointHigh-fidelity snapshot + rollback
edit_batchApply values or formulas to cells in a fork
transform_batchRange-first clear/fill/replace (prefer for bulk edits)
style_batchBatch style edits (range/region/cells)
apply_formula_patternAutofill-like formula fill over a target range
structure_batchBatch structural edits (rows/cols/sheets + copy/move ranges)
recalculateTrigger LibreOffice to update formula results
get_changesetDiff the fork against the original (cells, tables, named ranges)
screenshot_sheetRender a sheet range to a cropped PNG screenshot
save_forkSave fork to a new path (or overwrite original with --allow-overwrite)
list_staged_changes, apply_staged_change, discard_staged_changeManage previewed/staged changes
get_edits, list_forks, discard_forkInspect / list / discard forks

Token-Efficient Write Workflows

find_formula paging

{
  "tool": "find_formula",
  "arguments": {
    "workbook_or_fork_id": "wb-23456789ab",
    "sheet_name": "Calc",
    "query": "SUM(",
    "include_context": false,
    "limit": 20,
    "offset": 0
  }
}

get_changeset summary + filters

{
  "tool": "get_changeset",
  "arguments": {
    "fork_id": "fork-23456789abcd",
    "summary_only": true,
    "exclude_subtypes": ["recalc_result"],
    "limit": 200,
    "offset": 0
  }
}

Docker Paths (Exports + Screenshots)

When running in Docker with --workspace-root /data and a host mount like -v /path/to/workbooks:/data:

  • Fork working files live under /tmp/mcp-forks inside the container (not visible on host).
  • save_fork.target_path is resolved under workspace_root (Docker default: /data). Use a relative path like out.xlsx (or exports/out.xlsx) to write back into the mounted folder on the host.
  • screenshot_sheet writes PNGs under screenshots/ in workspace_root (Docker default: /data/screenshots/).

Screenshot Tool

screenshot_sheet captures a visual PNG of a rectangular range, rendered headless via LibreOffice in the :full image. The PNG is auto‑cropped to remove page whitespace and saved under screenshots/ in the workspace. Note: the tool returns a file:// URI on the server filesystem; when running via Docker, treat it as a container path and look for the PNG under your mounted workspace folder (e.g. screenshots/<name>.png).

Arguments:

  • workbook_or_fork_id (required; accepts a workbook_id or fork_id)
  • sheet_name (required)
  • range (optional, default A1:M40)

Limits and behavior:

  • Max range per screenshot: 100 rows × 30 columns. If exceeded, the tool fails with suggested tiled sub‑ranges to request instead.
  • After export/crop, a pixel guard rejects images that are too large for reliable agent use (default max 4096px on a side or 12MP area). On rejection, the tool returns smaller range suggestions.
  • Override pixel guard via env vars: SPREADSHEET_MCP_MAX_PNG_DIM_PX, SPREADSHEET_MCP_MAX_PNG_AREA_PX.

See for architecture details.

Example

Request: Profile a detected region

{
  "tool": "table_profile",
  "arguments": {
    "workbook_id": "wb-23456789ab",
    "sheet_name": "Q1 Actuals",
    "region_id": 1,
    "sample_size": 10,
    "sample_mode": "distributed"
  }
}

Response:

{
  "sheet_name": "Q1 Actuals",
  "headers": ["Date", "Category", "Amount", "Notes"],
  "column_types": [
    {"name": "Date", "inferred_type": "date", "nulls": 0, "distinct": 87},
    {"name": "Category", "inferred_type": "text", "nulls": 2, "distinct": 12, "top_values": ["Payroll", "Marketing", "Infrastructure"]},
    {"name": "Amount", "inferred_type": "number", "nulls": 0, "min": 150.0, "max": 84500.0, "mean": 12847.32},
    {"name": "Notes", "inferred_type": "text", "nulls": 45, "distinct": 38}
  ],
  "row_count": 1247,
  "samples": [...]
}

The agent now knows column types, cardinality, and value distributions—without reading 1,247 rows.

Recommended Agent Workflow

Token Efficiency Workflow

  1. list_workbookslist_sheetsworkbook_summary for orientation
  2. sheet_overview to get detected_regions (ids/bounds/kind/confidence)
  3. table_profileread_table with region_id, small limit, and sample_mode (distributed preferred)
  4. Use find_value (label mode) or range_values for targeted pulls
  5. Reserve sheet_page for unknown layouts or calculator inspection; prefer compact/values_only
  6. Keep payloads small; page/filter rather than full-sheet reads

Region Detection

Region Detection Visualization

Spreadsheets often contain multiple logical tables, parameter blocks, and output areas on a single sheet. The server detects these automatically:

  1. Gutter detection — Scans for empty rows/columns that separate content blocks
  2. Recursive splitting — Subdivides large areas along detected gutters
  3. Border trimming — Removes sparse edges to tighten bounds
  4. Header detection — Identifies header rows (including multi-row merged headers)
  5. Classification — Labels each region: data, parameters, outputs, calculator, metadata
  6. Confidence scoring — Higher scores for well-structured regions with clear headers

Regions are cached per sheet. Tools like read_table accept a region_id to scope reads without manually specifying ranges.

Quick Start

Docker (Recommended)

Two image variants are published:

ImageSizeWrite/Recalc
ghcr.io/psu3d0/spreadsheet-mcp:latest~15MBNo
ghcr.io/psu3d0/spreadsheet-mcp:latest-full~800MBYes (includes LibreOffice)
# Read-only (slim image)
docker run -v /path/to/workbooks:/data -p 8079:8079 ghcr.io/psu3d0/spreadsheet-mcp:latest

# Read-only + VBA tools enabled
docker run -v /path/to/workbooks:/data -p 8079:8079 -e SPREADSHEET_MCP_VBA_ENABLED=true ghcr.io/psu3d0/spreadsheet-mcp:latest

# With write/recalc support (full image)
docker run -v /path/to/workbooks:/data -p 8079:8079 ghcr.io/psu3d0/spreadsheet-mcp:full

Cargo Install

# Read-only
cargo install spreadsheet-mcp
spreadsheet-mcp --workspace-root /path/to/workbooks

# Enable VBA tools
SPREADSHEET_MCP_VBA_ENABLED=true spreadsheet-mcp --workspace-root /path/to/workbooks

Note: For write/recalc features, use the :full Docker image instead of cargo install. The Docker image includes LibreOffice with required macro configuration.

Build from Source

cargo run --release -- --workspace-root /path/to/workbooks

Default transport: HTTP streaming at 127.0.0.1:8079. Endpoint: POST /mcp.

Use --transport stdio for CLI pipelines.

MCP Client Configuration

Claude Code / Claude Desktop

Add to ~/.claude.json or project .mcp.json:

Read-only (slim image):

{
  "mcpServers": {
    "spreadsheet": {
      "command": "docker",
      "args": ["run", "-i", "--rm", "-v", "/path/to/workbooks:/data", "ghcr.io/psu3d0/spreadsheet-mcp:latest", "--transport", "stdio"]
    }
  }
}

Read-only + VBA tools enabled:

{
  "mcpServers": {
    "spreadsheet": {
      "command": "docker",
      "args": ["run", "-i", "--rm", "-v", "/path/to/workbooks:/data", "ghcr.io/psu3d0/spreadsheet-mcp:latest", "--transport", "stdio", "--vba-enabled"]
    }
  }
}

With write/recalc (full image):

{
  "mcpServers": {
    "spreadsheet": {
      "command": "docker",
      "args": ["run", "-i", "--rm", "-v", "/path/to/workbooks:/data", "ghcr.io/psu3d0/spreadsheet-mcp:latest-full", "--transport", "stdio", "--recalc-enabled"]
    }
  }
}

Binary (no Docker):

{
  "mcpServers": {
    "spreadsheet": {
      "command": "spreadsheet-mcp",
      "args": ["--workspace-root", "/path/to/workbooks", "--transport", "stdio"]
    }
  }
}

Cursor / VS Code

Read-only (slim image):

{
  "mcp.servers": {
    "spreadsheet": {
      "command": "docker",
      "args": ["run", "-i", "--rm", "-v", "${workspaceFolder}:/data", "ghcr.io/psu3d0/spreadsheet-mcp:latest", "--transport", "stdio"]
    }
  }
}

With write/recalc (full image):

{
  "mcp.servers": {
    "spreadsheet": {
      "command": "docker",
      "args": ["run", "-i", "--rm", "-v", "${workspaceFolder}:/data", "ghcr.io/psu3d0/spreadsheet-mcp:latest-full", "--transport", "stdio", "--recalc-enabled"]
    }
  }
}

Binary (no Docker):

{
  "mcp.servers": {
    "spreadsheet": {
      "command": "spreadsheet-mcp",
      "args": ["--workspace-root", "${workspaceFolder}", "--transport", "stdio"]
    }
  }
}

HTTP Mode

docker run -v /path/to/workbooks:/data -p 8079:8079 ghcr.io/psu3d0/spreadsheet-mcp:latest

Connect via POST http://localhost:8079/mcp.

Local Development

To test local changes without rebuilding Docker:

cargo build --release

Then point your MCP client to the binary:

{
  "mcpServers": {
    "spreadsheet": {
      "command": "/path/to/spreadsheet-mcp/target/release/spreadsheet-mcp",
      "args": ["--workspace-root", "/path/to/workbooks", "--transport", "stdio"]
    }
  }
}

Configuration

FlagEnvDescription
--workspace-root <DIR>SPREADSHEET_MCP_WORKSPACEWorkspace root to scan (default: cwd)
--cache-capacity <N>SPREADSHEET_MCP_CACHE_CAPACITYWorkbook cache size (default: 5)
--extensions <list>SPREADSHEET_MCP_EXTENSIONSAllowed extensions (default: xlsx,xls,xlsb)
--workbook <FILE>SPREADSHEET_MCP_WORKBOOKSingle-workbook mode
--enabled-tools <list>SPREADSHEET_MCP_ENABLED_TOOLSWhitelist exposed tools
--transport <http|stdio>SPREADSHEET_MCP_TRANSPORTTransport selection (default: http)
--http-bind <ADDR>SPREADSHEET_MCP_HTTP_BINDBind address (default: 127.0.0.1:8079)
--recalc-enabledSPREADSHEET_MCP_RECALC_ENABLEDEnable write/recalc tools (default: false)
--max-concurrent-recalcs <N>SPREADSHEET_MCP_MAX_CONCURRENT_RECALCSParallel recalc limit (default: 2)
--allow-overwriteSPREADSHEET_MCP_ALLOW_OVERWRITEAllow save_fork to overwrite original files (default: false)

Performance

  • LRU workbook cache — Recently opened workbooks stay in memory; oldest evicted when capacity exceeded
  • Lazy metrics — Sheet metrics computed on first access, cached for subsequent calls
  • Region caching — Detection runs once per sheet; region_id lookups are O(1)
  • Sampling modesdistributed sampling reads evenly across rows without loading everything
  • Compact formatsvalues_only and compact output modes reduce response size

Testing

cargo test

Covers: region detection, region-scoped tools, read_table edge cases (merged headers, filters, large sheets), workbook summary.

Local MCP Testing

To test local changes with an MCP client (Claude Code, Cursor, etc.), use the helper script that rebuilds the Docker image on each invocation:

{
  "mcpServers": {
    "spreadsheet": {
      "command": "./scripts/local-docker-mcp.sh"
    }
  }
}

Set WORKSPACE_ROOT to override the default test directory:

WORKSPACE_ROOT=/path/to/workbooks ./scripts/local-docker-mcp.sh

This ensures you're always testing against your latest code changes without manual image rebuilds.

Behavior & Limits

  • Read-only by default; write/recalc features require --recalc-enabled or the :full image
  • XLSX supported for write; .xls/.xlsb are read-only
  • Bounded in-memory cache honors cache_capacity
  • Prefer region-scoped reads and sampling for token/latency efficiency
  • screenshot_sheet requires write/recalc support and is capped to 100×30 cells per image (with split suggestions).