spreadsheet-read-mcp

PSU3D0/spreadsheet-read-mcp

3.3

If you are the rightful owner of spreadsheet-read-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

Spreadsheet Read MCP is a Model Context Protocol server designed for safe and deterministic exploration of spreadsheet workbooks by LLM agents.

Tools
14
Resources
0
Prompts
0

Spreadsheet Read MCP

spreadsheet-read-mcp is a Model Context Protocol (MCP) server that lets LLM agents explore spreadsheet workbooks safely and deterministically. It focuses on high-signal, read-only insights (structure, formulas, styles, statistics) without mutating the source files. The server is optimized for XLSX first and can discover .xls/.xlsb files, while keeping the backend pluggable for future formats.

What It Does

  • Enumerates workbooks inside a workspace, exposing stable short IDs that are easy for an LLM to reference.
  • Streams sheet pages, highlights formulas, and surfaces cached values so models can inspect data slices without loading entire files.
  • Maps formula clusters, traces precedents/dependents with pagination-friendly summaries, and tags volatile functions.
  • Reports workbook metadata, sheet classifications, style usage, and named ranges to give agents a comprehensive mental model.
  • Provides manifest stubs and bookkeeping helpers so downstream harnesses can integrate the results quickly.

What It Is Not

  • No spreadsheet writing, mutation, or recalculation; everything is read-only.
  • No XLS macro execution, VBA inspection, or automation beyond surface metadata.
  • No on-the-fly format conversion; ODS and other backends will require future CAPS-enabled adapters.
  • Not a generic file browser — it focuses strictly on spreadsheet-aware inspection.

Quick Start

# Run directly from the repository (SSE transport at http://127.0.0.1:8079/mcp/sse)
cargo run --release -- \
  --workspace-root /path/to/workbooks \
  --cache-capacity 10

To install the binary locally:

cargo install --path spreadsheet-read-mcp
spreadsheet-read-mcp --workspace-root /path/to/workbooks

By default the server speaks MCP over Server-Sent Events (SSE). Connect an MCP-aware client to GET http://127.0.0.1:8079/mcp/sse and use the initial event payload to post JSON requests to http://127.0.0.1:8079/mcp/message. Provide --http-bind <ADDR> to choose a different address, --transport http for the streamable HTTP transport, or --transport stdio to retain the classic stdio mode.

Configuration Options

You can configure the server through CLI flags, environment variables, or a YAML/JSON config file.

CLI / Environment

FlagEnv VarDescription
--workspace-root <DIR>SPREADSHEET_MCP_WORKSPACERoot directory to scan for workbooks (defaults to current directory).
--cache-capacity <N>SPREADSHEET_MCP_CACHE_CAPACITYMaximum in-memory workbook cache size (minimum 1, default 5).
--extensions ext1,ext2SPREADSHEET_MCP_EXTENSIONSAllowed file extensions (defaults to xlsx,xls,xlsb).
--workbook <FILE>SPREADSHEET_MCP_WORKBOOKLock the server to a single workbook without scanning the workspace.
--enabled-tools tool1,tool2SPREADSHEET_MCP_ENABLED_TOOLSRestrict execution to the named tools; others return an MCP invalid_request.
`--transport <ssehttpstdio>`
--http-bind <ADDR>SPREADSHEET_MCP_HTTP_BINDBind address for network transports (SSE/HTTP), defaults to 127.0.0.1:8079.
--config <FILE>Load settings from a YAML or JSON file. CLI/env values override file entries.

Config File Example (config.yaml)

workspace_root: /data/spreadsheets
cache_capacity: 8
extensions: ["xlsx", "xlsb"]

Start the server with spreadsheet-read-mcp --config config.yaml.

Transports

  • sse (default): Serves the MCP SSE interface with GET /mcp/sse for events and POST /mcp/message for JSON payloads. Ideal for local development where clients expect the classic SSE workflow.
  • http: Enables the streamable HTTP transport at /mcp, which combines JSON payloads and event streams over a single upgraded connection.
  • stdio: Keeps compatibility with stdio-based clients. Use --transport stdio to enable it.

Tool Surface

ToolWhy It Matters
list_workbooksLists discoverable workbooks with slug + short ID so agents can choose targets without remembering long hashes.
describe_workbookReturns workbook-level metadata (size, sheet count, CAPS) to gauge complexity before drilling in.
list_sheetsPresents sheet summaries, visibility, metrics, and tags to help prioritize inspection order.
sheet_overviewOffers classification, headline stats, and highlights (tables, named items) for a single sheet.
sheet_pagePages through tabular data with optional formula/style payloads, enabling high-signal slices for LLM review.
sheet_formula_mapGroups identical formulas and aggregates ranges so the agent can spot patterns without sifting cell-by-cell.
formula_traceWalks precedents/dependents recursively (with safe pagination) to explain how values propagate across sheets.
named_rangesSurfaces named items, their scope, and target ranges to anchor reasoning in business terminology.
sheet_statisticsCaptures distribution metrics, data density, and heuristics that describe how "busy" a sheet is.
find_formulaSearches formulas by text/regex, ideal for locating specific functions or references quickly.
scan_volatilesFlags volatile functions and high-churn ranges so models can reason about recalculation risk.
sheet_stylesSummarizes style reuse and annotations, revealing which cells carry semantic emphasis or commentary.
get_manifest_stubEmits a structured stub that downstream pipelines can drop into corpus manifests.
close_workbookExplicitly evicts a workbook from the cache to free memory between exploratory sessions.

Workspace Semantics

  • Workbooks are discovered relative to the configured workspace root. Subdirectories are preserved; use list_workbooks filters (slug_prefix, folder, path_glob) to focus the scan.
  • Single-workbook mode (--workbook) skips directory traversal and indexes only the specified file, while still providing the usual short ID aliases.
  • XLSX files are fully parsed through umya-spreadsheet. XLS/XLSB are enumerated and validated before load; unsupported structures are reported as MCP errors instead of crashing the server.
  • A bounded LRU cache keeps recently accessed workbooks warm while respecting memory limits.

Development

  • Format / lint using standard Rust tooling (cargo fmt, cargo clippy).
  • Run the full test suite with cargo test from the project root; integration tests synthesize workbooks on the fly via umya-spreadsheet fixtures.

When opening pull requests, GitHub Actions will automatically run the cross-platform test + build workflow defined in .github/workflows/ci.yml and publish release binaries as artifacts.

Related Documentation

Design notes and deeper architectural context live under docs/:

  • mcp-server-design.md — server architecture and module responsibilities.
  • mcp-server-plan.md — roadmap, tool contracts, and CAPS approach.
  • mcp-rust-umya-analysis.md — backend decision record and XLSX-first rationale.
  • formualizer-parse-integration.md — formula parser integration details.