replicate-mcp-server

rawveg/replicate-mcp-server

3.2

If you are the rightful owner of replicate-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

A modern, pagination-aware Model Context Protocol server for the Replicate HTTP API β€” optimized for agent workflows and large language models.

Replicate MCP Server (STDIO)

Node.js MCP TypeScript License: MIT Replicate API

A modern, pagination-aware Model Context Protocol server for the Replicate HTTP API β€” optimized for agent workflows and large language models.


Table of Contents

A Model Context Protocol (MCP) server for the Replicate HTTP API. This server exposes Replicate endpoints as MCP tools so agentic LLM clients can call them safely and efficiently.

Originally generated from the Replicate OpenAPI spec using the excellent openapi-mcp-generator project, this server was then refined to:

  • Remove unused transports and run STDIO-only (ideal for Claude, Windsurf, Cursor, etc.)
  • Enforce universal pagination controls to avoid overrunning LLM context windows
  • Trim and cap outputs with clear pagination notes for automatic follow-up calls
  • Standardize authentication via REPLICATE_API_TOKEN

Link to the generator used initially: https://github.com/harsha-iiiv/openapi-mcp-generator

Features ✨

  • STDIO-only MCP transport
  • Coverage of Replicate endpoints defined in the OpenAPI spec
  • Universal pagination parameters on every tool:
    • max_results (local trimming; default configurable)
    • cursor (pass-through to API if supported)
  • Output safeguards:
    • Local trimming for arrays and { results: [...] } payloads
    • Clear pagination note including next/previous cursors (if present)
    • Global output character cap with truncation notice
  • Authentication with a single env var: REPLICATE_API_TOKEN

Requirements πŸ“¦

Installation βš™οΈ

# From project root
npm install
npm run build

Quick Start πŸš€

git clone <this-repo>
cd replicate-mcp
npm install
npm run build

# Run (typically launched by an MCP client, but you can start it directly)
node build/index.js

Configuration πŸ”§

Create a .env file or set environment variables in your MCP client configuration. Supported variables:

  • REPLICATE_API_TOKEN (required): your Replicate API token
  • MCP_MAX_RESULTS (optional, default 25): max items returned per tool call after local trimming
  • MCP_MAX_RESULTS_HARD_LIMIT (optional, default 100): absolute ceiling for max_results
  • MCP_MAX_OUTPUT_CHARS (optional, default 12000): cap on the final text payload returned to the client

Example .env:

REPLICATE_API_TOKEN=your_api_token_here
MCP_MAX_RESULTS=25
MCP_MAX_RESULTS_HARD_LIMIT=100
MCP_MAX_OUTPUT_CHARS=12000

Environment variables at a glance:

NameRequiredDefaultPurpose
REPLICATE_API_TOKENYesβ€”Bearer token for Replicate API requests.
MCP_MAX_RESULTSNo25Local trim size for arrays/results to control LLM context usage.
MCP_MAX_RESULTS_HARD_LIMITNo100Absolute cap; max_results cannot exceed this.
MCP_MAX_OUTPUT_CHARSNo12000Caps final text payload to prevent overruns; appends a truncation note if exceeded.

Pagination by design πŸ“

Every tool accepts two optional arguments:

  • max_results: Limits how many items are returned in the current MCP response. Useful when list endpoints return many items.
  • cursor: Cursor from a previous API response. If the Replicate endpoint supports cursor-based pagination, this is passed through as a query parameter.

The server also:

  • Locally trims arrays or results arrays to max_results to reduce token usage
  • Adds a "Pagination:" note before the JSON that:
    • Indicates local trimming occurred (if applicable)
    • Surfaces next_cursor and/or previous_cursor parsed from API URLs when present
  • Caps the total characters returned using MCP_MAX_OUTPUT_CHARS

Example tool call (conceptual):

{
  "name": "models_list",
  "arguments": { "max_results": 10 }
}

Then continue with the next page using the cursor shown in the pagination note:

{
  "name": "models_list",
  "arguments": { "cursor": "<next_cursor_value>", "max_results": 10 }
}

Running ▢️

This is an MCP server designed to be launched by an MCP-enabled client over STDIO. You can also run it manually:

node build/index.js

…but typical usage is via a client configuration (examples below).

MCP client configuration examples 🧩

The following is a generic MCP configuration block you can place in your client’s config (adjust the path to build/index.js and set your token):

{
  "mcpServers": {
    "replicate": {
      "type": "stdio",
      "command": "node",
      "args": [
        "/path/to/build/index.js"
      ],
      "env": {
        "REPLICATE_API_TOKEN": "replicate_api_key_here"
      }
    }
  }
}

Claude Desktop (macOS)

Put/merge this JSON into ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "replicate": {
      "type": "stdio",
      "command": "node",
      "args": [
        "/absolute/path/to/build/index.js"
      ],
      "env": {
        "REPLICATE_API_TOKEN": "replicate_api_key_here"
      }
    }
  }
}

Restart Claude Desktop after saving.

Windsurf

Add/merge the same MCP block into your Windsurf MCP settings. The structure is identical to the generic JSON shown above. Ensure the args path matches your workspace build output and that REPLICATE_API_TOKEN is set.

Cursor

Cursor supports MCP configuration via its settings. Provide the same JSON structure under your MCP configuration section. Use an absolute path to build/index.js and set REPLICATE_API_TOKEN in the env map.

Note: Exact UI locations for MCP configuration vary by version of each client. The JSON structure above is compatible across MCP clients that use STDIO.

Security πŸ”

  • Do not commit your API token to source control.
  • Prefer using per-user environment variables in your MCP client configuration.

Development πŸ› οΈ

  • Build: npm run build
  • Typecheck: npm run typecheck

Key files:

  • src/index.ts β€” MCP server implementation
    • STDIO transport only
    • Pagination and output-capping logic
    • Authentication via REPLICATE_API_TOKEN
  • .env.example β€” sample environment variables

Acknowledgements πŸ™

License πŸ“„

This project is licensed under the MIT License.

See the LICENSE file in the repository root for the full text.