image-gen-mcp

simonChoi034/image-gen-mcp

3.4

If you are the rightful owner of image-gen-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

The Image Gen MCP Server is a provider-agnostic server designed for image generation and editing, utilizing FastMCP for efficient engine routing and a stable, type-safe schema.

Tools
3
Resources
0
Prompts
0

🎨 Image Gen MCP Server

"Fine. I'll do it myself." — Thanos (and also me, after trying five different MCP servers that couldn't mix-and-match image models)
I wanted a single, simple MCP server that lets agents generate and edit images across OpenAI, Google (Gemini/Imagen), Azure, Vertex, and OpenRouter—without yak‑shaving. So… here it is.

PyPI version Python 3.12+ license

A multi‑provider Model Context Protocol (MCP) server for image generation and editing with a unified, type‑safe API. It returns MCP ImageContent blocks plus compact structured JSON so your client can route, log, or inspect results cleanly.

[!IMPORTANT] This README.md is the canonical reference for API, capabilities, and usage. Some /docs files may lag behind.


🗺️ Table of Contents


🧠 Why this exists

Because I couldn’t find an MCP server that spoke multiple image providers with one sane schema. Some only generated, some only edited, some required summoning three different CLIs at midnight.
This one prioritizes:

  • One schema across providers (AR & diffusion)
  • Minimal setup (uvx or pip, drop a mcp.json, done)
  • Type‑safe I/O with clear error shapes
  • Discoverability: ask the server what models are live via get_model_capabilities

✨ Features

  • Unified tools: generate_image, edit_image, get_model_capabilities
  • Providers: OpenAI, Azure OpenAI, Google Gemini, Vertex AI (Imagen & Gemini), OpenRouter
  • Output: MCP ImageContent blocks + small JSON metadata
  • Quality/size/orientation normalization
  • Masking support where engines allow it
  • Fail‑soft errors with stable shape: { code, message, details? }

🚀 Quick start (users)

Install and use as a published package.

# With uv (recommended)
uv add image-gen-mcp

# Or with pip
pip install image-gen-mcp

Then configure your MCP client.

Configure mcp.json

Use uvx to run in an isolated env with correct deps:

{
  "mcpServers": {
    "image-gen-mcp": {
      "command": "uvx",
      "args": ["--from", "image-gen-mcp", "image-gen-mcp"],
      "env": {
        "OPENAI_API_KEY": "your-key-here"
      }
    }
  }
}

First call

{
  "tool": "generate_image",
  "params": {
    "prompt": "A vibrant painting of a fox in a sunflower field",
    "provider": "openai",
    "model": "gpt-image-1"
  }
}

🧑‍💻 Quick start (developers)

Run from source for local development or contributions.

Prereqs

  • Python 3.12+
  • uv (recommended)

Install deps

uv sync --all-extras --dev

Environment

cp .env.example .env
# Add your keys

Run the server

# stdio (direct)
python -m image_gen_mcp.main

# via FastMCP CLI
fastmcp run image_gen_mcp/main.py:app

Local VS Code mcp.json for testing

If you use a VS Code extension or local tooling that reads .vscode/mcp.json, here's a safe example to run the local server (do NOT commit secrets):

{
  "servers": {
    "image-gen-mcp": {
      "command": "python",
      "args": ["-m", "image_gen_mcp.main"],
      "env": {
        "# NOTE": "Replace with your local keys for testing; do not commit.",
        "OPENROUTER_API_KEY": "__REPLACE_WITH_YOUR_KEY__"
      }
    }
  },
  "inputs": []
}

Use this to run the server from your workspace instead of installing the package from PyPI. For CI or shared repos, store secrets in the environment or a secret manager and avoid checking them into git.

Dev tasks

uv run pytest -v
uv run ruff check .
uv run black --check .
uv run pyright

🧰 Tools API

All tools take named parameters. Outputs include structured JSON (for metadata/errors) and MCP ImageContent blocks (for actual images).

generate_image

Create one or more images from a text prompt.

Example

{
  "prompt": "A vibrant painting of a fox in a sunflower field",
  "provider": "openai",
  "model": "gpt-image-1",
  "n": 2,
  "size": "M",
  "orientation": "landscape"
}

Parameters

FieldTypeDescription
promptstrRequired. Text description.
providerenumRequired. openai | openrouter | azure | vertex | gemini.
modelenumRequired. Model id (see matrix).
nintOptional. Default 1; provider limits apply.
sizeenumOptional. S | M | L.
orientationenumOptional. square | portrait | landscape.
qualityenumOptional. draft | standard | high.
backgroundenumOptional. transparent | opaque (when supported).
negative_promptstrOptional. Used when provider supports it.
directorystrOptional. Filesystem directory where the server should save generated images. If omitted a unique temp directory is used.

edit_image

Edit an image with a prompt and optional mask.

Example

{
  "prompt": "Remove the background and make the subject wear a red scarf",
  "provider": "openai",
  "model": "gpt-image-1",
  "images": ["data:image/png;base64,..."],
  "mask": null
}

Parameters

FieldTypeDescription
promptstrRequired. Edit instruction.
imageslist<str>Required. One or more source images (base64, data URL, or https URL). Most models use only the first image.
maskstrOptional. Mask as base64/data URL/https URL.
providerenumRequired. See above.
modelenumRequired. Model id (see matrix).
nintOptional. Default 1; provider limits apply.
sizeenumOptional. S | M | L.
orientationenumOptional. square | portrait | landscape.
qualityenumOptional. draft | standard | high.
backgroundenumOptional. transparent | opaque.
negative_promptstrOptional. Negative prompt.
directorystrOptional. Filesystem directory where the server should save edited images. If omitted a unique temp directory is used.

get_model_capabilities

Discover which providers/models are actually enabled based on your environment.

Example

{ "provider": "openai" }

Call with no params to list all enabled providers/models.

Output: a CapabilitiesResponse with providers, models, and features.


🧭 Providers & Models

Routing is handled by a ModelFactory that maps model → engine. A compact, curated list keeps things understandable.

Model Matrix

ModelFamilyProvidersGenerateEditMask
gpt-image-1ARopenai, azure✅ (OpenAI/Azure)
dall-e-3Diffusionopenai, azure
gemini-2.5-flash-image-previewARgemini, vertex✅ (maskless)
imagen-4.0-generate-001Diffusionvertex
imagen-3.0-generate-002Diffusionvertex
imagen-4.0-fast-generate-001Diffusionvertex
imagen-4.0-ultra-generate-001Diffusionvertex
imagen-3.0-capability-001Diffusionvertex✅ (mask via mask config)
google/gemini-2.5-flash-image-previewARopenrouter✅ (maskless)

Provider Model Support

ProviderSupported Models
openaigpt-image-1, dall-e-3
azuregpt-image-1, dall-e-3
geminigemini-2.5-flash-image-preview
verteximagen-4.0-generate-001, imagen-3.0-generate-002, gemini-2.5-flash-image-preview
openroutergoogle/gemini-2.5-flash-image-preview

🐍 Python client example

import asyncio
from fastmcp import Client


async def main():
    # Assumes the server is running via: python -m image_gen_mcp.main
    async with Client("image_gen_mcp/main.py") as client:
        # 1) Capabilities
        caps = await client.call_tool("get_model_capabilities")
        print("Capabilities:", caps.structured_content or caps.text)

        # 2) Generate
        gen_result = await client.call_tool(
            "generate_image",
            {
                "prompt": "a watercolor fox in a forest, soft light",
                "provider": "openai",
                "model": "gpt-image-1",
            },
        )
        print("Generate Result:", gen_result.structured_content)
        print("Image blocks:", len(gen_result.content))


asyncio.run(main())

🔐 Environment variables

Set only what you need:

VariableRequired forDescription
OPENAI_API_KEYOpenAIAPI key for OpenAI.
AZURE_OPENAI_API_KEYAzure OpenAIAzure OpenAI key.
AZURE_OPENAI_ENDPOINTAzure OpenAIAzure endpoint URL.
AZURE_OPENAI_API_VERSIONAzure OpenAIOptional; default 2024-02-15-preview.
GEMINI_API_KEYGeminiGemini Developer API key.
OPENROUTER_API_KEYOpenRouterOpenRouter API key.
VERTEX_PROJECTVertex AIGCP project id.
VERTEX_LOCATIONVertex AIGCP region (e.g. us-central1).
VERTEX_CREDENTIALS_PATHVertex AIOptional path to GCP JSON; ADC supported.

🏃 Running via FastMCP CLI

Supports multiple transports:

  • stdio: fastmcp run image_gen_mcp/main.py:app
  • SSE (HTTP): fastmcp run image_gen_mcp/main.py:app --transport sse --host 127.0.0.1 --port 8000
  • HTTP: fastmcp run image_gen_mcp/main.py:app --transport http --host 127.0.0.1 --port 8000 --path /mcp

Design notes

  • Schema: public contract in image_gen_mcp/schema.py (Pydantic).
  • Engines: modular adapters in image_gen_mcp/engines/, selected by ModelFactory.
  • Capabilities: discovered dynamically via image_gen_mcp/settings.py.
  • Errors: stable JSON error { code, message, details? }.

⚠️ Testing remarks

I tested this project locally using the openrouter-backed model only. I could not access Gemini or OpenAI from my location (Hong Kong) due to regional restrictions — thanks, US government — so I couldn't fully exercise those providers.

Because of that limitation, the gemini/vertex and openai (including Azure) adapters may contain bugs or untested edge cases. If you use those providers and find issues, please open an issue or, even better, submit a pull request with a fix — contributions are welcome.

Suggested info to include when filing an issue:

  • Your provider and model (e.g., openai:gpt-image-1, vertex:imagen-4.0-generate-001)
  • Full stderr/server logs showing the error
  • Minimal reproduction steps or a short test script

Thanks — and PRs welcome!


🤝 Contributing & Releases

PRs welcome! Please run tests and linters locally.

Release process (GitHub Actions)

  1. Automated (recommended)

    • Actions → Manual Release
    • Pick version bump: patch / minor / major
    • The workflow tags, builds the changelog, and publishes to PyPI
  2. Manual

    • git tag vX.Y.Z
    • git push origin vX.Y.Z
    • Create a GitHub Release from the tag

📄 License

Apache-2.0 — see LICENSE.