gemini-image-mcp by ikamman - MCP Server

Gemini Image MCP Server

A powerful MCP server for image analysis, generation, and editing using Google's Gemini API

Installation • Usage • API Reference • Examples

✨ Features

🖼️ Image Analysis - Analyze images from URLs or local files using Gemini 2.5 Flash
🎨 Image Generation - Generate high-quality images from text prompts
✏️ Image Editing - Edit existing images with natural language instructions
🔍 Custom Prompts - Use system and user prompts for specific analysis needs
🚀 High Performance - Built with Rust for speed and reliability
🛡️ Robust Error Handling - Comprehensive error handling and validation
📡 MCP Protocol - Seamless integration with MCP-compatible clients
🌐 Multi-Format Support - JPEG, PNG, GIF, WebP, and more

🚀 Quick Start

Using npx (Recommended)

npx @ikamman/gemini-image-mcp --gemini-api-key "your-api-key"

Global Installation

npm install -g @ikamman/gemini-image-mcp
gemini-image-mcp --help

📦 Installation

Prerequisites

Gemini API Key: Get one here
Node.js: 14+ (for npm installation)
Rust: 1.70+ (for building from source)

Option 1: Install via npm (Automated with cargo-dist)

npm install -g @ikamman/gemini-image-mcp

Option 2: Build from Source

git clone https://github.com/ikamman/gemini-image-mcp.git
cd gemini-image-mcp
cargo build --release

🔧 Configuration

Set your Gemini API key using one of these methods:

Environment Variable

export GEMINI_API_KEY="your-api-key-here"

Command Line Argument

gemini-image-mcp --gemini-api-key "your-api-key"

Using .env File

echo "GEMINI_API_KEY=your-api-key-here" > .env

📖 Usage

As MCP Server

The server communicates via JSON-RPC over stdio:

gemini-image-mcp

Integration with Claude Desktop

Using npx (No Installation Required)

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "gemini-image-mcp": {
      "command": "npx",
      "args": ["@ikamman/gemini-image-mcp"],
      "env": {
        "GEMINI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Using Global Installation

First install globally:

npm install -g @ikamman/gemini-image-mcp

Then add to your claude_desktop_config.json:

{
  "mcpServers": {
    "gemini-image-mcp": {
      "command": "gemini-image-mcp",
      "env": {
        "GEMINI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Alternative: Using Full Path

For more reliability, you can use the full npx path:

{
  "mcpServers": {
    "gemini-image-mcp": {
      "command": "/usr/local/bin/npx",
      "args": ["@ikamman/gemini-image-mcp"],
      "env": {
        "GEMINI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Manual Testing

# Test image analysis
echo '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"analyze_image","arguments":{"image_source":"https://example.com/image.jpg","user_prompt":"What do you see?"}}}' | gemini-image-mcp

# Test image generation
echo '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"generate_image","arguments":{"user_prompt":"A sunset over mountains","output_path":"./sunset.png"}}}' | gemini-image-mcp

🔗 API Reference

🔍 `analyze_image`

Analyzes images using Google's Gemini API.

Parameters:

image_source (required) - Image URL or local file path
system_prompt (optional) - System instructions for analysis
user_prompt (optional) - Analysis question (default: "Caption this image.")

Example:

{
  "image_source": "https://example.com/photo.jpg",
  "system_prompt": "You are a professional photographer.",
  "user_prompt": "Analyze the composition and lighting of this image."
}

🎨 `generate_image`

Generates images from text descriptions.

Parameters:

user_prompt (required) - Description of the image to generate
output_path (required) - Path where the image should be saved
system_prompt (optional) - Additional generation guidelines

Example:

{
  "user_prompt": "A cyberpunk cityscape at night with neon lights",
  "output_path": "./generated_city.png",
  "system_prompt": "Create a high-quality, detailed image."
}

✏️ `edit_image`

Edits existing images using natural language instructions.

Parameters:

image_source (required) - Source image URL or file path
user_prompt (required) - Editing instructions
output_path (required) - Path for the edited image
system_prompt (optional) - Additional editing guidelines

Example:

{
  "image_source": "./my_photo.jpg",
  "user_prompt": "Add a vintage filter and increase the warmth",
  "output_path": "./edited_photo.jpg"
}

💡 Examples

Image Analysis Examples

# Analyze a webpage screenshot
gemini-image-mcp analyze "https://example.com/screenshot.png" "What UI elements do you see?"

# Analyze a local photo
gemini-image-mcp analyze "./vacation.jpg" "Describe the location and activities"

# Technical analysis
gemini-image-mcp analyze "./chart.png" "Extract the key data points and trends"

Image Generation Examples

# Generate artwork
gemini-image-mcp generate "Abstract watercolor painting of a forest" "./forest.png"

# Generate technical diagrams
gemini-image-mcp generate "Network architecture diagram showing microservices" "./diagram.png"

# Generate marketing assets
gemini-image-mcp generate "Modern logo for a tech startup, minimalist design" "./logo.png"

Image Editing Examples

# Basic editing
gemini-image-mcp edit "./portrait.jpg" "Remove the background" "./portrait_nobg.png"

# Style changes
gemini-image-mcp edit "./photo.jpg" "Convert to black and white with high contrast" "./photo_bw.jpg"

# Object manipulation
gemini-image-mcp edit "./room.jpg" "Add a plant in the corner" "./room_with_plant.jpg"

🏗️ Supported Image Formats

Format	Extensions	Analysis	Generation	Editing
JPEG	`.jpg`, `.jpeg`	✅	✅	✅
PNG	`.png`	✅	✅	✅
GIF	`.gif`	✅	❌	✅
WebP	`.webp`	✅	❌	✅

⚡ Performance & Limits

Image Size: Up to 20MB per image
Concurrent Requests: Handled via async Rust runtime
Rate Limits: Follows Gemini API rate limits
Response Time: Typically 2-10 seconds depending on image size and complexity

🛠️ Development

Building from Source

git clone https://github.com/your-username/gemini-image-mcp.git
cd gemini-image-mcp
cargo build --release

Running Tests

cargo test

Testing with Sample Images

cargo test -- --nocapture

Project Structure

gemini-image-mcp/
├── src/
│   ├── main.rs              # Application entry point
│   ├── jsonrpc.rs          # JSON-RPC handler
│   ├── gemini_client.rs    # Gemini API client
│   ├── image_service.rs    # Image processing service
│   ├── validation.rs       # Input validation
│   └── error.rs            # Error handling
├── test/                   # Sample images for testing
├── Cargo.toml              # Rust dependencies & cargo-dist config
└── .github/workflows/      # Automated CI/CD with cargo-dist

🤝 Contributing

We welcome contributions! Please see our for details.

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the file for details.

🆘 Troubleshooting

Common Issues

❌ "Missing GEMINI_API_KEY"

export GEMINI_API_KEY="your-api-key-here"

❌ "Image not found" for URLs

Ensure the URL is publicly accessible
Check your internet connection
Verify the image format is supported

❌ "Binary not found" after npm install

Try reinstalling: npm uninstall -g @ikamman/gemini-image-mcp && npm install -g @ikamman/gemini-image-mcp
The binary is automatically managed by cargo-dist

❌ Rate limit errors

Wait a moment before retrying
Consider implementing exponential backoff in your client

Getting Help

🙏 Acknowledgments

Model Context Protocol - For the excellent MCP standard
Google Gemini - For the powerful AI capabilities
Rust MCP SDK - For the Rust implementation

Made with ❤️ using Rust and Google Gemini

⭐ Star this repo • 🐛 Report Bug • ✨ Request Feature

ikamman/gemini-image-mcp

Gemini Image MCP Server

✨ Features

🚀 Quick Start

Using npx (Recommended)

Global Installation

📦 Installation

Prerequisites

Option 1: Install via npm (Automated with cargo-dist)

Option 2: Build from Source

🔧 Configuration

Environment Variable

Command Line Argument

Using .env File

📖 Usage

As MCP Server

Integration with Claude Desktop

Using npx (No Installation Required)

Using Global Installation

Alternative: Using Full Path

Manual Testing

🔗 API Reference

🔍 analyze_image

🎨 generate_image

✏️ edit_image

💡 Examples

Image Analysis Examples

Image Generation Examples

Image Editing Examples

🏗️ Supported Image Formats

⚡ Performance & Limits

🛠️ Development

Building from Source

Running Tests

Testing with Sample Images

Project Structure

🤝 Contributing

📄 License

🆘 Troubleshooting

Common Issues

Getting Help

🙏 Acknowledgments

🔍 `analyze_image`

🎨 `generate_image`

✏️ `edit_image`