antoniolg/gemini-image-mcp-server
If you are the rightful owner of gemini-image-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The Gemini Image MCP Server is a Model Context Protocol server designed for image generation and editing using Google Gemini AI, optimized for social media image creation.
Gemini Image MCP Server
A Model Context Protocol (MCP) server for image generation and editing using Google Gemini AI. Supports optional context images to guide results and now includes a dedicated edit workflow. Optimized for creating eye‑catching social media images with square (1:1) format by default.
Features
- ✨ Image generation with Google Gemini AI
- 🎨 Multiple aspect ratios (1:1, 16:9, 9:16, 4:3, 3:4)
- 📱 Optimized for social media with 1:1 format by default
- 🎯 Custom style support
- 🧩 Context images to guide generation
- ✏️ Dedicated edit tool for modifying existing assets without juggling extra options
- 🏷️ Watermark support - Overlay watermark images on generated results
- 💾 Automatic saving of images to local files
- 📁 Flexible output path configuration
- 🛡️ Customizable safety settings
Installation
- Clone this repository
- Install dependencies:
npm install
- Build the project:
npm run build
Configuration
Environment Variables
You need to configure your Google AI API key:
export GOOGLE_API_KEY="your-api-key-here"
Getting Google AI API Key
- Go to Google AI Studio
- Create a new API key
- Copy the key and set it as an environment variable
Client Configuration
{
"servers": {
"gemini-image": {
"command": "node",
"args": ["/full/path/to/project/dist/index.js"],
"env": {
"GOOGLE_API_KEY": "your-api-key-here"
}
}
}
}
Command Line Interface
In addition to the MCP server, the project now ships with a CLI for quick terminal-friendly workflows.
-
Build the project once:
npm run build -
Make sure
GOOGLE_API_KEYis set in your environment. -
Explore the CLI:
node dist/cli.js --help # or, after publishing/packing: gemini-image --help
Commands
gemini-image generate: Create new imagery from a text prompt.gemini-image generate --prompt "A banana astronaut on Mars" --output ./images/gemini-image edit: Apply instructions to an existing image.gemini-image edit --prompt "Add neon lights to the skyline" --input ./images/city.png
Both commands support --help for detailed, friendly option descriptions. CLI option names are intentionally concise (for example --prompt, --context, --input) so they are easier to memorize than the MCP tool identifiers.
Available Tools
generate_image
Creates a brand-new image from a text description, optionally using one or more images as visual context. Use this tool when you want to generate fresh content.
Parameters:
description(string, required): Detailed description of the desired image.images(string[], optional): Array of image paths used as context (absolute or relative). Use this to “edit” or guide style/content.aspectRatio(string, optional): Orientation preset (square,landscape,portrait). Default:square.style(string, optional): Additional style (e.g., "minimalist", "colorful", "professional", "artistic").outputPath(string, optional): Where to save the image. If omitted, saves in current directory.watermarkPath(string, optional): Path to watermark image to overlay.watermarkPosition(string, optional): One oftop-left,top-right,bottom-left,bottom-right. Default:bottom-right.
Usage Examples:
# Basic - saves to current directory
Generate an image of a mountain landscape at sunset with warm, minimalist style
# With context image to guide composition
Generate an image: "Create a futuristic city skyline inspired by this photo", images: ["./reference-skyline.jpg"], aspectRatio: "landscape"
# Multiple context images
Generate an image combining style of a logo and a photo, images: ["./photo.jpg", "./logo.png"], style: "professional"
When you request a specific orientation (square, landscape, or portrait), the server automatically appends an invisible helper image (assets/square.png, assets/landscape.png, or assets/portrait.png) so Gemini respects the target dimensions.
edit_image
Modifies an existing image using a focused text instruction. This tool keeps the original framing unless you explicitly ask for structural changes.
Parameters:
description(string, required): Instructions describing the edits to apply to the provided image.image(string, required): Path to the image file you want to edit (absolute or relative).outputPath(string, optional): Where to save the edited result. If omitted, the server uses the working directory and an auto-generated filename.
Usage Examples:
# Simple edit
Edit image: "Soften skin tones and remove flyaway hairs", image: "./headshot.png"
# Heavier retouch
Edit image: "Turn the product label red and add subtle sparkle highlights", image: "./product-shot.jpg"
# Custom path and watermark (top-left)
Generate an image of a space cat, outputPath: "./images/epic_pizza.png", watermarkPath: "./my_logo.png", watermarkPosition: "top-left"
Watermark Functionality
The generate_image tool supports adding watermarks to your images:
Features:
- 🏷️ Add image watermarks to any generated output
- 📍 Position in any corner (
watermarkPosition) - 📏 Smart sizing (25% of image width, maintaining aspect ratio)
- 🎯 Consistent spacing (3% padding from edges)
- 🖼️ Supports PNG, JPG, WebP watermark files
- ⚡ Only applied when
watermarkPathparameter is provided
Usage:
# For image generation
watermarkPath: "./my-brand-logo.png"
# With context images
watermarkPath: "./watermark.jpg"
Watermark Specifications:
- Position: Configurable corner via
watermarkPosition - Size: 25% of image width (maintains watermark aspect ratio)
- Padding: 3% of image width from the selected edges
- Blend mode: Over (watermark appears on top of image)
Save Functionality:
- Default: Images are saved in the directory from where the MCP client is executed
- Automatic naming: Generated based on description, date and time
- Supported formats: PNG, JPG, WebP (depending on what Gemini returns)
- Automatic creation: Creates necessary folders if they don't exist
Development
Available Scripts
npm run build: Compiles TypeScript to JavaScriptnpm run dev: Development mode with automatic reloadnpm start: Runs the compiled servernpm run cli: Runs the CLI entry directly (node dist/cli.js)
Project Structure
gemini-image-mcp-server/
├── src/
│ ├── index.ts # Main server entry point
│ ├── cli.ts # CLI entry point (generate/edit commands)
│ ├── services/
│ │ ├── gemini.ts # Gemini AI calls
│ │ ├── imageService.ts # File system + watermark handling
│ │ └── serviceFactory.ts # Shared initialization helpers
│ ├── tools/
│ │ ├── index.ts # Tools exports
│ │ ├── generateImage.ts # Tool for creating new images
│ │ └── editImage.ts # Tool for editing existing images
│ └── types/
│ └── index.ts # Type definitions
├── dist/ # Compiled files
├── package.json
├── tsconfig.json
└── README.md
Troubleshooting
Error: "GOOGLE_API_KEY environment variable is required"
Make sure you have configured the GOOGLE_API_KEY environment variable with your Google AI API key.
Error: "Could not generate image"
- Verify that your API key is valid and has permissions for the
gemini-2.5-flash-image-previewmodel - Ensure the description doesn't contain content that might be blocked by safety filters
File saving error
- Verify you have write permissions in the specified path
- Make sure the path is valid and accessible
- If specifying a folder, end it with
/
Server not responding
- Verify the server is running correctly
- Check logs in stderr for error messages
- Make sure the MCP client is configured correctly
License
MIT
Contributing
Contributions are welcome. Please open an issue before making significant changes.