holocode-ai/gemini-mcp
If you are the rightful owner of gemini-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
Gemini MCP Server is a comprehensive Model Context Protocol server for Google Gemini AI services, offering advanced multimodal generation capabilities.
Gemini MCP Server
A comprehensive Model Context Protocol (MCP) server for Google Gemini AI services, providing advanced multimodal generation capabilities including image generation, image editing, and video creation through Google's state-of-the-art AI models.
๐ Features
Multimodal AI Services
- ๐ผ๏ธ Image Generation: High-quality image creation using Gemini 2.5 Flash Image Preview and Imagen 4.0 models
- โ๏ธ Image Editing: Advanced image modification and enhancement using Gemini AI models
- ๐ Multi-Image Composition: Seamless blending and combining of multiple images
- ๐ฌ Video Generation: Cinematic video creation using Google's Veo 3.0 models (text-to-video and image-to-video)
Advanced Model Support
- Gemini Models:
gemini-2.5-flash-image-preview
,gemini-2.0-flash-preview
- Imagen Models:
imagen-4.0-generate-001
(latest),imagen-4.0-ultra-generate-001
,imagen-4.0-fast-generate-001
- Veo Models:
veo-3.0-generate-001
,veo-3.0-fast-generate-001
,veo-2.0-generate-001
MCP Protocol Features
- Stdio Transport: Direct integration with MCP clients
- Comprehensive Tool Descriptions: Detailed parameter documentation and usage examples
- File Output Management: Configurable output directories with metadata
- Error Handling: Robust error handling with informative responses
๐ Prerequisites
- Go 1.23+ (required for building)
- Google API Key with Gemini API access (required)
- Optional: Google Cloud Project ID for advanced features
๐ ๏ธ Installation
Quick Setup
- Clone and build:
git clone <repository-url>
cd gemini-mcp
go build -o gemini-mcp main.go
- Set up API key:
export GOOGLE_API_KEY="your_google_api_key_here"
- Test the installation:
./gemini-mcp -version
Using Makefile
- Install dependencies:
make deps
- Build application:
make build
- Set up environment:
cp .env.example .env
# Edit .env with your API key
๐ฏ Usage
Command Line Interface
./gemini-mcp [options]
Options:
-transport string Transport type: stdio (default)
-version Show version information
Stdio Mode (MCP Integration)
Run the server for direct MCP client integration:
./gemini-mcp
Testing MCP Protocol
# Test basic connectivity
./test_mcp.sh
# Manual testing
echo '{"jsonrpc":"2.0","id":"1","method":"tools/list","params":{}}' | ./gemini-mcp
๐ ๏ธ Available Tools
1. gemini_image_generation
Generate high-quality images using Google's latest Gemini image generation models with advanced style control and quality settings.
Key Features:
- Advanced style control and artistic options
- Multi-language prompt support
- Customizable aspect ratios and quality settings
- Content safety levels and text rendering options
Parameters:
prompt
(required): Detailed description of desired imagemodel
: Gemini model variant (default:gemini-2.5-flash-image-preview
)output_directory
: Local save path
2. gemini_image_edit
Edit existing images using Google's Gemini AI models with targeted modifications.
Key Features:
- Targeted image modifications and style transfers
- Object addition/removal capabilities
- Background changes while preserving original characteristics
- Precise control over edit types
Parameters:
prompt
(required): Description of desired editsimage_path
: Path to the image to editedit_type
: Type of edit operationoutput_directory
: Local save path
3. gemini_multi_image
Combine and blend multiple images using Google's Gemini AI models.
Key Features:
- Merge 2-3 images into cohesive compositions
- Create collages, overlays, and seamless blends
- Character consistency across scenes
- Style unification for creative compositions
Parameters:
prompt
(required): Description of desired compositionimage_paths
: Array of image paths to combineblend_mode
: How to combine the imagesoutput_directory
: Local save path
4. imagen_t2i
Generate high-quality images using Google's state-of-the-art Imagen models.
Key Features:
- Photorealistic and artistic image creation
- Multiple model variants for different use cases
- Support for various aspect ratios
- Batch generation (1-4 images)
Parameters:
prompt
(required): Detailed image descriptionmodel
: Imagen variant (default:imagen-4.0-generate-001
)num_images
: Number of images (1-4, default: 1)aspect_ratio
: Image ratio (1:1
,16:9
,9:16
,4:3
,3:4
)output_directory
: Local save path
Supported Models:
imagen-4.0-generate-001
: Latest standard modelimagen-4.0-ultra-generate-001
: Highest qualityimagen-4.0-fast-generate-001
: Fastest generation
5. veo_text_to_video
Generate 8-second videos from text prompts using Google's Veo 3.0 models.
Key Features:
- Detailed scene descriptions with camera movements
- Realistic physics and natural motion
- Support for 16:9/9:16 aspect ratios
- 720p/1080p resolution options
- Negative prompts for content exclusion
- SynthID watermarking
Parameters:
prompt
(required): Detailed video scene descriptionnegative_prompt
: Content to avoid in the videoaspect_ratio
: Video ratio (16:9
,9:16
)resolution
: Video quality (720p
,1080p
)model
: Veo variant (default:veo-3.0-generate-001
)seed
: Optional seed for reproducibilityoutput_directory
: Local save path
6. veo_image_to_video
Animate static images into 8-second videos using Google's Veo 3.0 models.
Key Features:
- Transform photos into dynamic scenes
- Natural motion and camera movements
- Input image becomes the starting frame
- Realistic physics simulation
Parameters:
prompt
(required): Description of desired animationimage_path
: Path to input imagenegative_prompt
: Content to avoidaspect_ratio
: Video ratio (16:9
,9:16
)resolution
: Video quality (720p
,1080p
)model
: Veo variant (default:veo-3.0-generate-001
)output_directory
: Local save path
7. veo_generate_video (Legacy)
General video generation tool supporting both text-to-video and image-to-video creation.
Key Features:
- Backward compatibility with existing workflows
- Supports both text and image inputs
- Advanced scene composition
- Automatic operation polling
Parameters:
prompt
(required): Video descriptionimage_path
: Optional input image for image-to-videoaspect_ratio
: Video ratioresolution
: Video qualitynegative_prompt
: Content exclusionoutput_directory
: Local save path
๐ง Environment Configuration
Variable | Description | Default | Required |
---|---|---|---|
GOOGLE_API_KEY | Gemini API authentication key | - | โ Yes |
GOOGLE_PROJECT_ID | Google Cloud Project ID | - | โ Optional |
GOOGLE_LOCATION | Google Cloud region | us-central1 | โ Optional |
OUTPUT_DIR | File output directory | ./output | โ Optional |
TRANSPORT | MCP transport protocol | stdio | โ Optional |
๐ MCP Client Integration
Claude Desktop Configuration
{
"mcpServers": {
"gemini": {
"command": "/path/to/gemini-mcp",
"env": {
"GOOGLE_API_KEY": "your_api_key_here"
}
}
}
}
Cline VSCode Extension
{
"cline.mcp.servers": [
{
"name": "gemini",
"command": "/path/to/gemini-mcp",
"env": {
"GOOGLE_API_KEY": "your_api_key_here"
}
}
]
}
๐งช Development
Building from Source
go mod tidy
go build -o gemini-mcp main.go
Testing
make test
./test_mcp.sh
Code Quality
make fmt # Format code
make clean # Clean artifacts
๐ Implementation Notes
- Gemini Integration: Uses
google.golang.org/genai
with Gemini API backend - Protocol Compliance: Implements MCP 2024-11-05 specification
- Image Generation: Full implementation with Gemini 2.5 Flash Image Preview and Imagen 4.0 models
- Video Generation: Complete Veo 3.0 integration with operation polling and proper file downloads
- File Management: Generated content saved with metadata and timestamps
- Error Handling: Comprehensive error responses with helpful messages
- Multi-modal Support: Supports text-to-image, image-to-image, text-to-video, and image-to-video workflows
๐ค Contributing
This project is designed to be a comprehensive MCP server for Google's AI services. Contributions are welcome for:
- Additional model support
- Transport protocol enhancements
- Full implementation of placeholder services
- Documentation improvements
๐ License
MIT License - see LICENSE file for details.