holocode-ai/gemini-mcp
If you are the rightful owner of gemini-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
Gemini MCP Server is a comprehensive Model Context Protocol server for Google Gemini AI services, offering advanced multimodal generation capabilities.
Gemini MCP Server
A comprehensive Model Context Protocol (MCP) server for Google Gemini AI services, providing advanced multimodal generation capabilities including image generation, image editing, and video creation through Google's state-of-the-art AI models.
🚀 Features
Multimodal AI Services
- 🖼️ Image Generation: High-quality image creation using Gemini 3.0 Pro models
- ✏️ Image Editing: Advanced image modification and enhancement using Gemini AI models
- 🔀 Multi-Image Composition: Seamless blending and combining of multiple images
- 🎬 Video Generation: Cinematic video creation using Google's Veo 3.1 models with native audio (text-to-video and image-to-video)
Advanced Model Support
- Gemini Models:
gemini-3-pro-image-preview(default - Gemini 3 Pro with native image generation),gemini-2.5-flash-image - Veo Models:
veo-3.1-generate-preview(default - latest with native audio),veo-3.1-fast-generate-preview,veo-3.0-generate-preview,veo-3.0-fast-generate-001
MCP Protocol Features
- Dual Transport Support: Stdio (default) and HTTP/SSE transports
- Bearer Token Authentication: Secure HTTP access with configurable service tokens
- Comprehensive Tool Descriptions: Detailed parameter documentation and usage examples
- File Output Management: Configurable output directories with metadata
- Error Handling: Robust error handling with informative responses
📋 Prerequisites
For Pre-built Binaries
- Google API Key with Gemini API access (required)
- Optional: Google Cloud Project ID for advanced features
For Building from Source
- Go 1.23+ (required for building)
- Google API Key with Gemini API access (required)
- Optional: Google Cloud Project ID for advanced features
🛠️ Installation
Option 1: Download Pre-built Binary (Recommended)
Download the latest release for your platform from GitHub Releases.
Each release includes two binaries:
- gemini-mcp: The main MCP server
- upload_media: CLI tool for uploading local files to the server (used with HTTP mode)
Linux (x86_64)
# Download and extract
wget https://github.com/holocode-ai/gemini-mcp/releases/latest/download/gemini-mcp-VERSION-linux-amd64.tar.gz
tar -xzf gemini-mcp-VERSION-linux-amd64.tar.gz
# Make executable and move to PATH
chmod +x gemini-mcp-VERSION-linux-amd64
sudo mv gemini-mcp-VERSION-linux-amd64 /usr/local/bin/gemini-mcp
Linux (ARM64)
wget https://github.com/holocode-ai/gemini-mcp/releases/latest/download/gemini-mcp-VERSION-linux-arm64.tar.gz
tar -xzf gemini-mcp-VERSION-linux-arm64.tar.gz
chmod +x gemini-mcp-VERSION-linux-arm64
sudo mv gemini-mcp-VERSION-linux-arm64 /usr/local/bin/gemini-mcp
macOS (Intel)
# Download and extract
curl -LO https://github.com/holocode-ai/gemini-mcp/releases/latest/download/gemini-mcp-VERSION-darwin-amd64.tar.gz
tar -xzf gemini-mcp-VERSION-darwin-amd64.tar.gz
# Make executable and move to PATH
chmod +x gemini-mcp-VERSION-darwin-amd64
sudo mv gemini-mcp-VERSION-darwin-amd64 /usr/local/bin/gemini-mcp
macOS (Apple Silicon)
curl -LO https://github.com/holocode-ai/gemini-mcp/releases/latest/download/gemini-mcp-VERSION-darwin-arm64.tar.gz
tar -xzf gemini-mcp-VERSION-darwin-arm64.tar.gz
chmod +x gemini-mcp-VERSION-darwin-arm64
sudo mv gemini-mcp-VERSION-darwin-arm64 /usr/local/bin/gemini-mcp
Windows
# Download the zip file from releases page
# Extract gemini-mcp-VERSION-windows-amd64.zip
# Add the extracted .exe to your PATH
Verify installation:
gemini-mcp -version
Option 2: Build from Source
- Clone and build:
git clone <repository-url>
cd gemini-mcp
go build -o gemini-mcp main.go
- Set up API key:
export GOOGLE_API_KEY="your_google_api_key_here"
- Test the installation:
./gemini-mcp -version
Option 3: Using Makefile
- Install dependencies:
make deps
- Build application:
make build
- Set up environment:
cp .env.example .env
# Edit .env with your API key
🎯 Usage
Command Line Interface
./gemini-mcp [options]
Options:
-transport string Transport type: stdio (default), http, or sse
-version Show version information
Stdio Mode (Default)
Run the server for direct MCP client integration:
./gemini-mcp
HTTP Mode
Run the server as an HTTP service with optional authentication:
# Basic HTTP mode (no authentication - development only)
TRANSPORT=http PORT=8080 ./gemini-mcp
# HTTP mode with Bearer token authentication (recommended for production)
TRANSPORT=http PORT=8080 SERVICE_TOKENS=token1,token2 ./gemini-mcp
# Using Makefile
make run-http
HTTP Authentication:
When SERVICE_TOKENS is configured, all requests must include an Authorization header:
curl -X POST http://localhost:8080 \
-H "Content-Type: application/json" \
-H "Authorization: Bearer token1" \
-d '{"jsonrpc":"2.0","method":"tools/list","id":"1"}'
Testing MCP Protocol
# Test basic connectivity (stdio mode)
./test_mcp.sh
# Manual testing (stdio mode)
echo '{"jsonrpc":"2.0","id":"1","method":"tools/list","params":{}}' | ./gemini-mcp
# Test HTTP mode
curl -X POST http://localhost:8080 \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your_token" \
-d '{"jsonrpc":"2.0","method":"tools/list","id":"1"}'
🛠️ Available Tools
1. gemini_image_generation
Generate high-quality images using Google's latest Gemini image generation models with advanced style control and quality settings.
Key Features:
- Advanced style control and artistic options
- Multi-language prompt support
- Customizable aspect ratios and quality settings
- Content safety levels and text rendering options
Parameters:
prompt(required): Detailed description of desired imagemodel: Gemini model variant (default:gemini-3-pro-preview)output_directory: Local save path
2. gemini_image_edit
Edit existing images using Google's Gemini AI models with targeted modifications.
Key Features:
- Targeted image modifications and style transfers
- Object addition/removal capabilities
- Background changes while preserving original characteristics
- Precise control over edit types
Parameters:
prompt(required): Description of desired editsimage_path: Path to the image to editedit_type: Type of edit operationoutput_directory: Local save path
3. gemini_multi_image
Combine and blend multiple images using Google's Gemini AI models.
Key Features:
- Merge 2-3 images into cohesive compositions
- Create collages, overlays, and seamless blends
- Character consistency across scenes
- Style unification for creative compositions
Parameters:
prompt(required): Description of desired compositionimage_paths: Array of image paths to combineblend_mode: How to combine the imagesoutput_directory: Local save path
4. veo_text_to_video
Generate 4-8 second videos from text prompts using Google's Veo 3.1 models with native audio.
Key Features:
- Detailed scene descriptions with camera movements
- Realistic physics and natural motion
- Native audio generation (dialogue, sounds, music)
- Support for 16:9/9:16 aspect ratios
- 720p/1080p resolution options
- Flexible duration: 4, 6, or 8 seconds
- SynthID watermarking
Parameters:
prompt(required): Detailed video scene descriptionnegative_prompt: Content to avoid in the videoaspect_ratio: Video ratio (16:9,9:16)resolution: Video quality (720p,1080p)model: Veo variant (default:veo-3.1-generate-preview)seed: Optional seed for reproducibilityoutput_directory: Local save path
6. veo_image_to_video
Animate static images into 4-8 second videos using Google's Veo 3.1 models with native audio.
Key Features:
- Transform photos into dynamic scenes
- Natural motion and camera movements
- Input image becomes the starting frame
- Realistic physics simulation
Parameters:
prompt(required): Description of desired animationimage_path: Path to input imagenegative_prompt: Content to avoidaspect_ratio: Video ratio (16:9,9:16)resolution: Video quality (720p,1080p)model: Veo variant (default:veo-3.1-generate-preview)output_directory: Local save path
7. veo_generate_video (Legacy)
General video generation tool supporting both text-to-video and image-to-video creation.
Key Features:
- Backward compatibility with existing workflows
- Supports both text and image inputs
- Advanced scene composition
- Automatic operation polling
Parameters:
prompt(required): Video descriptionimage_path: Optional input image for image-to-videoaspect_ratio: Video ratioresolution: Video qualitynegative_prompt: Content exclusionoutput_directory: Local save path
8. upload_media
Get instructions for uploading local files to S3 storage using the upload_media CLI tool. This is required when using HTTP mode with image editing or video generation tools.
Key Features:
- Upload local files to S3/MinIO storage
- Returns object_key for use with other tools
- One-time authentication tokens for security
- Supports PNG, JPEG, WebP, and video formats
Workflow:
- Call
upload_mediaMCP tool to get CLI instructions and token - Run
upload_mediaCLI with the provided token to upload your file - Use the returned
object_keywithgemini_image_edit,gemini_multi_image, orveo_image_to_video
CLI Usage:
upload_media --server "http://localhost:8080/upload" --token "<one-time-token>" /path/to/file.png
🔧 Environment Configuration
| Variable | Description | Default | Required |
|---|---|---|---|
GOOGLE_API_KEY | Gemini API authentication key | - | ✅ Yes |
GOOGLE_PROJECT_ID | Google Cloud Project ID | - | ❌ Optional |
GOOGLE_LOCATION | Google Cloud region | us-central1 | ❌ Optional |
OUTPUT_DIR | File output directory | ./output | ❌ Optional |
TRANSPORT | MCP transport protocol (stdio, http, sse) | stdio | ❌ Optional |
PORT | HTTP server port (when TRANSPORT=http) | 8080 | ❌ Optional |
SERVICE_TOKENS | Comma-separated Bearer tokens for HTTP auth | - | ❌ Optional |
🔌 MCP Client Integration
Claude Desktop Configuration (Stdio Mode)
{
"mcpServers": {
"gemini": {
"command": "gemini-mcp",
"env": {
"GOOGLE_API_KEY": "your_api_key_here"
}
}
}
}
Note: If you installed the binary to a custom location, use the full path:
{
"mcpServers": {
"gemini": {
"command": "/path/to/gemini-mcp",
"env": {
"GOOGLE_API_KEY": "your_api_key_here"
}
}
}
}
Claude Desktop Configuration (HTTP Mode)
First, start the server in HTTP mode:
GOOGLE_API_KEY=your_api_key TRANSPORT=http PORT=8080 SERVICE_TOKENS=mytoken ./gemini-mcp
Then configure Claude Desktop to connect via HTTP:
{
"mcpServers": {
"gemini": {
"type": "http",
"url": "http://localhost:8080/mcp",
"headers": {
"Authorization": "Bearer mytoken",
"X-Upload-Media-Path": "/path/to/upload_media"
}
}
}
}
Header Configuration:
Authorization: Bearer token for authentication (must match one ofSERVICE_TOKENS)X-Upload-Media-Path: Absolute path to theupload_mediaCLI binary. Required for usinggemini_image_edit,gemini_multi_image, orveo_image_to_videowith local files in HTTP mode.
Cline VSCode Extension
{
"cline.mcp.servers": [
{
"name": "gemini",
"command": "gemini-mcp",
"env": {
"GOOGLE_API_KEY": "your_api_key_here"
}
}
]
}
🧪 Development
Building from Source
go mod tidy
go build -o gemini-mcp main.go
Multi-Platform Builds
# Build for current platform
make build
# Build for specific platforms
make build-darwin-arm64 # macOS Apple Silicon
make build-darwin-amd64 # macOS Intel
make build-linux-amd64 # Linux x86_64
make build-linux-arm64 # Linux ARM64
# Build all platforms
make build-all
# Build release versions (with version suffix)
make release
Testing
make test
./test_mcp.sh
Running
make run # Run in stdio mode
make run-http # Run in HTTP mode (port 8080)
Code Quality
make fmt # Format code
make clean # Clean artifacts
📝 Implementation Notes
- Gemini Integration: Uses
google.golang.org/genaiwith Gemini API backend - Protocol Compliance: Implements MCP 2024-11-05 specification with Streamable HTTP transport (2025-03-26)
- Transport Support: Stdio (default) and HTTP/SSE with Bearer token authentication
- Image Generation: Full implementation with Gemini 3.0 Pro models, returns ImageContent for MCP clients
- Video Generation: Complete Veo 3.1 integration with native audio, operation polling, and proper file downloads
- File Management: Generated content saved with metadata and timestamps
- Error Handling: Comprehensive error responses with helpful messages
- Multi-modal Support: Supports text-to-image, image-to-image, text-to-video, and image-to-video workflows
- Authentication: Configurable Bearer token authentication for HTTP transport with multiple token support
🤝 Contributing
This project is designed to be a comprehensive MCP server for Google's AI services. Contributions are welcome for:
- Additional model support
- Transport protocol enhancements
- Full implementation of placeholder services
- Documentation improvements
📄 License
MIT License - see LICENSE file for details.