mnthe/gemini-mcp-server
If you are the rightful owner of gemini-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The Vertex MCP Server is an intelligent Model Context Protocol server designed to enhance AI assistants by enabling them to query Google Cloud Vertex AI with advanced agentic capabilities.
gemini-mcp-server
An intelligent MCP (Model Context Protocol) server that enables AI assistants to query Google AI (Gemini models) via Vertex AI or Google AI Studio with agentic capabilities - automatic tool selection, multi-turn reasoning, MCP-to-MCP delegation, and multimodal input support.
Purpose
This server provides:
- Agentic Loop: Turn-based execution with automatic tool selection and reasoning
- Query Gemini: Access Gemini models via Vertex AI or Google AI Studio for cross-validation
- Multimodal Support: Send images, audio, video, and code files alongside text prompts
- Tool Execution: Built-in WebFetch + integration with external MCP servers
- Multi-turn Conversations: Maintain context across queries with session management
- Reasoning Traces: File-based logging of AI thinking processes
Key Features
🎭 System Prompt Customization
Customize the AI assistant's behavior and persona:
- Domain-Specific Roles: Configure as financial analyst, code reviewer, research assistant, etc.
- Environment-Based: Set via
GEMINI_SYSTEM_PROMPTenvironment variable - Multi-Persona Support: Run multiple servers with different personas
- 100% Backward Compatible: Optional feature - works normally without customization
- See for detailed guide and for templates
🎨 Multimodal Input Support
Send images, audio, video, and code files to Gemini:
- Images: JPEG, PNG, WebP, HEIC
- Videos: MP4, MOV, AVI, WebM, and more
- Audio: MP3, WAV, AAC, FLAC, and more
- Documents/Code: PDF, text files, code files (Python, JavaScript, etc.)
- Support for both base64-encoded inline data and Cloud Storage URIs
- See for detailed documentation
🤖 Intelligent Agentic Loop
Inspired by OpenAI Agents SDK, the server operates as an autonomous agent:
- Turn-based execution (up to 10 turns per query)
- Automatic tool selection based on LLM decisions
- Parallel tool execution with retry logic
- Smart fallback to Gemini knowledge when tools fail
🛠️ Built-in Tools
- WebFetch: Secure HTTPS-only web content fetching with private IP blocking
- MCP Integration: Dynamic discovery and execution of external MCP server tools
🔐 Security First
Multi-Layer Defense:
- SSRF Protection: HTTPS-only URL fetching, private IP blocking (10.x, 172.16.x, 192.168.x, 127.x, 169.254.x), cloud metadata endpoint blocking (AWS, GCP, Azure)
- Prompt Injection Guardrails: External content tagging, trust boundaries, system prompt hardening
- File Security: MIME type validation, executable file rejection, path traversal prevention, directory whitelist
- Redirect Validation: Manual redirect handling with security checks, maximum 5 redirects, cross-domain blocking
- Content Boundaries: 50KB size limits, external content wrapping with security tags
Comprehensive Testing: 69 security-focused tests covering SSRF, path traversal, MIME validation, and prompt injection.
See for detailed security documentation and best practices.
📝 Observability
- File-based logging (
logs/general.log,logs/reasoning.log) - Configurable log directory or disable logging for npx/containerized environments
- Detailed execution traces for debugging
- Turn and tool usage statistics
Prerequisites
- Node.js 18 or higher
- Google Cloud Platform account (for Vertex AI) OR Google AI Studio account
- Google Cloud credentials configured (for Vertex AI mode)
Quick Start
Installation
Option 1: npx (Recommended)
npx -y github:mnthe/gemini-mcp-server
Option 2: From Source
git clone https://github.com/mnthe/gemini-mcp-server.git
cd gemini-mcp-server
npm install
npm run build
Authentication
The gen-ai SDK supports multiple authentication methods. For Vertex AI mode:
Application Default Credentials (Recommended):
gcloud auth application-default login
Or use Service Account:
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account-key.json"
For Google AI Studio mode, see the gen-ai SDK documentation.
Configuration
Required Environment Variables:
export GOOGLE_CLOUD_PROJECT="your-gcp-project-id"
export GOOGLE_CLOUD_LOCATION="us-central1"
Optional Model Settings:
export GEMINI_MODEL="gemini-2.5-pro"
export GEMINI_TEMPERATURE="1.0"
export GEMINI_MAX_TOKENS="8192"
export GEMINI_TOP_P="0.95"
export GEMINI_TOP_K="40"
Optional Agentic Features:
# System prompt customization
export GEMINI_SYSTEM_PROMPT="You are a specialized financial analyst AI assistant. You have access to the following tools:"
# Multi-turn conversations
export GEMINI_ENABLE_CONVERSATIONS="true"
export GEMINI_SESSION_TIMEOUT="3600"
export GEMINI_MAX_HISTORY="10"
# Logging configuration
# Default: Console logging to stderr (recommended for npx/MCP usage)
export GEMINI_LOG_TO_STDERR="true" # Default: true (console logging)
# For file-based logging instead:
export GEMINI_LOG_TO_STDERR="false" # Disable console, use file logging
export GEMINI_LOG_DIR="./logs" # Log directory (default: ./logs)
# To disable logging completely:
export GEMINI_DISABLE_LOGGING="true"
# File URI support (for CLI environments only)
export GEMINI_ALLOW_FILE_URIS="true" # Set to 'true' to allow file:// URIs (CLI tools only, NOT for desktop apps)
# External MCP servers (for tool delegation)
export GEMINI_MCP_SERVERS='[
{
"name": "filesystem",
"transport": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "./data"]
},
{
"name": "web-search",
"transport": "http",
"url": "http://localhost:3000/mcp"
}
]'
MCP Client Integration
Add to your MCP client configuration:
Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):
{
"mcpServers": {
"gemini": {
"command": "npx",
"args": ["-y", "github:mnthe/gemini-mcp-server"],
"env": {
"GOOGLE_CLOUD_PROJECT": "your-gcp-project-id",
"GOOGLE_CLOUD_LOCATION": "us-central1",
"GEMINI_MODEL": "gemini-2.5-pro",
"GEMINI_ENABLE_CONVERSATIONS": "true"
}
}
}
}
Claude Code (.claude.json in project root):
{
"mcpServers": {
"gemini": {
"command": "npx",
"args": ["-y", "github:mnthe/gemini-mcp-server"],
"env": {
"GOOGLE_CLOUD_PROJECT": "your-gcp-project-id",
"GOOGLE_CLOUD_LOCATION": "us-central1",
"GEMINI_MODEL": "gemini-2.5-pro"
}
}
}
}
Other MCP Clients (Generic stdio):
# Command to run
npx -y github:mnthe/gemini-mcp-server
# Or direct execution
node /path/to/gemini-mcp-server/build/index.js
Multi-Persona Setup
You can run multiple Gemini servers with different personas for specialized tasks:
{
"mcpServers": {
"gemini-code": {
"command": "npx",
"args": ["-y", "github:mnthe/gemini-mcp-server"],
"env": {
"GOOGLE_CLOUD_PROJECT": "your-project-id",
"GOOGLE_CLOUD_LOCATION": "us-central1",
"GEMINI_SYSTEM_PROMPT": "You are a code review specialist. Focus on code quality, security, and best practices. You have access to the following tools:"
}
},
"gemini-research": {
"command": "npx",
"args": ["-y", "github:mnthe/gemini-mcp-server"],
"env": {
"GOOGLE_CLOUD_PROJECT": "your-project-id",
"GOOGLE_CLOUD_LOCATION": "us-central1",
"GEMINI_SYSTEM_PROMPT": "You are an academic research assistant. Cite sources and provide comprehensive analysis. You have access to the following tools:"
}
}
}
}
See for comprehensive guide and for ready-to-use templates.
Available Tools
query
Main agentic entrypoint that handles multi-turn execution with automatic tool selection and multimodal input support.
Parameters:
prompt(string, required): The text prompt to sendsessionId(string, optional): Conversation session IDparts(array, optional): Multimodal content parts (images, audio, video, documents)
How It Works:
- Analyzes the prompt and conversation history (including multimodal content)
- Decides whether to use tools or respond directly
- Executes tools in parallel if needed (WebFetch, MCP tools)
- Retries failed tools with exponential backoff
- Falls back to Gemini knowledge if tools fail
- Continues for up to 10 turns until final answer
Examples:
# Simple text query
query: "What is the capital of France?"
# Complex query with tool usage
query: "Fetch the latest news from https://example.com/news and summarize"
→ Automatically uses WebFetch tool
→ Synthesizes content into answer
# Image analysis (multimodal)
query: "What's in this image?"
parts: [{ inlineData: { mimeType: "image/jpeg", data: "<base64>" } }]
# Multi-turn conversation
query: "What is machine learning?" (sessionId auto-created)
query: "Give me an example" (uses sessionId from previous response)
Multimodal Support: See for detailed documentation on:
- Parts array structure and field requirements (for agent developers)
- Supported file types (images, audio, video, documents)
- Base64 inline data vs Cloud Storage URIs
- Complete schema and validation rules
- Usage examples and code samples
- Best practices and limitations
- Common mistakes to avoid
Response Includes:
- Final answer
- Session ID (if conversations enabled)
- Statistics: turns used, tool calls, reasoning steps
search
Search for information using Gemini (OpenAI MCP spec).
Parameters:
query(string, required): Search query
Returns:
results: Array of{id, title, url}
fetch
Fetch full content of a search result (OpenAI MCP spec).
Parameters:
id(string, required): Document ID from search results
Returns:
id,title,text,url,metadata
Security
The gemini-mcp-server implements comprehensive security measures to protect against common vulnerabilities. See for complete documentation.
Defense Layers
1. SSRF (Server-Side Request Forgery) Protection
- HTTPS-only: HTTP requests are blocked; only HTTPS is allowed for web resources
- Private IP blocking: Blocks access to internal networks (10.x, 172.16.x, 192.168.x, 127.x, 169.254.x)
- Cloud metadata blocking: Prevents access to AWS, GCP, Azure, and Alibaba Cloud metadata endpoints
- Redirect validation: All redirects are manually validated; cross-domain redirects are blocked
2. Prompt Injection Guardrails
- Trust boundaries: Clear separation between user input (trusted) and external content (untrusted)
- Content tagging: All fetched web content is wrapped in
<external_content>tags with security warnings - System prompt hardening: Built-in instructions to ignore malicious commands in external content
- Information disclosure protection: Guidelines prevent revealing system prompts or internal details
3. File Security (Multimodal Content)
- MIME type validation: Only known safe types (images, video, audio, PDF, code) are allowed
- Executable rejection: Blocks
.exe,.sh,.dll, and other executable file types - Path traversal prevention: All paths are normalized and validated against a whitelist
- Directory whitelist: Local files only allowed in safe directories (cwd, Documents, Downloads, Desktop)
- URI scheme validation: Only
gs://,https://, and conditionallyfile://URIs are allowed
4. Content Boundaries
- Size limits: Web content limited to 50KB to prevent resource exhaustion
- Content type validation: Basic validation of response content types
- Encoding validation: Proper handling of character encodings
Configuration
File Security (Multimodal)
# Default: false (secure) - file:// URIs are disabled
export GEMINI_ALLOW_FILE_URIS="false"
# For CLI environments only - enables local file:// URIs with whitelist validation
export GEMINI_ALLOW_FILE_URIS="true"
Security Note: Never enable GEMINI_ALLOW_FILE_URIS in production or web-facing applications. It's designed for trusted CLI environments only.
Security Monitoring
# Enable logging to monitor security events
export GEMINI_DISABLE_LOGGING="false"
export GEMINI_LOG_DIR="/var/log/gemini-mcp"
# Log to stderr for real-time monitoring
export GEMINI_LOG_TO_STDERR="true"
Best Practices
For Desktop Applications (Recommended)
{
"mcpServers": {
"gemini": {
"env": {
"GEMINI_ALLOW_FILE_URIS": "false"
}
}
}
}
For CLI Tools (Use with Caution)
export GEMINI_ALLOW_FILE_URIS="true"
export GEMINI_LOG_TO_STDERR="true"
Security Testing
Run comprehensive security test suite:
# All security tests
npx tsx test/url-security-test.ts # 21 tests - SSRF protection
npx tsx test/file-security-test.ts # 34 tests - File validation
npx tsx test/webfetch-security-test.ts # 5 tests - Content tagging
npx tsx test/security-guidelines-test.ts # 3 tests - Prompt injection
npx tsx test/multimodal-security-test.ts # 6 tests - Multimodal files
Total: 69 security-focused tests covering SSRF, path traversal, MIME validation, and prompt injection.
For detailed security information, threat models, and vulnerability reporting, see .
Architecture
Agentic Loop
User Query
↓
┌─── Turn 1..10 Loop ───┐
│ │
│ 1. Build Prompt │
│ + Tool Definitions │
│ + History │
│ │
│ 2. Gemini Generation │
│ (with thinking) │
│ │
│ 3. Parse Response │
│ - Reasoning? │
│ - Tool Calls? │
│ - Final Output? │
│ │
│ 4. Execute Tools │
│ (parallel + retry) │
│ │
│ 5. Check MaxTurns │
│ Continue or Exit? │
│ │
└────────────────────────┘
↓
Final Result + Stats
Project Structure
src/
├── agentic/ # Core agentic loop
│ ├── AgenticLoop.ts # Main orchestrator
│ ├── RunState.ts # Turn-based state management
│ ├── ResponseProcessor.ts # Parse Gemini responses
│ └── Tool.ts # Tool interface (MCP standard)
│
├── mcp/ # MCP client implementation
│ ├── EnhancedMCPClient.ts # Unified stdio + HTTP client
│ ├── StdioMCPConnection.ts
│ └── HttpMCPConnection.ts
│
├── tools/ # Tool implementations
│ ├── WebFetchTool.ts # Secure web fetching
│ └── ToolRegistry.ts # Tool management + parallel execution
│
├── services/ # External services
│ └── GeminiAIService.ts # Gemini API (with thinkingConfig)
│
├── handlers/ # MCP tool handlers
│ ├── QueryHandler.ts
│ ├── SearchHandler.ts
│ └── FetchHandler.ts
│
├── managers/ # Business logic
│ └── ConversationManager.ts
│
├── errors/ # Custom error types
├── types/ # TypeScript type definitions
├── schemas/ # Zod validation schemas
├── config/ # Configuration loading
├── utils/ # Shared utilities (Logger, security)
│
└── server/ # MCP server bootstrap
└── GeminiAIMCPServer.ts
See and for details.
Advanced Usage
External MCP Servers
Connect to external MCP servers for extended capabilities:
Stdio (subprocess):
export GEMINI_MCP_SERVERS='[
{
"name": "filesystem",
"transport": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "./workspace"]
}
]'
HTTP:
export GEMINI_MCP_SERVERS='[
{
"name": "api-server",
"transport": "http",
"url": "https://api.example.com/mcp",
"headers": {"Authorization": "Bearer token"}
}
]'
Tools from external servers are automatically discovered and made available to the agent.
Reasoning Traces
Default: Console Logging
Logs are sent to stderr by default, making them visible in MCP client logs.
For File-Based Logging:
export GEMINI_LOG_TO_STDERR="false" # Disable console, use files
export GEMINI_LOG_DIR="./logs" # Log directory (default: ./logs)
Then check logs:
tail -f logs/general.log # All logs
tail -f logs/reasoning.log # Gemini thinking process only
To Disable All Logging:
export GEMINI_DISABLE_LOGGING="true"
Custom Tool Development
Tools follow MCP standard:
import { BaseTool, ToolResult, RunContext } from './agentic/Tool.js';
export class MyTool extends BaseTool {
name = 'my_tool';
description = 'Description for LLM';
parameters = {
type: 'object',
properties: {
arg: { type: 'string', description: 'Argument' }
},
required: ['arg']
};
async execute(args: any, context: RunContext): Promise<ToolResult> {
// Your implementation
return {
status: 'success',
content: 'Result'
};
}
}
Development
Build
npm run build
Watch Mode
npm run watch
Development Mode
npm run dev
Troubleshooting
MCP Server Connection Issues
If the MCP server appears to be "dead" or disconnects unexpectedly:
Check MCP client logs (logs are sent to stderr by default):
- macOS:
~/Library/Logs/Claude/mcp*.log - Windows:
%APPDATA%\Claude\Logs\mcp*.log
Server logs will appear in these files automatically.
Log Directory Errors
If you encounter errors like ENOENT: no such file or directory, mkdir './logs':
This should not happen with default settings (console logging is default).
If you enabled file logging (GEMINI_LOG_TO_STDERR="false"):
Solution: Use a writable log directory:
{
"mcpServers": {
"gemini": {
"command": "npx",
"args": ["-y", "github:mnthe/gemini-mcp-server"],
"env": {
"GOOGLE_CLOUD_PROJECT": "your-project-id",
"GEMINI_LOG_TO_STDERR": "false",
"GEMINI_LOG_DIR": "/tmp/gemini-logs"
}
}
}
}
Authentication Errors
- Verify credentials:
gcloud auth application-default login - Check project ID:
echo $GOOGLE_CLOUD_PROJECT - Enable Vertex AI API:
gcloud services enable aiplatform.googleapis.com
Tool Execution Failures
- Check logs in
logs/general.log(if logging is enabled) - Verify MCP server configurations in
GEMINI_MCP_SERVERS - Ensure external servers are running (for HTTP transport)
MaxTurns Exceeded
- Agent returns best-effort response after 10 turns
- Check if tools are repeatedly failing
- Review reasoning logs to understand loop behavior (if logging is enabled)
Documentation
- - Security documentation and best practices
- - System architecture and agentic loop design
- - Code organization
- - Implementation details
- - Build and release process
- - Multimodal content guide
- - System prompt customization
- - Contribution guidelines