7etsuo/image-description-mcp_server
If you are the rightful owner of image-description-mcp_server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The Image Description MCP Server is a Model Context Protocol server that leverages xAI's Grok API to provide AI-powered image analysis, offering detailed descriptions, metadata extraction, and OCR capabilities.
Image Description MCP Server
A Model Context Protocol (MCP) server that provides AI-powered image analysis using xAI's Grok API.
Purpose
This MCP server provides a secure interface for AI assistants to analyze images using Grok's advanced vision capabilities. It supports both web-hosted images and local files, offering detailed descriptions, technical metadata extraction, and optical character recognition (OCR).
Features
Current Implementation
describe_image_url
- Analyzes images from web URLs and provides AI-generated descriptionsdescribe_image_file
- Analyzes local image files and provides AI-generated descriptionsextract_text_from_image
- Performs OCR to extract readable text from images
Prerequisites
- Docker Desktop with MCP Toolkit enabled
- Docker MCP CLI plugin (
docker mcp
command) - Grok API key from https://console.x.ai/
Installation
See the step-by-step instructions provided with the files.
Usage Examples
In Grok4 Code Fast, you can ask:
- "Describe this image: https://example.com/image.jpg"
- "What does this local image show: /path/to/image.png"
- "Extract any text from this image: https://example.com/document.jpg"
- "Give me a detailed analysis of this photo: https://example.com/photo.jpg"
- "What text can you read in this screenshot: https://example.com/screenshot.png"
Local Testing
# Set environment variables for testing
export GROK_API_KEY="your-grok-api-key"
# Run directly
python image-description-mcp_server.py
# Test MCP protocol
echo '{"jsonrpc":"2.0","method":"tools/list","id":1}' | python image-description-mcp_server.py
Adding New Tools
- Add the function to
image-description-mcp_server.py
- Decorate with
@mcp.tool()
- Update the catalog entry with the new tool name
- Rebuild the Docker image
Troubleshooting
Authentication Errors
- Verify secrets with
docker mcp secret list
- Ensure GROK_API_KEY is set correctly
- Check API key validity at https://console.x.ai/
Image Processing Errors
- Ensure image URLs are accessible and valid
- Check local file paths exist and are readable
- Verify image formats are supported (JPEG, PNG, WebP, etc.)
Security Considerations
- All secrets stored in Docker Desktop secrets
- Never hardcode API keys
- Running as non-root user in Docker
- Images processed temporarily in memory
- No permanent storage of image data
- Sensitive data never logged
API Documentation
This service integrates with xAI's Grok API:
- Grok API Reference: https://docs.x.ai/docs/api-reference
- MCP SDK Documentation: https://github.com/modelcontextprotocol/sdk
Data Sources
External Image URLs
- Source: Web-hosted images accessible via HTTP/HTTPS
- Access Method: HTTP GET requests using httpx
- Purpose: Download images for analysis from any public URL
- Limitations: Only accessible URLs; no authentication-protected images
Local Image Files
- Source: Filesystem access to local image files
- Access Method: Python file I/O
- Purpose: Analyze images stored locally on the user's system
- Supported Paths: Absolute and relative file paths
- Supported Formats: JPEG, PNG, WebP, TIFF, GIF, BMP
Grok API
- Source: xAI's Grok model with vision capabilities
- Access Method: REST API calls via httpx
- Purpose: AI-powered image analysis and description generation
- Data Flow: Images converted to base64, sent to Grok, receive structured analysis
Image Processing
- Source: PIL (Pillow) and OpenCV libraries
- Access Method: Local processing
- Purpose: Extract technical metadata and perform OCR
- No External Calls: Pure local processing
License
MIT License