lintyourcode/perception-mcp
3.1
If you are the rightful owner of perception-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
Perception-MCP is a lightweight Model Context Protocol server designed to answer questions about multimedia files using advanced multimodal models.
Perception-MCP
A lightweight Model Context Protocol (MCP) server that lets you ask any question about an image, audio, or video file and returns an answer powered by state-of-the-art multimodal models served through fal.ai.
Prerequisites
- Python 3.11+
- uv
- A fal.ai account & API key
- A Perplexity account & API key
Installation
git clone --recurse-submodules https://github.com/lintyourcode/perception-mcp.git
cd perception-mcp
cp mcp_agent.secrets_template.yaml mcp_agent.secrets.yaml
$EDITOR mcp_agent.secrets.yaml
Usage
Add Perception-MCP to Claude Desktop (v0.3.7+) by adding the following to your claude_desktop_config.json
file:
{
"mcpServers": {
"perception-mcp": {
"command": "fastmcp",
"args": ["run", "perception-mcp", "serve"]
}
}
}
Tools
Perception-MCP provides the following tools:
query_image
: Answer a question about an image's contentsquery_audio
: Answer a question about an audio file's contentsquery_video
: Answer a question about a video's contents
Development
Running tests
uv run pytest -q