perception-mcp

lintyourcode/perception-mcp

3.1

If you are the rightful owner of perception-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

Perception-MCP is a lightweight Model Context Protocol server designed to answer questions about multimedia files using advanced multimodal models.

Perception-MCP

A lightweight Model Context Protocol (MCP) server that lets you ask any question about an image, audio, or video file and returns an answer powered by state-of-the-art multimodal models served through fal.ai.

Prerequisites

Installation

git clone --recurse-submodules https://github.com/lintyourcode/perception-mcp.git
cd perception-mcp
cp mcp_agent.secrets_template.yaml mcp_agent.secrets.yaml
$EDITOR mcp_agent.secrets.yaml

Usage

Add Perception-MCP to Claude Desktop (v0.3.7+) by adding the following to your claude_desktop_config.json file:

{
  "mcpServers": {
    "perception-mcp": {
      "command": "fastmcp",
      "args": ["run", "perception-mcp", "serve"]
    }
  }
}

Tools

Perception-MCP provides the following tools:

  • query_image: Answer a question about an image's contents
  • query_audio: Answer a question about an audio file's contents
  • query_video: Answer a question about a video's contents

Development

Running tests

uv run pytest -q