llm-vulnerability-mcp

arjun-krishna1/llm-vulnerability-mcp

3.3

If you are the rightful owner of llm-vulnerability-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

The LLM Vulnerability Scanner is an open-source MCP server designed to facilitate the scanning of LLM endpoints for vulnerabilities such as hallucinations, prompt-injection, data leaks, and toxicity.

Tools
  1. scan_model

    A tool that scans LLM models for vulnerabilities using Garak and returns a JSON summary.

LLM Vulnerability Scanner

An open-source MCP server that lets agents or humans trigger garak scans against any LLM endpoint and receive a concise vulnerability report (hallucination, prompt-injection, data-leak, toxicity, etc.).

TODO

MVP

  • Create repo
  • Run MCP hello world
  • Run garak hello world
  • Add dependeincies mcp[cli], garak
  • Wrap garak CLI in async function
  • Parse garak output to a JSON summary
  • Do end to end demo on Cursor
  • Prepare demo presentation with two slives
  • Installation guide (uvx --from git+https://github.com/arjun-krishna1/llm-vulnerability-mcp) and JSON

Nice to haves

  • Custom probe selection
  • Streaming progress updates
  • Batc comparison

MVP

One MCP tool called scan_model that takes model_type, model_name, and (optionally) api_key, runs garak with its default probe set, and returns a JSON summary plus the original garak.log as an MCP resource.

Usage

The MCP server exposes a scan_model tool that can be used to scan LLM models for vulnerabilities.

Example usage in Claude:

Use the scan_model tool to test the model at https://openrouter.ai/models/mistralai/mistral-7b-instruct

The tool will:

  1. Parse OpenRouter URLs automatically
  2. Use the OPENROUTER_API_KEY from environment if not provided
  3. Return a JSON summary of vulnerabilities found

Parameters:

  • model_type: Type of model (e.g., 'openai', 'openrouter', 'huggingface')
  • model_name: Model name or OpenRouter URL
  • api_key: Optional API key (uses OPENROUTER_API_KEY env var if not provided)

Nice-to-haves

  • choose probes interactively
  • live progress streaming
  • batch compare models
  • downloadable HTML dashboard
  • Slack / Discord webhook alerts
  • GitHub Action wrapper
  • OWASP-style severity score
  • persistent scan history (SQLite).