adhit-r/fairmind-mcp
If you are the rightful owner of fairmind-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
FairMind MCP is a Model Context Protocol server designed to enable AI agents to verify and self-correct bias in real-time using Green AI principles.
FairMind MCP
FairMind MCP is a Model Context Protocol server that enables AI agents to mathematically verify and self-correct bias in real-time using Green AI principles.
Features
- Token-Optimized: Uses TOON (Token-Oriented Object Notation) format for efficient LLM context usage
- High-Performance Inference: Supports LiteRT for fast on-device inference, with robust fallback to heuristic-based counterfactual generation if models are unavailable
- Industry-Standard Metrics: Wraps Fairlearn and AIF360 for rigorous fairness auditing. Clearly distinguishes between:
- Statistical Audits: Rigorous analysis of classification datasets with ground truth
- Heuristic Proxies: Estimated bias metrics for generative content where ground truth is simulated via keyword patterns
- True Code Analysis: Detects bias in source code using AST parsing (via
esprimafor JS/TS andastfor Python). Detects structural bias, control flow divergence, and variable naming issues with high precision - Configurable Detection: Load custom stereotypes and patterns via
bias_config.jsonor environment variables - LLM Testing Tools: Systematic bias testing for custom LLMs and fine-tuned models - batch evaluation, prompt suite testing, and real-time monitoring
- Agent-Native: Built for the Model Context Protocol, works with Claude Desktop, Cline, and Cursor
Architecture
The system follows a Broker Pattern:
- MCP Server (TypeScript/Bun): Handles MCP protocol communication and manages the Python kernel with auto-restart, timeout protection, and automatic warm-up to eliminate first-request penalty
- Python Analysis Kernel: Runs fairness auditing using Fairlearn/AIF360 and LiteRT-powered inference
- TOON Codec: Optimized serialization format for token-optimized communication, with enhanced support for large payloads and nested structures
Installation
Prerequisites
- Bun (latest)
- Python 3.11+ (managed via
uv) uvpackage manager
Setup
# Install Bun dependencies
bun install
# Setup Python environment
cd py_engine
uv sync
Configuration
You can customize the bias detection patterns (stereotypes, roles, occupations) by editing py_engine/bias_config.json.
To use a custom configuration file without modifying the source, set the FAIRMIND_BIAS_CONFIG environment variable:
export FAIRMIND_BIAS_CONFIG=/path/to/your/custom_bias_config.json
Usage
Start the MCP server:
bun run src/index.ts
The server communicates via stdio and exposes seven tools:
-
evaluate_bias: Evaluates text, code, or data for bias against protected attributes
- Use
content_type="text"for natural language content - Use
content_type="code"for source code analysis (comments, variable names, algorithmic bias, inclusive terminology) - Supports single or multiple attributes (via
protected_attributesarray)
- Use
-
evaluate_bias_advanced: Advanced evaluation with full Fairlearn MetricFrame and AIF360 support
- For generative text, this uses Heuristic Proxies to estimate metrics like "Demographic Parity" based on text patterns
- For classification datasets, it provides rigorous statistical analysis
-
compare_code_bias: Compares two code snippets generated for different personas to detect structural bias
- Uses differential AST analysis to find complexity disparities
- Alerts if one persona receives significantly more complex code (>1.5x ratio)
- Detects control flow divergence (extra validation steps)
-
generate_counterfactuals: Generates alternative text suggestions to reduce bias
- Uses LiteRT models when available, falls back to heuristics
-
evaluate_model_outputs: Batch evaluation tool for testing multiple LLM/fine-tuned model outputs with aggregated reporting
- Designed for pre-deployment comprehensive testing
- Returns overall pass rates, failure patterns, and per-attribute metrics
-
evaluate_prompt_suite: Systematic prompt suite testing with tracking over time
- Ideal for fine-tuning validation and continuous monitoring
- Compares results across training epochs or model versions
-
evaluate_model_response: Real-time single output testing for quick bias checks during inference
Development
Project Structure
fairmind-mcp/
├── src/
│ ├── index.ts # MCP Server entry point
│ ├── python_bridge.ts # Robust Python process manager
│ └── types.ts # TypeScript type definitions
├── py_engine/
│ ├── main.py # Python entry point with Pydantic validation
│ ├── models.py # Pydantic data models
│ ├── core/ # Shared utilities
│ │ ├── auditor.py
│ │ ├── code_auditor.py
│ │ ├── ast_analyzer.py
│ │ └── ...
│ └── tools/ # Tool handlers
│ ├── registry.py
│ └── ...
└── website/
└── website.html # Landing page
Documentation
- - All detected patterns and metrics
- - For code generation tools like Cursor
- - Test custom LLMs and fine-tuned models for bias
- - Full Fairlearn MetricFrame, AIF360, multi-attribute detection
- - Compare code for different personas
- - Start here for integration
- - Setup for code generation
- - Real-world use cases
- - Development roadmap
Quick Setup
Option 1: Web UI
bun run ui
Then open http://localhost:3000 in your browser to test tools interactively.
Option 2: Claude Desktop
./scripts/setup-claude-desktop.sh
Option 3: Cursor
./scripts/setup-cursor.sh
Manual Setup
See setup guides in docs/:
License
MIT