qter21/ca-codes-mcp-server
If you are the rightful owner of ca-codes-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The CA Legal Codes MCP Server provides access to California legal codes using MongoDB Atlas and Voyage AI embeddings, with features to prevent hallucination.
CA Legal Codes MCP Server
A Model Context Protocol (MCP) server that provides Claude with access to California legal codes through MongoDB Atlas and Voyage AI embeddings, with built-in anti-hallucination features.
Features
🎯 Core Capabilities
- Semantic Search: Natural language search across 51,802 CA legal code sections using
voyage-law-2embeddings - Exact Retrieval: Direct access to specific code sections by citation
- Related Statutes: Discover similar and related legal code sections
- Citation Validation: Mandatory validation to prevent hallucination
- Document Drafting: AI-assisted legal document creation with verified citations
🛡️ Anti-Hallucination System
- Retrieval-First Workflow: Forces retrieval before citation
- Citation Tracking: Monitors all retrievals vs citations
- Validation Enforcement: Validates every legal claim
- Structured Responses: Separates retrieved facts from analysis
- System Prompts: Explicit instructions to prevent hallucination
Architecture
ca-codes-mcp-server/
├── server.py # Main MCP server with stdio protocol
├── config.py # Configuration management
├── tools/
│ ├── retrieval.py # Semantic search, get_section, find_similar
│ ├── validation.py # Citation validation
│ ├── drafting.py # Document drafting (declarations, MPAs, strategy)
│ └── workflows.py # Orchestrated multi-step workflows
├── anti_hallucination/
│ ├── tracker.py # Citation tracking
│ ├── validator.py # Validation logic
│ └── prompts.py # System prompts for Claude
├── context/
│ ├── session.py # Session management
│ └── cache.py # Result caching
├── db/
│ ├── mongodb.py # MongoDB Atlas operations
│ └── vector_ops.py # Vector search
├── utils/
│ ├── voyage.py # Voyage AI client
│ └── formatters.py # Response formatting
└── models/
├── responses.py # Structured response models
└── documents.py # Document templates
Installation
Prerequisites
- Python 3.9+
- MongoDB Atlas cluster with CA legal codes data
- Voyage AI API key
Setup
- Clone and navigate to project:
cd /path/to/ca-codes-mcp-server
- Create virtual environment:
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Configure environment:
cp .env.example .env
# Edit .env with your credentials
Required environment variables:
# MongoDB Atlas
MONGODB_URI=mongodb+srv://user:pass@cluster.mongodb.net/
DATABASE_NAME=ca_codes_db
COLLECTION_NAME=section_contents
VECTOR_INDEX_NAME=legal_codes_vector_index
# Voyage AI
VOYAGE_API_KEY=pa-your-api-key
VOYAGE_MODEL=voyage-law-2
# Anti-Hallucination Settings
MANDATORY_VALIDATION=true
CITATION_TRACKING=true
MIN_RETRIEVAL_SCORE=0.7
# Agent Configuration
CONTEXT_TIMEOUT=3600
CACHE_TTL=1800
Usage
Running the Server
python server.py
The server communicates via stdio using the MCP protocol.
Available Tools
1. semantic_search
Search legal codes using natural language:
{
"query": "What are the rules about meal breaks for employees?",
"code_filter": "LAB",
"limit": 5
}
2. get_section
Retrieve specific code section:
{
"code": "LAB",
"section": "2922"
}
3. find_similar
Find related statutes:
{
"code": "LAB",
"section": "2922",
"limit": 5
}
4. validate_citation
Validate a legal citation:
{
"code": "LAB",
"section": "2922",
"claimed_content": "Employment may be terminated at will..."
}
5. research_workflow
Comprehensive research with validation:
{
"question": "What are the statutory requirements for wrongful termination claims?",
"depth": "thorough",
"validate_all": true
}
6. draft_declaration
Draft legal declaration:
{
"declarant_name": "John Doe",
"facts": [
"I was employed by XYZ Corp from 2020 to 2023",
"I was terminated without cause on March 1, 2023"
],
"purpose": "wrongful termination",
"party_type": "plaintiff"
}
7. draft_mpa
Draft Memorandum of Points and Authorities:
{
"issue": "Whether plaintiff's termination violated Labor Code § 2922",
"position": "Plaintiff's termination was wrongful",
"facts": "Plaintiff was employed for 3 years..."
}
8. plan_strategy
Develop legal strategy:
{
"situation": "Client was terminated after reporting safety violations",
"goals": [
"Obtain compensation for lost wages",
"Reinstatement to position"
]
}
9. get_validation_report
Get citation accuracy report:
{
"session_id": "optional-session-id"
}
Anti-Hallucination Strategy
How It Works
1. Retrieval-First Pattern
When user mentions specific citations:
User: "What does LAB § 2922 say?"
↓
1. Call get_section("LAB", "2922")
2. Retrieve actual text from database
3. Use retrieved text as context
4. Generate response from actual content
✓ No hallucination possible
2. Generation-With-Tools Pattern
For open-ended questions:
User: "What are the meal break rules?"
↓
1. Call semantic_search("meal breaks employment")
2. Retrieve top sections: LAB § 512, LAB § 226.7
3. Get full text for each
4. Generate response using retrieved content
5. Validate all citations
✓ All claims backed by database
3. Hybrid-Draft Pattern
For document creation:
User: "Draft a declaration for wrongful termination"
↓
1. Search for applicable laws
2. Retrieve all relevant sections
3. Draft using ONLY retrieved content
4. Validate every citation
5. Return with validation report
✓ Every claim has database backing
System Prompts
The server provides explicit prompts to Claude:
Key Rules:
- NEVER cite without retrieving first
- MANDATORY validation before asserting claims
- Separate retrieved facts from analysis
- Track all citations
- Flag uncertain claims
Citation Tracking
Every tool call logs:
- Retrieved sections
- Citations used
- Validation status
- Accuracy rate
Get report anytime:
{
"total_retrievals": 15,
"total_citations": 15,
"unvalidated_claims": [],
"accuracy_rate": 1.0,
"is_clean": true
}
Integration with Claude
Claude Desktop Configuration
Add to claude_desktop_config.json:
{
"mcpServers": {
"ca-legal-codes": {
"command": "python",
"args": ["/path/to/ca-codes-mcp-server/server.py"],
"env": {
"MONGODB_URI": "your-mongodb-uri",
"VOYAGE_API_KEY": "your-voyage-key"
}
}
}
}
Example Conversation
User: "I need to understand California's at-will employment doctrine."
Claude (using MCP server):
- Calls
semantic_search("california at-will employment doctrine") - Retrieves LAB § 2922
- Calls
get_section("LAB", "2922")to get full text - Responds with actual code content
Result: No hallucination - response based on actual database content.
Development
Running Tests
pytest tests/ -v
Logging
Structured logging with structlog:
logger.info("tool_called", tool="semantic_search", query="meal breaks")
Adding New Tools
- Create tool handler in appropriate module (
tools/) - Register tool in
server.pyTOOLSlist - Add route in
call_tool()function - Update documentation
Deployment
Docker
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "server.py"]
Docker Compose
ca-codes-mcp-server:
build:
context: ./ca-codes-mcp-server
environment:
- MONGODB_URI=${MONGODB_URI}
- VOYAGE_API_KEY=${VOYAGE_API_KEY}
networks:
- ca-codes-network
Performance
- Vector Search: ~100-200ms per query
- Direct Retrieval: ~50-100ms per section
- Validation: ~50ms per citation
- Caching: 30min TTL for frequent queries
Security
- Environment variables for sensitive data
- Read-only database access recommended
- Input validation on all tools
- Rate limiting recommended for production
Troubleshooting
MongoDB Connection Issues
# Test connection
python -c "from db import get_db_client; import asyncio; asyncio.run(get_db_client())"
Voyage AI API Issues
# Test API key
python -c "from utils import get_voyage_client; client = get_voyage_client(); print('OK')"
MCP Protocol Issues
- Ensure stdio communication
- Check Claude Desktop logs
- Verify tool schemas match MCP spec
License
MIT License - See LICENSE file
Support
- Issues: GitHub Issues
- Documentation: This README
- Contact: [Your contact info]
Changelog
v1.0.0 (2025-01-XX)
- Initial release
- 10 core tools
- Anti-hallucination system
- MongoDB Atlas + Voyage AI integration
- Session management
- Citation tracking