mcp_mfai_tools

modflowai/mcp_mfai_tools

3.1

If you are the rightful owner of mcp_mfai_tools and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

The MCP MFAI Tools server is a production-ready Model Context Protocol server that provides secure access to MODFLOW documentation search capabilities, utilizing OAuth authentication and deployed on Cloudflare Workers.

Tools
1
Resources
0
Prompts
0

MCP MFAI Tools - Advanced MODFLOW AI Search Engine

A production-ready MCP (Model Context Protocol) server with OAuth authentication, deployed on Cloudflare Workers. Provides intelligent, user-controlled search capabilities for MODFLOW/PEST documentation, code modules, and workflows with advanced features and comprehensive metadata display.

🚀 Live Deployment

Production URL: https://mcp-mfai-tools.little-grass-273a.workers.dev

✨ Key Features

  • 🔐 OAuth Authentication - GitHub and Google sign-in with beautiful login page
  • 🌐 HTTP Transport - Cloudflare Workers Edge deployment for global performance
  • 👥 User Access Control - Allowlist-based access for GitHub usernames and Google emails
  • 🎯 Specialized Search Tools - Content-focused tools for tutorials, code, and documentation
  • 📊 Rich Metadata Display - User-controlled output with arrays, snippets, and GitHub links
  • 🔍 Advanced Search Strategies - 5 search types with user-controlled field inclusion
  • ⚡ Boolean Parameter Parsing - Proper handling of MCP string-to-boolean conversion
  • 🎨 Beautiful Login UI - Glass-morphism design with provider selection
  • 📝 Comprehensive Debugging - Multi-level logging for troubleshooting

🏗️ Project Architecture

🔄 Critical Architecture Flow (NEVER FORGET!)

graph LR
    A[src/tools/] -->|Single Source| B[HTTP MCP Server<br/>Cloudflare Workers]
    A -->|Compiled to| C[STDIO MCP Server<br/>stdio/.tools-compiled/]
    C -->|MCP Protocol| D[Mastra Agent<br/>mfai-mcp-agent/]
    D -->|Agent Import| E[CopilotKit UI<br/>copilotkit-app/]
    
    style A fill:#f9f,stroke:#333,stroke-width:4px
    style B fill:#bbf,stroke:#333,stroke-width:2px
    style C fill:#bbf,stroke:#333,stroke-width:2px
    style D fill:#bfb,stroke:#333,stroke-width:2px
    style E fill:#fbf,stroke:#333,stroke-width:2px

The Flow Explained:

  1. Original Tools (src/tools/) - Single source of truth for all tool implementations
  2. HTTP Transport - Production MCP server on Cloudflare Workers with OAuth
  3. STDIO Transport (stdio/) - Local MCP server that imports and compiles the same tools
  4. Mastra Agent (mfai-mcp-agent/) - Loads tools via MCP client from STDIO server
  5. CopilotKit UI (copilotkit-app/) - Imports Mastra agent and displays tool invocations

📁 Directory Structure

mcp_mfai_tools/
├── src/                        # Source code (ORIGINAL TOOLS HERE!)
│   ├── index.ts               # Main entry point with OAuth providers
│   ├── mcp-agent.ts           # MCP agent with authentication logic
│   ├── handlers/              # OAuth and request handlers
│   │   ├── github-handler.ts         # GitHub OAuth flow
│   │   ├── google-handler.ts         # Google OAuth flow
│   │   └── multi-provider-handler.ts # Provider selection UI
│   ├── tools/                 # 🔥 SINGLE SOURCE OF TRUTH FOR ALL TOOLS
│   │   ├── search-code.ts            # ⭐ Advanced API/module search
│   │   ├── search-docs.ts            # Full-text documentation search
│   │   ├── search-tutorials.ts       # Tutorial and workflow search
│   │   ├── semantic-search-docs.ts   # Vector-based semantic search
│   │   ├── semantic-search-tutorials.ts # Semantic tutorial search
│   │   ├── get-file-content.ts       # Direct file content retrieval
│   │   ├── get-modflow-ai-info.ts    # MODFLOW AI overview
│   │   └── acronym-mappings.json     # Centralized acronym expansions
│   └── utils/                 # Utility functions
│       ├── utils.ts                   # OAuth utility functions
│       └── workers-oauth-utils.ts    # UI rendering utilities
├── stdio/                     # 📡 STDIO MCP Server (imports from src/tools/)
│   ├── src/
│   │   └── index.ts          # STDIO server that imports parent tools
│   ├── .tools-compiled/      # Compiled tools from parent src/tools/
│   ├── build.sh              # Compiles parent tools to .tools-compiled/
│   └── test/
│       └── test-all.ts       # Comprehensive test suite
├── mfai-mcp-agent/            # 🤖 Mastra Agent Integration
│   ├── src/mastra/           # Mastra agent implementation
│   │   ├── agents/           # Agent definitions
│   │   │   └── modflow-build-time.ts # MODFLOW agent loading MCP tools
│   │   └── index.ts          # Mastra configuration
│   ├── tests/                # Comprehensive test suite
│   │   ├── integration/      # Integration tests
│   │   ├── unit/            # Unit tests
│   │   ├── e2e/             # Playwright E2E tests
│   │   └── manual/          # Manual test scripts
│   └── README.md            # Agent-specific documentation
├── copilotkit-app/           # 🎨 CopilotKit UI Application
│   ├── app/                  # Next.js app directory
│   │   └── page.tsx         # Main page with useCopilotAction
│   ├── components/
│   │   └── ToolCard.tsx    # Tool visualization component
│   └── package.json         # CopilotKit dependencies
├── config/                    # Configuration files
│   ├── wrangler.toml         # Production Cloudflare configuration
│   └── wrangler.dev.toml     # Development configuration
├── scripts/                   # Automation scripts
│   ├── deploy.sh             # Automated deployment pipeline
│   └── update-secrets.sh     # Secret management automation
├── examples/                  # Testing and examples
│   └── simple-mcp-client.js  # Development test client
├── docs/                      # Technical documentation
│   ├── SCHEMA_CODE_SEARCH.md # Implementation roadmap
│   └── *.md                  # Additional technical docs
├── tests/                     # Test files
├── .env                       # Environment variables (not in git)
├── CLAUDE.md                 # Development guidance for Claude Code
└── README.md                 # This comprehensive guide

🔗 Architecture Deep Dive

The Complete Tool Flow

This project implements a sophisticated multi-layer architecture where tools flow through different transports and frameworks:

1️⃣ Tool Source (src/tools/)
  • Single source of truth for all tool implementations
  • Written in TypeScript with full type safety
  • Database queries via Neon PostgreSQL
  • Semantic search via OpenAI embeddings
2️⃣ HTTP Transport (Production)
  • Deployed on Cloudflare Workers Edge
  • OAuth authentication (GitHub/Google)
  • User allowlist access control
  • Global CDN distribution
3️⃣ STDIO Transport (Development)
  • Local MCP server for development
  • Imports and compiles tools from src/tools/
  • No authentication required
  • Direct database access
4️⃣ Mastra Agent (mfai-mcp-agent/)
  • Loads tools via MCP client from STDIO server
  • Critical: Tools must be loaded at build time with await mcp.getTools()
  • Provides conversational AI interface
  • Handles tool selection and execution
5️⃣ CopilotKit UI (copilotkit-app/)
  • Next.js application with CopilotKit integration
  • Imports Mastra agent for tool execution
  • Custom ToolCard component for visualization
  • Real-time tool status updates (pending → executing → complete)

Why This Architecture?

  1. Single Source of Truth: All tools defined once in src/tools/
  2. Multiple Transports: Same tools work via HTTP (production) and STDIO (development)
  3. Framework Integration: Seamlessly integrates with Mastra and CopilotKit
  4. Type Safety: Full TypeScript throughout the stack
  5. Scalability: Edge deployment with global distribution
  6. Developer Experience: Local development without authentication complexity

🛠️ Available Tools

Tools Overview

This MCP server provides 7 specialized tools designed for different use cases in the MODFLOW/PEST ecosystem:

ToolPurposeBest ForStatus
🎓 search_tutorialsTutorial/workflow searchLearning materials, step-by-step guides, workflowsWORKING
🧠 search_codeAPI/module searchFunction signatures, class definitions, implementation detailsWORKING
📖 search_docsDocumentation searchMathematical theory, conceptual explanations, reference materialWORKING
🤖 semantic_search_tutorialsSemantic tutorial searchConcept-based tutorial discovery using embeddingsWORKING
🔍 semantic_search_docsSemantic documentation searchConcept-based theory discovery using embeddingsWORKING
📁 get_file_contentDirect file accessComplete file retrieval by exact path with paginationWORKING
ℹ️ get_modflow_ai_infoMODFLOW AI overviewComprehensive information about MODFLOW AI capabilitiesWORKING

Architecture: Specialized Tools

Content-Focused Search:

  • search_tutorials: Tutorials and workflows ONLY (flopy_workflows, pyemu_workflows tables)
  • search_code: API and modules ONLY (flopy_modules, pyemu_modules tables)
  • search_docs: Theory and references ONLY (repository_files table)

Semantic Search Tools:

  • semantic_search_docs: Cross-repository semantic search with OpenAI embeddings
  • semantic_search_tutorials: Semantic similarity for tutorials using vector search

Utility Tools:

  • get_file_content: Direct file retrieval with automatic pagination for large files
  • get_modflow_ai_info: Comprehensive overview of MODFLOW AI capabilities and resources

Detailed Tool Documentation

1. 🧠 search_code - Advanced Multi-Strategy Search

The flagship intelligent search tool with comprehensive user controls.

Purpose: Search for API details, function signatures, class definitions, and troubleshooting information with advanced user-controlled strategies.

Key Features:

  • 5 search strategies (general, package, error, usage, concept)
  • Rich array display (scenarios, concepts, errors, PEST integration)
  • Boolean parameter parsing for MCP compatibility
  • Field-specific search (docstrings, purpose, arrays, source code)
  • Advanced filters (package code, model family, category)
  • Acronym expansion (WEL → Well Package)
  • Wildcard support (* → :*)
  • Highlighted snippets with ts_headline
  • GitHub URL integration

Complete Parameters:

{
  query: string,                    // Required: search terms
  repository?: 'flopy' | 'pyemu',  // Optional: specific repository
  limit?: number,                   // 1-50, default: 10
  
  // Search strategy control
  search_type?: 'general' | 'package' | 'error' | 'usage' | 'concept',
  
  // Display options - control rich metadata output
  include_scenarios?: boolean,      // Show user scenarios/use cases
  include_concepts?: boolean,       // Show related concepts/statistical concepts
  include_errors?: boolean,         // Show typical errors/common pitfalls
  include_pest?: boolean,          // Show PEST integration details
  include_source?: boolean,        // Show source code snippets
  include_github?: boolean,        // Show GitHub URLs (default: true)
  include_snippet?: boolean,       // Show highlighted content snippets
  
  // Advanced filters
  package_code?: string,           // Filter by package (WEL, SMS, etc.)
  model_family?: string,           // Filter by model (mf6, mfusg, etc.)
  category?: string,              // Filter PyEMU category (core, utils, etc.)
  
  // Field-specific search control
  search_docstring?: boolean,     // Include docstrings in search
  search_purpose?: boolean,       // Include semantic_purpose in search
  search_arrays?: boolean,        // Include array fields in search
  search_source?: boolean,        // Include source code in search
  
  // Output formatting
  max_array_items?: number,       // 1-10, default: 3
  snippet_length?: number,        // 50-300, default: 150
  compact_format?: boolean        // Compact vs full format
}

Search Strategy Matrix:

StrategyPrimary FocusBest ForExample Query
generalsearch_vectorBroad searches"hydraulic conductivity"
packagepackage_code matchesSpecific packages"WEL package methods"
errortypical_errors arraysTroubleshooting"convergence failed"
usageuser_scenarios arraysExamples/tutorials"pumping well example"
conceptrelated_concepts arraysTheory/background"FOSM uncertainty"

Real Example:

// User query: "control data section"
mcp__mfaitools__search_docs({
  query: "control data section"
})

// Actual response preview:
{
  "results": [
    {
      "filepath": "pestman1/The_PEST_Control_File_part05.md",
      "title": "PEST Control File: Parameter Groups and Data Specifications", 
      "relevance": 1.000,
      "repository": "pest",
      "snippet": "**[data]**\" **[section]** of the PEST **[control]** file..."
    }
  ],
  "total_results": 9,
  "search_metadata": {
    "method_used": "text",
    "average_relevance": 0.565
  }
}

Advanced Code Search:

mcp__mfaitools__search_code({
  query: "WEL package constructor",
  repository: "flopy",
  include_scenarios: true,
  include_snippet: true
})

2. 🎓 search_tutorials - Tutorial & Workflow Search

Find tutorials, workflows, and practical implementations with advanced filtering.

Purpose: Search for step-by-step guides, working examples, and best practices.

Key Features:

  • Advanced filtering by model type, packages, complexity level
  • Array search within use cases, prerequisites, and implementation tips
  • Complete working examples with code and explanations
  • Complexity indicators (beginner/intermediate/advanced)
  • Package usage lists showing required MODFLOW packages
  • Enhanced snippet highlighting with configurable display options

Parameters:

{
  query: string,                    // Required: search terms
  repository?: 'flopy' | 'pyemu',  // Optional: specific repository
  limit?: number,                  // 1-50, default: 10
  complexity?: 'beginner' | 'intermediate' | 'advanced',
  packages?: string[],             // Filter by packages used
  workflow_type?: string,          // Filter by workflow type (PyEMU)
  include_tips?: boolean,          // Show implementation tips
  include_use_cases?: boolean      // Show use case examples
}

3. 📖 search_docs - Documentation Search with Ultra-Flexible Repository Parsing

Find theoretical foundations, mathematical formulations, and reference material with bulletproof parameter parsing.

Purpose: Search comprehensive documentation for concepts, theory, and reference guides with the most flexible repository parameter support available.

🛡️ Ultra-Flexible Repository Parameter Parsing:

  • ANY delimiter combination: "pest pestpp", "pest,pestpp", "pest|pestpp", "pest;pestpp"
  • Array formats: ["pest","pestpp"], '["pest","pestpp"]', "[pest,pestpp]"
  • Mixed formats: "pest,pestpp|pest_hp;mfusg"
  • VSCode-agent proof: Handles ANY format AI agents can generate

Key Features:

  • Bulletproof parsing - Never fails on repository parameter format
  • Multi-repository search - Search across multiple repos simultaneously
  • Automatic acronym expansion for better coverage
  • Key concept extraction from documentation
  • Cross-repository search across all documentation
  • Focused results with relevance ranking

Parameters:

{
  query: string,                    // Required: search terms
  repository?: string | string[],   // Ultra-flexible: ANY format accepted!
                                   // Examples: "pest", "pest pestpp", "pest,pestpp", 
                                   // "pest|pestpp", "pest;pestpp", ["pest","pestpp"],
                                   // "[pest,pestpp]", "pest,pestpp|pest_hp;mfusg"
  limit?: number,                   // 1-50, default: 15
  file_type?: string,              // Filter by file extension
  include_content?: boolean        // Include content snippets (default: true)
}

4. 🔍 semantic_search_docs - Semantic Documentation Search

AI-powered conceptual search using OpenAI embeddings.

Purpose: Find conceptually related documentation even when exact terms don't match.

Key Features:

  • Vector similarity search using OpenAI embeddings
  • Conceptual matching beyond keyword matching
  • Cross-repository discovery of related concepts
  • Semantic understanding of groundwater modeling terminology

Parameters:

{
  query: string,                    // Required: natural language query
  repository?: string,              // Optional: specific repository  
  limit?: number                    // 1-20, default: 10
}

5. 🤖 semantic_search_tutorials - Semantic Tutorial Search

Find tutorials using concept-based similarity search.

Purpose: Discover tutorials by meaning and conceptual similarity rather than keywords.

Key Features:

  • Embedding-based search for conceptual matching
  • Tutorial-specific optimization for workflow discovery
  • Similarity scoring for relevance assessment
  • Cross-workflow discovery of related techniques

Parameters:

{
  query: string,                    // Required: natural language description
  limit?: number,                   // 1-20, default: 5
  similarity_threshold?: number    // 0-1, default: 0.7
}

6. 📁 get_file_content - Direct File Access with Structured Output

Retrieve complete file content by exact path with automatic pagination and structured JSON output.

Purpose: Get the full content of a specific file when you know its exact location, with rich metadata in structured format.

Key Features:

  • Structured JSON output via outputSchema for better UI integration
  • Automatic pagination for large files (30KB+ split into pages)
  • Complete file content without truncation
  • Multi-table routing (automatically finds file in correct table)
  • Rich metadata (title, summary, key concepts, statistics)
  • GitHub URL integration for source code files
  • Handles all file types (documentation, code, workflows)
  • Optimized response size with minimal content field for MCP compatibility

Parameters:

{
  repository: string,               // Required: repository name
  filepath: string,                 // Required: exact file path
  page?: number,                    // Optional: page number for large files
  force_full?: boolean             // Optional: force full content (use with caution)
}

Output Schema (NEW):

{
  file: {
    repository: string,
    filepath: string,
    filename: string,
    extension: string,
    file_type: string,
    file_size: number,
    created_at: string,
    content: string,
    analysis?: {
      summary?: string,
      key_concepts?: string[],
      technical_level?: string,
      purpose?: string
    },
    // Workflow-specific metadata
    complexity?: string,
    workflow_type?: string,
    packages_used?: string[],
    workflow_purpose?: string,
    best_use_cases?: string[],
    // Module-specific metadata
    package_code?: string,
    model_family?: string,
    semantic_purpose?: string,
    title?: string,
    // Pagination info
    pagination?: {
      needsPagination: boolean,
      currentPage: number,
      totalPages: number,
      actualContentSize: number
    }
  },
  found: boolean,
  error?: string
}

Example Usage:

// Get first page of large file
mcp__mfaitools__get_file_content({
  repository: "pest",
  filepath: "pestman1/The_PEST_Control_File_part05.md",
  page: 1
})

// Get complete small file
mcp__mfaitools__get_file_content({
  repository: "flopy", 
  filepath: "flopy/mf6/modflow/mfgwfwel.py"
})

7. ℹ️ get_modflow_ai_info - MODFLOW AI Overview

Get comprehensive information about MODFLOW AI capabilities and resources.

Purpose: Provide an overview of MODFLOW AI, available repositories, tools, and usage guidance.

Key Features:

  • Dynamic repository listing from database
  • Comprehensive tool documentation
  • Usage statistics (optional)
  • Getting started guide
  • Example queries
  • No parameters required - returns all information

Parameters:

{
  include_stats?: boolean           // Optional: include database statistics (default: true)
}

Example Usage:

// Get complete MODFLOW AI information
mcp__mfaitools__get_modflow_ai_info()

// Get info without statistics
mcp__mfaitools__get_modflow_ai_info({
  include_stats: false
})

Returns:

  • What MODFLOW AI is and its purpose
  • List of all available repositories (dynamically fetched)
  • Available search tools and their usage
  • Database statistics (file counts, etc.)
  • Getting started examples
  • Common use cases

🎛️ Advanced User Controls

Search Strategy Implementation Status

PhaseFeatureStatusDescription
1.1Rich Array DisplayUser-controlled metadata display
1.2Enhanced FormattingCompact format, array limits, truncation
2.1Search Strategies5 search types with targeted approaches
2.2FiltersPackage, model family, category filtering
3.1Field SearchUser-controlled field inclusion

Boolean Parameter Parsing

Important: MCP passes boolean parameters as strings. Our tools automatically parse:

  • String "false" → Boolean false
  • String "true" → Boolean true
  • Boolean false → Boolean false
  • Boolean true → Boolean true

This ensures include_snippet=false actually disables snippets!

🗃️ Database Schema

Repository Coverage

Documentation Repositories (repository_files table)
  • mf6: MODFLOW 6 documentation
  • pest: Parameter Estimation documentation
  • pestpp: PEST++ enhanced version
  • pest_hp: PEST_HP parallel version
  • mfusg: MODFLOW-USG unstructured grid
  • plproc: Parameter list processor
  • gwutils: Groundwater utilities
Code Repositories
  • flopy: Python MODFLOW package
    • flopy_modules (928 kB): API documentation, 13 MB indexes
    • flopy_workflows: Tutorial implementations
  • pyemu: Python uncertainty analysis
    • pyemu_modules (56 kB): API documentation, 2.9 MB indexes
    • pyemu_workflows: Analysis workflows

Rich Metadata Arrays

FloPy Modules:

  • user_scenarios[]: Real-world usage examples with context
  • related_concepts[]: Connected packages/concepts with explanations
  • typical_errors[]: Common mistakes and debugging info

PyEMU Modules:

  • use_cases[]: Practical usage scenarios
  • statistical_concepts[]: Mathematical/statistical concepts
  • common_pitfalls[]: Common mistakes and warnings
  • pest_integration[]: PEST software integration details

🚀 Setup Instructions

1. Create OAuth Applications

GitHub OAuth App
  1. Go to GitHub Settings > Developer settings > OAuth Apps
  2. Click "New OAuth App"
  3. Configure:
    • Application name: MCP MFAI Tools
    • Homepage URL: https://your-worker-name.your-subdomain.workers.dev
    • Authorization callback URL: https://your-worker-name.your-subdomain.workers.dev/callback
  4. Save Client ID and Client Secret
Google OAuth App
  1. Go to Google Cloud Console
  2. Create project → Enable Google+ API → Create OAuth 2.0 Client ID
  3. Configure:
    • Application type: Web application
    • Authorized redirect URIs: https://your-worker-name.your-subdomain.workers.dev/callback
  4. Save Client ID and Client Secret

2. Configure Cloudflare Workers

# Create KV namespace for OAuth sessions
wrangler kv:namespace create OAUTH_KV

# Update wrangler.toml with the returned ID

Update wrangler.toml:

[[kv_namespaces]]
binding = "OAUTH_KV"
id = "your-kv-namespace-id"  # Replace with actual ID

[vars]
ALLOWED_GITHUB_USERS = "your-username,other-user"
ALLOWED_GOOGLE_USERS = "your-email@gmail.com,other@email.com"
DEBUG = "true"
DEVELOPMENT_MODE = "false"  # NEVER set to "true" in production!

3. Set Secrets

# Database connection (Neon PostgreSQL)
wrangler secret put MODFLOW_AI_MCP_01_CONNECTION_STRING

# GitHub OAuth credentials
wrangler secret put GITHUB_CLIENT_ID
wrangler secret put GITHUB_CLIENT_SECRET

# Google OAuth credentials  
wrangler secret put GOOGLE_CLIENT_ID
wrangler secret put GOOGLE_CLIENT_SECRET

# Cookie encryption (generate with: openssl rand -base64 32)
wrangler secret put COOKIE_ENCRYPTION_KEY

4. Deploy

# Install dependencies
pnpm install

# Automated deployment
./scripts/deploy.sh

# Or manual deployment
npx wrangler deploy

# Update secrets easily
./scripts/update-secrets.sh

💻 Development

Two Server Options Available

HTTP Server (Production-like)
# Development mode (no OAuth required)
pnpm run dev

# Test all tools
pnpm run test:client

# Access at http://localhost:8787

Development Features:

  • No authentication required
  • Mock user created automatically
  • All tools available
  • Status page with configuration info
STDIO Server (Local MCP)
# Run local STDIO MCP server
cd stdio
npm run dev

# Test with interactive client
npm run test:interactive

# Test all tools
npm run test

STDIO Features:

  • Same bulletproof parsing as HTTP server
  • All 7 tools working through MCP protocol
  • Ultra-flexible repository parameters
  • Single source of truth - uses same tool implementations
  • Real-time logs showing parameter parsing

Production Testing

# Test with OAuth (requires setup)
pnpm run dev:prod

# View deployment logs
pnpm run tail

# Check production logs
npx wrangler tail your-worker-name --format pretty

Adding New Tools

1. Create Tool File
// tools/my-advanced-tool.ts
import type { NeonQueryFunction } from "@neondatabase/serverless";

export const myAdvancedToolSchema = {
  name: "my_advanced_tool",
  description: "Advanced tool with user controls",
  inputSchema: {
    type: 'object',
    properties: {
      query: { type: 'string', description: 'Search query' },
      advanced_mode: { type: 'boolean', description: 'Enable advanced features' },
      options: {
        type: 'object',
        properties: {
          include_metadata: { type: 'boolean' },
          max_depth: { type: 'number' }
        }
      }
    },
    required: ['query']
  }
};

export async function myAdvancedTool(args: any, sql: NeonQueryFunction<false, false>) {
  try {
    // Parse boolean values for MCP compatibility
    const parseBool = (value: any, defaultValue: boolean): boolean => {
      if (typeof value === 'boolean') return value;
      if (typeof value === 'string') {
        if (value.toLowerCase() === 'false') return false;
        if (value.toLowerCase() === 'true') return true;
      }
      return defaultValue;
    };

    const { query } = args;
    const advanced_mode = parseBool(args.advanced_mode, false);
    const include_metadata = parseBool(args.options?.include_metadata, true);

    // Implement your advanced logic here
    console.log(`[MY ADVANCED TOOL] Processing: ${query}, advanced: ${advanced_mode}`);

    // Return MCP-compatible response
    return {
      content: [{
        type: "text" as const,
        text: `Advanced tool executed: ${query}`
      }]
    };

  } catch (error) {
    return {
      content: [{
        type: "text" as const,  
        text: `Error: ${error instanceof Error ? error.message : 'Unknown error'}`
      }]
    };
  }
}
2. Register Tool
// mcp-agent.ts
import { myAdvancedToolSchema, myAdvancedTool } from "./tools/my-advanced-tool.js";

// Add to toolsList
const toolsList = [
  // ... existing tools
  {
    name: myAdvancedToolSchema.name,
    description: myAdvancedToolSchema.description,
    inputSchema: myAdvancedToolSchema.inputSchema,
  }
];

// Add handler
switch (name) {
  // ... existing cases
  case 'my_advanced_tool':
    return await myAdvancedTool(args, this.sql);
}

🔐 Security & Access Control

Authentication Flow

  1. User visits MCP endpoint → Redirected to OAuth selection
  2. User selects provider (GitHub/Google) → OAuth flow
  3. Server validates user against allowlist → Issues encrypted session
  4. Authenticated user accesses MCP tools

Security Features

  • OAuth 2.0 with GitHub and Google providers
  • User allowlists for both GitHub usernames and Google emails
  • Encrypted session cookies with secure token handling
  • No public access - all tools require authentication
  • Environment isolation between development and production
  • Comprehensive logging for security monitoring

User Management

# wrangler.toml
[vars]
ALLOWED_GITHUB_USERS = "user1,user2,user3"
ALLOWED_GOOGLE_USERS = "email1@gmail.com,email2@company.com"

🔧 Troubleshooting

Common Issues

Authentication Problems
"Authentication failed" / "Access denied"

Solutions:

  • Verify your GitHub username or Google email is in allowlist
  • Check wrangler.toml environment variables
  • Ensure OAuth redirect URLs match deployed worker URL
  • Clear browser cookies and retry authentication
Database Connection Issues
"Database connection error"

Solutions:

  • Verify MODFLOW_AI_MCP_01_CONNECTION_STRING secret is set correctly
  • Test Neon database connectivity outside of Cloudflare
  • Check database credentials and permissions
  • Review Cloudflare Workers logs for detailed error messages
Boolean Parameter Issues
include_snippet=false still shows snippets

Solutions:

  • This was fixed in our implementation with parseBool helper
  • MCP passes booleans as strings - our tools handle this automatically
  • Verify you're using the latest deployed version

Development Debugging

# Check deployment status
npx wrangler tail your-worker-name --format pretty

# Local development with full logging
pnpm run dev

# Test specific tools
pnpm run test:client

# Check configuration
curl https://your-worker-name.your-subdomain.workers.dev/

📊 Recent Improvements & Version History

Latest Version: Bulletproof Parameter Parsing + STDIO Server (2025)

✅ Recently Completed Features
  • 🛡️ BULLETPROOF REPOSITORY PARSING - Ultra-flexible parameter parsing accepts ANY format!
  • 📡 STDIO SERVER WORKING - Local MCP server with same bulletproof parsing
  • 🎯 VSCode Agent Compatible - Handles all formats VSCode agents can generate
  • 🔐 OAuth Authentication Fixed - GitHub and Google sign-in working perfectly
  • 🎨 Glassmorphism Login UI - Beautiful provider selection with animated backgrounds
  • 👥 Complete User Management - 15 GitHub users + 11 Google users in production allowlist
  • 🔧 CORS Issues Resolved - Proper headers for authenticated MCP connections
  • 🔍 Query Parsing Improved - Fixed plainto_tsquery for simple queries, to_tsquery for advanced
  • 📄 Pagination Feature - Automatic pagination for large files (70KB+) with page navigation
🎛️ Complete Tool Set
  • 6 specialized search tools covering tutorials, code, documentation, and semantic search
  • Rich metadata display with user-controlled arrays and snippets
  • Advanced filtering by package code, model family, complexity, and repository
  • Automatic acronym expansion with centralized MODFLOW/PEST mappings
  • GitHub URL integration for direct access to source code
  • Comprehensive error handling with detailed debugging information
🔧 Technical Improvements
  • Production deployment on Cloudflare Workers Edge with global performance
  • Robust authentication flow with encrypted session management
  • Database optimization with proper plainto_tsquery usage for reliability
  • Clean modular architecture with separation of concerns
  • Comprehensive logging for debugging and monitoring
🛠️ Tool Specialization Status
  • search_tutorials: ✅ Working - Tutorial and workflow discovery
  • search_code: ✅ Working - API and module documentation
  • search_docs: ✅ Working - Theory and reference material
  • semantic_search_tutorials: ✅ Working - Concept-based tutorial discovery
  • semantic_search_docs: ✅ Working - Semantic documentation search
  • get_file_content: ✅ Working - Complete file retrieval with pagination
🚀 Deployment Status
  • Live Production URL: https://mcp-mfai-tools.little-grass-273a.workers.dev
  • Authentication: Fully functional OAuth with GitHub and Google
  • User Access: Controlled allowlist with 26 authorized users
  • Performance: Edge deployment with global CDN
  • Reliability: All tools tested and working in production

Design Philosophy

User Control: Every feature is explicitly controlled by user parameters - no "intelligent" assumptions or hardcoded behavior.

Performance: Efficient SQL queries with proper indexing and caching strategies.

Reliability: Comprehensive error handling and fallback mechanisms.

Extensibility: Clean, modular architecture for easy feature additions.

🔮 Community & Contributing

Getting Involved

This project is designed to serve the MODFLOW/PEST community with powerful, user-controlled search capabilities. We welcome:

  • Feature requests based on real user needs
  • Performance improvements and optimization suggestions
  • Documentation improvements and usage examples
  • Integration suggestions with other groundwater modeling tools

Development Guidelines

  • No hardcoding - everything must be user-controlled
  • Comprehensive testing - all features must be thoroughly tested
  • Clear documentation - every parameter and option explained
  • Performance first - efficient queries and minimal latency
  • Security focused - proper authentication and access control

Support Channels

  • Issues: GitHub Issues for bug reports and feature requests
  • Documentation: This README and technical documentation in docs/
  • Examples: Working examples in examples/ directory
  • Community: MODFLOW user forums and mailing lists

🎉 Mastra Agent Integration

NEW: MODFLOW AI Agent with Mastra Framework

We've created a Mastra Agent that integrates all MCP tools into a conversational AI assistant!

Features
  • 🤖 Interactive Playground: Web UI at http://localhost:4113 for testing
  • 🔌 Full MCP Integration: All 7 MODFLOW AI tools available through the agent
  • 💬 Conversational Interface: Natural language queries with intelligent tool selection
  • 🚀 API Access: REST API endpoint for programmatic access
  • 🧪 Comprehensive Testing: Unit, integration, E2E, and manual tests included
Quick Start
cd mfai-mcp-agent
pnpm install
pnpm dev

# Open browser to http://localhost:4113
# Select "MODFLOW Documentation Assistant" agent
# Start asking questions!
Critical Implementation Note

MCP tools MUST be loaded at build time using await mcp.getTools():

// ✅ CORRECT - Load tools at build time
export const modflowAgent = new Agent({
  name: 'MODFLOW Documentation Assistant',
  model: openai('gpt-4o-mini'),
  tools: await mcp.getTools(), // CRITICAL: Build-time loading!
});

// ❌ WRONG - Dynamic loading does NOT work
const agent = new Agent({...});
agent.tools = await mcp.getTools(); // This will fail!

See mfai-mcp-agent/README.md for complete documentation.

📝 Recent Updates

January 11, 2025 - OutputSchema Implementation for Structured Responses

✅ Implemented outputSchema for get_file_content Tool
  • Achievement: Successfully added MCP outputSchema support for structured JSON responses
  • Problem Solved: MCP SDK error "Tool has an output schema but did not return structured content"
  • Solution: Tools with outputSchema must return both structuredContent and content fields
  • Optimization: Reduced response size by 50% using minimal text in content field
  • Implementation:
    • Added comprehensive outputSchema definition to get_file_content tool
    • Modified STDIO server to wrap responses correctly for tools with outputSchema
    • Fixed console.log breaking STDIO protocol by using console.error
  • Testing: All tests passing with proper structured data validation

January 10, 2025 - Mastra Agent Integration

✅ Created Mastra Agent with MCP Tools
  • Achievement: Successfully integrated all 7 MCP tools into a Mastra agent
  • Solution: Discovered that MCP tools must be loaded at build time, not dynamically
  • Implementation: Created modflow-build-time.ts with proper tool loading pattern
  • Testing: Full test suite with unit, integration, E2E, and manual tests
  • Documentation: Updated CLAUDE.md with critical solution for future reference

January 8, 2025 - Critical Bug Fixes

✅ Fixed get_file_content Pagination Issues
  • Problem: Large files (>70KB) were failing with "invalid escape string" errors
  • Root Cause: PostgreSQL's SUBSTRING function was interpreting escape sequences in JSON/notebook content
  • Solution: Replaced SUBSTRING with SUBSTR function which treats content as raw text
  • Impact: All file types now load correctly including complex Jupyter notebooks and documentation
✅ Optimized Page Size for MCP Token Limits
  • Problem: Large pages exceeded MCP's 25,000 token response limit
  • Solution: Reduced page size from 70KB to 30KB per page
  • Result: gpr_emulation_hosaki.ipynb (5.3MB) now properly paginated into 179 pages
✅ Improved Pagination Architecture
  • Enhancement: Separated metadata checking from content loading
  • New Functions:
    • checkFileMetadata() - Gets file size without loading content
    • loadFileContent() - Handles pagination with proper SUBSTR queries
  • Benefit: Prevents loading entire large files into memory before pagination
✅ Enhanced Observability
  • Added: Cloudflare Workers observability configuration
  • Benefit: Better debugging and monitoring of production issues

Known Working Examples

  • ✅ pestpp-ies.md (147KB, 5 pages)
  • ✅ gpr_emulation_hosaki.ipynb (5.3MB, 179 pages)
  • ✅ All FloPy/PyEMU modules and workflows
  • ✅ All PEST/MODFLOW documentation files
  • ✅ Mastra agent with full MCP tool integration

📄 License

MIT License - See file for details.

🤝 Acknowledgments

Built for the MODFLOW/PEST community with comprehensive search capabilities across:

  • MODFLOW 6 documentation and examples
  • FloPy Python package modules and workflows
  • PyEMU uncertainty analysis tools and tutorials
  • PEST parameter estimation documentation
  • MODFLOW-USG unstructured grid resources

Built with ❤️ for the groundwater modeling community

Empowering researchers, consultants, and students with intelligent access to MODFLOW/PEST knowledge