package-context-mcp

platypusrex/package-context-mcp

3.3

If you are the rightful owner of package-context-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

A Model Context Protocol (MCP) server that provides intelligent semantic search across software package documentation and repository content.

Tools
  1. indexRepository

    Index any Git repository directly by URL.

  2. indexPackage

    Index a package by name from supported registries.

  3. searchDocs

    Perform semantic search across indexed documentation.

  4. listRepositories

    List all indexed repositories with filtering options.

Package Documentation MCP Server

A Model Context Protocol (MCP) server that provides intelligent semantic search across software package documentation and repository content. Using a repository-centric architecture with native vector embeddings, this server helps LLMs like Claude understand third-party packages by indexing comprehensive documentation, source code, examples, and more across multiple programming ecosystems.

πŸš€ Features

  • πŸ—οΈ Repository-Centric Architecture: Repositories as source of truth with package registries as discovery layers
  • πŸ” Semantic Search: Native libSQL vector operations with OpenAI embeddings for intelligent content retrieval
  • 🌍 Multi-Language Support: JavaScript/TypeScript, Python, Rust, Go with ecosystem-aware documentation patterns
  • πŸ“¦ Multi-Registry Discovery: npm, PyPI, crates.io, Go modules with automatic repository resolution
  • 🧠 Intelligent Content Processing: Language-aware chunking, deduplication, and importance scoring
  • ⚑ Native Vector Operations: F32_BLOB storage with blazing-fast similarity search
  • πŸ”§ MCP Integration: HTTP transport compatible with Claude Desktop, Windsurf, Cursor, and other MCP clients
  • πŸ’Ύ Persistent Storage: libSQL database with type-safe Drizzle ORM operations

πŸ› οΈ Installation

Prerequisites

  • Node.js 22+
  • npm/pnpm/yarn
  • Git (for repository cloning)
  • OpenAI API key (for embeddings)

Setup

  1. Clone the repository

    git clone <repository-url>
    cd package-docs-mcp
    
  2. Install dependencies

    pnpm install
    
  3. Configure environment

    cp .env.example .env
    # Edit .env with your configuration:
    # OPENAI_API_KEY=your_openai_api_key
    # TURSO_DATABASE_URL=file:./package-context-mcp.db (default)
    # TURSO_AUTH_TOKEN=noop (default for local)
    # PORT=5309 (default)
    
  4. Run database migrations

    pnpm run db:generate
    pnpm run db:migrate
    

πŸ”§ Configuration

The server uses environment variables with sensible defaults:

VariableDefaultDescription
OPENAI_API_KEY-Required: OpenAI API key for embeddings
TURSO_DATABASE_URLfile:./package-context-mcp.dblibSQL database URL
TURSO_AUTH_TOKENnoopDatabase auth token (for Turso cloud)
PORT5309Server port

For local development, the server uses a local libSQL database with native vector support.

πŸš€ Usage

Development Mode

pnpm dev

Production Mode

pnpm build
pnpm start

MCP Inspector (Testing)

pnpm inspect

The server will start on http://localhost:5309 with the MCP endpoint at /model-context.

πŸ”Œ MCP Integration

This server uses HTTP transport and works with any MCP client that supports HTTP connections, including modern AI-powered IDEs and development tools.

Claude Desktop

Add to your Claude Desktop MCP configuration using mcp-remote:

{
  "mcpServers": {
    "package-context": {
      "command": "npx",
      "args": [
        "mcp-remote",
        "http://localhost:5309/model-context"
      ]
    }
  }
}

IDE Integration

The server works with modern AI-powered development environments:

Windsurf IDE:

  • Configure MCP server connection in Windsurf settings
  • Use HTTP transport: http://localhost:5309/model-context

Cursor IDE:

  • Add MCP server configuration for HTTP endpoint
  • Access package documentation directly in your development workflow

Other MCP-Compatible Tools:

  • Any tool supporting MCP over HTTP can connect
  • Use the endpoint: http://localhost:5309/model-context
  • Transport type: streamable-http

MCP Inspector

Test the server using the MCP Inspector:

npx @modelcontextprotocol/inspector http --url http://localhost:5309/model-context

πŸ› οΈ Available Tools

indexRepository

Index any Git repository directly by URL.

Parameters:

  • repositoryUrl (required): Git repository URL (GitHub, GitLab, Bitbucket, etc.)
  • forceReindex (optional): Force reindex if repository already exists (default: false)

Example:

// Index React repository
await indexRepository({ 
  repositoryUrl: "https://github.com/facebook/react" 
})

// Force reindex existing repository
await indexRepository({ 
  repositoryUrl: "https://github.com/vercel/next.js",
  forceReindex: true 
})

indexPackage

Index a package by name from supported registries.

Parameters:

  • packageName (required): Name of the package
  • registry (optional): Registry to search - npm, pypi, crates, go (default: npm)
  • version (optional): Specific version to index (defaults to latest)
  • forceReindex (optional): Force reindex if repository already exists (default: false)

Example:

// Index latest React from npm
await indexPackage({ packageName: "react" })

// Index specific Python package version
await indexPackage({ 
  packageName: "fastapi", 
  registry: "pypi", 
  version: "0.104.1" 
})

// Index Rust crate
await indexPackage({ 
  packageName: "serde", 
  registry: "crates" 
})

// Index Go module
await indexPackage({ 
  packageName: "github.com/gin-gonic/gin", 
  registry: "go" 
})

searchDocs

Semantic search across indexed documentation.

Parameters:

  • query (required): Search query string
  • repositoryId (optional): Scope search to specific repository (accepts URLs, slugs, or IDs)

Example:

// Search across all repositories
await searchDocs({ query: "authentication middleware examples" })

// Search within specific repository
await searchDocs({ 
  query: "useState hook patterns", 
  repositoryId: "https://github.com/facebook/react" 
})

// Search using repository slug
await searchDocs({ 
  query: "async/await patterns", 
  repositoryId: "github.com/microsoft/TypeScript" 
})

listRepositories

List all indexed repositories with filtering options.

Parameters:

  • provider (optional): Filter by Git provider (github, gitlab, bitbucket)
  • language (optional): Filter by primary programming language

Example:

// List all repositories
await listRepositories({})

// List only GitHub repositories
await listRepositories({ provider: "github" })

// List TypeScript repositories
await listRepositories({ language: "TypeScript" })

fetchRepositoryFile

Retrieve specific files from indexed repositories.

Parameters:

  • repositoryId (required): Repository identifier
  • filePath (required): Path to file in repository
  • branch (optional): Git branch (defaults to repository default branch)

Status: Coming soon - currently provides web URLs for manual access.

🌐 Debug API Endpoints

For development and debugging, the server exposes REST endpoints:

  • GET / - Server status and information
  • GET /debug/repositories - List all indexed repositories
  • GET /debug/repositories/:id - Get specific repository details
  • GET /debug/docs/:repositoryId - Get documentation chunks for repository

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   MCP Client        β”‚    β”‚   Hono Server       β”‚    β”‚   libSQL DB         β”‚
β”‚  (Claude Desktop)   │◄──►│  (index.ts)         │◄──►│  (Native Vectors)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                      β”‚
                        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                        β–Ό                           β–Ό
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚ RepositoryIndexing  β”‚    β”‚ RegistryDiscovery   β”‚
              β”‚ Service             β”‚    β”‚ Service             β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚                           β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”‚
              β–Ό                   β–Ό                 β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ ChunkingService β”‚ β”‚ VectorEmbedding β”‚ β”‚ Language        β”‚
    β”‚ (LangChain)     β”‚ β”‚ Service         β”‚ β”‚ Analysis        β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Core Components

Repository-Centric Services:

  • RepositoryIndexingService: Git repository cloning, content extraction, and indexing
  • RegistryDiscoveryService: Multi-registry package discovery and repository resolution
  • LanguageAnalysisService: Ecosystem detection and language-aware documentation patterns

Content Processing Pipeline:

  • ChunkingService: LangChain-powered intelligent text splitting with language awareness
  • ContentHashingService: Content normalization, hashing, and deduplication
  • NativeVectorEmbeddingService: OpenAI embeddings with native libSQL F32_BLOB storage

Database Schema:

  • repositories: Primary source of truth for indexed repositories
  • registry_packages: Package registry metadata linked to repositories
  • documentation_chunks: Chunked content with metadata and importance scoring
  • chunk_content: Deduplicated content storage with hashing
  • chunk_embeddings: Native F32_BLOB vector embeddings for semantic search

🌍 Supported Ecosystems

βœ… Fully Supported

  • npm (JavaScript/TypeScript) - package.json, Node.js ecosystem
  • PyPI (Python) - setup.py, pyproject.toml, Python ecosystem
  • crates.io (Rust) - Cargo.toml, Rust ecosystem
  • Go modules - go.mod, Go ecosystem

πŸ”„ Planned

  • Maven (Java/Kotlin/Scala) - pom.xml, build.gradle
  • RubyGems (Ruby) - Gemfile, .gemspec
  • Packagist (PHP) - composer.json

🎯 Language-Aware Features

  • Ecosystem Detection: Automatic detection based on indicator files and primary language
  • Documentation Patterns: Ecosystem-specific file patterns (README, examples, docs, tests)
  • Content Processing: Language-aware chunking with appropriate separators and boundaries
  • Source Type Classification: Intelligent categorization (readme, documentation, source, test, example, config)

🚧 Development

Scripts

CommandDescription
pnpm devStart development server with hot reload
pnpm buildBuild TypeScript to JavaScript
pnpm startStart production server
pnpm lintRun Biome linter
pnpm formatFormat code with Biome
pnpm db:generateGenerate database migrations
pnpm db:migrateRun database migrations
pnpm inspectStart MCP Inspector

Database Operations

The project uses Drizzle ORM with libSQL and native vector support:

# After modifying schema.ts
pnpm run db:generate  # Generate migration files
pnpm run db:migrate   # Apply migrations

Local Development Database

For local development, the server automatically creates a libSQL database file (package-context-mcp.db) with native vector support enabled.

πŸ—ΊοΈ Roadmap

βœ… Current Implementation (Repository-Centric Foundation)

  • Repository-centric architecture with multi-provider Git support
  • Native libSQL vector operations with F32_BLOB storage
  • Multi-registry package discovery (npm, PyPI, crates.io, Go)
  • Language-aware content processing and documentation patterns
  • Semantic search with OpenAI embeddings and deduplication
  • Intelligent chunking with LangChain and content hashing

πŸ”„ Phase 2: Enhanced Content Understanding

  • AST-based code analysis with Tree-sitter integration
  • Function, class, and type extraction from source code
  • Enhanced API discovery and usage pattern detection
  • Cross-reference resolution and symbol linking

πŸ”„ Phase 3: Advanced Features

  • Direct file fetching from indexed repositories
  • Package-scoped search and filtering capabilities
  • Migration guide detection and version comparison
  • Performance optimizations for large repositories

πŸ”„ Phase 4: Ecosystem Expansion

  • Maven ecosystem support (Java/Kotlin/Scala)
  • RubyGems and Packagist integration
  • Enhanced multi-language AST support
  • Cross-package relationship mapping

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Guidelines

  • Use TypeScript for all new code
  • Follow the existing code style (Biome configuration)
  • Write descriptive commit messages
  • Add tests for new functionality
  • Update documentation as needed

πŸ“ License

This project is licensed under the ISC License - see the LICENSE file for details.

πŸ™‹β€β™‚οΈ Support

🎯 Use Cases

For Developers

  • Learning: Quickly understand unfamiliar packages across multiple programming languages
  • Integration: Get comprehensive context when integrating third-party libraries
  • API Discovery: Find relevant functions, classes, and usage patterns through semantic search
  • Cross-Ecosystem: Work with packages from different language ecosystems in a unified interface

For AI-Powered IDEs

  • Contextual Code Assistance: Windsurf, Cursor, and other AI IDEs can access real package documentation
  • Intelligent Suggestions: AI assistants get accurate, up-to-date information about dependencies
  • Cross-Reference Navigation: Understand how packages relate to each other across your project
  • Documentation Lookup: Instant access to examples, API docs, and usage patterns without leaving your IDE

For AI Assistants

  • Accurate Code Generation: Generate code using real package APIs and documented patterns
  • Contextual Documentation: Provide detailed explanations with actual package context
  • Best Practices: Recommend usage patterns based on official examples and community practices
  • Multi-Language Support: Assist with packages from JavaScript, Python, Rust, Go, and more

πŸ’‘ Example Workflows

Discovering a New Package

// 1. Index a package by name
await indexPackage({ packageName: "fastapi", registry: "pypi" })

// 2. Search for specific functionality
await searchDocs({ 
  query: "authentication middleware setup",
  repositoryId: "github.com/tiangolo/fastapi" 
})

// 3. Explore related examples
await searchDocs({ query: "JWT token validation examples" })

Repository-First Workflow

// 1. Index repository directly
await indexRepository({ 
  repositoryUrl: "https://github.com/microsoft/TypeScript" 
})

// 2. Search for TypeScript-specific patterns
await searchDocs({ 
  query: "generic constraints and conditional types",
  repositoryId: "github.com/microsoft/TypeScript"
})

Cross-Language Comparison

// Index equivalent packages in different languages
await indexPackage({ packageName: "express", registry: "npm" })
await indexPackage({ packageName: "fastapi", registry: "pypi" })
await indexPackage({ packageName: "actix-web", registry: "crates" })
await indexPackage({ packageName: "github.com/gin-gonic/gin", registry: "go" })

// Compare web framework patterns
await searchDocs({ query: "middleware authentication patterns" })

Built with ❀️ using TypeScript, Hono, Drizzle ORM, libSQL, LangChain, and the Model Context Protocol