dawiddutoit/librarian-mcp-server
If you are the rightful owner of librarian-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
Librarian MCP Server is a high-performance server that provides intelligent file indexing and lightning-fast search capabilities for repositories.
search_files
Search for files by name pattern (substring match).
search_files_regex
Search for files using regular expressions.
search_by_type
Search for files by programming language or file type.
refresh_index
Refresh the file index by rescanning the repository.
get_index_stats
Get statistics about the current file index, including file counts by type.
Librarian MCP Server
A high-performance MCP (Model Context Protocol) server that provides intelligent file indexing and lightning-fast search capabilities for your repositories. Librarian automatically creates and maintains an index of your project files, enabling instant file discovery without repeated filesystem traversal.
Why Librarian?
When working with large codebases, finding files quickly is crucial. Traditional file search methods require traversing the entire filesystem each time, which can be slow. Librarian solves this by maintaining an efficient index at .claude/workspace/workspace.yml
, providing:
- Instant Results: Search through thousands of files in milliseconds
- Smart Indexing: Automatically respects
.gitignore
patterns - Type-Aware: Built-in understanding of 19+ programming languages
- Zero Configuration: Works out of the box with any git repository
Features
- 🚀 Fast File Search - Search files by name pattern without filesystem traversal
- 🔍 Regex Search - Use regular expressions for advanced file searching
- 📁 Type-based Search - Find files by type (Python, JavaScript, TypeScript, etc.)
- 📊 Index Statistics - Get insights about your repository structure
- 🔄 Index Refresh - Update the index when files change
- 🚫 Gitignore Support - Automatically respects
.gitignore
patterns - 📦 Language Support - Recognizes 19+ programming languages and file types
- 🔧 MCP Protocol - Full compliance with Model Context Protocol standards
Installation
Install from GitHub (recommended)
# Using uvx (recommended)
uvx --from git+https://github.com/dawiddutoit/librarian-mcp-server.git librarian-mcp-server
# Or install with pip
pip install git+https://github.com/dawiddutoit/librarian-mcp-server.git
Install for development
git clone https://github.com/dawiddutoit/librarian-mcp-server.git
cd librarian-mcp-server
uv pip install -e .
Configuration
Add to your .mcp.json
configuration file:
{
"mcpServers": {
"librarian": {
"command": "uvx",
"args": ["librarian-mcp-server"],
"type": "stdio",
"description": "File indexer and retrieval for projects"
}
}
}
How It Works
- Automatic Indexing: On first run, Librarian scans your repository and creates an index at
.claude/workspace/workspace.yml
- Fast Searching: All searches use the index for instant results without filesystem traversal
- Repository Detection: Automatically detects repository root by finding
.git
directory
Available Tools
search_files
Search for files by name pattern (substring match).
Parameters:
- query: Search pattern
- case_sensitive: Whether to perform case-sensitive search (default: false)
- limit: Maximum results to return (default: 100)
search_files_regex
Search for files using regular expressions.
Parameters:
- pattern: Regular expression pattern
- case_sensitive: Whether to perform case-sensitive search (default: false)
- limit: Maximum results to return (default: 100)
search_by_type
Search for files by programming language or file type.
Parameters:
- file_type: Type to search for (python, javascript, typescript, kotlin, etc.)
- pattern: Optional name pattern to filter results
- limit: Maximum results to return (default: 100)
Supported file types:
python
- .py, .pyi, .pyx, .pxdjavascript
- .js, .mjs, .cjstypescript
- .ts, .tsx, .d.tskotlin
- .kt, .ktsjava
- .javago
- .gorust
- .rscpp
- .cpp, .cxx, .cc, .hpp, .hxx, .h++c
- .c, .hcsharp
- .csruby
- .rbphp
- .phpswift
- .swiftmarkdown
- .md, .markdownjson
- .jsonyaml
- .yml, .yamlxml
- .xmlhtml
- .html, .htmcss
- .css, .scss, .sass, .less
refresh_index
Refresh the file index by rescanning the repository.
get_index_stats
Get statistics about the current file index, including file counts by type.
Resources
index://stats
Returns current index statistics as JSON.
index://config
Returns server configuration including repository path and supported file types.
Index Location
The index is stored at .claude/workspace/workspace.yml
in your repository root. This file contains:
- File paths and metadata
- File types and sizes
- Last modification times
- SHA256 hashes for change detection
Ignored Files
Librarian automatically ignores:
- Files and directories in
.gitignore
- Hidden files and directories (starting with
.
) - Common build/dependency directories:
node_modules
,__pycache__
,venv
,env
,build
,dist
,target
- The
.git
and.claude
directories
Examples
Basic File Search
# Search for files containing "user" in the name
result = await search_files("user")
# Returns: UserController.py, user_model.py, user.test.js
# Case-sensitive search
result = await search_files("README", case_sensitive=True)
# Returns: README.md (but not readme.txt)
Regex Search
# Find all test files
result = await search_files_regex(r".*\.test\.(js|py|ts)$")
# Find files with version numbers
result = await search_files_regex(r"v\d+\.\d+")
Search by File Type
# Find all Python files
result = await search_by_type("python")
# Find TypeScript files with "component" in the name
result = await search_by_type("typescript", pattern="component")
How It Works
- Automatic Detection: Librarian automatically detects your repository root by finding the
.git
directory - Smart Indexing: On first run, it scans your repository and creates an index, ignoring files in
.gitignore
- Fast Retrieval: All searches use the pre-built index for instant results
- Easy Updates: Use the
refresh_index
tool to update the index when files change
Supported File Types
Librarian recognizes and categorizes files by type:
- Python:
.py
,.pyi
,.pyx
,.pxd
- JavaScript:
.js
,.mjs
,.cjs
- TypeScript:
.ts
,.tsx
,.d.ts
- Java:
.java
- Kotlin:
.kt
,.kts
- Go:
.go
- Rust:
.rs
- C/C++:
.c
,.h
,.cpp
,.hpp
,.cc
- C#:
.cs
- Ruby:
.rb
- PHP:
.php
- Swift:
.swift
- Web:
.html
,.css
,.scss
- Data:
.json
,.yaml
,.xml
- Documentation:
.md
,.markdown
Performance
Librarian is designed for speed:
- Initial indexing: ~1000 files/second
- Search operations: <10ms for repos with 10,000+ files
- Memory efficient: Index size is typically <1% of repository size
Development
Prerequisites
- Python 3.10+
- UV package manager
Setup
# Clone the repository
git clone https://github.com/dawiddutoit/librarian-mcp-server.git
cd librarian-mcp-server
# Install dependencies with UV
uv pip install -e ".[dev]"
Running Tests
# Run all tests
pytest
# Run with coverage
pytest --cov=librarian_mcp_server
# Run specific test file
pytest tests/test_integration.py
Building
# Build distribution packages
uv build
# The built packages will be in dist/
ls dist/
Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
Troubleshooting
Index not updating?
Run the refresh_index
tool to manually update the index:
{"tool": "refresh_index"}
Files missing from search?
Check if they're ignored by .gitignore
. Librarian respects gitignore patterns.
Performance issues?
For very large repositories (>50,000 files), the initial indexing might take a few seconds. Subsequent searches remain fast.
License
MIT License - see file for details.
Acknowledgments
Built with the Model Context Protocol to provide seamless integration with AI assistants.