bwarzecha/knowledge-server
If you are the rightful owner of knowledge-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The Knowledge Server is a specialized MCP server designed to make large technical documentation accessible to LLMs through intelligent chunking and retrieval.
search_api
Search your indexed knowledge base and return relevant chunks.
research_api
Use the intelligent ReAct agent for comprehensive analysis and implementation guidance.
Knowledge Server
A specialized MCP (Model Context Protocol) server that makes large technical documentation accessible to LLMs through intelligent chunking and retrieval. Originally designed for OpenAPI specifications but now supports general knowledge management including markdown documents.
Overview
The Knowledge Server solves the fundamental problem of large technical documentation (OpenAPI specs, markdown docs, etc.) that cannot fit in LLM context windows. It provides intelligent chunking, semantic search, and automatic reference resolution to deliver complete, accurate information through a simple interface.
Key Features
- Universal Document Support: OpenAPI specifications (JSON/YAML) and markdown documents
- Intelligent Chunking: Context-aware splitting with reference tracking
- Semantic Search: Vector-based similarity search with configurable embedding models
- Reference Resolution: Automatic expansion of related content and dependencies
- Research Agent: Intelligent ReAct agent for comprehensive analysis
- MCP Integration: Standard Model Context Protocol server for LLM tools
- Configurable LLM Support: Local models (GGUF) and cloud providers (AWS Bedrock)
Quick Start
Installation
# Clone and setup
git clone <repository-url>
cd knowledge-server
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
Configuration
Create a .env
file in the project root:
# Required: Document directories
OPENAPI_SPECS_DIR=/path/to/your/openapi/specs
# Vector Store Configuration
VECTOR_STORE_DIR=./data/vectorstore
VECTOR_STORE_COLLECTION=knowledge_base
EMBEDDING_MODEL=dunzhang/stella_en_1.5B_v5
EMBEDDING_DEVICE=mps # mps, cpu, cuda
MAX_TOKENS=8192
# API Index Configuration
API_INDEX_PATH=./data/api_index.json
# Knowledge Retriever Configuration
RETRIEVAL_MAX_PRIMARY_RESULTS=5
RETRIEVAL_MAX_TOTAL_CHUNKS=15
RETRIEVAL_MAX_DEPTH=3
RETRIEVAL_TIMEOUT_MS=5000
CONTEXT_PRIORITIZE_PRIMARY=true
# MCP Server Configuration
MCP_SERVER_NAME=Knowledge Server
MCP_SERVER_HOST=localhost
MCP_SERVER_PORT=8000
# Processing Configuration
SKIP_HIDDEN_FILES=true
SUPPORTED_EXTENSIONS=.json,.yaml,.yml
LOG_PROCESSING_PROGRESS=true
Index Your Documents
# Index both OpenAPI specs and markdown documents
knowledge-server index
# Index only OpenAPI specifications
knowledge-server index --skip-markdown
# Index only markdown documents
knowledge-server index --skip-openapi
# Specify custom markdown directory
knowledge-server index --markdown-dir /path/to/docs
# Control chunk size for markdown (default: 1000 tokens, max: 8000)
knowledge-server index --max-tokens 1500
Start MCP Server
# Start the MCP server
knowledge-server serve
# With verbose output for debugging
knowledge-server serve -v
Usage
As MCP Server
The primary use case is as an MCP server providing two main tools:
search_api(query, max_response_length, max_chunks, include_references, max_depth)
Search your indexed knowledge base and return relevant chunks.
research_api(question)
Use the intelligent ReAct agent for comprehensive analysis and implementation guidance.
Direct CLI Usage
For testing and development:
# Ask questions about your documentation
knowledge-server ask "How do I authenticate with the user API?"
# Use the research agent for comprehensive analysis
knowledge-server research "What are the best practices for pagination in this API?"
# Advanced search options
knowledge-server ask "API rate limits" --max-chunks 30 --max-depth 2 --no-references
MCP Client Configuration
Important: All MCP configurations must use the Python executable from the virtual environment (venv/bin/python
) to ensure all dependencies are available.
Claude Desktop
Add to your Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"knowledge-server": {
"command": "/path/to/knowledge-server/venv/bin/python",
"args": [
"-m", "src.mcp_server.server"
],
"cwd": "/path/to/knowledge-server",
"env": {
"PYTHONPATH": "/path/to/knowledge-server"
}
}
}
}
Or using the convenience script (make sure it's executable: chmod +x run_server.sh
):
{
"mcpServers": {
"knowledge-server": {
"command": "/path/to/knowledge-server/run_server.sh",
"cwd": "/path/to/knowledge-server"
}
}
}
VS Code with Cline
Add to your Cline MCP settings:
{
"mcpServers": {
"knowledge-server": {
"command": "/path/to/knowledge-server/venv/bin/python",
"args": ["-m", "src.mcp_server.server"],
"cwd": "/path/to/knowledge-server",
"env": {
"PYTHONPATH": "/path/to/knowledge-server"
}
}
}
}
Generic MCP Client
The server implements the standard MCP protocol and can be used with any compatible client:
# Direct execution
cd /path/to/knowledge-server
source venv/bin/activate
python -m src.mcp_server.server
Document Processing
OpenAPI Specifications
The server processes OpenAPI 3.0/3.1 specifications:
- Supported formats: JSON and YAML
- Intelligent chunking: Operations grouped with related schemas
- Reference resolution: Automatic $ref expansion
- Metadata extraction: Comprehensive tagging and categorization
Markdown Documents
Supports structured markdown processing:
- Header-based chunking: Sections split at configurable token limits
- Reference tracking: Cross-document links and references
- Navigation building: Automatic section hierarchy
- Content analysis: Semantic categorization
Architecture
The Knowledge Server uses a modular architecture:
- Document Processors: Handle OpenAPI and markdown parsing
- Vector Store Manager: ChromaDB integration with configurable embeddings
- Knowledge Retriever: Two-stage retrieval with reference expansion
- Research Agent: LangGraph-based intelligent analysis
- MCP Server: Standard protocol interface for LLM tools
See for detailed component documentation.
Configuration Reference
Environment Variables
Variable | Description | Default |
---|---|---|
OPENAPI_SPECS_DIR | Directory containing OpenAPI specs | Required |
VECTOR_STORE_DIR | Vector store persistence directory | ./data/vectorstore |
VECTOR_STORE_COLLECTION | Vector store collection name | knowledge_base |
EMBEDDING_MODEL | Sentence-transformers model | dunzhang/stella_en_1.5B_v5 |
EMBEDDING_DEVICE | Device for embeddings | mps |
MAX_TOKENS | Maximum tokens per chunk | 8192 |
API_INDEX_PATH | Path to API index file | ./data/api_index.json |
RETRIEVAL_MAX_PRIMARY_RESULTS | Max primary search results | 5 |
RETRIEVAL_MAX_TOTAL_CHUNKS | Max total chunks retrieved | 15 |
RETRIEVAL_MAX_DEPTH | Max reference expansion depth | 3 |
RETRIEVAL_TIMEOUT_MS | Retrieval timeout in milliseconds | 5000 |
MCP_SERVER_NAME | MCP server display name | Knowledge Server |
MCP_SERVER_HOST | MCP server host | localhost |
MCP_SERVER_PORT | MCP server port | 8000 |
Embedding Models
Supported embedding models (via sentence-transformers):
dunzhang/stella_en_1.5B_v5
(default) - High-quality English embeddingssentence-transformers/all-MiniLM-L6-v2
- Fast, lightweightsentence-transformers/all-mpnet-base-v2
- Good balance of speed and qualityBAAI/bge-large-en-v1.5
- State-of-the-art English embeddings
Device Support
- cpu: CPU-only processing (default, most compatible)
- mps: Apple Silicon GPU acceleration (recommended for M1/M2/M3 Macs)
- cuda: NVIDIA GPU acceleration
- auto: Automatically select best available device
Development
Testing
# Run all tests
source venv/bin/activate
python -m pytest tests/ -v
# Run specific test suites
python -m pytest tests/openapi_processor/ -v
python -m pytest tests/markdown_processor/ -v
python -m pytest tests/integration/ -v
Code Quality
# Format and lint
source venv/bin/activate
black src/ tests/
isort src/ tests/
python -m pytest tests/
Project Structure
knowledge-server/
āāā src/
ā āāā cli/ # Command-line interface
ā āāā mcp_server/ # MCP protocol server
ā āāā openapi_processor/ # OpenAPI document processing
ā āāā markdown_processor/ # Markdown document processing
ā āāā vector_store/ # ChromaDB integration
ā āāā retriever/ # Knowledge retrieval engine
ā āāā research_agent/ # Intelligent analysis agent
ā āāā llm/ # LLM provider abstraction
ā āāā utils/ # Shared utilities
āāā tests/ # Comprehensive test suite
āāā data/ # Generated indices and vector store
āāā docs/ # Documentation and specifications
āāā samples/ # Example documents and configurations
Troubleshooting
Common Issues
MCP server not starting
- Check that all dependencies are installed:
pip install -r requirements.txt
- Verify
.env
file configuration - Ensure directories in config exist and are readable
No search results
- Run indexing first:
knowledge-server index
- Check that document directories contain supported files
- Verify ChromaDB persistence directory is writable
Embedding model download fails
- Check internet connectivity
- Try a different embedding model
- Use CPU device if GPU memory is insufficient
LLM provider errors
- Verify AWS credentials and permissions for Bedrock
- Check local model path and GGUF format for local provider
- Ensure sufficient memory for local models
Performance Optimization
- Use GPU acceleration for embeddings when available
- Adjust
VECTOR_SEARCH_LIMIT
based on your use case - Consider using smaller embedding models for faster indexing
- Increase
max_chunks
parameter for comprehensive but slower searches
License
MIT License - see file for details.
Contributing
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
See for development guidelines and coding standards.