egpivo/kb-bridge
If you are the rightful owner of kb-bridge and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
KB-Bridge is a Model Context Protocol (MCP) server designed for intelligent knowledge base search and retrieval, supporting multiple backend providers.
KB-Bridge
A Model Context Protocol (MCP) server for intelligent knowledge base search and retrieval with support for multiple backend providers.
Installation
pip install kbbridge
Quick Start
Configuration
Create a .env file with your retrieval backend credentials:
# Required - Retrieval Backend Configuration
RETRIEVAL_ENDPOINT=https://api.dify.ai/v1 # Example: Dify endpoint
RETRIEVAL_API_KEY=your-retrieval-api-key
LLM_API_URL=https://your-llm-service.com/v1
LLM_MODEL=gpt-4o
LLM_API_TOKEN=your-token-here
# Optional
RERANK_URL=https://your-rerank-api.com
RERANK_MODEL=your-rerank-model
Supported Backends:
| Backend | Status | Notes |
|---|---|---|
| Dify | Supported | Currently available |
| Others | Planned | Additional backends coming soon |
See env.example for all available configuration options.
Running the Server
# Start server
python -m kbbridge.server --host 0.0.0.0 --port 5210
# Or using Makefile (if available)
make start
Server runs on http://0.0.0.0:5210 with MCP endpoint at http://0.0.0.0:5210/mcp.
Deployment Options
Option 1: Docker (Local Development / Simple Deployments)
For local development or simple single-container deployments:
# Build the image
docker build -t kbbridge:latest .
# Run with environment variables
docker run -d \
--name kbbridge \
-p 5210:5210 \
--env-file .env \
kbbridge:latest
For production deployments, use container orchestration platforms like Kubernetes with your preferred deployment method.
Features
- Backend Integration: Extensible architecture supporting multiple retrieval backends
- Multiple Search Methods: Hybrid, semantic, keyword, and full-text search
- Quality Reflection: Automatic answer quality evaluation and refinement
- Custom Instructions: Domain-specific query guidance
Workflow
KB-Bridge follows a multi-stage pipeline to ensure high-quality answers:
flowchart LR
Start([User Query]) --> Preprocess[Query Preprocessing<br/>Rewriting & Understanding]
Preprocess --> FileDiscovery[File Discovery<br/>Find Relevant Files]
FileDiscovery --> Search[Search Stages]
Search --> Direct[Direct Approach<br/>Simple Retrieval]
Search --> Advanced[Advanced Approach<br/>File-level Processing]
Direct --> Candidates
Advanced --> Candidates[Collect Candidates]
Candidates --> Synthesis[Answer Synthesis<br/>Rerank & Format]
Synthesis --> Reflection{Reflection<br/>Enabled?}
Reflection -->|Yes| Reflect[Quality Check<br/>& Refinement]
Reflection -->|No| Final
Reflect --> Final([Final Answer])
style Start fill:#e1f5ff
style Final fill:#c8e6c9
style FileDiscovery fill:#fff9c4
style Direct fill:#fff9c4
style Advanced fill:#fff9c4
style Reflect fill:#ffccbc
style Synthesis fill:#e1bee7
Stage Details
Query Preprocessing (Optional)
- Query Rewriting: LLM-based expansion/relaxation to improve recall
- Query Understanding: Extract intent and decompose complex queries
File Discovery
- Semantic search to identify relevant files (recall-focused)
- Optional quality evaluation with automatic search expansion if quality is low
Search Stages (Parallel)
- Direct Approach: Simple query → retrieval → answer extraction (fallback)
- Advanced Approach: File-level processing with content boosting for precision
Answer Synthesis
- Rerank candidates by relevance (if reranking service available)
- Combine and deduplicate using LLM
Quality Reflection (Optional)
- Evaluate answer quality and refine if needed (up to max_iterations)
Implementation Status
The orchestrator (DatasetProcessor) currently implements stages 1-3, 5-8. File Discovery Quality Evaluation (stage 4) is implemented but not yet integrated into the pipeline. See .doc/FILE_DISCOVERY_EVALUATION_CONFIG.md for details.
Available Tools
assistant: Intelligent search and answer extraction from knowledge basesfile_discover: Discover relevant files using retriever + optional rerankingfile_lister: List files in knowledge base datasetskeyword_generator: Generate search keywords using LLMretriever: Retrieve information using various search methodsfile_count: Get file count in knowledge base dataset
Usage Examples
Basic Query
import asyncio
from fastmcp import Client
async def main():
async with Client("http://localhost:5210/mcp") as client:
result = await client.call_tool(
"assistant",
{
"resource_id": "resource-id",
"query": "What are the safety protocols?",
},
)
print(result.content[0].text)
asyncio.run(main())
With Custom Instructions
await client.call_tool("assistant", {
"resource_id": "hr_dataset",
"query": "What is the maternity leave policy?",
"custom_instructions": "Focus on HR compliance and legal requirements."
})
With Query Rewriting
await client.call_tool("assistant", {
"resource_id": "resource-id",
"query": "What are the safety protocols?",
"enable_query_rewriting": True # Enables LLM-based query expansion/relaxation
})
With Document Filtering
await client.call_tool("assistant", {
"resource_id": "resource-id",
"query": "What are the safety protocols?",
"document_name": "safety_manual.pdf" # Limit search to specific document
})
Integration with Dify
You can plug KB-Bridge into a Dify Agent Workflow instead of calling MCP tools directly:
- Configure MCP Connection
- MCP server URL:
http://localhost:5210/mcp - Add auth headers:
X-RETRIEVAL-ENDPOINT,X-RETRIEVAL-API-KEY,X-LLM-API-URL,X-LLM-MODEL
- MCP server URL:
- Create an Agent Workflow
- Add an “MCP Tool” node
- Select tool:
assistant - Map workflow variables to
resource_id,query, and other tool parameters
- Run Queries
- User input → Agent → MCP
assistanttool → Structured answer with citations
- User input → Agent → MCP
Development
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/
# Format code
black kbbridge/ tests/
# Lint code
ruff check kbbridge/ tests/
License
Apache-2.0