bsreeram08/Cogni-Docs
If you are the rightful owner of Cogni-Docs and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
A Model Context Protocol (MCP) server that provides AI assistants with the ability to search and query documentation using a flexible backend architecture.
CogniDocs - Documentation MCP Server - Flexible Backend
CogniDocs is a Model Context Protocol (MCP) server that provides AI assistants with the ability to search and query documentation. Now supports flexible backend configurations to meet different privacy and infrastructure requirements.
π What's New - Flexible Backend Architecture
This version introduces a complete backend abstraction layer that allows you to choose your preferred technology stack:
Storage Options
- ChromaDB - Open-source vector database
Embedding Options
- Xenova/Transformers - Local, privacy-focused embeddings
- Transformers.js (@huggingface/transformers) - Official HF JS runtime (WASM by default on server)
Provider registry and provider-agnostic configuration (New)
We now use a plugin-style provider registry with auto-registration. Configuration no longer references specific providers in the schema; instead you specify:
# Storage
STORAGE_NAME=chroma
STORAGE_OPTIONS={"url":"http://localhost:8000"}
# Embeddings
EMBEDDINGS_NAME=xenova
EMBEDDINGS_OPTIONS={"model":"Xenova/all-MiniLM-L6-v2","maxBatchSize":50}
# Alternative (Transformers.js)
# EMBEDDINGS_NAME=transformersjs
# EMBEDDINGS_OPTIONS={"model":"Xenova/all-MiniLM-L6-v2","device":"wasm","pooling":"mean","normalize":true,"maxBatchSize":50}
Notes:
- Providers self-register via
app/*/providers/index.ts
side-effect imports (e.g.,app/embeddings/providers/index.ts
,app/storage/providers/index.ts
,app/chunking/providers/index.ts
). - Adding a provider is as simple as adding a new file that calls
register*Provider()
. - Old variables like
STORAGE_PROVIDER
,EMBEDDING_PROVIDER
,CHROMA_URL
,XENOVA_MODEL
are supported for backward-compat in parsing, but are deprecated.
Chunking updates
- Default chunker is LangChain with the recursive strategy.
- Recommended defaults:
chunkSize=3000
,chunkOverlap=150
. - Configure via
CHUNKING_NAME=langchain
andCHUNKING_OPTIONS={"strategy":"recursive","chunkSize":3000,"chunkOverlap":150}
. - Additional strategies in the LangChain provider:
intelligent
: content-typeβaware splitting (adapts separators/size for code, markdown, html, etc.).semantic
: initial split + adjacent-merge when cosine similarity of embeddings is above a threshold.
- The Chonkie provider normalizes outputs to strings so
Chunk.text
is always a string.
π§ Agentic Document Processing (ingestion, optional) (TODO)
Agent-guided chunking and annotation can dramatically improve search quality for large, multi-topic docs by aligning chunks to topic boundaries and enriching them with metadata (topic tags, section headings, code language, entities, summaries, and quality scores). This is designed to be an optional, provider-agnostic stage at ingestion time.
Learn more: see docs/agentic-processing.md
.
ποΈ Architecture
---
config:
layout: dagre
theme: redux
look: neo
---
flowchart LR
subgraph subGraph0["Storage Providers"]
chroma["ChromaDB"]
storage["Storage Layer"]
end
subgraph subGraph1["Embedding Providers"]
xenova["Xenova"]
embeddings["Embedding Layer"]
end
subgraph subGraph2["Chunking Providers"]
langchain["LangChain (default)"]
chunking["Chunking Layer"]
chonkie["Chonkie"]
builtin["Builtin"]
end
client["MCP Client (Claude)"] --- upload["HTTP Upload Server"]
web["Web UI (Optional)"] --- upload
upload --> abstractions["Provider Abstractions\n(Storage / Embeddings / Chunking)"]
abstractions --> storage & embeddings & chunking
storage --> chroma
embeddings --> xenova
chunking --> langchain & chonkie & builtin
π Quick Start
Privacy-Focused Setup (Local Only)
# Clone and install
git clone <repository>
cd cogni-docs
bun install
# Configure for local processing
cp .env.example .env
# Edit .env with provider-agnostic config:
STORAGE_NAME=chroma
STORAGE_OPTIONS={"url":"http://localhost:8000"}
EMBEDDINGS_NAME=xenova
EMBEDDINGS_OPTIONS={"model":"Xenova/all-MiniLM-L6-v2","maxBatchSize":50}
# Chunking (default: LangChain recursive)
CHUNKING_NAME=langchain
CHUNKING_OPTIONS={"strategy":"recursive","chunkSize":3000,"chunkOverlap":150}
# Start servers
# Start server (Upload + MCP on the same port)
bun run upload-server:prod # Default :3001 (set HTTP_PORT). Use :dev for watch mode
Hybrid Setup (ChromaDB + Local Embeddings)
# Start ChromaDB
docker run -p 8000:8000 chromadb/chroma
# Configure app
STORAGE_NAME=chroma
STORAGE_OPTIONS={"url":"http://localhost:8000"}
EMBEDDINGS_NAME=xenova
EMBEDDINGS_OPTIONS={"model":"Xenova/all-MiniLM-L6-v2"}
# Start app
bun run upload-server:prod
π Configuration Options
Environment Variables
Variable | Type | Description |
---|---|---|
HTTP_PORT | number | Upload server port (default: 3001 in examples, config default 8787) |
STORAGE_NAME | string | Storage provider name (e.g., chroma ) |
STORAGE_OPTIONS | JSON | Provider-specific options as JSON (e.g., { "url": "http://localhost:8000" } or { "projectId": "..." } ) |
EMBEDDINGS_NAME | string | Embeddings provider name (e.g., xenova ) |
EMBEDDINGS_OPTIONS | JSON | Provider-specific options as JSON (e.g., { "model": "Xenova/all-MiniLM-L6-v2" } ) |
CHUNKING_NAME | string | Chunking provider name: langchain (default), chonkie , or builtin |
CHUNKING_OPTIONS | JSON | Provider-specific chunking options as JSON (e.g., { "strategy": "recursive", "chunkSize": 3000, "chunkOverlap": 150 } ) |
CHUNK_SIZE | number | Back-compat: target chunk size (default: 3000) |
CHUNK_OVERLAP | number | Back-compat: overlap between chunks (default: 150) |
MAX_CHUNK_SIZE | number | Back-compat: hard cap for chunk size (default: 5000) |
See .env.example
for complete configuration options.
Chunking strategies (LangChain)
recursive
(default):CHUNKING_NAME=langchain CHUNKING_OPTIONS={"strategy":"recursive","chunkSize":3000,"chunkOverlap":150}
intelligent
(content-type aware: code/markdown/html get tuned separators & sizes):CHUNKING_NAME=langchain CHUNKING_OPTIONS={"strategy":"intelligent","chunkSize":3000,"chunkOverlap":150,"contentTypeAware":true}
semantic
(adjacent merge by embedding similarity):Notes:CHUNKING_NAME=langchain CHUNKING_OPTIONS={ "strategy":"semantic", "chunkSize":3000, "chunkOverlap":150, "contentTypeAware":true, "semanticSimilarityThreshold":0.9, "semanticMaxMergeChars":6000, "semanticBatchSize":64 }
- Tweak
semanticSimilarityThreshold
(typ. 0.85β0.92) per corpus. - If embeddings are unavailable, the provider should fall back to the initial split (no merges).
- Tweak
Deprecated (still parsed for backward compatibility): STORAGE_PROVIDER
, EMBEDDING_PROVIDER
, CHROMA_URL
, XENOVA_MODEL
, MAX_BATCH_SIZE
, UPLOAD_SERVER_PORT
, UPLOAD_SERVER_HOST
.
π§ Technology Stack Comparison
Feature | ChromaDB + Xenova |
---|---|
Privacy | β Self-hosted |
Performance | β Good |
Scalability | β High |
Setup Complexity | β οΈ Medium |
Cost | π° Infrastructure |
Offline Support | β οΈ Partial |
π― Use Cases
Enterprise/Production
β ChromaDB + Xenova
- Automatic scaling
- Enterprise security
- Managed infrastructure
Privacy-Sensitive
β ChromaDB + Xenova
- No external cloud dependencies
- Complete data control
- Works in air-gapped environments
Development/Research
β ChromaDB + Xenova
- Easy experimentation
- Good performance
- Flexible deployment
π Project Structure
app/
βββ index.ts # Starts HTTP Upload + MCP server
βββ config/
β βββ app-config.ts # Zod-validated, provider-agnostic config
βββ chunking/ # Chunking interface, factory, and providers
β βββ chunker-interface.ts
β βββ chunking-factory.ts
β βββ providers/ # Providers: langchain (default), chonkie, builtin
βββ storage/
β βββ storage-interface.ts # Storage interface
β βββ chroma-storage.ts # ChromaDB implementation
β βββ storage-factory.ts # Provider registry + factory
βββ embeddings/
β βββ embedding-interface.ts # Embeddings interface
β βββ embedding-factory.ts # Provider registry + factory
β βββ providers/ # Embedding providers (e.g., Xenova)
βββ server/
β βββ mcp-server.ts # MCP tools + SSE transport (/sse, /messages)
βββ ingest/
β βββ chunker.ts # Ingestion entrypoint; uses chunking service
βββ parsers/
βββ pdf.ts # PDF parser
βββ html.ts # HTML parser
βββ text.ts # Plain text parser
π API Endpoints
Upload Server (Port 3001)
GET /health
- Service health check with provider statusGET /sets
- List documentation setsPOST /sets
- Create documentation setGET /sets/:setId
- Get specific setGET /sets/:setId/documents
- List documents in setPOST /sets/:setId/upload
- Upload documentsDELETE /sets/:setId/documents/:docId
- Delete document
MCP Server (HTTP SSE)
- Transport:
GET /sse
(event stream),POST /messages
(JSON messages) - Tools:
list_documentation_sets
- List available setsget_documentation_set
- Get details about a specific setsearch_documentation
- Vector search within a setagentic_search
- Extractive, context-grounded answers
π οΈ Development
# Install dependencies
bun install
# Development with file watching
bun run upload-server:dev # Upload+MCP server with hot reload
bun run web:dev # Web UI development server
# Type checking
bun run typecheck
# Build for production
bun run web:build
π Health Monitoring
Check service status:
curl http://localhost:3001/health
Response includes:
- Overall service health
- Storage provider status
- Embedding provider status
- System uptime
π€ Contributing
The flexible backend architecture makes it easy to add new providers:
- Storage Provider: Implement
StorageService
interface - Embedding Provider: Implement
EmbeddingService
interface - Update Factories: Add to respective factory files
- Configuration: Add options to config schema
π License
MIT License - see LICENSE file for details.
A Model Context Protocol (MCP) server that provides AI assistants with the ability to search and query documentation using local-first, provider-agnostic backend.
Architecture
This project implements a dual-server architecture:
- HTTP Upload Server - For document ingestion and management
- MCP Server - For AI assistants to query documentation
Key Features
- Multi-format parsing: PDF, HTML, and plain text documents
- Agentic search: Extractive answers grounded in your documentation via MCP tools
- Multi-tenant: Multiple documentation sets with isolated search
- Modern stack: Bun runtime, TypeScript, Elysia framework
Quick Start
Prerequisites
- Bun runtime installed
- Docker (optional) for ChromaDB
Setup
- Clone and install dependencies:
bun install
- Configure environment:
cp .env.example .env
# Edit .env with your provider-agnostic settings
- Start the upload server:
bun run upload-server
- In another terminal, start the MCP server:
bun run mcp-server
Usage
1. Upload Documentation
Create a documentation set and upload files:
# Create a documentation set
curl -X POST http://localhost:3001/sets \
-H "Content-Type: application/json" \
-d '{"name": "My API Docs", "description": "REST API documentation"}'
# Upload documents (PDF, HTML, TXT)
curl -X POST http://localhost:3001/sets/{SET_ID}/upload \
-F "files=@documentation.pdf" \
-F "files=@api-guide.html"
2. Query via MCP
The MCP server exposes four tools:
list_documentation_sets
- List all available documentation setsget_documentation_set
- Get details about a specific setsearch_documentation
- Basic vector search within a setagentic_search
- Agentic, context-grounded answers from your docs
3. Agentic Search Example
// In Claude or another MCP-compatible AI assistant
await mcp.callTool("agentic_search", {
setId: "your-set-id",
query: "How do I authenticate API requests?",
limit: 10,
});
Configuration
Environment Variables
# Core
HTTP_PORT=3001
# Provider-agnostic
STORAGE_NAME=chroma
STORAGE_OPTIONS={"url":"http://localhost:8000"}
EMBEDDINGS_NAME=xenova
EMBEDDINGS_OPTIONS={"model":"Xenova/all-MiniLM-L6-v2","maxBatchSize":50}
# Chunking
CHUNKING_NAME=langchain
CHUNKING_OPTIONS={"strategy":"recursive","chunkSize":3000,"chunkOverlap":150}
CHUNK_SIZE=3000
CHUNK_OVERLAP=150
MAX_CHUNK_SIZE=5000
Development
Scripts
bun run upload-server:dev # Hot reload Upload+MCP server
bun run upload-server:prod # Production Upload+MCP server
bun run web:dev # Web UI dev
bun run typecheck # Type checking
Adding New Document Types
- Create parser in
app/parsers/
- Register/route the MIME type alongside existing parsers
- Ensure chunking strategy in
app/ingest/chunker.ts
suits the new type
Architecture Decisions
Why Bun?
- Performance: Fast startup and runtime
- TypeScript native: No compilation step needed
- Modern toolchain: Built-in testing, bundling, package management
Troubleshooting
SSE transport disconnects
- Prefer
bun run upload-server:prod
(non-watch) for stability. - Ensure your MCP client uses
GET /sse
(not POST) andPOST /messages
. - If the IDE session gets stale, reload the MCP client to re-handshake.
ChromaDB connectivity
- Verify Chroma is running and
STORAGE_OPTIONS={"url":"http://localhost:8000"}
. - Check
GET /health
for storage status; restart Chroma if down.
Embedding model setup
- Xenova/Transformers.js models download on first run; allow network access once if needed.
- Adjust
EMBEDDINGS_OPTIONS
(e.g.,maxBatchSize
) if you see memory warnings. - If you change model/provider, embedding dimensions may differ. Use a fresh collection or reingest to avoid mixing dimensions.
Future Enhancements
- Support for more document formats (DOCX, Markdown)
- Document metadata search and filtering
- Batch upload improvements
- Vector search optimization
- Authentication for upload server
- Metrics and monitoring
Contributing
This project follows the user's coding guidelines:
- TypeScript with proper typing
- Functional programming patterns
- Modular architecture
- Comprehensive error handling
License
MIT
To install dependencies:
bun install
To run:
bun run upload-server:prod
This project was created using bun init
in bun v1.2.20. Bun is a fast all-in-one JavaScript runtime.