mcp_server4j

jeremylem/mcp_server4j

3.2

If you are the rightful owner of mcp_server4j and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

Java implementation of a local knowledge base using the Model Context Protocol (MCP) for querying documents with hybrid search capabilities.

MCP Server 4J - Local Knowledge Base

Java implementation of a local knowledge base using the Model Context Protocol (MCP). Query your documents with hybrid search (BM25 + vector similarity).

Features

  • Hybrid search: BM25 keyword + vector semantic similarity (30% + 70% weights)
  • Dual storage: In-memory Lucene BM25 + ChromaDB vector store
  • MCP protocol: Model Context Protocol server implementation
  • Multi-format support: PDF, Markdown, TXT via Apache Tika

Quick Start

Prerequisites

  • Docker and Docker Compose
  • Java 21+ (for local development)
  • Maven 3.8+ (for local development)

1. Add Documents

documents/
├── mybook.pdf
├── notes.md
└── article.txt

2. Start Services

docker-compose up -d
# ChromaDB: port 8000
# MCP Server: port 8001

3. Ingest Documents

docker-compose run --rm -v "$(pwd)/documents:/docs" mcp-server ingest \
  --docs_dir "/docs" --chroma-host chroma --chroma-port 8000

4. Query Your Knowledge Base

Via MCP JSON-RPC endpoint:

curl -X POST http://localhost:8001/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "tools/call",
    "params": {
      "name": "query_knowledge_base",
      "arguments": {
        "query": "What is the CAP theorem?",
        "topK": 5,
        "useHybrid": true
      }
    }
  }'

Via REST API:

curl -X POST http://localhost:8001/api/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the CAP theorem?",
    "topK": 5,
    "useHybrid": true
  }'

Ingestion Pipeline

Documents → Finder → Loader → Chunker → BM25 Index + Vector Store

Key Components:

  • RecursiveDocumentFinder - Recursively find documents in directory
  • MultiFormatDocumentLoader - PDF, Markdown, TXT via Apache Tika
  • RecursiveDocumentChunker - 512-char chunks, 50-char overlap
  • LuceneBM25Indexer - In-memory keyword index (Apache Lucene)
  • ChromaVectorSearch - Embeddings via all-MiniLM-L6-v2

Retrieval Pipeline

Query → BM25 Search + Vector Search → Score Fusion → Ranked Results

Key Components:

  • BaselineRetriever - Orchestrates hybrid search (30% BM25 + 70% vector)
  • LuceneBM25Indexer - BM25 keyword search with Lucene
  • ChromaVectorSearch - Semantic similarity via LangChain4j
  • HybridScoreFusion - Weighted score combination and normalization
  • KnowledgeBaseTool - MCP protocol interface

Core Interfaces

  • KeywordIndexer - BM25 indexing and search operations
  • DocumentLoader - Multi-format document parsing
  • DocumentChunker - Text splitting strategies
  • IngestionPipeline - End-to-end ingestion workflow

Differences from Python Version

AspectPython VersionJava Version
LanguagePython 3.11Java 21
FrameworkFastMCP + FastAPISpring Boot + MCP protocol
DI ContainerManual wiringSpring IoC container
BM25 Libraryrank-bm25 (in-memory)Apache Lucene (in-memory)
Vector StoreChromaDB Python clientLangChain4j ChromaDB integration
Embedding ModelSentence TransformersLangChain4j ONNX (all-MiniLM-L6-v2)
Document LoadingLangChain Python loadersApache Tika (universal)
ChunkingLangChain RecursiveCharacterTextSplitterLangChain4j DocumentSplitters.recursive()
ConfigurationHardcoded constantsExternalized config classes
PersistenceIn-memory BM25, ChromaDB volumeIn-memory BM25, ChromaDB volume
Code Size~200 lines~2000 lines

Why Java?

Advantages:

  • Strong type safety and compile-time error detection
  • Spring Boot ecosystem (DI, config management, testing)
  • Native Lucene BM25 implementation (no external BM25 library needed)
  • ONNX runtime for embeddings (no Python dependencies)

Tradeoffs:

  • More verbose (~10x code size vs Python)
  • Higher memory footprint (~500MB vs ~200MB)

Configuration

Retrieval Settings

Edit src/main/resources/application.yml:

retrieval:
  bm25-weight: 0.3           # Keyword importance (0-1)
  vector-weight: 0.7         # Semantic importance (0-1)
  candidate-pool-size: 20    # Candidates before fusion

Or set environment variables:

RETRIEVAL_BM25_WEIGHT=0.3
RETRIEVAL_VECTOR_WEIGHT=0.7
RETRIEVAL_CANDIDATE_POOL_SIZE=20

Ingestion Settings

Chunk size and overlap are configured in the ingestion pipeline:

  • Default chunk size: 512 characters
  • Default overlap: 50 characters

To customize, modify RecursiveDocumentChunker initialization in your configuration.

Development

Local Build

# Compile and package
mvn clean package

# Run unit tests only
mvn clean test

# Run with integration tests (requires Docker for ChromaDB)
mvn clean verify

Docker Build

# Build image
docker-compose build mcp-server

# Rebuild without cache
docker-compose build --no-cache mcp-server

Running Locally (without Docker)

# Start ChromaDB
docker run -p 8000:8000 chromadb/chroma:0.4.24

# Build the JAR
mvn clean package

# Run ingestion CLI
java -jar target/mcp-server4j-1.0.0-SNAPSHOT.jar ingest \
  --docs_dir ./documents \
  --chroma-host localhost \
  --chroma-port 8000

# Run MCP server
java -jar target/mcp-server4j-1.0.0-SNAPSHOT.jar

Troubleshooting

No Search Results

The BM25 index is in-memory and must be rebuilt on each server restart:

# Re-run ingestion to rebuild BM25 index
docker-compose run --rm -v "$(pwd)/documents:/docs" mcp-server ingest \
  --docs_dir "/docs" --chroma-host chroma --chroma-port 8000

ChromaDB Connection Failed

# Check ChromaDB is running
docker-compose ps chroma

# Check ChromaDB logs
docker-compose logs chroma

# Restart ChromaDB
docker-compose restart chroma

No Results from Vector Search

# Check ChromaDB has documents
curl http://localhost:8000/api/v1/collections/baseline_kb

If count is 0, re-run ingestion.

Out of Memory

Increase Docker memory limit or Java heap size:

# In Dockerfile, modify:
ENTRYPOINT ["java", "-Xmx1g", "-jar", "app.jar"]

Performance

Benchmark (29 markdown files, 873 chunks):

  • Ingestion: ~30 seconds
  • Query latency: ~20-30ms average
  • Recall@5: 100% on test queries
  • Memory: ~500MB Java heap + ChromaDB storage
  • Startup: ~5 seconds (Spring Boot + model loading)

References