rag-mcp-server

CipherScout/rag-mcp-server

3.1

If you are the rightful owner of rag-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

The RAG MCP Server is a self-hosted Model Context Protocol server designed to crawl documentation sites, store semantic chunks, and provide real-time job visibility.

RAG MCP Server

Self-hosted Model Context Protocol (MCP) server that crawls documentation sites with Crawl4AI, stores semantic chunks in PostgreSQL + pgvector, and exposes real-time job visibility through a Next.js dashboard.

Toolchain & Stack

  • Backend: Python 3.12, FastMCP, async job pipeline, UV for dependency management
  • Frontend: Next.js 15 App Router, TypeScript, TailwindCSS, pnpm workspace
  • Data: PostgreSQL 16 + pgvector (HNSW index), Alembic migrations
  • Embeddings: Local Ollama (embeddinggemma:300m) reachable from containers via host.docker.internal:11434
  • Containerization: Docker Compose v2 with dedicated Dockerfile.backend and Dockerfile.frontend
  • Command Runner: just consolidates install/build/test/dev workflows

Architectural details live in docs/architecture/ (see tech-stack.md, source-tree.md, coding-standards.md for enforced conventions).

Prerequisites

  • Python 3.12+
  • Node.js 18.18+ and pnpm (npm install -g pnpm)
  • uv
  • just
  • Docker & Docker Compose
  • Local Ollama with embeddinggemma:300m pulled

Quick Start

# 1. Clone repo
git clone <repo-url> && cd rag-mcp-server

# 2. Configure environment variables
cp .env.example .env
# Edit .env if needed (defaults work for local development)

# 3. Start PostgreSQL with pgvector
just docker-up               # Start PostgreSQL container in background

# 4. Verify pgvector installation
docker exec rag-mcp-postgres psql -U raguser -d rag_mcp_db -c "SELECT * FROM pg_extension WHERE extname = 'vector';"
# Expected: Row with extname='vector' (version 0.8.1)

# 5. Install dependencies
just install-frontend        # pnpm install --frozen-lockfile in /frontend
just install-server          # uv sync - resolve backend deps

# 6. Launch services (in separate terminals)
just run-server              # uv run python app.py (FastMCP server, runs migrations)
just run-frontend            # pnpm dev (Next.js dashboard)

# 7. Run tests
just test-server             # uv run pytest
just test-frontend           # pnpm test:unit

Common commands (justfile)

CommandDescription
just install-serverInstall backend dependencies via uv
just install-frontendInstall frontend dependencies via pnpm
just build-serverBuild backend package via uv
just build-frontendBuild Next.js app
just run-serverRun FastMCP server (applies migrations)
just run-frontendRun Next.js dev server
just test-serverBackend pytest suite (/server/tests)
just test-frontendFrontend unit suite
just docker-upStart PostgreSQL container in background
just docker-downStop PostgreSQL container
just docker-logsView PostgreSQL container logs
just docker-resetStop container and remove volumes (fresh start)
just compose-up / just compose-downManage full Docker Compose stack (when backend/frontend containerized)

Database Setup

PostgreSQL with pgvector

The project uses PostgreSQL 16 with the pgvector extension for vector similarity search. Docker Compose handles the setup automatically.

Configuration:

  • Database credentials are defined in .env (copy from .env.example)
  • Default values: raguser / ragpass / rag_mcp_db
  • Connection string: postgresql://raguser:ragpass@localhost:5432/rag_mcp_db
  • Data persisted in local directory: ./data/postgres/

Starting the database:

just docker-up

Verifying pgvector extension:

docker exec rag-mcp-postgres psql -U raguser -d rag_mcp_db -c "SELECT * FROM pg_extension WHERE extname = 'vector';"

Expected output: Row with extname='vector' and extversion='0.8.1' (or later)

Viewing logs:

just docker-logs

Fresh start (removes all data):

just docker-down
rm -rf data/postgres
mkdir -p data/postgres
just docker-up

Note: PostgreSQL data is stored locally in ./data/postgres/. To back up your database, simply copy this directory.

Troubleshooting

Port 5432 already in use:

  • Check if PostgreSQL is running locally: lsof -i :5432
  • Stop local PostgreSQL: brew services stop postgresql (macOS) or sudo systemctl stop postgresql (Linux)
  • Or change port in docker-compose.yml: "5433:5432" and update DATABASE_URL in .env

Permission errors on data volume:

  • Ensure the ./data/postgres/ directory is writable
  • If using Linux, PostgreSQL container runs as user postgres (UID 999)
  • Fix: sudo chown -R 999:999 data/postgres/ on Linux systems

Docker daemon not running:

  • Start Docker Desktop (macOS/Windows)
  • Or start Docker service: sudo systemctl start docker (Linux)

pgvector extension not found:

  • Verify using pgvector/pgvector:pg16 image in docker-compose.yml
  • Check init script executed: docker logs rag-mcp-postgres | grep "CREATE EXTENSION"
  • If missing, manually create: docker exec rag-mcp-postgres psql -U raguser -d rag_mcp_db -c "CREATE EXTENSION vector;"

Database Migrations (Alembic)

The project uses Alembic for database schema migrations. Migrations are automatically applied on server startup.

Migration Commands:

# Apply all pending migrations
just alembic-up

# Rollback last migration
just alembic-down

# View migration history
just alembic-history

# Check current migration revision
just alembic-current

# Create a new migration (for future use when models exist)
just alembic-revision "migration message"

Automatic Migration on Startup:

The FastMCP server automatically runs alembic upgrade head before starting, ensuring the database schema is always up-to-date. If migrations fail, the server will not start.

Migration Workflow:

  1. Creating Migrations: Once SQLAlchemy models are defined (future stories), use just alembic-revision "description" to create a new migration file
  2. Applying Migrations: Run just alembic-up or let the server apply them automatically on startup
  3. Rolling Back: Use just alembic-down to undo the last migration

Troubleshooting:

Database connection failed during migration:

  • Ensure PostgreSQL is running: just docker-up
  • Verify DATABASE_URL in .env matches your PostgreSQL credentials
  • Check connection: docker exec rag-mcp-postgres psql -U raguser -d rag_mcp_db -c "SELECT 1;"

Revision not found:

  • View migration history: just alembic-history
  • Ensure you're in the repository root when running commands
  • Check that /server/migrations/versions/ contains migration files

Manual migration execution if server fails to start:

  • Run migrations manually: just alembic-up
  • Check migration status: just alembic-current
  • View error details in server logs

Repository Layout (excerpt)

/
├── docs/                # PRD, architecture, stories (sharded)
├── frontend/            # Next.js App Router app (pnpm workspace)
│   ├── app/
│   ├── components/
│   ├── lib/
│   ├── __tests__/
│   └── playwright/
├── server/              # FastMCP service
│   ├── app.py
│   ├── jobs/
│   ├── integrations/
│   ├── persistence/
│   └── tests/           # unit + integration suites
├── migrations/          # Alembic env & version scripts
├── scripts/             # Helper scripts invoked via just recipes
├── docker-compose.yml
├── Dockerfile.backend
├── Dockerfile.frontend
├── justfile
├── .gitignore
└── README.md

References

  • PRD: docs/prd.md (sharded under docs/prd/)
  • Architecture suite: docs/architecture/
  • Story library: docs/stories/ with naming convention [epic].[story_number].story.md (e.g., 1.1.story.md)

For detailed requirements, acceptance criteria, and sequencing, consult the PRD and corresponding epics/stories. Each story’s Dev Notes summarize the relevant architecture context to keep build tasks streamlined.***