webauthn-mcp by dqj1998 - MCP Server

WebAuthn MCP (Local RAG)

A local, retrieval-only MCP server focused on the WebAuthn specification. It crawls the spec, extracts and cleans content, chunks with section awareness, embeds locally (CPU by default), stores in a persistent vector DB, and exposes a single MCP tool named ask for semantic retrieval.

No external LLMs or API keys are required by default.

Start in minutes

Install dependencies

python -m venv .venv
source .venv/bin/activate (Windows: .venv\Scripts\activate)
pip install -U pip
pip install -r requirements.txt

Configure

cp .env-example .env
Set at minimum: WEBAUTHN_SPEC_URL and VECTOR_DB_DIR

Build/refresh the index

python -m webauthn_mcp.cli build-index --log-level INFO
Optional dry-run: python -m webauthn_mcp.cli build-index --dry-run
Manifest written under VECTOR_DB_DIR/INDEX_NAME/INDEX_VERSION/manifest.json

Run MCP server (choose one)

Stdio (default): python -m webauthn_mcp.cli serve
Or: python server.py
HTTP (streamable via FastMCP): python -m webauthn_mcp.cli serve-http

Use ask tool via MCP client Example payload: {"prompt":"How is WebAuthn registration ceremony defined?","top_k":5,"include_snippets":true}

Features

Local crawler (BFS) with:
- .env configuration
- robots.txt respect (optional)
- polite concurrency, retries/backoff for 429/503 with Retry-After + jitter
- per-host minimum delay and robots.txt crawl-delay support
- URL fragment canonicalization to avoid duplicate anchor fetches
- ETag/Last-Modified conditional requests
- Incremental crawl cache (URL metadata, content hash, timestamps)
Content extraction and cleaning:
- Trafilatura if available; BeautifulSoup readability-like fallback
- Removes navigation/boilerplate; preserves headings, anchors, lists, code, tables
- Normalizes whitespace, converts relative links to absolute
Section-aware chunking:
- Split by headings, then by paragraphs with target ~800 tokens and ~120 overlap
- Keep code blocks intact when possible
- Stable chunk fingerprint for deduplication across runs
Local embeddings and persistence:
- Sentence-Transformers by default (e.g., all-MiniLM-L6-v2 or BAAI/bge-small-en)
- Embedding cache keyed by chunk fingerprint and model
- Persistent vector store: Chroma (default) with FAISS fallback
Retrieval pipeline:
- Query normalization and expansion (acronyms, heading hints)
- ANN with MMR diversity; optional local cross-encoder rerank
- De-duplicated, stable ordering by score
MCP tool:
- ask: Retrieval-only, returns structured passages/snippets
- Validates inputs, rate-limited to prevent abuse
Indexing commands:
- Build or refresh index end-to-end
- Dry-run support and manifest with statistics
Quality:
- Duplicate and near-duplicate suppression (content hash + SimHash)
- Unicode normalization and boilerplate removal
- Optional acronym expansion for WebAuthn terms

Requirements

Python 3.10+
macOS, Linux, or Windows
No GPU required (CPU-only by default)

Install dependencies:

python -m venv .venv
. .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -U pip
pip install -r requirements.txt

Key packages: mcp, fastmcp, httpx, beautifulsoup4, lxml, trafilatura, sentence-transformers, chromadb, faiss-cpu, pytest.

HTTPX baseline

Baseline: httpx>=0.28.1. No alternate compatibility profiles required.

Configuration

Copy .env-example to .env and adjust. Required keys:

WEBAUTHN_SPEC_URL: Root URL to crawl (e.g., https://www.w3.org/TR/webauthn/)
VECTOR_DB_DIR: Directory to persist vectors and state

Recommended keys (sensible defaults provided):

INDEX_NAME, INDEX_VERSION
CRAWL_MAX_DEPTH, CRAWL_MAX_PAGES, CRAWL_CONCURRENCY, USER_AGENT, TIMEOUT_SECONDS, CRAWL_MIN_DELAY_SECONDS, RESPECT_ROBOTS, ALLOWED_HOSTS, INCLUDE_SITEMAP
EMBEDDING_MODEL, EMBEDDING_DEVICE, EMBEDDING_BATCH_SIZE
CHUNK_SIZE_TOKENS, CHUNK_OVERLAP_TOKENS
VECTOR_BACKEND (chroma|faiss)
RERANKER_MODEL (optional)
ASK_RPM, ASK_BURST (rate limit)

See the provided .env-example for details.

Build or Refresh the Index

End-to-end crawl/extract/chunk/embed/upsert:

python -m webauthn_mcp.cli build-index --log-level INFO

Uses incremental crawl cache and deduplicated chunk fingerprints to avoid re-downloading or re-embedding unchanged content.
Creates or updates a manifest at: VECTOR_DB_DIR/INDEX_NAME/INDEX_VERSION/manifest.json

Dry run (no writes):

python -m webauthn_mcp.cli build-index --dry-run

Full rebuild (wipe and re-embed)

Use this when you want to completely recreate the vector DB (e.g., corruption, stale state, or to force a clean slate).

Stop any running servers

Stdio server entrypoint:
HTTP server entrypoint:

Remove the current index data (Chroma/FAISS + ID index + manifest)

Paths for the default index visible in this repo:

(only if FAISS was used)

macOS/Linux example:

rm -rf vector_db/webauthn-spec/v1/chroma
rm -f  vector_db/webauthn-spec/v1/ids.json
rm -f  vector_db/webauthn-spec/v1/manifest.json
rm -rf vector_db/webauthn-spec/v1/faiss

Optional: clear caches to force recomputation

Embeddings cache: .cache/embeddings (managed by )
Crawl cache: .cache/crawl

macOS/Linux example:

rm -f  .cache/embeddings/cache_*.jsonl
rm -rf .cache/crawl

Rebuild the index

python -m webauthn_mcp.cli build-index --log-level INFO

Validate

Print manifest: python -m webauthn_mcp.cli manifest (see )
Verify updated files: ,
Use DEBUG logs if diagnosing: python -m webauthn_mcp.cli build-index --log-level DEBUG

Notes:

Ensure .env values VECTOR_DB_DIR, INDEX_NAME, INDEX_VERSION are consistent for both build and serve; see .
The ID index prevents re-upserting existing chunk IDs; deleting is required for a true full rebuild (see ).
If Chroma init fails the code falls back to FAISS automatically; clear both backends’ folders if switching (see ).

Run the MCP Server

Via the CLI:

python -m webauthn_mcp.cli serve

Or with the top-level entrypoint:

python server.py

The server runs over stdio using the official Python MCP library and registers a single tool: ask.

Run the HTTP Server (Streamable HTTP via FastMCP)

This project also exposes the MCP tools over an HTTP server using FastMCP’s streamable-http transport. This is optional and does not replace the stdio server.

Installation:

FastMCP is included in requirements; install project dependencies with:

pip install -r requirements.txt

Environment:

Configure host/port via .env:
- MCP_SERVER_HOST (default 0.0.0.0)
- MCP_SERVER_PORT (default 8080)
- See

Run:

python -m webauthn_mcp.cli serve-http

Available tools:

ask: Same retrieval-only tool as the stdio server
echo: A simple smoke-test tool to verify the HTTP endpoint is reachable

Notes:

The HTTP entrypoint is implemented in and wired into the CLI as serve-http in .
FastMCP is included in base .

MCP Tool: ask

Name: ask
Description: Ask questions about WebAuthn and retrieve relevant specification passages
Input JSON:
- prompt: string (required)
- top_k: int (optional, default 5, bounded [1, 20])
- score_threshold: float (optional)
- include_snippets: bool (optional, default true)
Output JSON:
- count: number of hits
- hits: list of objects:
  - url, title, heading_path, anchor_id, score, chunk_id, metadata
  - snippet when include_snippets is true (short highlighted excerpt)
Behavior:
- Retrieval-only; does not call external LLMs
- Returns an error if the index is empty advising to run the indexer first
- Enforces simple rate limiting

Example payload:

{
  "prompt": "How is WebAuthn registration ceremony defined?",
  "top_k": 5,
  "include_snippets": true
}

Example result (truncated):

{
  "count": 3,
  "hits": [
    {
      "url": "https://www.w3.org/TR/webauthn/#sctn-create-credential",
      "title": "Web Authentication: An API for accessing Public Key Credentials Level 3",
      "heading_path": "Registration > Ceremony",
      "anchor_id": "sctn-create-credential",
      "score": 0.78,
      "snippet": "…The **registration** **ceremony** consists of…",
      "chunk_id": "c7f4e…",
      "metadata": { "token_count": 732, "...": "..." }
    }
  ]
}

Project Structure

— constants and helpers
— .env loading and dataclass settings
— normalization, token count, MMR, snippets
— token-bucket limiter
— BFS crawler with robots, ETag, cache
— trafilatura/BS extraction, headings/anchors
— section-aware splitter, fingerprints
— sentence-transformers + JSONL cache
— Chroma persistent store with FAISS fallback
— semantic search, MMR, optional rerank
— end-to-end pipeline, manifest
— MCP server wiring and ask tool
— CLI commands
— stdio MCP entrypoint wrapper

Persistence and Incrementality

All state resides under VECTOR_DB_DIR:
- Chroma/FAISS index data
- Manifest (doc/chunk counts, model, timestamps)
Crawl cache resides under .cache/crawl by default
Embedding cache resides under .cache/embeddings by default
Idempotent indexing by content hash + chunk fingerprint
Subsequent runs avoid re-downloading unchanged pages and re-embedding unchanged chunks

Security and Reliability

Retrieval-only; no generation and no external LLM calls
Input validation and bounds for ask
Rate limiting to protect server
Network timeouts and retry backoff
robots.txt respect when enabled
CPU-only execution by default

Testing

Run the test suite:

pytest -q

The suite covers:

Chunking stability and fingerprinting
Embedding cache behavior (no duplicate appends)
Indexing idempotency (no duplicate upserts)
Retrieval ranking and snippets using synthetic fixtures

Troubleshooting

Index is empty error:
- Run: python -m webauthn_mcp.cli build-index
Slow first run:
- Initial crawl and model download may take time; subsequent runs are incremental and cached
429 Too Many Requests during crawl:
- Set CRAWL_CONCURRENCY=1 in .env
- Increase CRAWL_MIN_DELAY_SECONDS to 2–3
- Keep ALLOWED_HOSTS minimal; anchor URLs are de-duplicated via canonicalization
Chroma issues:
- Set VECTOR_BACKEND=faiss in .env to force FAISS
Reranker not available:
- Leave RERANKER_MODEL empty to skip reranking

License

See .