nicsuzor/zotmcp
3.2
If you are the rightful owner of zotmcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
ZotMCP is an MCP server designed for semantic search and literature review across a shared Zotero academic library.
Tools
6
Resources
0
Prompts
0
ZotMCP
MCP server for semantic search and literature review across a shared Zotero academic library.
Features
- Semantic Search - Vector-based search across library items
- Citation Retrieval - Get properly formatted academic citations
- Similar Items - Find related works by similarity
- Author Search - Find all works by specific authors
- 7 Specialized Tools - Search, retrieve, and analyze academic literature
MCP Client Configuration
{
"mcpServers": {
"zotero": {
"command": "docker",
"args": ["run", "--rm", "-i", "us-central1-docker.pkg.dev/prosocial-443205/reg/zotmcp:latest"]
}
}
}
Available Tools
search
- Semantic search across library (primary tool)get_item
- Retrieve full text and metadata by Zotero keyget_similar_items
- Find works similar to a given itemsearch_by_author
- Find all works by an authorget_collection_info
- Library statistics and metadataassisted_search
- LLM-assisted literature review (experimental)
Distribution
The Docker image includes:
- ✅ All dependencies (FastMCP, ChromaDB, Pydantic)
- ✅ ChromaDB vectors (baked in, ~3GB)
- ✅ No local Python setup required
Colleagues just need to:
- Install Docker Desktop
- Pull image:
docker pull us-central1-docker.pkg.dev/prosocial-443205/reg/zotmcp:latest
- Configure MCP client (see above)
- Restart client
Architecture
ZotMCP is an MCP (Model Context Protocol) server that provides semantic search and literature review capabilities for a shared Zotero academic library.
- Zotero Library:
prosocial
group library - ChromaDB Collection:
prosocial_zot
- Embedding Model: Google gemini-embedding-001 (3072 dimensions)
The ChromaDB is created by the Buttermilk vectorization pipeline.
- Full text sourced from Zotero group first
- Where full text is not available, we extract full text from source PDF
- Each source is chunked using a semantic splitter into approximately 1000 tokens, with overlap of 250 tokens.
- Citations are generated by a LLM based on the first page of full text in the format:
Authors (Year). Title. Outlet