raphael-nogueira/weaviate_mcp_server
If you are the rightful owner of weaviate_mcp_server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The Weaviate MCP Server is a Model Context Protocol server designed for seamless integration with Weaviate, enabling semantic search in knowledge bases through conversational AI.
weaviate_query
Semantic search with GraphQL, including property selection control, result limiting, and filtering.
Weaviate MCP Server
A Model Context Protocol (MCP) server for seamless integration with Weaviate, enabling semantic search in knowledge bases through conversational AI.
🏗️ Architecture
flowchart TD
A[Documents<br/>JSON/CSV/MD/TXT] --> B[Population Script<br/>populate_knowledge_base.rb]
B --> C[Text Processing<br/>& Chunking]
C --> D[Weaviate Database<br/>Vector Storage]
E[OpenAI/Cohere<br/>Vectorization API] --> D
F[MCP Server<br/>weaviate-mcp-server] --> D
F --> G[MCP Client<br/>Cursor/Claude/Continue]
H[User Query] --> G
G --> F
F --> I[GraphQL Query<br/>Semantic Search]
I --> D
D --> J[Vector Results]
J --> F
F --> G
G --> K[AI Response]
style A fill:#e1f5fe
style D fill:#f3e5f5
style F fill:#e8f5e8
style G fill:#fff3e0
style K fill:#fce4ec
🚀 Features
- MCP Server for vector queries in Weaviate
- Knowledge Base Population with multiple file formats
- Semantic Search using GraphQL
- Intelligent Chunking for large documents
- Automatic Schema creation
- Real-time Queries with filtering
⚡ Quick Start
Prerequisites
- Ruby 3.4.3+
- Docker and Docker Compose
- OpenAI API key (recommended)
Setup
# Clone and install
git clone https://github.com/your-username/weaviate_mcp_server.git
cd weaviate_mcp_server
bundle install
# Start Weaviate
docker compose up -d
# Populate knowledge base
./bin/populate_knowledge_base.rb examples/sample_documents.json
# Test MCP server
ruby examples/example_usage.rb
📊 MCP Server
MCP Client Integration
Configure with MCP clients like Cursor, Continue, or Claude Desktop:
{
"servers": {
"weaviate": {
"command": "ruby",
"args": ["/path/to/weaviate_mcp_server/bin/weaviate-mcp-server"]
}
}
}
Available Tools
- weaviate_query: Semantic search with GraphQL
- Property selection control
- Result limiting and filtering
- Vector similarity search
📚 Knowledge Base Population
Supported Formats
- JSON: Arrays or single objects with
title
andcontent
- CSV: Headers required, uses 'content' and 'title' columns
- Text/Markdown: Automatic processing or chunking
Basic Usage
# Add documents
./bin/populate_knowledge_base.rb documents.json
# Split large files into chunks
./bin/populate_knowledge_base.rb -s 1000 large_document.md
# Custom class name
./bin/populate_knowledge_base.rb -c "Articles" content.json
# List existing classes
./bin/populate_knowledge_base.rb -l
Schema
Automatic schema creation with:
title
,content
,source_file
category
,author
,created_at
chunk_index
(for split texts)
🔧 Configuration
Weaviate Setup
# compose.yml
services:
weaviate:
image: cr.weaviate.io/semitechnologies/weaviate:1.24.6
ports:
- "8080:8080"
environment:
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
DEFAULT_VECTORIZER_MODULE: 'text2vec-openai'
ENABLE_MODULES: 'text2vec-openai,text2vec-cohere,text2vec-huggingface'
OPENAI_API_KEY: ${OPENAI_API_KEY}
Vectorization
Set environment variables for your chosen provider:
OPENAI_API_KEY
(default)COHERE_API_KEY
HUGGINGFACE_API_KEY
🎯 Use Cases
- Documentation Search: Technical docs with semantic understanding
- Customer Support: FAQ and knowledge base queries
- Research: Academic papers and content discovery
- RAG Applications: Retrieval Augmented Generation workflows
🔍 Query Examples
MCP Client Query
{
jsonrpc: '2.0',
method: 'tools/call',
params: {
name: 'weaviate_query',
arguments: {
class_name: 'Document',
query: 'machine learning concepts',
limit: 5,
properties: ['title', 'content']
}
}
}
With Filters
{
where_filter: {
path: ['category'],
operator: 'Equal',
valueText: 'AI'
}
}
🚀 RAG Integration
Perfect for RAG applications:
# 1. Query relevant documents
documents = weaviate_query(query: user_question, limit: 3)
# 2. Create context for LLM
context = documents.map { |doc| doc['content'] }.join("\n\n")
prompt = "Context: #{context}\n\nQuestion: #{user_question}"
🐛 Troubleshooting
Common Issues
- Connection Error: Check
docker ps
andcurl http://localhost:8080/v1/.well-known/ready
- Vectorization Error: Verify API keys and module configuration
- Encoding Issues: Ensure UTF-8 encoding, use
-v
flag for logs
Verification
# Check documents
./bin/populate_knowledge_base.rb --count Document
# Monitor Weaviate
docker compose logs weaviate
🤝 Contributing
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Submit a pull request
Development
bundle install
bundle exec rspec # Run tests
bundle exec rubocop # Check code style
📄 License
MIT License - see file for details.
📞 Support
- for bug reports
- for questions
- for usage patterns
Built with ❤️ for the AI community