hovav698/superlinked_mcp_server
If you are the rightful owner of superlinked_mcp_server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
A Model Context Protocol (MCP) server that provides semantic search and RAG capabilities using Superlinked's vector search framework.
Superlinked MCP Server
A Model Context Protocol (MCP) server that provides semantic search and RAG capabilities using Superlinked's vector search framework. This server enables AI assistants to create and query vector indexes from structured data files with support for multiple search dimensions (text similarity, recency, categories, and numbers).
Overview
This project creates an MCP server that wraps Superlinked's InMemoryExecutor, allowing any MCP-compatible client (like Claude Desktop) to perform sophisticated semantic search operations on structured data. The server provides tools to preview data files, create multi-dimensional vector indexes, and query them using natural language.
How It Works
The server uses Superlinked's vector search capabilities to create in-memory indexes that combine multiple search dimensions:
- Text Similarity: Semantic search using sentence transformers embeddings
- Recency: Time-based ranking for timestamp fields
- Number: Numeric value ranking (prices, ratings, etc.)
- Category: Categorical matching and filtering
When you create an index, the server:
- Analyzes your data file structure
- Creates appropriate vector spaces based on your column mapping
- Applies custom weights to each dimension
- Ingests the data into an in-memory index
- Makes it available for semantic queries
Features
- Multi-format support: CSV and JSON data files
- Flexible schema: Dynamic schema creation based on your data
- Multi-dimensional search: Combine text, time, numbers, and categories
- Custom weighting: Control the importance of each search dimension
- In-memory execution: Fast queries without external dependencies
- MCP compatibility: Works with any MCP client
Installation
Prerequisites
- Python 3.10+
- pip package manager
Setup
- Clone this repository:
git clone <repository-url>
cd rag_repo
- Install dependencies:
pip install -r requirements.txt
- Set required environment variable:
# Required for Streamlit demo
export ANTHROPIC_API_KEY=your_api_key_here
# Optional: Change Claude model (default: claude-haiku-4-5-20251001)
export CLAUDE_MODEL=claude-haiku-4-5-20251001
MCP Server Tools
The server provides three main tools:
1. preview_file
Preview the structure and contents of a data file before indexing. Use this tool to understand your data structure and decide which spaces to create for indexing.
Parameters:
file_path(string): Path to data file (CSV or JSON)rows(integer, optional): Number of sample rows to preview (default: 5)
Returns:
- Total row count
- Column names and data types
- Sample records
Example use case: "Show me what's in sample_data/business_news.json"
2. create_index
Create a searchable vector index from structured data.
Parameters:
file_path(string): Path to data file to indexcolumn_mapping(dict): Maps column names to space typestext_similarity: For text fields (semantic search)recency: For timestamp/date fields (Unix timestamps)number: For numeric fieldscategory: For categorical fields
weights(dict): Maps column names to weight values (0.0 to 1.0)- Higher weight = more influence on ranking
- Example:
{"description": 0.8, "timestamp": 0.3}
Returns:
- Index name (derived from filename)
- Column configuration
- Ingestion statistics
- Applied weights
Example use case: "Create an index from business_news.json with text search on the description field and recency on the date field. Put more weight on the description."
3. query_index
Search an index using natural language queries.
Parameters:
index_name(string): Name of the index (filename without extension)query_text(string): Natural language search querylimit(integer, optional): Maximum results to return (default: 5)
Returns:
- Array of results with:
id: Record identifierscore: Similarity scorefields: All indexed fields and their values
Example use case: "Search the business_news index for articles about strikes"
Integration with External MCP Clients
Claude Desktop
To use this server with Claude Desktop:
-
Locate your Claude Desktop configuration file:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
- macOS:
-
Add the server configuration:
{
"mcpServers": {
"superlinked-rag": {
"command": "python",
"args": ["/absolute/path/to/rag_repo/mcp_server.py"],
"type": "stdio"
}
}
}
-
Restart Claude Desktop
-
The server tools will now be available to Claude. You can ask Claude to:
- Preview your data files
- Create vector indexes
- Perform semantic searches
Streamlit Demo App
This repository includes a Streamlit chatbot demo that uses the Claude Agent SDK as an MCP client. This demonstrates how to build a conversational interface with the MCP server.
Running the Demo
- Run the Streamlit app:
streamlit run streamlit_chatbot.py
-
Open your browser to the provided URL (typically
http://localhost:8501) -
IMPORTANT: small unfixed bug - right after fresh environment installation, the MCP server may not load on the first app start. Simply refresh the browser once after the Streamlit app loads — you only need to do this one time and it fix the problem.
Demo Features
- Real-time streaming: See AI reasoning steps as they happen
- Conversation memory: Multi-turn conversations with context
- Tool visualization: Watch the AI use MCP tools
- Interactive UI: Clean interface with execution step display
Example Conversation Flow
- "I have this file: sample_data/business_news.json. Show me what's in it."
- "Create an index for semantic search on the description field and time-based ranking on the date field."
- "Who wanted to have a strike?"
- "This news is old, put more weight on the date field."
- "Any news related to gas? What does it say?"