mcp-pinecone by NimbleBrainInc - MCP Server

Pinecone MCP Server

MCP server for Pinecone vector database. Store and search embeddings for similarity search, semantic search, and RAG (Retrieval Augmented Generation) applications.

Features

Index Management: Create, list, describe, and delete vector indexes
Vector Operations: Upsert, query, fetch, update, and delete vectors
Similarity Search: Find similar vectors with cosine, euclidean, or dot product metrics
Metadata Filtering: Hybrid search with metadata filters
Namespaces: Data isolation for multi-tenancy
Collections: Create backups from indexes
Statistics: Get vector counts and index stats

Setup

Prerequisites

Pinecone account
API key and environment name

Environment Variables

PINECONE_API_KEY (required): Your Pinecone API key
PINECONE_ENVIRONMENT (required): Your Pinecone environment

How to get credentials:

Go to app.pinecone.io
Sign up or log in
Navigate to API Keys section
Copy your API key
Note your environment (e.g., us-west1-gcp, us-east-1-aws)
Store as PINECONE_API_KEY and PINECONE_ENVIRONMENT

Index Types

Serverless (Recommended)

Pay per usage
Auto-scaling
No infrastructure management
Available regions: AWS (us-east-1, us-west-2), GCP (us-central1, us-west1), Azure (eastus)

Pod-based

Fixed capacity
Dedicated resources
More control over performance
Higher cost

Vector Dimensions

Match your embedding model:

OpenAI text-embedding-ada-002: 1536 dimensions
OpenAI text-embedding-3-small: 1536 dimensions
OpenAI text-embedding-3-large: 3072 dimensions
sentence-transformers/all-MiniLM-L6-v2: 384 dimensions
sentence-transformers/all-mpnet-base-v2: 768 dimensions

Distance Metrics

cosine - Cosine similarity (recommended for most use cases)
euclidean - Euclidean distance
dotproduct - Dot product similarity

Available Tools

Index Management

`list_indexes`

List all indexes in the project.

Example:

indexes = await list_indexes()

`create_index`

Create a new vector index.

Parameters:

name (string, required): Index name
dimension (int, required): Vector dimension
metric (string, optional): Distance metric (default: 'cosine')
spec_type (string, optional): 'serverless' or 'pod' (default: 'serverless')
cloud (string, optional): 'aws', 'gcp', or 'azure' (default: 'aws')
region (string, optional): Region (default: 'us-east-1')

Example:

index = await create_index(
    name="my-index",
    dimension=1536,  # OpenAI embeddings
    metric="cosine",
    spec_type="serverless",
    cloud="aws",
    region="us-east-1"
)

`describe_index`

Get index configuration and status.

Example:

info = await describe_index(index_name="my-index")

`delete_index`

Delete an index.

Example:

result = await delete_index(index_name="my-index")

Vector Operations

`upsert_vectors`

Insert or update vectors with metadata.

Parameters:

index_name (string, required): Index name
vectors (list, required): List of vector objects
namespace (string, optional): Namespace (default: "")

Vector format:

{
    "id": "vec1",
    "values": [0.1, 0.2, 0.3, ...],  # Must match index dimension
    "metadata": {"key": "value"}  # Optional
}

Example:

result = await upsert_vectors(
    index_name="my-index",
    vectors=[
        {
            "id": "doc1",
            "values": [0.1, 0.2, ...],  # 1536 dimensions
            "metadata": {
                "title": "Document 1",
                "category": "tech",
                "year": 2024
            }
        },
        {
            "id": "doc2",
            "values": [0.3, 0.4, ...],
            "metadata": {
                "title": "Document 2",
                "category": "science"
            }
        }
    ],
    namespace="production"
)

`query_vectors`

Query similar vectors.

Parameters:

index_name (string, required): Index name
vector (list, optional): Query vector (use this OR id)
id (string, optional): Vector ID to use as query (use this OR vector)
top_k (int, optional): Number of results (default: 10)
namespace (string, optional): Namespace (default: "")
include_values (bool, optional): Include vectors (default: False)
include_metadata (bool, optional): Include metadata (default: True)
filter (dict, optional): Metadata filter

Example:

# Query by vector
results = await query_vectors(
    index_name="my-index",
    vector=[0.1, 0.2, ...],  # Your query embedding
    top_k=5,
    filter={"category": {"$eq": "tech"}, "year": {"$gte": 2023}},
    include_metadata=True
)

# Query by existing vector ID
results = await query_vectors(
    index_name="my-index",
    id="doc1",
    top_k=5
)

Response:

{
  "matches": [
    {
      "id": "doc1",
      "score": 0.95,
      "metadata": {"title": "Document 1", "category": "tech"}
    }
  ]
}

`fetch_vectors`

Fetch vectors by IDs.

Example:

vectors = await fetch_vectors(
    index_name="my-index",
    ids=["doc1", "doc2", "doc3"],
    namespace="production"
)

`update_vector`

Update vector values or metadata.

Example:

# Update values
result = await update_vector(
    index_name="my-index",
    id="doc1",
    values=[0.5, 0.6, ...]
)

# Update metadata
result = await update_vector(
    index_name="my-index",
    id="doc1",
    set_metadata={"category": "updated", "year": 2025}
)

`delete_vectors`

Delete vectors.

Example:

# Delete by IDs
result = await delete_vectors(
    index_name="my-index",
    ids=["doc1", "doc2"]
)

# Delete by filter
result = await delete_vectors(
    index_name="my-index",
    filter={"year": {"$lt": 2020}}
)

# Delete all in namespace
result = await delete_vectors(
    index_name="my-index",
    delete_all=True,
    namespace="test"
)

Statistics & Utility

`describe_index_stats`

Get index statistics.

Example:

stats = await describe_index_stats(index_name="my-index")
# Returns: dimension, totalVectorCount, indexFullness, namespaces

`list_vector_ids`

List all vector IDs.

Example:

ids = await list_vector_ids(
    index_name="my-index",
    namespace="production",
    prefix="doc",
    limit=100
)

`create_collection`

Create a collection (backup) from an index.

Example:

collection = await create_collection(
    name="my-backup",
    source_index="my-index"
)

Namespaces

Namespaces provide data isolation within an index:

# Production data
await upsert_vectors(index_name="my-index", vectors=[...], namespace="prod")

# Test data
await upsert_vectors(index_name="my-index", vectors=[...], namespace="test")

# Query only production
results = await query_vectors(index_name="my-index", vector=[...], namespace="prod")

Metadata Filtering

Filter vectors during queries using metadata:

Operators:

$eq - Equal
$ne - Not equal
$gt - Greater than
$gte - Greater than or equal
$lt - Less than
$lte - Less than or equal
$in - In array
$nin - Not in array

Examples:

# Simple filter
filter={"category": {"$eq": "tech"}}

# Range filter
filter={"year": {"$gte": 2020, "$lte": 2024}}

# Multiple conditions
filter={
    "$and": [
        {"category": {"$eq": "tech"}},
        {"year": {"$gte": 2020}}
    ]
}

# OR condition
filter={
    "$or": [
        {"category": {"$eq": "tech"}},
        {"category": {"$eq": "science"}}
    ]
}

# In array
filter={"category": {"$in": ["tech", "science", "engineering"]}}

RAG Example with OpenAI

import openai

# 1. Generate embedding
response = openai.Embedding.create(
    input="What is machine learning?",
    model="text-embedding-ada-002"
)
query_embedding = response['data'][0]['embedding']

# 2. Query Pinecone
results = await query_vectors(
    index_name="knowledge-base",
    vector=query_embedding,
    top_k=3,
    include_metadata=True
)

# 3. Get context from results
context = "\n".join([match['metadata']['text'] for match in results['matches']])

# 4. Generate answer with context
answer = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": f"Answer based on this context:\n{context}"},
        {"role": "user", "content": "What is machine learning?"}
    ]
)

Rate Limits

Free Tier (Starter)

100,000 operations/month
1 pod/index
100 indexes max

Paid Tiers

Standard: $70/month, unlimited operations
Enterprise: Custom pricing, dedicated support

Best Practices

Match dimensions: Ensure vector dimensions match index
Use namespaces: Separate prod/test/dev data
Add metadata: Enable hybrid search and filtering
Batch upserts: Insert multiple vectors per request
Use serverless: For most applications (cost-effective)
Monitor usage: Track vector count and operations
Create backups: Use collections for important data
Optimize queries: Use appropriate top_k values

Common Use Cases

Semantic Search: Find similar documents or products
RAG: Retrieval for LLM context
Recommendation Systems: Similar item recommendations
Duplicate Detection: Find near-duplicate content
Anomaly Detection: Identify outliers
Image Search: Visual similarity search
Chatbot Memory: Store conversation context

Error Handling

Common errors:

401 Unauthorized: Invalid API key
404 Not Found: Index or vector not found
400 Bad Request: Invalid dimensions or parameters
429 Too Many Requests: Rate limit exceeded
503 Service Unavailable: Pinecone service issue

NimbleBrainInc/mcp-pinecone

Pinecone MCP Server

Features

Setup

Prerequisites

Environment Variables

Index Types

Serverless (Recommended)

Pod-based

Vector Dimensions

Distance Metrics

Available Tools

Index Management

`list_indexes`

`create_index`

`describe_index`

`delete_index`

Vector Operations

`upsert_vectors`

`query_vectors`

`fetch_vectors`

`update_vector`

`delete_vectors`

Statistics & Utility

`describe_index_stats`

`list_vector_ids`

`create_collection`

Namespaces

Metadata Filtering

RAG Example with OpenAI

Rate Limits

Free Tier (Starter)

Paid Tiers

Best Practices

Common Use Cases

Error Handling

API Documentation

Support