openapi-context-engine

xinthink/openapi-context-engine

3.2

If you are the rightful owner of openapi-context-engine and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

The Enterprise API Context Engine (MCP) is a sophisticated server designed to enhance e-commerce platforms by providing AI agents with precise API specifications through a multi-layered hybrid search approach.

Tools
2
Resources
0
Prompts
0

Enterprise API Context Engine (MCP)

A sophisticated Model Context Protocol (MCP) Server that acts as a "Context Engine" for e-commerce platforms, providing AI agents with precise, progressively disclosed API specifications using multi-layered hybrid search.

🚀 Features

  • Multi-Layered Search Pipeline: Hard Filtering Hybrid Retrieval Reranking
  • Hybrid Embeddings: Dense semantic + Sparse keyword vectors for comprehensive matching
  • Progressive Disclosure: Find relevant endpoints first, then inspect full specifications
  • Service-Oriented Architecture: Organized by microservices with rich metadata
  • AI Agent Friendly: Natural language queries with intelligent result ranking
  • Local Processing: Uses FastEmbed for efficient on-device embedding generation

🏗️ Architecture

Search Pipeline

Query  Hard Filtering (Metadata)  Hybrid Search (Dense + Sparse)  Reranking  Results
  1. Hard Filtering: Pre-computation metadata filters (service, method, path, tags)
  2. Hybrid Retrieval: Combines dense embeddings (semantic) with sparse vectors (keywords)
  3. Reranking: Cross-encoder scoring for final relevance ranking

Technology Stack

  • Language: Python 3.11+
  • Vector Database: Qdrant with named vectors for hybrid search
  • Embeddings: FastEmbed (BAAI/bge-small-en-v1.5 + prithivida/Splade_PP_en_v1)
  • Orchestration: LlamaIndex for RAG pipeline management
  • Server: FastAPI with MCP SDK integration
  • Reranking: BAAI/bge-reranker-base for result refinement

🚀 Quick Start

Prerequisites

  • Python 3.11+
  • Docker (for Qdrant)
  • UV package manager

Setup

# Clone and setup
git clone <repository>
cd enterprise-api-context-engine

# Start Qdrant vector database
docker compose up -d

# Install dependencies
uv sync

# Index OpenAPI specifications
uv run python app/ingest.py

# Start the server
uv run main.py

Test the System

# Run validation tests
uv run python scripts/test_agent.py

# Test specific search queries
uv run python scripts/test_agent.py --test-search "calculate shipping costs"

# Validate collection data integrity
uv run python scripts/test_collection.py

📋 API Endpoints

Search Endpoints

  • POST /api/search - Search for API endpoints
  • GET /api/services - List available services
  • GET /api/specs/{service} - Get full OpenAPI specification

MCP Tool Simulation

  • POST /tools/find-api-endpoints - Find endpoints by intent
  • GET /tools/get-service-spec/{service} - Get service specification

Example Queries

# Search for shipping-related endpoints
curl -X POST http://localhost:8000/api/search \
  -H "Content-Type: application/json" \
  -d '{"query": "calculate shipping costs", "service": "logistics"}'

# List all services
curl http://localhost:8000/api/services

# Get full checkout service specification
curl http://localhost:8000/api/specs/checkout-payment

📁 Project Structure

enterprise-api-context-engine/
 app/                    # Core application modules
    engine.py          # Multi-layer search engine
    ingest.py          # OpenAPI ingestion pipeline
    server.py          # FastAPI server with MCP tools
 data/specs/            # OpenAPI specifications
    product-catalog.yaml
    cart-service.yaml
    checkout-payment.yaml
    logistics-shipping.yaml
 scripts/               # Utility scripts
    test_agent.py      # Validation test script
    test_collection.py # Collection validation
 docker-compose.yml     # Qdrant configuration
 pyproject.toml         # Project dependencies

🔍 How It Works

1. Data Ingestion

  • Parses OpenAPI YAML specifications
  • Creates one document per API endpoint
  • Extracts rich metadata (service, method, path, parameters, responses)
  • Generates both dense and sparse embeddings
  • Stores in Qdrant with metadata filtering support

2. Search Process

  • Intent Recognition: Natural language query processing
  • Service Filtering: Optional service-specific search
  • Hybrid Matching: Combines semantic similarity with keyword matching
  • Result Ranking: Reranks based on cross-encoder scoring
  • Progressive Disclosure: Returns endpoint summaries, then full specs on demand

3. Agent Interaction Pattern

Agent: "I need to calculate shipping costs"
System: Returns POST /shipping/rates endpoint
Agent: Gets full specification for implementation details

🛠️ Development

Adding New API Specifications

  1. Create YAML file in data/specs/ with x-service-name metadata
  2. Run ingestion: uv run python app/ingest.py
  3. Test search functionality

Modifying Search Behavior

  • Adjust retrieval parameters in app/engine.py
  • Change embedding models in app/ingest.py
  • Modify reranking logic in app/engine.py

Performance Tuning

  • Vector dimensions: 384 (dense) for memory efficiency
  • Retrieval: Top 20 results for high recall
  • Reranking: Top 5 results for precision
  • Batch processing for ingestion efficiency

🔧 Configuration

Qdrant Settings

  • Collection: api_endpoints
  • Dense Vectors: text-dense (384 dims, cosine distance)
  • Sparse Vectors: text-sparse-new (SPLADE encoding)
  • Port: 6333 (HTTP), 6334 (gRPC)

Embedding Models

  • Dense: BAAI/bge-small-en-v1.5 (semantic similarity)
  • Sparse: prithivida/Splade_PP_en-v1 (keyword matching)
  • Reranker: BAAI/bge-reranker-base (result scoring)

🐛 Troubleshooting

Common Issues

  • Qdrant Connection: Ensure Docker container is running (docker compose ps)
  • Model Downloads: Network issues may affect initial model downloads
  • Sparse Encoder: Custom FastEmbed integration bypasses torch dependency
  • Collection Structure: Validate data integrity with test_collection.py

Debug Commands

# Check Qdrant status
docker compose logs qdrant

# Test ingestion with verbose logging
uv run python -u app/ingest.py

# Validate collection data integrity
uv run python scripts/test_collection.py

# Validate search functionality
uv run python scripts/test_agent.py --verbose

📊 Performance Metrics

  • Indexing Speed: ~22 documents/second (depends on model download)
  • Search Latency: <100ms for typical queries
  • Memory Usage: Efficient 384-dim vectors
  • Accuracy: Hybrid approach balances precision and recall

🤝 Contributing

  1. Follow the existing code structure and patterns
  2. Add comprehensive logging for debugging
  3. Test with scripts/test_agent.py and scripts/test_collection.py before submitting changes
  4. Update documentation for new features

📄 License

[Add your license information here]

🙏 Acknowledgments

  • LlamaIndex for the RAG framework
  • Qdrant for vector database capabilities
  • FastEmbed for efficient local embeddings
  • FastAPI for the web framework