xinthink/openapi-context-engine
3.2
If you are the rightful owner of openapi-context-engine and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The Enterprise API Context Engine (MCP) is a sophisticated server designed to enhance e-commerce platforms by providing AI agents with precise API specifications through a multi-layered hybrid search approach.
Tools
2
Resources
0
Prompts
0
Enterprise API Context Engine (MCP)
A sophisticated Model Context Protocol (MCP) Server that acts as a "Context Engine" for e-commerce platforms, providing AI agents with precise, progressively disclosed API specifications using multi-layered hybrid search.
🚀 Features
- Multi-Layered Search Pipeline: Hard Filtering Hybrid Retrieval Reranking
- Hybrid Embeddings: Dense semantic + Sparse keyword vectors for comprehensive matching
- Progressive Disclosure: Find relevant endpoints first, then inspect full specifications
- Service-Oriented Architecture: Organized by microservices with rich metadata
- AI Agent Friendly: Natural language queries with intelligent result ranking
- Local Processing: Uses FastEmbed for efficient on-device embedding generation
🏗️ Architecture
Search Pipeline
Query Hard Filtering (Metadata) Hybrid Search (Dense + Sparse) Reranking Results
- Hard Filtering: Pre-computation metadata filters (service, method, path, tags)
- Hybrid Retrieval: Combines dense embeddings (semantic) with sparse vectors (keywords)
- Reranking: Cross-encoder scoring for final relevance ranking
Technology Stack
- Language: Python 3.11+
- Vector Database: Qdrant with named vectors for hybrid search
- Embeddings: FastEmbed (BAAI/bge-small-en-v1.5 + prithivida/Splade_PP_en_v1)
- Orchestration: LlamaIndex for RAG pipeline management
- Server: FastAPI with MCP SDK integration
- Reranking: BAAI/bge-reranker-base for result refinement
🚀 Quick Start
Prerequisites
- Python 3.11+
- Docker (for Qdrant)
- UV package manager
Setup
# Clone and setup
git clone <repository>
cd enterprise-api-context-engine
# Start Qdrant vector database
docker compose up -d
# Install dependencies
uv sync
# Index OpenAPI specifications
uv run python app/ingest.py
# Start the server
uv run main.py
Test the System
# Run validation tests
uv run python scripts/test_agent.py
# Test specific search queries
uv run python scripts/test_agent.py --test-search "calculate shipping costs"
# Validate collection data integrity
uv run python scripts/test_collection.py
📋 API Endpoints
Search Endpoints
POST /api/search- Search for API endpointsGET /api/services- List available servicesGET /api/specs/{service}- Get full OpenAPI specification
MCP Tool Simulation
POST /tools/find-api-endpoints- Find endpoints by intentGET /tools/get-service-spec/{service}- Get service specification
Example Queries
# Search for shipping-related endpoints
curl -X POST http://localhost:8000/api/search \
-H "Content-Type: application/json" \
-d '{"query": "calculate shipping costs", "service": "logistics"}'
# List all services
curl http://localhost:8000/api/services
# Get full checkout service specification
curl http://localhost:8000/api/specs/checkout-payment
📁 Project Structure
enterprise-api-context-engine/
app/ # Core application modules
engine.py # Multi-layer search engine
ingest.py # OpenAPI ingestion pipeline
server.py # FastAPI server with MCP tools
data/specs/ # OpenAPI specifications
product-catalog.yaml
cart-service.yaml
checkout-payment.yaml
logistics-shipping.yaml
scripts/ # Utility scripts
test_agent.py # Validation test script
test_collection.py # Collection validation
docker-compose.yml # Qdrant configuration
pyproject.toml # Project dependencies
🔍 How It Works
1. Data Ingestion
- Parses OpenAPI YAML specifications
- Creates one document per API endpoint
- Extracts rich metadata (service, method, path, parameters, responses)
- Generates both dense and sparse embeddings
- Stores in Qdrant with metadata filtering support
2. Search Process
- Intent Recognition: Natural language query processing
- Service Filtering: Optional service-specific search
- Hybrid Matching: Combines semantic similarity with keyword matching
- Result Ranking: Reranks based on cross-encoder scoring
- Progressive Disclosure: Returns endpoint summaries, then full specs on demand
3. Agent Interaction Pattern
Agent: "I need to calculate shipping costs"
System: Returns POST /shipping/rates endpoint
Agent: Gets full specification for implementation details
🛠️ Development
Adding New API Specifications
- Create YAML file in
data/specs/withx-service-namemetadata - Run ingestion:
uv run python app/ingest.py - Test search functionality
Modifying Search Behavior
- Adjust retrieval parameters in
app/engine.py - Change embedding models in
app/ingest.py - Modify reranking logic in
app/engine.py
Performance Tuning
- Vector dimensions: 384 (dense) for memory efficiency
- Retrieval: Top 20 results for high recall
- Reranking: Top 5 results for precision
- Batch processing for ingestion efficiency
🔧 Configuration
Qdrant Settings
- Collection:
api_endpoints - Dense Vectors:
text-dense(384 dims, cosine distance) - Sparse Vectors:
text-sparse-new(SPLADE encoding) - Port: 6333 (HTTP), 6334 (gRPC)
Embedding Models
- Dense: BAAI/bge-small-en-v1.5 (semantic similarity)
- Sparse: prithivida/Splade_PP_en-v1 (keyword matching)
- Reranker: BAAI/bge-reranker-base (result scoring)
🐛 Troubleshooting
Common Issues
- Qdrant Connection: Ensure Docker container is running (
docker compose ps) - Model Downloads: Network issues may affect initial model downloads
- Sparse Encoder: Custom FastEmbed integration bypasses torch dependency
- Collection Structure: Validate data integrity with test_collection.py
Debug Commands
# Check Qdrant status
docker compose logs qdrant
# Test ingestion with verbose logging
uv run python -u app/ingest.py
# Validate collection data integrity
uv run python scripts/test_collection.py
# Validate search functionality
uv run python scripts/test_agent.py --verbose
📊 Performance Metrics
- Indexing Speed: ~22 documents/second (depends on model download)
- Search Latency: <100ms for typical queries
- Memory Usage: Efficient 384-dim vectors
- Accuracy: Hybrid approach balances precision and recall
🤝 Contributing
- Follow the existing code structure and patterns
- Add comprehensive logging for debugging
- Test with
scripts/test_agent.pyandscripts/test_collection.pybefore submitting changes - Update documentation for new features
📄 License
[Add your license information here]
🙏 Acknowledgments
- LlamaIndex for the RAG framework
- Qdrant for vector database capabilities
- FastEmbed for efficient local embeddings
- FastAPI for the web framework