dcp-mcp-server

owaisnaveed00-hue/dcp-mcp-server

3.1

If you are the rightful owner of dcp-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

The DCP MCP Server is a Model Context Protocol server implementation with Retrieval-Augmented Generation capabilities designed for intelligent document processing and question-answering.

DCP MCP Server - RAG System

A Model Context Protocol (MCP) server implementation with Retrieval-Augmented Generation (RAG) capabilities for intelligent document processing and question-answering.

🚀 Features

  • RAG System: Advanced retrieval-augmented generation for accurate document-based responses
  • Document Processing: Support for multiple document formats (PDF, TXT, DOCX, etc.)
  • Vector Search: Efficient semantic search using embeddings
  • Context-Aware Responses: Generate responses based on retrieved document context
  • MCP Protocol: Standardized Model Context Protocol implementation
  • Web Interface: User-friendly web interface for document upload and querying

📋 Table of Contents

🛠 Installation

Prerequisites

  • Python 3.8+
  • pip or conda
  • Git

Setup

  1. Clone the repository

    git clone https://github.com/owaisnaveed00-hue/dcp-mcp-server.git
    cd dcp-mcp-server
    
  2. Create virtual environment

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    
  3. Install dependencies

    pip install -r requirements.txt
    
  4. Set up environment variables

    cp .env.example .env
    # Edit .env with your configuration
    

🚀 Quick Start

  1. Start the server

    python app.py
    
  2. Access the web interface

    • Open your browser to http://localhost:5000
    • Upload documents to build your knowledge base
    • Ask questions about your documents
  3. Use the API

    curl -X POST http://localhost:5000/api/query \
         -H "Content-Type: application/json" \
         -d '{"query": "What is the main topic of the document?"}'
    

💡 Usage

Web Interface

  1. Upload Documents

    • Navigate to the upload page
    • Select your documents (PDF, TXT, DOCX)
    • Documents are automatically processed and indexed
  2. Query Documents

    • Use the search interface to ask questions
    • Get context-aware responses based on your documents
    • View source citations and confidence scores

API Usage

Upload Document
import requests

files = {'file': open('document.pdf', 'rb')}
response = requests.post('http://localhost:5000/api/upload', files=files)
print(response.json())
Query Documents
import requests

query = {
    "query": "What are the key findings?",
    "top_k": 5,
    "temperature": 0.7
}

response = requests.post('http://localhost:5000/api/query', json=query)
result = response.json()
print(result['answer'])

📚 API Documentation

Endpoints

MethodEndpointDescription
POST/api/uploadUpload and process documents
POST/api/queryQuery the RAG system
GET/api/documentsList uploaded documents
DELETE/api/documents/<id>Delete a document
GET/api/healthHealth check

Request/Response Examples

Query Request
{
  "query": "What is machine learning?",
  "top_k": 3,
  "temperature": 0.7,
  "max_tokens": 500
}
Query Response
{
  "answer": "Machine learning is a subset of artificial intelligence...",
  "sources": [
    {
      "document": "ml_guide.pdf",
      "page": 5,
      "content": "Machine learning algorithms...",
      "score": 0.95
    }
  ],
  "confidence": 0.92,
  "processing_time": 1.23
}

⚙️ Configuration

Environment Variables

# Database
DATABASE_URL=sqlite:///rag_system.db

# Vector Store
VECTOR_STORE_TYPE=chroma  # or faiss, pinecone
CHROMA_PERSIST_DIRECTORY=./chroma_db

# Model Configuration
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
LLM_MODEL=gpt-3.5-turbo
LLM_API_KEY=your_openai_api_key

# Server Configuration
HOST=0.0.0.0
PORT=5000
DEBUG=False

Model Options

  • Embedding Models: sentence-transformers, OpenAI embeddings
  • LLM Models: OpenAI GPT, Anthropic Claude, local models
  • Vector Stores: Chroma, FAISS, Pinecone, Weaviate

🏗 Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Web Client    │    │   API Server    │    │   RAG Engine    │
│                 │◄──►│                 │◄──►│                 │
│ - Upload UI     │    │ - Flask/FastAPI │    │ - Document Parser│
│ - Query UI      │    │ - Authentication│    │ - Embedding Gen │
│ - Results UI    │    │ - Rate Limiting │    │ - Vector Search │
└─────────────────┘    └─────────────────┘    └─────────────────┘
                                │                        │
                                ▼                        ▼
                       ┌─────────────────┐    ┌─────────────────┐
                       │   Database      │    │  Vector Store   │
                       │                 │    │                 │
                       │ - Document Meta │    │ - Embeddings    │
                       │ - User Sessions │    │ - Similarity    │
                       │ - Query History │    │ - Indexing      │
                       └─────────────────┘    └─────────────────┘

🔧 Development

Project Structure

dcp-mcp-server/
├── app.py                 # Main application entry point
├── requirements.txt       # Python dependencies
├── .env.example          # Environment variables template
├── src/
│   ├── rag/              # RAG system implementation
│   │   ├── __init__.py
│   │   ├── document_parser.py
│   │   ├── embedding_generator.py
│   │   ├── vector_store.py
│   │   └── query_engine.py
│   ├── api/              # API endpoints
│   │   ├── __init__.py
│   │   ├── routes.py
│   │   └── middleware.py
│   └── models/           # Data models
│       ├── __init__.py
│       ├── document.py
│       └── query.py
├── templates/            # HTML templates
├── static/              # Static files (CSS, JS)
└── tests/               # Test files

Running Tests

# Install test dependencies
pip install -r requirements-test.txt

# Run tests
pytest tests/

# Run with coverage
pytest --cov=src tests/

Code Quality

# Format code
black src/ tests/

# Lint code
flake8 src/ tests/

# Type checking
mypy src/

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Guidelines

  • Follow PEP 8 style guidelines
  • Write comprehensive tests for new features
  • Update documentation for API changes
  • Ensure backward compatibility

📊 Performance

Benchmarks

MetricValue
Document Processing~100 pages/second
Query Response Time~2-5 seconds
Vector Search Speed~1000 queries/second
Memory Usage~2GB for 10k documents

Optimization Tips

  • Use GPU acceleration for embedding generation
  • Implement document chunking strategies
  • Cache frequently accessed embeddings
  • Use efficient vector store configurations

🐛 Troubleshooting

Common Issues

  1. Out of Memory Errors

    • Reduce batch size for document processing
    • Use smaller embedding models
    • Implement document chunking
  2. Slow Query Performance

    • Optimize vector store configuration
    • Use approximate nearest neighbor search
    • Implement result caching
  3. Poor Response Quality

    • Adjust top_k parameter
    • Fine-tune embedding model
    • Improve document preprocessing

📄 License

This project is licensed under the MIT License - see the file for details.

🙏 Acknowledgments

📞 Support


Made with ❤️ by the DCP MCP Server team