shilpathota/RAG_VertexAI_MCP
If you are the rightful owner of RAG_VertexAI_MCP and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The Enhanced RAG System is a comprehensive solution for chat summarization and document question answering, now upgraded with Model Context Protocol (MCP) for seamless client-backend interaction.
Enhanced RAG System: Chat Summarization + Document QA
Project Overview
This project started as a Chat Summarization API using Vertex AI Gemini to generate concise summaries from chat conversations.
It has now been enhanced to support:
- Uploading Research Papers / Documents 📄
- Semantic Search inside documents using FAISS
- Natural Language Question Answering from uploaded content
✅ Now MCP-Compliant!
This system has been upgraded with MCP (Model Context Protocol) — a standardized interface for unified interaction between the client and backend LLM services.
Built using:
- FAISS (Facebook AI Similarity Search) for local vector database
- SentenceTransformer (
all-MiniLM-L6-v2
) for embeddings - Vertex AI Gemini for final answer generation
- FastAPI for API management
- Postman for API testing
Features
- Upload PDF documents.
- Generate dense embeddings using SentenceTransformer.
- Store embeddings locally in a FAISS index.
- Send natural language questions.
- Retrieve relevant document chunks via semantic search.
- Generate intelligent, human-like answers using Vertex AI Gemini.
💡 Example Flow (via MCP)
-
Upload a research paper:
POST /mcp/upload
- Form-Data:
file=your.pdf
- Metadata JSON (as
metadata
field):{ "request_id": "uuid", "context": { "type": "upload_doc", "data_sources": [], "query": "{\"source\": \"chatbot_ui\"}" } }
- Form-Data:
-
Ask a question:
POST /mcp
{ "request_id": "uuid", "context": { "type": "doc_query", "data_sources": ["uploaded_docs"], "query": "What is the methodology used in this paper?" } }
-
Summarize a chat:
POST /mcp
{ "request_id": "uuid", "context": { "type": "summarization", "data_sources": [], "query": "Customer: Hi, I have an issue...\nAgent: Can you please share your ID?" } }
System Architecture (Block Diagram)
Updated Architecture with MCP
+-------------+ +---------------------+ +-----------------+
| MCP Client | <---> | FastAPI MCP Server | ---> | Vertex AI Gemini |
+-------------+ +---------------------+ +-----------------+
| | ↑
| v |
| Semantic Search (FAISS) |
| ^ |
| Document Embeddings |
+-----------------+ | +-----------------------+
| v |
| SentenceTransformer
|
Uploaded PDFs (pdfplumber)
Tech Stack
- FastAPI — Lightweight API framework
- FAISS — Vector similarity search engine
- SentenceTransformers — For text embeddings
- Vertex AI Gemini — LLM for content generation
- pdfplumber — For PDF text extraction
- Postman — API testing tool
Setup Instructions
- Clone the repository
git clone https://github.com/shilpathota/RAGWithVertexAI.git
cd RAGWithVertexAI
- Create a virtual environment
python -m venv venv
source venv/bin/activate # On Windows use venv\Scripts\activate
- Install requirements
pip install -r requirements.txt
- Environment Variables
Create a .env
file:
GOOGLE_APPLICATION_CREDENTIALS=path/to/your/service-account.json
GOOGLE_PROJECT_ID=your-gcp-project-id
GOOGLE_LOCATION=your-gcp-region
- Run the API Server
uvicorn app.main:app --port 8765 --reload
🔌 MCP API Endpoints
1. Unified Request Handling (MCP)
POST /mcp
- Request:
{
"request_id": "uuid",
"context": {
"type": "doc_query" | "summarization",
"data_sources": [],
"query": "..."
}
}
2. Document Upload (MCP)
POST /mcp/upload
- Form-Data:
file
: PDF filemetadata
: JSON with context and source
Future Enhancements
- Document chunking for long PDFs
- Streaming response from Gemini (for long answers)
- Real-time upload and QA interface (frontend)
- Multilingual document support
- Auto Summarization of full research papers
Author
#AI #MachineLearning #VertexAI #MCP #LLM #FAISS #FastAPI #DocumentSummarization #ChatSummarization #Python #RAG #GitHubProject