Iamkrmayank/mcp-server
If you are the rightful owner of mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The Agno MCP Orchestration System is a comprehensive web scraping framework with automatic fallback mechanisms, built on the Model Context Protocol (MCP).
Agno MCP Orchestration System
A comprehensive web scraping orchestration framework with automatic fallback mechanisms, built on the Model Context Protocol (MCP).
🎯 System Overview
The Agno orchestration system provides a robust, production-ready solution for web data extraction with:
- Multi-source scraping with intelligent fallback
- Comprehensive logging with timestamps and performance metrics
- MCP integration for seamless tool coordination
- Natural language queries with automatic preprocessing
- Transparent feedback on execution status
Architecture
User Input
↓
Agno Orchestrator (preprocesses & coordinates)
↓
MCP Tools Framework (manages execution)
↓
Primary: Tavily API (fast, structured extraction)
↓ (on failure)
Fallback: Jina API (semantic understanding)
↓
Formatted Response + Execution Logs
🚀 Quick Start
Prerequisites
Installation
- Clone the repository:
git clone https://github.com/yourusername/mcp-server.git
cd mcp-server
- Install dependencies:
pip install -r requirements.txt
- Configure environment variables:
# Copy example environment file
cp .env.example .env
# Edit .env with your API keys
# Required:
TAVILY_API_KEY=your_tavily_api_key_here
# Optional:
JINA_API_KEY=your_jina_api_key_here
# Configuration (optional):
LOG_LEVEL=INFO
LOG_FILE=agno_system.log
MAX_RETRIES=3
TIMEOUT_SECONDS=30
Usage
As MCP Server
Run as an MCP server (for integration with MCP-compatible clients):
python -m src.server
As CLI Tool
Run standalone queries from the command line:
# Basic usage
python -m src.cli "Tell me about Microsoft 2024 report"
# With custom formatting
python -m src.cli "What happened in AI in 2024?" --format markdown
# With more results
python -m src.cli "Latest climate change news" --max-results 10
# Show statistics
python -m src.cli "Search query" --stats
Programmatic Usage
import asyncio
from src.agno_orchestrator import AgnoOrchestrator
async def main():
# Initialize orchestrator
orchestrator = AgnoOrchestrator()
# Process a request
response = await orchestrator.process_request(
user_input="Tell me about Microsoft 2024 report",
max_results=5
)
# Format and display
if response.success:
formatted = orchestrator.format_response(response, format_type="markdown")
print(formatted)
else:
print(f"Error: {response.error}")
asyncio.run(main())
📋 Core Components
1. Agno Orchestrator (src/agno_orchestrator.py)
Central control layer that:
- Preprocesses natural language queries
- Coordinates tool execution with priority-based selection
- Manages automatic fallback mechanisms
- Provides transparent user feedback
- Tracks system statistics
Key Methods:
process_request(user_input, **kwargs)- Main entry point for queriesformat_response(response, format_type)- Format results as structured/JSON/Markdownget_statistics()- Retrieve system performance metrics
2. MCP Tools Framework (src/mcp_tools_integration.py)
Unified interface for tool management:
- Dynamic tool registration and discovery
- Priority-based execution ordering
- Result validation with confidence scoring
- Execution history tracking
- Automatic failover handling
Key Classes:
BaseTool- Abstract base for all scraping toolsToolRegistry- Central registry for tool managementMCPToolsFramework- Main coordination frameworkToolResult- Standardized result container
3. Scraping Tools
Tavily Tool (src/tools/tavily_tool.py)
- Priority: 0 (highest)
- Purpose: Fast, structured web extraction
- Features:
- Configurable search depth (basic/advanced)
- Domain filtering (include/exclude)
- Automatic confidence calculation
- Rich result metadata
Jina Tool (src/tools/jina_tool.py)
- Priority: 1 (fallback)
- Purpose: Semantic web understanding
- Features:
- Search API integration
- Reader API for content extraction
- Fallback search mechanisms
- Works without API key (rate-limited)
4. Logging System (src/logging_system.py)
Comprehensive logging with:
- UTC timestamps on all events
- Operation-level tracking (start/end)
- Duration measurements (milliseconds)
- Success/failure status
- Data quality metrics (confidence, completeness)
- Fallback event logging
Log Format:
2024-10-14 15:30:45 UTC - AgnoSystem - INFO - [Tavily_Request] Status: success | Duration: 1250.45ms | Quality: {"confidence": 0.85}
5. Configuration Management (src/config.py)
Centralized configuration:
- Environment variable loading
- Configuration validation
- Default value management
- Secure API key handling
🔧 Configuration Options
| Variable | Description | Default | Required |
|---|---|---|---|
TAVILY_API_KEY | Tavily API key | - | Yes* |
JINA_API_KEY | Jina API key | - | No |
LOG_LEVEL | Logging level | INFO | No |
LOG_FILE | Log file path | agno_system.log | No |
TIMEOUT_SECONDS | Request timeout | 30 | No |
MAX_RETRIES | Max retry attempts | 3 | No |
MIN_CONFIDENCE | Min confidence threshold | 0.5 | No |
*At least one API key (Tavily or Jina) must be configured.
📊 System Features
Priority-Based Execution
Tools are executed in priority order:
- Tavily (priority 0) - Primary fast extraction
- Jina (priority 1) - Semantic fallback
On failure or low confidence, the system automatically tries the next tool.
Automatic Fallback
# Fallback triggers when:
1. Primary tool fails (error, timeout)
2. Result confidence below threshold
3. Empty or invalid data returned
# Fallback is logged:
{
"timestamp": "2024-10-14T15:30:45.123Z",
"operation": "Fallback_Triggered",
"from": "Tavily",
"to": "Jina",
"reason": "Tavily request timeout"
}
Confidence Scoring
Each result includes a confidence score (0.0-1.0) based on:
- Number of results found
- Quality scores from source
- Presence of summary answer
- Content completeness
Transparent Feedback
Users receive clear status messages:
- ✓ Success: "Results retrieved successfully from Tavily."
- ✓ Fallback Used: "Primary source unavailable, fallback mechanism used."
- ⚠ Timeout: "Request timed out. Please try again."
- ✗ Failure: "Unable to retrieve results. All sources failed."
📝 Logging Standards
Every operation logs:
- Timestamp - UTC ISO format
- Operation Name - e.g., "Tavily_Request"
- Duration - Execution time in milliseconds
- Status - success/failure/timeout/in_progress
- Error Message - If applicable
- Data Quality - Confidence and completeness metrics
- Metadata - Additional context
Example log entry:
{
"timestamp": "2024-10-14T15:30:45.123456Z",
"operation": "Tavily_Request_End",
"status": "success",
"duration_ms": 1250.45,
"error": null,
"data_quality": {
"confidence": 0.85
},
"metadata": {
"result_count": 5,
"search_depth": "basic"
}
}
🛠️ Development
Project Structure
mcp-server/
├── src/
│ ├── __init__.py
│ ├── server.py # MCP server entry point
│ ├── cli.py # Command-line interface
│ ├── config.py # Configuration management
│ ├── logging_system.py # Comprehensive logging
│ ├── agno_orchestrator.py # Main orchestration logic
│ ├── mcp_tools_integration.py # Tool framework
│ └── tools/
│ ├── __init__.py
│ ├── tavily_tool.py # Tavily integration
│ └── jina_tool.py # Jina integration
├── requirements.txt # Python dependencies
├── .env.example # Example environment file
├── .gitignore # Git ignore rules
└── README.md # This file
Adding New Tools
- Create a new tool class inheriting from
BaseTool:
from src.mcp_tools_integration import BaseTool, ToolResult, ToolStatus
class MyCustomTool(BaseTool):
def __init__(self, api_key=None):
super().__init__(name="MyTool", priority=2) # Lower priority = fallback
self.api_key = api_key
def is_available(self) -> bool:
return bool(self.api_key)
async def execute(self, query: str, **kwargs) -> ToolResult:
# Implement your scraping logic
try:
# ... scraping code ...
return ToolResult(
status=ToolStatus.SUCCESS,
data={"results": []},
confidence=0.8
)
except Exception as e:
return ToolResult(
status=ToolStatus.FAILURE,
error=str(e)
)
- Register the tool in the orchestrator:
from src.tools.my_custom_tool import MyCustomTool
# In AgnoOrchestrator._register_tools()
custom_tool = MyCustomTool(api_key=custom_api_key)
self.framework.register_tool(custom_tool)
Running Tests
# Test CLI
python -m src.cli "test query" --format json
# Test specific queries
python -m src.cli "Microsoft 2024 report" --stats
python -m src.cli "Latest AI news" --max-results 10 --format markdown
📈 System Statistics
Track system performance:
stats = orchestrator.get_statistics()
# Returns:
{
"total_requests": 100,
"successful_requests": 95,
"success_rate": 95.0,
"fallback_count": 12,
"fallback_rate": 12.0,
"available_tools": 2,
"registered_tools": 2
}
🔒 Security
- API keys stored in environment variables
- Keys never logged or exposed in output
- HTTPS for all API communications
- No sensitive data stored in logs
🐛 Troubleshooting
"No scraping tools are available"
- Check that at least one API key is configured in
.env - Verify API keys are valid
"Request timed out"
- Increase
TIMEOUT_SECONDSin.env - Check internet connection
- Verify API services are operational
"All tools failed"
- Check API key validity
- Review logs in
agno_system.log - Ensure query is well-formed
- Try with different queries
Low confidence results
- Adjust
MIN_CONFIDENCEthreshold - Use more specific queries
- Try with
search_depth=advancedfor Tavily
📄 License
MIT License - See LICENSE file for details
🤝 Contributing
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
📞 Support
For issues, questions, or feature requests:
- Open an issue on GitHub
- Check existing issues for solutions
- Review logs for debugging information
🎉 Acknowledgments
Built with:
- MCP (Model Context Protocol)
- Tavily API
- Jina AI
- Python asyncio and httpx
Version: 1.0.0
Last Updated: October 2025