synechron-code/text2sql-mcp
If you are the rightful owner of text2sql-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
Synechron's Text2SQL MCP Server is a Model Context Protocol server that enables natural language access to relational databases.
Text2Sql_Cyber_Threats
Query the Cyber Threats database using natural language.
Text2Sql_ESG_Ratings
Query the ESG Ratings database using natural language.
Synechron Text2SQL MCP Server
Synechron's Text2SQL MCP Server is a Model Context Protocol (MCP) server. It provides MCP clients like Cursor, VSCode, and Claude Desktop, with natural language access to relational databases of various types – enabling users to ask questions of their data using natural language capabilities.
The server optimizes the quality of the results it generates by using a combination of schema and row-aware semantic search. Schemas and a sample of rows are indexed, enabling the server to more effectively generate suitable queries.
Features and Capabilities
The Text2SQL MCP server provides the following capabilities:
- Natural Language to SQL: Natural language questions are converted into optimized SQL queries using advanced language models.
- Multi-Database Support: Compatibility with PostgreSQL, MySQL, SQLite and Microsoft Fabric databases.
- Retrieval Augmented Generation (RAG): Enhanced query generation by combining schema metadata and sampled row content.
- Table Indexing: Enables embedding for database schemas and sample data, improving query relevance.
- Response Synthesis: Generates natural language summaries of query results.
- Markdown Formatting: Creates more readable responses.
- Multiple Model Support: Works with OpenAI, Azure OpenAI, AWS Bedrock, and Ollama models.
Sample Data
The server ships with two example datasets.
- Cyber Threats
- ESG Ratings
Sample Client
A sample REPL is also included.
Quick Start with Docker
Note: To use your own databases, further configuration is required, see the Configuration section below.
Running the Server with Docker
docker run -it -p8000:8000 \
-v ~/.aws:/home/appuser/.aws:ro \
-e MODEL_API_TYPE=Bedrock \
-e AWS_PROFILE=AWSAdministratorAccess-880502554482 \
-e FASTMCP_HOST=0.0.0.0 \
709825985650.dkr.ecr.us-east-1.amazonaws.com/synechron/mcp-text2sql:0.0.10-2025.07.04-rc3 server
Running the Sample Client with Docker
docker run -it \
-v ~/.aws:/home/appuser/.aws:ro \
-e AWS_PROFILE=AWSAdministratorAccess-880502554482 \
-e BEDROCK_MODEL_ID=us.anthropic.claude-3-7-sonnet-20250219-v1:0 \
-e BEDROCK_API_VERSION=2025-01-01-preview \
-e MCP_SERVER_HOST=host.docker.internal \
709825985650.dkr.ecr.us-east-1.amazonaws.com/synechron/mcp-text2sql:0.0.10-2025.07.04-rc3 client
Example Client Usage
Connected to server with tools: ['Text2Sql_Cyber_Threats', 'Text2Sql_ESG_Ratings']
MCP Client Started!
Type your queries or 'quit' to exit.
Query: What kind of data is in the Cyber Threat datasource?
The Cyber Threat datasource contains comprehensive information about cybersecurity incidents with the following data fields:
1. country - The country where the cyber attack took place
2. year - The year when the cyber attack occurred
3. attack_type - The type of cyber attack (e.g., Phishing, Ransomware, DDoS, Man-in-the-Middle, SQL Injection)
4. target_industry - The industry that was targeted (e.g., Education, Retail, IT, Telecommunications, Healthcare, Government, Banking)
5. financial_loss_(in_million_$) - The financial impact of the attack in millions of dollars
6. number_of_affected_users - How many users were affected by the attack
7. attack_source - Where the attack originated from (e.g., Hacker Group, Nation-state, Insider, Unknown)
8. security_vulnerability_type - The type of security vulnerability exploited (e.g., Unpatched Software, Weak Passwords, Social Engineering)
9. defense_mechanism_used - What defense was in place (e.g., VPN, Firewall, Antivirus, AI-based Detection)
10. incident_resolution_time_(in_hours) - How long it took to resolve the incident in hours
The database tracks cybersecurity incidents across different countries, industries, and years, including details about attack methods, financial impact, affected users, vulnerabilities exploited, and resolution times.
Using the Server with an Existing MCP Client
The server should be run with Docker. Use either of the following configurations to add it to a MCP client:
{
"servers": {
"text2sql": {
"type": "streamable-http",
"url": "http://localhost:8000/mcp"
}
}
}
{
"mcpServers": {
"text2sql": {
"command": "docker",
"args": [
"run",
"-i",
"--rm",
"--env-file",
".env",
"709825985650.dkr.ecr.us-east-1.amazonaws.com/synechron/mcp-text2sql:0.0.10-2025.07.04-rc3",
"server"
]
}
}
}
Configuration
The Text2SQL MCP server can be configured with a combination of environment variables and JSON configuration.
Server Configuration
Valves Config | Environment Variable | Default | Description |
---|---|---|---|
ENABLED | False | Enable tool | |
ENABLE_RAG | True | Use AI to synthesize a response using data collected. | |
ENABLE_MARKDOWN | False | Format response in Markdown | |
ENABLE_PANDAS | True | Use pandas to execute and parse SQL query, otherwise use llama_index | |
ENABLE_TABLE_INDEXING | True | Use table retriever to create embeddings for database schema and sample data to optimize context for LLM | |
FORCE_REINDEXING | False | Force tool to regenerate vector indexes for tables and sample data | |
DATABASE_CONFIG | DB_CONFIG | Default db config | JSON formatted tool configuration with database connection string, tables, context, and prompts |
IGNORE_SCHEMA | Default system tables | Comma separated list of schemas to ignore | |
CREDENTIAL_SERVICE | CREDENTIAL_SERVICE | Select authentication type (Azure, AWS, Google, None) | |
MODEL_API_TYPE / MODEL_TYPE | MODEL_TYPE | OpenAI | Select model API type (OpenAI, Azure OpenAI, Bedrock, Ollama) |
API_ENDPOINT | OPENAI_ENDPOINT | http://litellm-proxy:4000 | OpenAI API endpoint |
API_ID | API_ACCESS_ID | Access Key ID | |
API_KEY | OPENAI_API_KEY | OpenAI API key | |
API_REGION | AWS_REGION | us-east-1 | Region for cloud-hosted models |
API_VERSION | 2025-01-01-preview | OpenAI API version | |
LLM_MODEL_NAME | gpt-4.1-mini | Text-to-SQL model | |
EMBED_MODEL_NAME | text-embedding-3-small | Embedding model | |
MAX_RETRY | 3 | Maximum number of retries for failing queries | |
MAX_RESULTS | 200 | Maximum number of rows in a response | |
MAX_DATAFRAME | 1000 | Maximum number of rows in a dataframe sent to LLM to synthesize a response | |
SAMPLE_RATIO | 0.0001 | Percentage of dataset to index. Higher values will take longer to index but give better results. | |
SAMPLE_MAX_SIZE | 50 | Maximum number of rows to sample. Supersedes SAMPLE_RATIO | |
DEBUG | False | Debug mode | |
MODE | MODE | stdio | Mode in which the MCP server should operate. Use shttp for Streamable HTTP. |
AUTH_ENABLED | AUTH_ENABLED | false | Whether or not to require MCP clients to be authenticated. |
AUTH_ISSUER | AUTH_ISSUER | Required if AUTH_ENABLED is true . | |
AUTH_JWKS_URI | AUTH_JWKS_URI | Required if AUTH_ENABLED is true . | |
AWS_ACCESS_KEY_ID | AWS_ACCESS_KEY_ID | Required if using AWS Bedrock for embeddings | |
AWS_SECRET_ACCESS_KEY | AWS_SECRET_ACCESS_KEY | Required if using AWS Bedrock for embeddings | |
LOG_LEVEL | LOG_LEVEL | INFO | Logging level (DEBUG, INFO, WARNING, ERROR). |
MCP_SERVER_API_KEY | MCP_SERVER_API_KEY | API Key for access to LLM / Embeddings. Configure with AWS_SECRET_ACCESS_KEY, LiteLLM key, or OpenAI API key. | |
MCP_SERVER_DATA | MCP_SERVER_DATA | data | Directory for temporary data files |
TEXT2SQL_VALVES_JSON | TEXT2SQL_VALVES_JSON | valves.json | Full path to configuration JSON, see above for example. |
Example .env File
LOG_LEVEL=INFO
PYTHONUNBUFFERED=1
TEXT2SQL_VALVES_JSON=/home/appuser/config/config.json
MODE=shttp
FASTMCP_HOST=0.0.0.0
FASTMCP_PORT=8000
AUTH_ENABLED=true
AUTH_JWKS_URI=https://cognito-idp.us-east-1.amazonaws.com/us-east-1_/.well-known/jwks.json
AUTH_ISSUER=https://cognito-idp.us-east-1.amazonaws.com/us-east-1_
AWS_PROFILE=default
Example TEXT2SQL_VALVES_JSON (valves.json)
The Text2SQL MCP server must be configured using a JSON file that specifies database connections, model settings, and other parameters. Here's an example configuration:
{
"ENABLED": true,
"ENABLE_RAG": false,
"ENABLE_MARKDOWN": true,
"ENABLE_PANDAS": true,
"ENABLE_TABLE_INDEXING": true,
"FORCE_REINDEXING": false,
"DATABASE_CONFIG": {
"Cyber_Threats": {
"description": "A comprehensive dataset tracking cybersecurity incidents, attack vectors, threat types, and affected countries.",
"topics": [
"Cyber Threats",
"Attacks",
"Targets"
],
"url": "sqlite:///file:/home/appuser/config/cyber_threats.db?uri=true",
"tables": {
"cyber_threats": "The Global Cybersecurity Threats Dataset (2015-2024) provides extensive data on cyberattacks, malware types, targeted industries, and affected countries."
},
"prompts": [
{
"name": "biggest_cyber_threat",
"description": "Which type of cyber threat has caused the biggest financial loss?"
}
]
},
"ESG_Ratings": {
"description": "S&P 500 Companies ESG Insights & Risk Scores for Informed Decisions.",
"topics": [
"ESG Ratings",
"ESG Risk",
"Sustainability"
],
"url": "sqlite:///file:/home/appuser/config/esg_ratings.db?uri=true",
"tables": {
"esg_ratings": "A comprehensive dataset tracking ESG ratings for S&P 500 companies."
}
}
},
"IGNORE_SCHEMA": "information_schema, INFORMATION_SCHEMA, _rsc, db_accessadmin, db_backupoperator, db_datareader, db_datawriter, db_ddladmin, db_denydatareader, db_denydatawriter, db_owner, db_securityadmin, guest, queryinsights, sys, pg_catalog",
"MODEL_API_TYPE": "Bedrock",
"AWS_PROFILE": "default",
"API_VERSION": "2025-01-01-preview",
"LLM_MODEL_NAME": "us.anthropic.claude-3-7-sonnet-20250219-v1:0",
"EMBED_MODEL_NAME": "amazon.titan-embed-text-v2:0",
"MAX_RETRY": 3,
"MAX_RESULTS": 100,
"MAX_DATAFRAME": 2000,
"LOG_LEVEL": "INFO"
}
Client Configuration
The following environment variables can be provided to the sample MCP client:
Option | Default | Description |
---|---|---|
JWT_ACCESS_TOKEN | false | Token used to authenticate client with server if the server has AUTH_ENABLED true |
MCP_SERVER_HOST | Host for MCP server. For a server running in Docker on the same host use host.docker.internal | |
MCP_SERVER_PORT | 8000 | Port for MCP server. |
BEDROCK_MODEL_ID | Required if using AWS Bedrock, e.g. us.anthropic.claude-3-7-sonnet-20250219-v1:0 | |
BEDROCK_API_VERSION | Required if using AWS Bedrock, e.g. 2023-06-01-preview | |
AWS_PROFILE | Required if using AWS Bedrock, e.g. default | |
AZURE_DEPLOYMENT_MODEL | Required if using Azure OpenAI. Ignored if BEDROCK_MODEL_ID is set. For Azure authentication see DefaultAzureCredential. | |
AZURE_API_VERSION | Required If using Azure OpenAI. Ignored if BEDROCK_MODEL_ID is set. | |
OTEL_SDK_DISABLED | false | Enable telemetry for Crew.AI client. |
CREWAI_DISABLE_TELEMETRY | false | Enable telemetry for Crew.AI client. |
Tools
The Text2SQL MCP server dynamically creates MCP Tools and Prompts for the configured databases in the following format:
- Text2Sql_DatabaseName: Query a specific database using natural language
- Parameters:
query
: Natural language query to execute against the database
- Parameters:
License
This project is licensed under the MIT License.