CSI-Genomics-and-Data-Analytics-Core/nci-gdc-mcp-server
If you are the rightful owner of nci-gdc-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
A Model Context Protocol (MCP) server designed for accessing the Genomic Data Commons (GDC) API with intelligent response size management.
GDC MCP Server
A Model Context Protocol (MCP) server for accessing the Genomic Data Commons (GDC) API with intelligent response size management.
Features
- Response Size Management: 900K character limit with intelligent truncation
- GraphQL First: Prioritizes GraphQL for better context window usage
- Complete Tool Suite: 5 specialized tools for GDC data access
- Production Ready: Handles large queries safely without MCP token overflow
Installation
pip install -r requirements.txt
Usage
Start the Server
python gdc_mcp_server.py
Available Tools
gdc_graphql_query ⭐ PREFERRED
Execute GraphQL queries against the NCI GDC GraphQL API.
Recommended Limits:
- Cases: ≤50 records
- Files: ≤30 records
- SSMs: ≤20 records
- Genes: ≤100 records
- Projects: ≤86 records
Parameters:
query(string, required): The GraphQL query stringvariables(string, optional): JSON string of variables for the query
Example:
query ProjectsEdges($filters: FiltersArgument) {
projects {
hits(filters: $filters) {
total
edges { node { project_id primary_site disease_type } }
}
}
}
gdc_rest_query ⚠️ FALLBACK ONLY
Execute REST API queries. Use only when GraphQL isn't suitable.
Parameters:
endpoint(string, required): The REST endpoint (projects/cases/files/genes/ssms)filters(string, optional): JSON filter stringfields(string, optional): Comma-separated or JSON array of fieldssize(number, optional): Records to return (≤100 recommended, default: 10)
Example:
endpoint="projects"
fields=["project_id", "name", "disease_type", "primary_site"]
size=100
gdc_build_filter 🔧 HELPER
Build GDC filter objects for use with GraphQL variables.
Parameters:
operator(string, required): Filter operator (in, =, >=, <=, and, or)field(string, required): Field path (e.g., cases.project.project_id)values(string, required): JSON string of valuescombine_with(string, optional): JSON string of existing filter to combine with
Example:
operator="in"
field="cases.project.project_id"
values='["TCGA-BRCA", "TCGA-LUAD"]'
gdc_schema_introspection 🔍 DISCOVERY
Get GDC GraphQL schema information to understand available fields and types.
Parameters:
type_name(string, optional): Specific type to introspect (e.g., "Case", "File")
Example:
type_name="File" # Get File type schema
gdc_quick_count 📊 CONVENIENCE
Quick count tool for common GDC entities with built-in filter support.
Parameters:
entity_type(string, required): Type of entity to count (files, cases, projects)filters(string, optional): JSON string of filters
Example:
entity_type="files"
filters='{"op": "=", "content": {"field": "data_type", "value": "Gene Expression Quantification"}}'
Response Size Management
Problem: Large queries exceed MCP's 1MB token limit and fail.
Solution:
- 900K character limit with smart truncation
- Pre-query warnings for large parameters
- Preserves metadata while limiting response size
Support
For API-specific questions, refer to: