nci-gdc-mcp-server

CSI-Genomics-and-Data-Analytics-Core/nci-gdc-mcp-server

3.2

If you are the rightful owner of nci-gdc-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

A Model Context Protocol (MCP) server designed for accessing the Genomic Data Commons (GDC) API with intelligent response size management.

Tools
5
Resources
0
Prompts
0

GDC MCP Server

A Model Context Protocol (MCP) server for accessing the Genomic Data Commons (GDC) API with intelligent response size management.

Features

  • Response Size Management: 900K character limit with intelligent truncation
  • GraphQL First: Prioritizes GraphQL for better context window usage
  • Complete Tool Suite: 5 specialized tools for GDC data access
  • Production Ready: Handles large queries safely without MCP token overflow

Installation

pip install -r requirements.txt

Usage

Start the Server

python gdc_mcp_server.py

Available Tools

gdc_graphql_queryPREFERRED

Execute GraphQL queries against the NCI GDC GraphQL API.

Recommended Limits:

  • Cases: ≤50 records
  • Files: ≤30 records
  • SSMs: ≤20 records
  • Genes: ≤100 records
  • Projects: ≤86 records

Parameters:

  • query (string, required): The GraphQL query string
  • variables (string, optional): JSON string of variables for the query

Example:

query ProjectsEdges($filters: FiltersArgument) {
  projects {
    hits(filters: $filters) {
      total
      edges { node { project_id primary_site disease_type } }
    }
  }
}
gdc_rest_query ⚠️ FALLBACK ONLY

Execute REST API queries. Use only when GraphQL isn't suitable.

Parameters:

  • endpoint (string, required): The REST endpoint (projects/cases/files/genes/ssms)
  • filters (string, optional): JSON filter string
  • fields (string, optional): Comma-separated or JSON array of fields
  • size (number, optional): Records to return (≤100 recommended, default: 10)

Example:

endpoint="projects"
fields=["project_id", "name", "disease_type", "primary_site"]
size=100
gdc_build_filter 🔧 HELPER

Build GDC filter objects for use with GraphQL variables.

Parameters:

  • operator (string, required): Filter operator (in, =, >=, <=, and, or)
  • field (string, required): Field path (e.g., cases.project.project_id)
  • values (string, required): JSON string of values
  • combine_with (string, optional): JSON string of existing filter to combine with

Example:

operator="in"
field="cases.project.project_id"
values='["TCGA-BRCA", "TCGA-LUAD"]'
gdc_schema_introspection 🔍 DISCOVERY

Get GDC GraphQL schema information to understand available fields and types.

Parameters:

  • type_name (string, optional): Specific type to introspect (e.g., "Case", "File")

Example:

type_name="File"  # Get File type schema
gdc_quick_count 📊 CONVENIENCE

Quick count tool for common GDC entities with built-in filter support.

Parameters:

  • entity_type (string, required): Type of entity to count (files, cases, projects)
  • filters (string, optional): JSON string of filters

Example:

entity_type="files"
filters='{"op": "=", "content": {"field": "data_type", "value": "Gene Expression Quantification"}}'

Response Size Management

Problem: Large queries exceed MCP's 1MB token limit and fail.

Solution:

  • 900K character limit with smart truncation
  • Pre-query warnings for large parameters
  • Preserves metadata while limiting response size

Support

For API-specific questions, refer to: