dnaerys/onekgpd-mcp
If you are the rightful owner of onekgpd-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The 1000 Genomes Project dataset MCP Server provides natural language access to a comprehensive genomic dataset hosted online in the Dnaerys variant store.
1000 Genomes Project Dataset MCP Server
Natural language access to 1000 Genomes Project dataset, hosted online in Dnaerys variant store
Dataset is sequenced & aligned to GRCh38 by New York Genome Center
- 2504 unrelated samples from the phase three panel
- additional 698 samples from 602 family trios
- 3202 samples total (1598 males, 1604 females)
- dataset details
Key Features
-
real-time access to 138 044 724 unique variants and about 442 billion individual genotypes in 3202 samples
-
variant, sample, and genotype selection based on coordinates, annotations, zygosity
-
filtering by VEP, ClinVar, gnomAD AF and AlphaMissense annotations
-
filtering by inheritance model (de novo, heterozygous dominant, homozygous recessive)
Deployments
Remote MCP service is available online via Streamable HTTP:
For local builds with stdio and http transports see details below
Architecture
MCP Server is implemented as a Java EE service, accessing 1KGP dataset via gRPC calls to public Dnaerys variant store service.
- service implementation is based on Quarkus MCP Server
- provides MCP over Streamable HTTP, HTTP/SSE and STDIO transports
Examples
Regulatory Variant Impact on Known Disease Genes
Identify healthy individuals in the KGP dataset carrying ClinVar-validated pathogenic variants in SCN5A, KCNH2, or LDLR. For each carrier, conduct a cis-regulatory scan within a 70kb window to identify variants in high linkage disequilibrium that are statistically associated with Haplotype-Specific Expression (HSE) or Allelic Imbalance. Analyze if these secondary variants disrupt Transcription Factor Binding Sites (TFBS) in proximal promoters, alter uORF-mediated translation kinetics, or modify mRNA stability motifs in the 3'UTR. Evaluate the hypothesis that resilience is achieved through a "Transcriptional Damping" mechanism, where the pathogenic allele is preferentially silenced or the wild-type allele is hyper-activated, ensuring the total pool of functional protein remains above the critical phenotypic threshold.
-
Building on the SCN5A damping findings, the next phase should investigate if resilience in 150 heterozygous carriers is also mediated by trans-acting chaperones that stabilize the functional 50% of sodium channels. Task Statement: Identify healthy SCN5A pathogenic carriers from the previous "Transcriptional Damping" cohort and perform a genome-wide scan for Trans-acting Proteostatic Modifiers. Search for enrichment of gain-of-function variants or high-expression eQTLs in cardiac-specific ion channel chaperones and trafficking regulators, specifically SNTA1 (Syntrophin-alpha 1), GPD1L, and RANGRF (MOG1).
-
similar study for KCNH2 and LDLR
Metabolic Pathway Redundancy
Cellular redox homeostasis is maintained by two parallel antioxidant systems: the glutathione system and the thioredoxin system. Complete loss of either GSR or TXNRD1 is incompatible with mammalian development, yet population databases contain individuals carrying variants predicted to impair enzyme function. Identify clusters of individuals in the KGP cohort who carry multiple 'Moderate' impact VEP variants across both systems. Reasoning through the AlphaMissense structural implications, can you detect a 'balancing act' where a loss of efficiency in Glutathione reductase is consistently paired with high-confidence benign or potentially activating variants in the Thioredoxin system ? Synthesize a model of 'Redox Robustness' based on the co-occurrence of these variants across the cohort.
Macromolecular structural complexes
The human RNA Exosome (Exo-9 core) is a "dead machine" that acts as a scaffold. In lower organisms the ring itself can degrade RNA. In humans, the 9-subunit ring has lost all its catalytic teeth and is purely a structural tunnel that guides RNA into the catalytic subunits (DIS3 or EXOSC10) attached at the bottom. Since RNA is a highly negatively charged polymer, the residues lining this pore are typically positively charged (Lysine, Arginine), but not too "sticky" or RNA will jam. So, to reach the "shredder" at the bottom it must slide through a narrow pore formed by the Exo-9 ring.
The task: analyse all missense variants in the healthy KGP cohort that map to the internal pore-lining residues of the Exo-9 ring. Look for 'charge-swap' variants where a positive residue (K, R) is replaced by a negative one (D, E). If an individual is healthy despite having a 'negative patch' in the tunnel that should repel RNA, do they carry a compensatory variant in the cap subunits (EXOSC1, 2, 3) that widens the entrance? Use a 3D electrostatic surface map to determine if the 'healthy' cohort maintains a specific electrostatic gradient.
Structural Intolerance
In what cardiac related genes, e.g. ion channels, variants in KGP dataset near catalytic residues or ligand-binding pockets show strong depletion compared to flanking residues (±20 amino acids) ?
Available Tools
Description for 30 tools and parameters can be found
Installation
Project can be run locally with MCP over stdio and/or http transports
Option A - build & run locally
- build the project and package it as a single über-jar:
- jar is located in
target/onekgpd-mcp-runner.jarand includes all dependencies
- jar is located in
./mvnw package -DskipTests -Dquarkus.package.jar.type=uber-jar
- run it locally with dev profile
- both stdio and http transports are enabled
- http transport is on quarkus
- project expects JRE 21 to be available at runtime
java -Dquarkus.profile=dev -jar <full path>/onekgpd-mcp-runner.jar
Option B - build & run in docker
-
in order to run in docker, stdio transport needs to be disabled to prevent application from stopping itself due to closed stdio in containers
- it's already configured in prod profile
- it's the default configuration overall
-
build with prod profile
docker build -f Dockerfile -t onekgpd-mcp .
- run as you prefer, e.g.
docker run -p 9000:9000 --name onekgpd-mcp --rm onekgpd-mcp
Option C - pull from Docker Hub
- pull prebuilt image; stdio transport disabled, http transport on port 9000
docker pull dnaerys/onekgpd-mcp:latest
- run
docker run -p 9000:9000 --name onekgpd-mcp --rm onekgpd-mcp
Connecting with MCP clients
-
to connect via http transport, remote or local, simply direct the client to an appropriate destination, e.g.
http://localhost:9000/mcporhttps://db.dnaerys.org:443/mcp -
to connect via stdio transport, MCP client should start application with dev profile and with a full path to the jar file
- e.g. for Claude Desktop and stdio transport add to
claude_desktop_config.json:
- e.g. for Claude Desktop and stdio transport add to
{
"mcpServers": {
"OneKGPd": {
"command": "java",
"args": ["-Dquarkus.profile=dev", "-jar", "/full/path/onekgpd-mcp-runner.jar"]
}
}
}
Verification
How many variants exist in 1000 Genome Project ?
License
This project is licensed under the Apache License 2.0 - see the file for details.