mcp-ncbi-blast

sian-davies/mcp-ncbi-blast

3.1

If you are the rightful owner of mcp-ncbi-blast and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

The NCBI BLAST MCP Server is a Gradio web application that performs DNA sequence similarity searches using NCBI's BLAST service, allowing users to identify organisms and genes from DNA sequences.

NCBI BLAST MCP Server

MCP Server and Gradio web application that performs DNA sequence similarity searches using NCBI's BLAST service. It allows the user to input a DNA sequence and get back the top ten sequence matches from the NCBI database, enabling identification of the organism and gene.

This app functions as a Model Context Protocol (MCP) Server for integration with AI assistants.

Model Context Protocol (MCP) is an open protocol that standardizes how 
applications provide context to LLMs. MCP enables models to interact 
with the world. Learn more at modelcontextprotocol.io

Created for the HuggingFace Gradio Agents & MCP Hackathon 2025. Track 1 (mcp-server-track): Extend the capabilities of your favorite LLM by building a Gradio app to accomplish any specific task.

MCP/Gradio app on HuggingFaceSpaces: Agents-MCP-Hackathon/ncbi-blast-mcp-server

Demo video with Cursor Desktop: "https://youtu.be/yCpaTvcDeqM"

Features

  • MCP Server for AI assistant integration
  • Web interface and programmatic API access
  • DNA sequence similarity search using NCBI BLAST
  • Submit DNA sequences in FASTA or raw format
  • Automatic sequence validation and cleaning
  • Returns top 10 hits/matches in JSON format

Quick Start in Cursor IDE

Clone or download this repository, then run in Cursor terminal:

pip install -r requirements.txt
python app.py

Open the provided local URL (http://localhost:7860) to access the web interface.

Web UI Usage

Input: Paste your DNA sequence in FASTA format or as raw nucleotides

>My sequence
AGTCTGNYRGWACGT

or just:

AGTCTGNYRGWACGT

Output: View generated JSON output of top sequence matches

Programmatic Usage

Query the running server programmatically with Python:

pip install gradio_client
from gradio_client import Client

client = Client("<spacename>/ncbi-blast-mcp-server")
result = client.predict(
		seq="ATGGACACCTACTCCTCTGGAGAAGATTTAGTTATTAAGACACGAAAACCGTATACAATTACCAAGCAACGGGAACGATGGACAGAGGAGGAGCATAA
            TAGGTTTCTAGAAGCCTTAAAACTCTATGGGCGAGCGTGGCAACGTATCGAAGAACATATAGGAACCAAGACTGCTGTGCAGATCAGAAGTCATGCACA
            GAAATTCTTTACAAAGTTGGAGAAGGAAGCTCTTGTGAAAGGAGTTCCAATAAGACAAGCTATTGACATAGAGATTCCTCCTCCGCGCCCTAAAAGGAA
            ACCAAGCAATCCTTATCCTCGAAAGACTGGTGTGGCAACACCTAGTCTGCAGGTGGGAGCAAAGGATGGGAATAATTCATCATCAGTTTCTTCTTCCTG
            CACTGCCACTGGTAAACAAATACTGGACTTGGAAAGAGAACCACTACCTGAGAAACCTGATGGAGATGAAAAGCAAGAAAATGCCAAAGAAAACCAGGA
            TGAGGGAAATTTCTCTGAAGTTTTAACCCTTTTCCAAGAAGCTCCGTGTACGTCCTTGTCTTCAGTGGACAAAGATTCCATTCGAACACTGGCGGCACC",
		api_name="/blast_ui"
)

View the top result:

import json

blast_data = json.loads(result[1])

if "error" not in blast_data:
    top_hit = blast_data["top_hits"][0]
    print(f"Organism: {top_hit['description']}")
    print(f"Identity: {top_hit['identity_percent']}%")

Input Requirements

  • Single sequence only (no multiple FASTA entries)
  • DNA sequence as input
  • Maximum length: 3,000 base pairs
  • Supports IUPAC DNA codes including ambiguous bases
  • Automatically removes headers and whitespace

Notes

  • Uses NCBI's public BLAST service
  • Results typically available within 30-180 seconds
  • Respects NCBI usage guidelines
  • Removes "PREDICTED:" prefixes from descriptions

Future Work

  • Enable multi-sequence inputs
  • Enable protein and RNA search functionality