neuralx-dev/biotools-mcp-server
If you are the rightful owner of biotools-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The Biotools MCP Server is a Model Context Protocol server that provides access to biotools and scientific literature through PubMed APIs, enabling AI applications to efficiently search and manage scientific publications and research data.
Biotools MCP Server
A comprehensive Model Context Protocol (MCP) server for bioinformatics research, providing AI applications with access to major biological databases and analysis tools including PubMed, UniProt, NCBI GenBank, KEGG, PDB, and more.
🧬 Available Tools (37 Total)
📚 Literature Research Tools (3 tools)
search_pubmed
Search PubMed for scientific publications with advanced filtering capabilities.
- Purpose: Find relevant research papers and studies
- Input: Search terms (e.g., "CRISPR gene editing", "cancer genomics")
- Returns: Publication metadata, abstracts, authors, and bibliographic information
get_publication_details
Retrieve comprehensive details for a specific publication by PMID.
- Purpose: Get complete bibliographic record with all metadata
- Input: PubMed ID (PMID)
- Returns: Full MEDLINE record including authors, affiliations, funding sources, MeSH terms, citations, and publication history
get_publication_abstract
Extract the full abstract for a specific publication.
- Purpose: Get structured abstract content with section labels
- Input: PubMed ID (PMID)
- Returns: Complete abstract text with metadata
🧬 Protein Analysis Tools (3 tools)
search_uniprot
Search the UniProtKB database for proteins with comprehensive field extraction.
- Purpose: Find proteins by name, function, organism, or other criteria
- Input: Search query (e.g., "insulin", "kinase AND human", "P04637")
- Returns: Protein entries with 90+ comprehensive fields including function, domains, and cross-references
get_protein_entry
Get detailed information for a specific protein by UniProt accession.
- Purpose: Retrieve complete protein annotation and functional data
- Input: UniProt accession number (e.g., "P38398")
- Returns: Exhaustive protein data including domains, PTMs, variants, tissue specificity, disease associations, and cross-references to 80+ databases
get_protein_sequence
Retrieve protein sequence in FASTA format with structural context.
- Purpose: Get amino acid sequence with feature annotations
- Input: UniProt accession number
- Returns: FASTA sequence with complete metadata and structural features
🧬 Nucleotide Sequence Analysis Tools (4 tools)
get_nucleotide_sequence
Retrieve nucleotide sequences from GenBank, RefSeq, or Ensembl databases.
- Purpose: Get DNA/RNA sequences with complete annotation
- Input: Accession number (e.g., "NM_000546") and database preference
- Returns: Complete sequence records with features, references, and comprehensive annotation
compare_annotations
Compare genomic annotations between prokaryotic and eukaryotic sequences.
- Purpose: Analyze differences in gene structure and annotation between organism types
- Input: Two sequence identifiers for comparison
- Returns: Detailed comparison of features, similarities, and differences with biological insights
find_intron_exons
Detect intron-exon boundaries in gene sequences with splice site analysis.
- Purpose: Analyze gene structure and identify coding regions
- Input: Gene sequence identifier with optional organism context
- Returns: Complete gene structure with exons, introns, splice sites, and coding sequence analysis
align_promoters
Align promoter regions from multiple genes to discover conserved regulatory elements.
- Purpose: Find conserved motifs and regulatory sequences in gene promoters
- Input: List of 2-10 gene identifiers
- Returns: Promoter alignment with conserved elements and regulatory motif analysis
🧪 Enhanced Protein Analysis Tools (3 tools)
get_cross_references
Get comprehensive cross-references for a protein from multiple databases.
- Purpose: Find related information across KEGG, Pfam, PDB, InterPro, GO, and other databases
- Input: UniProt accession number
- Returns: Complete cross-references from 80+ databases including structures, domains, pathways, and functional annotations
analyze_ptms
Analyze post-translational modifications with functional impact assessment.
- Purpose: Identify and analyze protein modifications and their biological significance
- Input: UniProt accession number with optional PTM type filters
- Returns: Complete PTM analysis with functional impact predictions and confidence scoring
get_pathway_data
Get detailed pathway information and metabolic context for a protein.
- Purpose: Understand protein function in biological pathways and networks
- Input: UniProt accession number with database preference (KEGG, Reactome, etc.)
- Returns: Complete pathway networks with reactions, modules, and related proteins
🧬 DNA Analysis Tools (4 tools)
analyze_gc_content
Calculate GC percentage and nucleotide composition of a DNA sequence.
- Purpose: Analyze sequence composition and identify compositional bias
- Input: DNA sequence string
- Returns: GC content, AT content, nucleotide counts, skew analysis, and sequence characteristics
find_restriction_sites
Identify restriction enzyme cut sites in DNA sequence using REBASE database motifs.
- Purpose: Find restriction enzyme recognition sites for cloning and molecular biology
- Input: DNA sequence and optional enzyme list
- Returns: Restriction sites by enzyme, fragment analysis, and cutting pattern visualization
predict_orfs
Scan all 6 reading frames for start/stop codons to detect open reading frames (ORFs).
- Purpose: Identify potential protein-coding regions in DNA sequences
- Input: DNA sequence with minimum length threshold
- Returns: ORF locations by reading frame, amino acid sequences, and statistical analysis
assemble_fragments
Assemble short DNA sequences into one using overlap-based merging.
- Purpose: Reconstruct longer sequences from overlapping fragments
- Input: Array of DNA fragments with optional overlap parameters
- Returns: Assembled sequence, overlap analysis, and assembly statistics
🧬 Protein Sequence Tools (3 tools)
predict_protein_properties
Predict molecular weight, isoelectric point, amino acid composition, and other physicochemical properties.
- Purpose: Calculate protein physical and chemical characteristics
- Input: Protein sequence (amino acid string)
- Returns: Molecular weight, pI, hydropathy, instability index, amino acid composition, and stability predictions
predict_transmembrane_regions
Identify transmembrane helices using hydropathy analysis and TMHMM-like algorithms.
- Purpose: Predict membrane-spanning regions and protein topology
- Input: Protein sequence with analysis parameters
- Returns: Transmembrane helices, topology predictions, signal peptides, and localization analysis
scan_protein_motifs
Detect functional motifs and domains using PROSITE patterns and other databases.
- Purpose: Find functional sites and regulatory elements in proteins
- Input: Protein sequence with database preference
- Returns: Functional motifs, phosphorylation sites, glycosylation sites, and regulatory predictions
🔍 Sequence Similarity Tools (5 tools)
blast_search
Run BLAST (nucleotide or protein) search against NCBI databases to find similar sequences.
- Purpose: Find similar sequences and identify homologs
- Input: Query sequence with database and program selection
- Returns: BLAST hits with alignments, E-values, bit scores, and statistical analysis
psi_blast_search
Run PSI-BLAST for deeper homology detection using iterative profile construction.
- Purpose: Detect distant homologs through profile-based searching
- Input: Protein sequence with iteration parameters
- Returns: Profile-enhanced hits, iteration summary, and remote homolog detection
align_sequences_global
Perform Needleman-Wunsch global alignment to compare two sequences end-to-end.
- Purpose: Align entire sequences to compare overall similarity
- Input: Two sequences with scoring parameters
- Returns: Global alignment with identity, similarity, gaps, and quality assessment
align_sequences_local
Perform Smith-Waterman local alignment to find the best local similarity between sequences.
- Purpose: Find regions of local similarity between sequences
- Input: Two sequences with scoring parameters
- Returns: Local alignment with similarity regions and coverage analysis
generate_dotplot
Generate dot plot visualization for pairwise sequence comparison to identify similarity patterns.
- Purpose: Visualize sequence similarity patterns and detect rearrangements
- Input: Two sequences with window and threshold parameters
- Returns: Dot plot coordinates, similarity regions, and pattern analysis
🧬 Multiple Alignment Tools (4 tools)
multiple_sequence_alignment
Align 2-20 protein or nucleotide sequences using progressive alignment algorithms.
- Purpose: Create multiple sequence alignments for comparative analysis
- Input: Array of sequences with alignment parameters
- Returns: Multiple alignment with conservation analysis and quality metrics
highlight_conserved_regions
Find and analyze conserved regions in a multiple sequence alignment.
- Purpose: Identify functionally important conserved regions
- Input: Aligned sequences with conservation thresholds
- Returns: Conserved regions, consensus sequences, and functional predictions
generate_sequence_logo
Create sequence logo data from multiple alignment to visualize conservation patterns.
- Purpose: Generate conservation logos for motif visualization
- Input: Aligned sequences with information content thresholds
- Returns: Logo data with information content, residue frequencies, and motif analysis
export_alignment
Export multiple sequence alignment in various formats (FASTA, PHYLIP, Clustal, MSF).
- Purpose: Convert alignments to different formats for external tools
- Input: Aligned sequences with format selection
- Returns: Formatted alignment file with usage instructions
🏗️ Structure & RNA Tools (4 tools)
get_protein_structure
Retrieve 3D protein structure data from PDB database with comprehensive metadata.
- Purpose: Get experimental protein structure information
- Input: PDB ID (4-character code)
- Returns: Structure metadata, chain information, ligands, resolution, and experimental details
analyze_secondary_structure
Analyze protein secondary structure from PDB data or predict from sequence.
- Purpose: Determine protein secondary structure elements
- Input: Protein sequence with optional PDB structure
- Returns: Secondary structure composition, helices, sheets, turns, and structural classification
predict_rna_secondary_structure
Predict RNA secondary structure using thermodynamic algorithms.
- Purpose: Predict RNA folding and stability
- Input: RNA sequence
- Returns: Secondary structure, base pairs, loops, stems, and thermodynamic analysis
scan_rna_motifs
Identify functional RNA motifs and regulatory elements in sequence.
- Purpose: Find functional RNA elements and regulatory sites
- Input: RNA sequence with structure context option
- Returns: Regulatory motifs, structural elements, and functional predictions
🌳 Phylogenetics Tools (2 tools)
build_phylogenetic_tree
Build phylogenetic tree from multiple sequences using Neighbor-Joining, UPGMA, or Maximum Parsimony.
- Purpose: Construct evolutionary trees from sequence data
- Input: 3-50 sequences with method selection and bootstrap options
- Returns: Phylogenetic tree in Newick format with branch lengths and support values
compare_phylogenetic_trees
Compare two phylogenetic trees using Robinson-Foulds distance and other metrics.
- Purpose: Assess topological differences between trees
- Input: Two sets of sequences for tree comparison
- Returns: Tree comparison metrics, topological differences, and statistical assessment
📊 Documentation Tools (2 tools)
log_analysis_parameters
Record workflow parameters, data, and results for reproducibility and tracking.
- Purpose: Document analysis workflows for reproducibility
- Input: Tool name, parameters, input data, and results
- Returns: Structured log entry with performance metrics and metadata
generate_resource_map
Create comprehensive guide of bioinformatics databases, tools, and workflow recommendations.
- Purpose: Generate personalized resource guides and workflow recommendations
- Input: Focus areas and analysis history
- Returns: Curated database list, algorithm recommendations, workflow guides, and citations
🚀 Quick Start
Prerequisites
- Node.js 18+
- npm or yarn
Installation
git clone <repository-url>
cd biotools-mcp-server
npm install
npm run build
npm start
Testing
npm run inspect
📖 Usage Example
// Search for BRCA1-related publications
{
"tool": "search_pubmed",
"arguments": {
"term": "BRCA1 mutations breast cancer",
"max_results": 10
}
}
// Get detailed protein information
{
"tool": "get_protein_entry",
"arguments": {
"accession": "P38398" // BRCA1 protein
}
}
// Analyze DNA sequence GC content
{
"tool": "analyze_gc_content",
"arguments": {
"sequence": "ATCGATCGATCGATCG"
}
}
// Build phylogenetic tree
{
"tool": "build_phylogenetic_tree",
"arguments": {
"sequences": [
{"id": "seq1", "sequence": "MKLLLLLL..."},
{"id": "seq2", "sequence": "MKLLLLLL..."},
{"id": "seq3", "sequence": "MKLLLLLL..."}
],
"method": "neighbor-joining",
"bootstrap_replicates": 100
}
}
🔧 Configuration
Add to your MCP client configuration:
{
"mcpServers": {
"biotools": {
"command": "node",
"args": ["path/to/biotools-mcp-server/build/index.js"]
}
}
}
📋 Features
- 37 comprehensive tools covering all major bioinformatics analysis types
- 11+ major biological databases integrated (PubMed, UniProt, NCBI, KEGG, PDB, etc.)
- Complete research workflows from literature review to phylogenetic analysis
- Advanced algorithms including BLAST, multiple alignment, phylogenetics, and structure prediction
- Reproducible analysis with comprehensive logging and documentation tools
- Flexible input/output supporting multiple sequence formats and databases
🧬 Analysis Categories
Sequence Analysis
- DNA/RNA composition and structure analysis
- ORF prediction and restriction mapping
- Protein property prediction and motif scanning
- Transmembrane region and secondary structure prediction
Comparative Analysis
- Sequence similarity searching (BLAST, PSI-BLAST)
- Pairwise and multiple sequence alignment
- Phylogenetic tree construction and comparison
- Conservation analysis and motif discovery
Database Integration
- Literature mining from PubMed
- Protein data from UniProt
- Nucleotide sequences from GenBank/RefSeq/Ensembl
- Structural data from PDB
- Pathway information from KEGG and Reactome
Advanced Analytics
- RNA secondary structure prediction
- Post-translational modification analysis
- Cross-database reference mapping
- Fragment assembly and sequence reconstruction
🤝 Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Test with
npm run inspect - Submit a pull request
📄 License
MIT License - see LICENSE file for details.