coreymhudson/mcp-fasta
If you are the rightful owner of mcp-fasta and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
A Model Context Protocol (MCP) server for working with FASTA files in VS Code with Claude.
load_fasta
Load a FASTA file and parse into sequence records.
summarize_fasta
Summarize number and length of sequences in a FASTA file.
get_sequence_by_id
Fetch a single sequence from a FASTA file by ID.
filter_fasta_by_length
Return all sequences in a file between length range.
write_fasta
Write sequences to a FASTA file.
validate_sequence
Validate sequences for proper nucleotide/amino acid composition.
reverse_complement
Generate reverse complement of DNA sequences.
translate_sequence
Translate DNA sequences to protein using genetic code.
search_sequence
Search for patterns, motifs, or subsequences.
calculate_gc_content
Calculate GC content and nucleotide statistics.
split_fasta
Split FASTA files into multiple files.
merge_fasta
Merge multiple FASTA files.
extract_subsequence
Extract subsequences by coordinates.
find_duplicates
Find duplicate sequences.
MCP FASTA
A Model Context Protocol (MCP) server for working with FASTA files in VS Code with Claude. This server provides tools for parsing, analyzing, and manipulating FASTA sequence files commonly used in bioinformatics.
Features
- Load FASTA files - Parse FASTA files and extract sequence metadata
- Summarize sequences - Get statistics about sequence lengths and counts
- Fetch specific sequences - Retrieve individual sequences by ID
- Filter by length - Find sequences within specified length ranges
- Write FASTA files - Create new FASTA files from sequence data
- Validate sequences - Check sequence format and composition
- Reverse complement - Generate reverse complement of DNA sequences
- Translate sequences - Convert DNA to protein using genetic codes
- Search patterns - Find motifs, patterns, or subsequences
- Calculate GC content - Analyze nucleotide composition and GC content
- Split FASTA files - Divide large files into smaller chunks
- Merge FASTA files - Combine multiple files with duplicate handling
- Extract subsequences - Extract regions by coordinates
- Find duplicates - Identify duplicate sequences by ID or content
Installation
- Clone the repository:
git clone <repository-url>
cd mcp-fasta
- Install dependencies:
npm install
- Build the project:
npm run build
Usage
With VS Code and Claude
The server is automatically configured for VS Code with Claude. Once built, the MCP server will be available in Claude with the following tools:
Available Tools
-
load_fasta
- Load a FASTA file and parse into sequence records- Input:
path
(string) - Path to the FASTA file - Output: Array of sequence records with ID, description, and length
- Input:
-
summarize_fasta
- Summarize number and length of sequences in a FASTA file- Input:
path
(string) - Path to the FASTA file - Output: Statistics including number of sequences, average length, longest, and shortest
- Input:
-
get_sequence_by_id
- Fetch a single sequence from a FASTA file by ID- Input:
path
(string) - Path to the FASTA fileid
(string) - Sequence ID to fetch
- Output: Complete sequence record with ID, description, and sequence
- Input:
-
filter_fasta_by_length
- Return all sequences in a file between length range- Input:
path
(string) - Path to the FASTA fileminLength
(number) - Minimum sequence lengthmaxLength
(number) - Maximum sequence length
- Output: Array of matching sequences with metadata
- Input:
-
write_fasta
- Write sequences to a FASTA file- Input:
path
(string) - Output path for the FASTA filesequences
(array) - Array of sequence objects withid
,sequence
, and optionaldescription
- Output: Confirmation message with number of sequences written
- Input:
-
validate_sequence
- Validate sequences for proper nucleotide/amino acid composition- Input:
path
(string) - Path to the FASTA filesequenceType
(enum) - Type to validate: "dna", "rna", "protein", or "auto"
- Output: Validation results with invalid characters and detected types
- Input:
-
reverse_complement
- Generate reverse complement of DNA sequences- Input:
path
(string) - Path to the FASTA fileoutputPath
(string, optional) - Output file for reverse complement sequencessequenceIds
(array, optional) - Specific sequence IDs to process
- Output: Reverse complement sequences with statistics
- Input:
-
translate_sequence
- Translate DNA sequences to protein using genetic code- Input:
path
(string) - Path to the FASTA filereadingFrame
(number) - Reading frame (1-3 forward, -1 to -3 reverse)geneticCode
(string, optional) - Genetic code table ("standard", "vertebrate_mitochondrial", "bacterial")outputPath
(string, optional) - Output file for translated sequences
- Output: Translated protein sequences with statistics
- Input:
-
search_sequence
- Search for patterns, motifs, or subsequences- Input:
path
(string) - Path to the FASTA filepattern
(string) - Search pattern (supports IUPAC codes and regex)searchType
(enum) - Type: "exact", "regex", or "iupac"caseSensitive
(boolean, optional) - Case sensitivityincludeReverseComplement
(boolean, optional) - Search reverse strand
- Output: Match positions and context for all sequences
- Input:
-
calculate_gc_content
- Calculate GC content and nucleotide statistics- Input:
path
(string) - Path to the FASTA filewindowSize
(number, optional) - Window size for sliding window analysis
- Output: GC content, nucleotide counts, and optional sliding window data
- Input:
-
split_fasta
- Split FASTA files into multiple files- Input:
path
(string) - Path to the FASTA filesplitBy
(enum) - Split method: "count", "size", or "individual"value
(number, optional) - Sequences per file or max size in MBoutputDir
(string) - Output directoryprefix
(string, optional) - Filename prefix
- Output: Information about created files
- Input:
-
merge_fasta
- Merge multiple FASTA files- Input:
inputPaths
(array) - Array of input file pathsoutputPath
(string) - Output merged file pathremoveDuplicates
(boolean, optional) - Remove duplicate sequencesaddFilePrefix
(boolean, optional) - Add filename to sequence IDs
- Output: Merge statistics and sequence information
- Input:
-
extract_subsequence
- Extract subsequences by coordinates- Input:
path
(string) - Path to the FASTA filecoordinates
(array) - Array of extraction coordinates with sequenceId, start, endoutputPath
(string, optional) - Output file for extracted sequences
- Output: Extracted subsequences with coordinate information
- Input:
-
find_duplicates
- Find duplicate sequences- Input:
path
(string) - Path to the FASTA fileduplicateType
(enum) - Type: "id", "sequence", or "both"caseSensitive
(boolean, optional) - Case sensitivity for sequence comparisonoutputDuplicates
(string, optional) - Output file for duplicatesoutputUnique
(string, optional) - Output file for unique sequences
- Output: Duplicate analysis and grouping information
- Input:
Command Line Claude Usage
To use with command line Claude, simply run the startup script from the project directory:
./start-claude.sh
This script automatically:
- Changes to the correct directory
- Starts Claude with the proper MCP configuration
- Loads all FASTA tools automatically
Alternatively, you can run Claude manually:
cd "/path/to/mcp-fasta"
claude --mcp-config mcp-config-portable.json
Manual Server Usage
You can also run the MCP server manually for debugging:
npm start
The server uses stdio transport and communicates via JSON-RPC.
Development
Scripts
npm run build
- Compile TypeScript to JavaScriptnpm start
- Start the MCP servernpm run dev
- Build and start in one command
Project Structure
src/
āāā server.ts # Main MCP server
āāā tools/ # Tool implementations
ā āāā loadFasta.ts # Load and parse FASTA files
ā āāā summarizeFasta.ts # Generate sequence statistics
ā āāā getSequence.ts # Fetch sequences by ID
ā āāā filterFasta.ts # Filter sequences by length
ā āāā writeFasta.ts # Write sequences to FASTA files
ā āāā validateSequence.ts # Validate sequence composition
ā āāā reverseComplement.ts # Generate reverse complement
ā āāā translateSequence.ts # Translate DNA to protein
ā āāā searchSequence.ts # Search patterns and motifs
ā āāā calculateGC.ts # Calculate GC content
ā āāā splitFasta.ts # Split FASTA files
ā āāā mergeFasta.ts # Merge multiple FASTA files
ā āāā extractSubsequence.ts # Extract by coordinates
ā āāā findDuplicates.ts # Find duplicate sequences
āāā utils/
āāā fastaParser.ts # FASTA file parsing utilities
FASTA Format Support
This server supports standard FASTA format files:
>sequence_id_1 Description of sequence 1
ATCGATCGATCGATCG
ATCGATCGATCGATCG
>sequence_id_2 Description of sequence 2
GCTAGCTAGCTAGCTA
GCTAGCTAGCTAGCTA
Requirements
- Node.js 18+
- TypeScript 5.1+
- VS Code with Claude extension (for integrated usage)
License
See LICENSE file for details.