hyennnnnnn/celltypist-mcp
If you are the rightful owner of celltypist-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
An MCP server designed for automated cell type annotation in single-cell RNA sequencing (scRNA-seq) analysis using CellTypist with natural language processing capabilities.
CellTypist MCP Server
An MCP (Model Context Protocol) server for automated cell type annotation in scRNA-seq analysis using CellTypist with natural language!
🎯 What can it do?
- Automatic cell type annotation using pre-trained CellTypist models
- List available models with descriptions and metadata
- Download models from CellTypist repository
- Train custom models on your own annotated data
- Extract marker genes for specific cell types
- Majority voting for robust predictions based on local subclusters
- Visualization with dotplot comparing predictions to reference labels
🧬 About CellTypist
CellTypist is an automated cell type annotation tool for scRNA-seq datasets based on logistic regression classifiers. It provides:
- Fast and accurate predictions using regularized linear models
- Pre-trained models for various tissues and cell types
- Majority voting approach to refine predictions
- Custom model training capabilities
📦 Installation
From source
git clone <repository-url>
cd celltypist-mcp
pip install -e .
🚀 Quick Start
Run locally with stdio transport
celltypist-mcp run
Run with a pre-loaded dataset
celltypist-mcp run --data /path/to/your/data.h5ad
Run with SSE transport (for remote access)
celltypist-mcp run --transport sse --port 8000 --host 0.0.0.0
🔧 Configuration
For AI Clients (e.g., Claude Desktop, Cherry Studio)
Add to your MCP client configuration:
{
"mcpServers": {
"celltypist": {
"command": "celltypist-mcp",
"args": ["run"]
}
}
}
With pre-loaded data
{
"mcpServers": {
"celltypist": {
"command": "celltypist-mcp",
"args": ["run", "--data", "/path/to/your/data.h5ad"]
}
}
}
Remote SSE connection
First, run the server on your machine:
celltypist-mcp run --transport sse --port 8000
Then configure your MCP client:
http://localhost:8000/sse
🛠️ Available Tools
1. celltypist_list_models
List all available CellTypist models with descriptions.
Example usage:
"Show me available CellTypist models"
"List all immune cell type models"
2. celltypist_annotate
Annotate cell types in your scRNA-seq data.
Parameters:
model: Model name (e.g., "Immune_All_High.pkl")majority_voting: Enable majority voting (default: False)over_clustering: Column in adata.obs for clustering (optional)mode: "best match" or "prob match" (default: "best match")p_thres: Probability threshold for multi-label (default: 0.5)
Example usage:
"Annotate my cells using the Immune_All_High model"
"Run CellTypist with majority voting on my data"
"Use the Immune_All_Low model with leiden clustering for majority voting"
3. celltypist_download_model
Download CellTypist models.
Parameters:
model: Model name or list of names (None downloads all)force_update: Force update to latest version (default: False)
Example usage:
"Download the Immune_All_High model"
"Download all available CellTypist models"
"Update the Immune_All_Low model to the latest version"
4. celltypist_get_model_info
Get detailed information about a specific model.
Parameters:
model: Model name
Example usage:
"What cell types are in the Immune_All_High model?"
"Show me information about the Immune_All_Low model"
"How many features does the Immune_All_High model use?"
5. celltypist_extract_markers
Extract top marker genes for a specific cell type.
Parameters:
model: Model namecell_type: Cell type nametop_n: Number of top markers (default: 10)
Example usage:
"What are the top marker genes for T cells in Immune_All_High?"
"Show me 20 marker genes for macrophages"
"Extract markers for B cells from the Immune_All_Low model"
6. celltypist_train
Train a custom CellTypist model.
Parameters:
labels: Column in adata.obs with cell type labelsmodel_name: Filename to save the modeluse_SGD: Use SGD learning for large datasets (default: False)C: L2 regularization strength (default: 1.0)max_iter: Maximum iterations (optional)feature_selection: Enable feature selection (default: False)top_genes: Number of top genes to select (default: 300)
Example usage:
"Train a CellTypist model using the 'cell_type' column and save it as 'my_model.pkl'"
"Create a custom model with SGD learning and feature selection"
7. celltypist_dotplot
Generate a dotplot comparing predictions with reference labels.
Parameters:
use_as_reference: Column in adata.obs with reference labelsuse_as_prediction: "predicted_labels" or "majority_voting" (default: "majority_voting")save: Filename to save figure (optional)
Example usage:
"Create a dotplot comparing CellTypist predictions with my cell_type labels"
"Visualize the majority voting results against leiden clusters"
"Generate a dotplot and save it as 'results.png'"
📊 Typical Workflow
-
List available models
"What CellTypist models are available?" -
Download a model (if not already downloaded)
"Download the Immune_All_High model" -
Annotate your cells
"Annotate my cells using Immune_All_High with majority voting" -
Visualize results
"Create a dotplot comparing predictions with my manual annotations" -
Extract markers (optional)
"What are the marker genes for T cells in this model?"
🧪 Example Conversations
Example 1: Quick annotation
User: "I have scRNA-seq data loaded. Can you annotate the cell types?"
Assistant: [Lists available models]
User: "Use the Immune_All_High model"
Assistant: [Runs celltypist_annotate and shows results]
Example 2: Custom model training
User: "I want to train my own CellTypist model"
Assistant: "What column contains your cell type labels?"
User: "The 'cell_type' column"
Assistant: [Runs celltypist_train and saves the model]
🔬 Data Requirements
- Input data should be in AnnData format (
.h5ad) - Expression matrix should be log1p normalized to 10,000 counts per cell
- For training: cell type labels should be in
adata.obs
📝 Notes
- The first time you use a model, it will be downloaded automatically
- Majority voting requires either an existing clustering or will auto-cluster
- Trained models are saved locally and can be reused
- All results are saved to
adata.obscolumns prefixed withcelltypist_
🔗 Related Projects
- CellTypist - The original CellTypist tool
- MCP - Model Context Protocol
- Scanpy - Single-cell analysis in Python