fastomop/omcp_py
If you are the rightful owner of omcp_py and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The OMCP Python Sandbox Server is a secure, MCP-compliant Python code execution environment with Docker-based sandboxing, designed for safe and isolated Python code execution.
OMCP Python Sandbox Server
Overview
A secure, Docker-based Python sandbox server using the Model Context Protocol (MCP) for isolated code execution and advanced healthcare analytics. This project enables secure processing of Synthea synthetic healthcare data with PostgreSQL OMOP CDM integration and LLM-powered analytics.
π Key Features
- π Secure Sandboxing: Isolated Docker containers with resource limits and user isolation
- π₯ Healthcare Data Pipeline: Synthea-to-PostgreSQL with OMOP CDM mapping
- π€ LLM Integration: Natural language queries for healthcare analytics
- π Advanced Analytics: Structured and LLM-friendly data exploration
- π§ MCP Protocol: Model Context Protocol for AI agent integration
- π³ Docker Integration: Containerized PostgreSQL database with data persistence
ποΈ Architecture
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β MCP Client βββββΆβ FastMCP Server βββββΆβ Docker Sandbox β
β (AI Agent) β β (main.py) β β (Isolated) β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β β
βΌ βΌ
ββββββββββββββββββββ βββββββββββββββββββ
β PostgreSQL DB β β Synthea CSV β
β (OMOP CDM) β β (Mounted Data) β
ββββββββββββββββββββ βββββββββββββββββββ
π Prerequisites
- Python 3.8+ with pip
- Docker & Docker Compose
- Synthea CSV files (optional, for healthcare data processing)
Using UV for environment management
This project is configured to use uv
for environment management. uv
creates and manages Python virtual environments and can install the dependencies declared in pyproject.toml
under tool.uv
.
Quick start using uv
:
# Install uv (see https://astral.sh/uv for instructions)
# Then create a uv-managed venv and install dependencies:
scripts/setup_uv.sh
source .venv/bin/activate
If you prefer not to use uv
, you can still create a regular venv and install the packages listed in pyproject.toml
or requirements.txt
.
π Quick Start
1. Clone and Setup
git clone https://github.com/fastomop/omcp_py.git
cd omcp_py
# Install dependencies
pip install -r requirements.txt
2. Start PostgreSQL Database
# Start the OMOP database
docker-compose up -d db
# Verify it's running
docker-compose ps
3. Prepare Data (Optional)
Place your Synthea CSV files in the synthetic_data/
directory:
synthetic_data/
βββ patients.csv # Patient demographics
βββ encounters.csv # Healthcare encounters
βββ conditions.csv # Medical conditions
βββ ...
4. Start the MCP Server
# Set Python path
export PYTHONPATH=src
# Start the server
python src/omcp_py/main.py
5. Connect with MCP Client
Use MCP Inspector or your preferred MCP client:
# Install MCP Inspector
npm install -g @modelcontextprotocol/inspector
# Connect to the server
mcp-inspector python src/omcp_py/main.py
Then open http://127.0.0.1:6274 in your browser.
π₯ Healthcare Data Workflow
Complete Synthea-to-PostgreSQL Pipeline
# 1. Create sandbox and install packages
sandbox_id = await mcp.create_sandbox()
await mcp.install_package(sandbox_id, "pandas psycopg2-binary sqlalchemy")
# 2. Create OMOP CDM schema
await mcp.create_omop_schema(sandbox_id)
# 3. Load Synthea data
await mcp.load_synthea_to_postgres(sandbox_id, "/synthetic_data")
# 4. Run analytics
await mcp.analyze_omop_data(sandbox_id, "basic")
await mcp.llm_dataframe_operation(sandbox_id, "Count total patients")
Available MCP Tools
Tool | Description | Example |
---|---|---|
create_sandbox | Create isolated Python environment | create_sandbox() |
install_package | Install Python packages | install_package(sandbox_id, "pandas") |
create_omop_schema | Create OMOP CDM database schema | create_omop_schema(sandbox_id) |
load_synthea_to_postgres | Load Synthea CSV to PostgreSQL | load_synthea_to_postgres(sandbox_id, "/synthetic_data") |
analyze_omop_data | Run structured analytics | analyze_omop_data(sandbox_id, "basic") |
llm_dataframe_operation | Natural language queries | llm_dataframe_operation(sandbox_id, "Count patients") |
execute_sql_in_sandbox | Direct SQL execution | execute_sql_in_sandbox(sandbox_id, "SELECT COUNT(*) FROM person") |
remove_sandbox | Clean up sandbox | remove_sandbox(sandbox_id, force=True) |
π Analytics Examples
Basic Counts
{
"total_patients": 1000,
"total_visits": 5000,
"total_conditions": 8000
}
Demographics Analysis
[
{
"gender_concept_id": 8507,
"patient_count": 500,
"avg_age": 45.2
}
]
LLM Natural Language Queries
# These work with natural language
await mcp.llm_dataframe_operation(sandbox_id, "Count total patients")
await mcp.llm_dataframe_operation(sandbox_id, "Show age distribution")
await mcp.llm_dataframe_operation(sandbox_id, "Count unique conditions")
await mcp.llm_dataframe_operation(sandbox_id, "Show gender distribution")
π§ Configuration
Environment Variables
Create a .env
file or set environment variables:
# Sandbox Configuration
SANDBOX_TIMEOUT=300
MAX_SANDBOXES=10
DOCKER_IMAGE=fastomop/sandbox:python-3.11-slim # recommended prebuilt sandbox image
DEBUG=false
LOG_LEVEL=INFO
# Database Configuration
DB_HOST=localhost
DB_PORT=5432
DB_USER=omop_user
DB_PASSWORD=omop_pass
DB_NAME=omop
Docker Compose
The docker-compose.yml
provides:
- PostgreSQL 15 with OMOP database
- Persistent data storage
- Synthea data directory mounting
π§ͺ Testing
Run Integration Tests
python tests/test_synthea_integration.py
Run Workflow Demo
./scripts/demo.sh
Demo and prebuilt sandbox image
We provide a prebuilt sandbox Dockerfile and a convenience demo script to run an end-to-end local demo.
- Build the prebuilt sandbox image (optional but recommended):
docker build -t fastomop/sandbox:python-3.11-slim -f docker/sandbox/Dockerfile .
- Run the demo (builds image, starts DB, launches server, runs a local client and prints DB counts):
./scripts/demo.sh
If you have a DuckDB snapshot at synthetic_data/synthea.duckdb
and want the demo to load Synthea into Postgres, run:
./scripts/demo.sh --load-duckdb
If port 5432 on your host is already in use, pass an alternate host port to the demo script or set DB_PORT
in your environment (or .env) before running:
# Use port 5433 for the host mapping
./scripts/demo.sh --db-port 5433 --load-duckdb
# or export DB_PORT beforehand
export DB_PORT=5433
./scripts/demo.sh --load-duckdb
Notes:
- The sandbox manager will auto-join the docker-compose network (if detected) so sandboxes can resolve the
db
service name when running underdocker compose
. - If you use a host Postgres instance, set
DB_HOST=host.docker.internal
or enable host-gateway resolution.
Test Individual Components
# Test file structure
python -c "import src.omcp_py.main; print('β
Main module loads successfully')"
# Test Docker Compose
docker-compose config
π Security Features
- Container Isolation: Each sandbox runs in isolated Docker containers
- Resource Limits: CPU and memory restrictions per sandbox
- User Isolation: Non-root user execution
- Network Security: Controlled network access
- File System: Read-only filesystem with temporary mounts
- Capability Dropping: Removed dangerous Linux capabilities
- Auto-cleanup: Automatic removal of inactive sandboxes
π Documentation
- - Detailed workflow documentation
- - Complete tool documentation
- - Environment and deployment setup
- - System design and components
π Advanced Usage
Custom Data Mapping
Extend the Synthea-to-OMOP mapping in load_synthea_to_postgres
:
synthea_mappings = {
'custom_data.csv': {
'table': 'omop_cdm.custom_table',
'columns': {
'custom_id': 'person_id',
'custom_date': 'birth_datetime'
}
}
}
Additional OMOP Tables
Extend the schema to include more OMOP CDM tables:
drug_exposure
procedure_occurrence
measurement
observation
Custom Analytics
Create domain-specific analytics:
# Custom Python code in sandbox
code = '''
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('postgresql://omcp:postgres@db:5432/omcp')
df = pd.read_sql("SELECT * FROM omop_cdm.person", engine)
# Your custom analysis here
result = df.groupby('gender_concept_id').agg({
'person_id': 'count',
'birth_datetime': lambda x: pd.Timestamp.now().year - pd.to_datetime(x).dt.year.mean()
}).to_dict()
print(result)
'''
await mcp.execute_python_code(sandbox_id, code)
π€ Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
π License
MIT License - see file for details.
π Acknowledgments
- Model Context Protocol for the MCP specification
- FastMCP for the Python MCP implementation
- Synthea for synthetic healthcare data
- OMOP CDM for healthcare data standards
π Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: Wiki
Built by Zhangshu Joshua Jiang and the wider FastOMCP team