jweingardt12/mlb_mcp
If you are the rightful owner of mlb_mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
Pybaseball MCP Server is a FastAPI-based server that provides access to MLB and Fangraphs baseball data using the pybaseball library.
MLB Stats MCP Server
A Model Context Protocol (MCP) server that provides comprehensive MLB baseball statistics with video highlights through the pybaseball library.
Overview
This MCP server enables AI assistants to access real-time MLB statistics, historical data, and video highlights. It provides seven powerful tools for querying baseball data, from individual player performance to team statistics and advanced Statcast metrics.
Features
🎯 7 Core Tools
1. get_player_stats
Get detailed Statcast data for any MLB player with optional date filtering.
- Parameters:
name
(required): Player name (e.g., "Mike Trout", "Ronald Acuña Jr.")start_date
(optional): Start date in YYYY-MM-DD formatend_date
(optional): End date in YYYY-MM-DD format
- Returns: Player statistics including hits, home runs, exit velocity, launch angle, and barrel rate
2. get_team_stats
Retrieve comprehensive team batting or pitching statistics for any season.
- Parameters:
team
(required): Team name or abbreviation (e.g., "Yankees", "NYY", "Red Sox")year
(required): Season year (1871-present)stat_type
(optional): "batting" (default) or "pitching"
- Returns: Complete team statistics for the specified season
3. get_leaderboard
Access statistical leaderboards for any MLB stat category.
- Parameters:
stat
(required): Statistic abbreviation (e.g., "HR", "AVG", "ERA", "K")season
(required): Season yearleaderboard_type
(optional): "batting" or "pitching"limit
(optional): Number of results (default: 10)
- Returns: Top players for the specified statistic
4. statcast_leaderboard
Query advanced Statcast data with filtering, sorting, and video highlight links.
- Parameters:
start_date
(required): Start date in YYYY-MM-DD formatend_date
(required): End date in YYYY-MM-DD formatresult
(optional): Filter by outcome (e.g., "home_run", "single", "double")min_ev
(optional): Minimum exit velocity filtermin_pitch_velo
(optional): Minimum pitch velocity filtersort_by
(optional): Sort metric:"exit_velocity"
(default) - Hardest hit balls"distance"
- Longest hits"launch_angle"
- Optimal launch angles"pitch_velocity"
- Fastest pitches"spin_rate"
- Highest spin rates"xba"
- Highest expected batting average"xwoba"
- Highest expected weighted on-base average"barrel"
- Barrel rate (perfect contact)
limit
(optional): Number of results (default: 10)order
(optional): Sort order - "desc" (default) or "asc"group_by
(optional): Group results by "team" for team-wide rankings
- Returns: Detailed play-by-play data with video links (or team aggregations when group_by="team")
5. team_season_stats
Get fast team season averages for Statcast metrics. Optimized for queries like "which team hits the ball hardest?".
- Parameters:
year
(required): Season year (e.g., 2025)stat
(optional): Metric to analyze"exit_velocity"
(default) - Average exit velocity"distance"
- Average and max distance"launch_angle"
- Average launch angle"barrel_rate"
- Percentage of barrels (perfect contact)"hard_hit_rate"
- Percentage of balls hit 95+ mph"sweet_spot_rate"
- Percentage hit at optimal launch angles (8-32°)
min_result_type
(optional): Filter by result type"batted_ball"
- All balls in play"home_run"
- Home runs only"hit"
- All hits (singles, doubles, triples, home runs)
- Returns: Team rankings with averages, counts, and other statistics
- Performance: Uses sampling strategy (every 7th day) and 24-hour caching for instant responses
6. team_pitching_stats
Get fast team pitching averages for Statcast metrics. Optimized for queries like "which team has the best pitching staff?".
- Parameters:
year
(required): Season year (e.g., 2025)stat
(optional): Metric to analyze"velocity"
(default) - Average and max pitch velocity"spin_rate"
- Average and max spin rate"movement"
- Pitch break (horizontal, vertical, total)"whiff_rate"
- Swing-and-miss percentage"chase_rate"
- Swings at pitches outside the zone"zone_rate"
- Percentage of pitches in strike zone"ground_ball_rate"
- Ground balls per balls in play"xera"
- Expected ERA based on quality of contact
pitch_type
(optional): Filter to specific pitch type"FF"
- 4-seam fastball"SL"
- Slider"CH"
- Changeup"CU"
- Curveball"SI"
- Sinker"FC"
- Cutter"FS"
- Splitter
- Returns: Team pitching rankings with averages, counts, and other statistics
- Performance: Uses sampling strategy (every 7th day) and 24-hour caching for instant responses
7. statcast_count
Count Statcast events matching specific criteria. Optimized for multi-year queries like "how many 475+ ft home runs since 2023?"
- Parameters:
start_date
(required): Start date in YYYY-MM-DD formatend_date
(required): End date in YYYY-MM-DD formatresult_type
(optional): Type to count - 'home_run' (default), 'hit', 'batted_ball', or specific like 'double'min_distance
(optional): Minimum distance in feet (e.g., 475)max_distance
(optional): Maximum distance in feetmin_exit_velocity
(optional): Minimum exit velocity in mphmax_exit_velocity
(optional): Maximum exit velocity in mph
- Returns: Total count, yearly breakdown, and top 5 examples with video links
- Performance:
- Multi-year queries: Samples 3 days per month (~90x fewer API calls)
- 6-12 months: Weekly sampling (~7x fewer API calls)
- <6 months: Complete data with chunking
- 24-hour caching for all queries
🎥 Video Highlights Integration
Every result from statcast_leaderboard
includes detailed metrics and video access points:
{
"rank": 1,
"player": "Ronald Acuña Jr.",
"date": "2024-07-20",
"exit_velocity": 113.7,
"launch_angle": 23.0,
"distance": 456.0,
"result": "home_run",
"pitch_velocity": 95.2,
"pitch_type": "FF",
"spin_rate": 2450,
"xba": 0.920,
"xwoba": 1.823,
"barrel": true,
"description": "Ronald Acuña Jr. homers (13) on a fly ball to center field.",
"video_links": {
"game_highlights_url": "https://www.mlb.com/gameday/745890/video",
"film_room_search": "https://www.mlb.com/video/search?q=Ronald+Acuna+Jr.+2024-07-20",
"game_pk": "745890",
"api_highlights_endpoint": "https://statsapi.mlb.com/api/v1/schedule?gamePk=745890&hydrate=game(content(highlights(highlights)))"
}
}
🏟️ Smart Team Recognition
The server intelligently handles team names:
- Full names: "Orioles" → "BAL", "Red Sox" → "BOS"
- Cities: "Boston" → "BOS", "New York Yankees" → "NYY"
- Historical teams: "Expos" → "MON", "Indians" → "CLE"
- All 30 current MLB teams supported with common variations
📊 Team-Wide Rankings
New team aggregation feature for statcast_leaderboard
:
- Group results by team to see team-wide performance
- Calculates averages, maximums, and counts for each metric
- Perfect for questions like "Which team hits the hardest home runs?"
- Returns comprehensive team statistics including barrel counts and expected metrics
Installation
Deploy on Smithery
- Fork this repository
- Connect your GitHub account to Smithery
- Deploy directly from your repository
Local Development
# Clone the repository
git clone https://github.com/yourusername/mlb_mcp.git
cd mlb_mcp/san-juan
# Install dependencies
pip install -r requirements.txt
# Run the server
python -m server
Claude Desktop Configuration
Add to your Claude Desktop configuration:
{
"mcpServers": {
"mlb-stats": {
"command": "python",
"args": ["-m", "server"],
"cwd": "/path/to/mlb_mcp/san-juan"
}
}
}
Usage Examples
Find the Longest Home Runs
Query: "Show me the longest home runs from yesterday"
Tool: statcast_leaderboard("2024-07-20", "2024-07-20", "home_run", None, None, "distance", 10)
Hardest Hit Balls on Fast Pitches
Query: "What were the hardest hit balls on 99+ mph pitches last week?"
Tool: statcast_leaderboard("2024-07-14", "2024-07-20", None, 0, 99.0, "exit_velocity", 10)
Fastest Pitches Thrown
Query: "Show me the fastest pitches thrown yesterday"
Tool: statcast_leaderboard("2024-07-20", "2024-07-20", None, None, None, "pitch_velocity", 10)
Highest Spin Rate Pitches
Query: "What pitches had the highest spin rate today?"
Tool: statcast_leaderboard("2024-07-20", "2024-07-20", None, None, None, "spin_rate", 10)
Best Quality Contact (Barrels)
Query: "Show me the best quality contact this week"
Tool: statcast_leaderboard("2024-07-14", "2024-07-20", None, None, None, "barrel", 10)
Player Season Stats
Query: "Get Mike Trout's stats for this season"
Tool: get_player_stats("Mike Trout", "2024-04-01", "2024-10-01")
Team Performance
Query: "How are the Red Sox doing this year?"
Tool: get_team_stats("Red Sox", 2024, "batting")
League Leaders
Query: "Who's leading the league in home runs?"
Tool: get_leaderboard("HR", 2024, "batting", 10)
Team-Wide Rankings (statcast_leaderboard)
Query: "Which team has the hardest hit home runs this season?"
Tool: statcast_leaderboard("2024-04-01", "2024-10-01", "home_run", None, None, "exit_velocity", 10, "desc", "team")
Query: "Show me teams with the longest average home run distance this month"
Tool: statcast_leaderboard("2024-07-01", "2024-07-31", "home_run", None, None, "distance", 10, "desc", "team")
Query: "Which teams have the highest average pitch velocity?"
Tool: statcast_leaderboard("2024-07-20", "2024-07-20", None, None, None, "pitch_velocity", 10, "desc", "team")
Team Season Averages (team_season_stats)
Query: "What team averages the hardest hit balls in 2025?"
Tool: team_season_stats(2025, "exit_velocity")
Query: "Which team has the highest barrel rate this season?"
Tool: team_season_stats(2025, "barrel_rate")
Query: "Show me teams with the highest hard-hit rate on home runs only"
Tool: team_season_stats(2025, "hard_hit_rate", "home_run")
Query: "What team hits the ball the farthest on average?"
Tool: team_season_stats(2025, "distance")
Team Pitching Analysis (team_pitching_stats)
Query: "Which team throws the hardest in 2025?"
Tool: team_pitching_stats(2025, "velocity")
Query: "What team has the best slider spin rate?"
Tool: team_pitching_stats(2025, "spin_rate", "SL")
Query: "Which pitching staff gets the most swings and misses?"
Tool: team_pitching_stats(2025, "whiff_rate")
Query: "Show me teams with the highest ground ball rate"
Tool: team_pitching_stats(2025, "ground_ball_rate")
Query: "Which team has the lowest expected ERA based on contact quality?"
Tool: team_pitching_stats(2025, "xera")
Counting Queries (statcast_count)
Query: "How many home runs hit over 475 ft have been hit since 2023?"
Tool: statcast_count("2023-01-01", "2025-12-31", "home_run", 475)
Query: "Count all 110+ mph batted balls this season"
Tool: statcast_count("2025-04-01", "2025-10-31", "batted_ball", None, 110)
Query: "How many home runs between 400-450 feet were hit last year?"
Tool: statcast_count("2024-04-01", "2024-10-31", "home_run", 400, None, 450)
Query: "Total hits with exit velocity over 100 mph since 2022"
Tool: statcast_count("2022-01-01", "2025-12-31", "hit", None, 100)
Technical Details
Architecture
- Transport: stdio (Model Context Protocol)
- Framework: FastMCP for protocol implementation
- Data Source: pybaseball library (MLB official data)
- Language: Python 3.11+
- Deployment: Docker container via Smithery
Performance Optimizations
- Query Chunking: Automatically splits large date ranges into 5-day chunks to handle Baseball Savant's 30,000 row limit
- Response Caching: 15-minute cache for repeated queries, 24-hour cache for team season stats
- Vectorized Operations: Uses NumPy for efficient team identification instead of slower pandas apply() operations
- Sampling Strategy:
team_season_stats
andteam_pitching_stats
: Every 7th day samplingstatcast_count
: Adaptive sampling (3 days/month for multi-year, weekly for 6-12 months)
- Specialized Tools: Dedicated tools for common aggregate queries that would timeout with full data:
team_season_stats
for batting metricsteam_pitching_stats
for pitching metricsstatcast_count
for counting queries across multiple years
- Lazy Loading: Heavy dependencies (pandas, numpy, pybaseball) loaded only when needed for fast startup
- Efficient Filtering: Applies filters sequentially to minimize data processing overhead
Key Features
- Error Handling: Comprehensive error messages for common issues
- Type Safety: Proper type conversion for JSON serialization
- Video Integration: Automatic video highlight links for all plays and team top performances
- Smart Team Lookup: Handles full names, abbreviations, cities, and historical teams
File Structure
san-juan/
├── server.py # Main MCP server implementation
├── requirements.txt # Python dependencies
├── smithery.yaml # Smithery deployment configuration
├── Dockerfile # Container configuration
└── README.md # This file
Requirements
- Python 3.11 or higher
- Internet connection for MLB data access
- Smithery account for deployment (optional)
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project uses publicly available MLB data through the pybaseball library. All MLB data is property of MLB Advanced Media.
Built with pybaseball and FastMCP