jessemull/blockbuster-index-mcp-server
If you are the rightful owner of blockbuster-index-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The Blockbuster Index MCP Server calculates the Blockbuster Index for U.S. states by aggregating retail footprint signals.
Blockbuster Index MCP Server
The Blockbuster Index MCP Server calculates the Blockbuster Index for all U.S. states by aggregating multiple retail and digital commerce signals. Each signal runs as an independent ECS Fargate task on a daily schedule, fetching data from various retail APIs, computing weighted scores, and uploading the results to S3 for use by the Blockbuster Index website.
The Blockbuster Index is an AI-powered exploration of how consumer buying habits have shifted across the United States from traditional brick-and-mortar retail to online commerce. Inspired by the cultural decline of physical video rental stores like Blockbuster, this project builds a unique state-by-state index using signals that reflect the tension between digital and analog purchasing behavior.
This MCP server is part of the Blockbuster Index Project which includes the following repositories:
- Blockbuster Index MCP Server: The Blockbuster Index calculation server (this repository).
- Blockbuster Index Project Client: The Blockbuster Index NextJS client.
- Blockbuster Index Lambda@Edge: The Blockbuster Index Lambda@Edge.
Table of Contents
- Project Overview
- Architecture Overview
- Signal Calculations
- Blockbuster Index Calculation
- ECS Task Scheduling
- Environments
- Tech Stack
- Setup Instructions
- Running Signals Locally
- Development Workflow
- Commits & Commitizen
- Linting & Formatting
- Unit Tests & Code Coverage
- Error & Performance Monitoring
- Environment Variables
- Build & Deployment
- Connecting to the Bastion Host
- License
Project Overview
This project calculates the Blockbuster Index by aggregating distinct retail footprint signals for each U.S. state. The server fetches data from various retail APIs, applies weighted calculations, and uploads the results to S3 for use by the Blockbuster Index website.
The calculation process currently includes the following signals with more signals on the way:
- Amazon – E-commerce adoption and digital retail presence through job posting analysis with sliding window aggregation for improved accuracy.
- Census – Retail market maturity using U.S. Census Bureau data: retail establishments per 100,000 people.
- Broadband – Internet infrastructure and connectivity metrics.
- Walmart – Physical retail presence and technology job distribution through dual-signal analysis of Walmart job postings.
Each signal is weighted and combined to generate a comprehensive score that reflects the balance between digital and physical retail activity in each state.
Signal Inversion: Some signals are inverted after normalization so that higher scores always indicate greater digital adoption.
Architecture Overview
The Blockbuster Index MCP Server employs a modular microservices architecture where each signal runs as an independent ECS Fargate task. This design provides several key advantages:
Modular Signal Processing
- Independent Deployment: Each signal can be deployed, updated, and scaled independently.
- Fault Isolation: A failure in one signal doesn't affect the others.
- Resource Optimization: Each task can be configured with appropriate CPU/memory for its specific workload.
- Parallel Execution: Signals can run concurrently, reducing total processing time.
Task Architecture
- Amazon Signal Task: Web scraping and job posting analysis with sliding window aggregation.
- Census Signal Task: Demographic data processing and analysis.
- Broadband Signal Task: Infrastructure and connectivity metrics.
- Walmart Signal Task: Web scraping and job posting analysis with sliding window aggregation.
- Blockbuster Index Task: Signal aggregation and final index calculation.
Data Flow
- Signal Collection: Each signal task fetches and processes its respective data.
- Data Persistence: Signal-specific data is stored in DynamoDB tables.
- S3 Storage: Individual signal results are uploaded to S3 with versioned filenames for use by the Blockbuster Index website.
- Index Calculation: The blockbuster index task downloads all signal results and computes the final index.
- Result Publication: Final index is uploaded to S3 for consumption by the website and stored in DynamoDB tables.
Signal Calculations
See for detailed information about how each signal is calculated and how they are combined to create the Blockbuster Index.
ECS Task Scheduling
The system employs a sophisticated scheduling architecture using AWS EventBridge to orchestrate the execution of all signals and the final index calculation.
Execution Schedule
- Signal Tasks: Run daily at staggered intervals to avoid resource contention.
- Blockbuster Index Task: Runs daily after all signals complete.
Task Dependencies
The scheduling system ensures proper execution order:
- All signal tasks run independently and in parallel.
- Each signal uploads its results to DynamoDB tables and S3 upon completion.
- The blockbuster index task waits for all signal results before executing.
- Final index is published to S3 for website consumption and DynamoDB tables for historical trends.
Error Handling
- Retry Logic: Failed tasks are automatically retried with exponential backoff.
- Monitoring: CloudWatch alarms trigger on task failures.
- Rollback Capability: Previous versions can be quickly restored.
Environments
The Blockbuster Index operates in multiple environments to ensure smooth development, testing, and production workflows. Configuration files and environment variables should be set to point to the correct environment (dev/prod), depending on the stage of the application. Separate cloudfront distributions exist for each environment.
Tech Stack
The Blockbuster Index MCP Server is built using modern technologies to ensure reliability, scalability, and maintainability.
-
AWS ECS Fargate: Containerized deployment platform that runs each signal as an independent scheduled task without managing servers.
-
AWS CloudFormation: Infrastructure as Code (IaC) is used to define and provision AWS resources like ECS tasks, EventBridge rules, and IAM roles.
-
AWS EventBridge: Triggers scheduled executions of each signal and the index combiner with precise timing and dependency management.
-
AWS S3: Stores the calculated signal data and final index for consumption by the website, ensuring high availability and durability.
-
AWS DynamoDB: Provides persistent storage for signal-specific data including job posting records and sliding window aggregates, enabling efficient data processing and historical analysis.
-
AWS CloudWatch: Provides logging and monitoring capabilities for all tasks, including structured logging with bunyan.
-
Docker: Containerization technology used to package each signal for consistent deployment across environments.
-
Webpack: Bundles the TypeScript code for production deployment with optimization and minification.
-
Jest: JavaScript testing framework used for unit and integration testing, ensuring code reliability and preventing regressions.
-
ESLint & Prettier: Linting and formatting tools that enforce code consistency, reduce syntax errors, and improve maintainability.
-
Commitizen: A tool for enforcing a standardized commit message format, improving version control history and making collaboration more structured.
-
Husky & Lint-Staged: Git hooks that ensure code quality by running linting and formatting before commits.
-
Bunyan: Structured logging library that provides JSON-formatted logs for better parsing and analysis.
-
AWS SDK v3: Modern AWS SDK for JavaScript that provides type-safe access to AWS services like S3.
-
TypeScript: Provides type safety and enhanced developer experience for the server codebase.
-
Node.js: Runtime environment for executing the server application.
-
Puppeteer: Headless browser automation for web scraping Amazon job postings.
This tech stack ensures that the Blockbuster Index MCP Server remains performant, secure, and easily maintainable while leveraging AWS infrastructure for scalability and reliability.
Setup Instructions
To clone the repository, install dependencies, and run the project locally follow these steps:
-
Clone the repository:
git clone https://github.com/jessemull/blockbuster-index-mcp-server.git
-
Navigate into the project directory:
cd blockbuster-index-mcp-server
-
Install the dependencies:
npm install
-
Set up environment variables. Please see the Environment Variables section.
Running Signals Locally
Individual Signal Execution
In development mode, signals write results to local files instead of S3. To run each signal independently for development and testing:
npm run signal -- <signalName>
Container-Based Testing
To test signals in a containerized environment that mirrors production:
npm run signal:container -- <signalName>
ECS Task Management
Tasks can be triggered manually from the command line. To trigger a specific task:
npm run ecs:run -- <signalName>
To run all tasks:
npm run ecs:run:all
Development Workflow
The Blockbuster Index MCP Server follows a structured development workflow that ensures code quality, testing, and proper deployment practices.
Development Process
- Feature Development: Create feature branches from main for new functionality
- Local Testing: Run signals locally using the provided npm scripts
- Code Quality: Ensure all linting and formatting standards are met
- Unit Testing: Write and run comprehensive unit tests with 80% coverage
- Pull Request: Submit PR with proper commitizen-formatted commits
- CI/CD Pipeline: Automated testing and quality checks via GitHub Actions
- Deployment: Use GitHub workflows for targeted signal deployment
Signal Development
When developing new signals or modifying existing ones:
- Local Development: Use
npm run signal -- <signalName>
for local testing - Container Testing: Use
npm run signal:container -- <signalName>
for production-like testing - ECS Testing: Use
npm run ecs:run -- <signalName>
for AWS environment testing - Deployment: Use GitHub Actions deploy workflow for production deployment
Code Standards
- TypeScript: All code must be written in TypeScript with proper type definitions.
- ESLint: Code must pass all linting rules without warnings.
- Prettier: Code must be properly formatted using Prettier.
- Jest: All new code must include comprehensive unit tests.
- Coverage: Maintain > 80% minimum code coverage across all metrics.
Commits & Commitizen
This project uses Commitizen to ensure commit messages follow a structured format and versioning is consistent. Commit linting is enforced via a pre-commit husky hook.
Making a Commit
To make a commit in the correct format, run the following command. Commitzen will walk the user through the creation of a structured commit message and versioning:
npm run commit
Linting & Formatting
This project uses ESLint and Prettier for code quality enforcement. Linting is enforced during every CI/CD pipeline to ensure consistent standards.
Linting Commands
Run linting:
npm run lint
Fix linting issues automatically:
npm run lint:fix
Formatting Commands
Format using prettier:
npm run format
Check formatting:
npm run format:check
Pre-Commit Hook
Lint-staged is configured to run linting before each commit. The commit will be blocked if linting fails, ensuring code quality at the commit level.
Unit Tests & Code Coverage
Unit Tests
This project uses Jest for testing. Code coverage is enforced during every CI/CD pipeline. The build will fail if any tests fail or coverage drops below 80%.
Run tests:
npm run test
Run tests in watch mode:
npm run test:watch
Open coverage report:
npm run coverage:open
Code Coverage
Coverage thresholds are enforced at 80% for all metrics. The build will fail if coverage drops below this threshold.
Error & Performance Monitoring
This project uses AWS CloudWatch for server-side error and performance monitoring with structured logging.
Configuration
CloudWatch logging is configured with environment-specific settings. Logs are sent to CloudWatch with structured JSON formatting for better parsing and analysis.
CloudWatch Logging
The server uses Bunyan for structured logging with the following features:
- JSON-formatted logs for better parsing and analysis.
- Performance metrics for tracking calculation times.
- Signal-specific logging for monitoring individual signal performance.
- Error context for debugging and monitoring.
- CloudWatch integration for centralized log management.
Environment Variables
See for a complete list of all environment variables used by the blockbuster index MCP server, organized by signal type and functionality.
Build & Deployment
The Blockbuster Index MCP Server uses a sophisticated CI/CD pipeline with GitHub Actions to enable independent deployment of each signal. This modular approach allows for targeted updates and rollbacks without affecting the entire system.
Build Process
Each signal is built using a shared webpack configuration with the SIGNAL_TYPE
environment variable determining which signal to compile:
SIGNAL_TYPE=<signalName> npm run build
Docker Container
The application is containerized using Docker for consistent deployment across environments. A shared Dockerfile is used with the SIGNAL_TYPE
build argument to create signal-specific containers:
ARG SIGNAL_TYPE
ENV SIGNAL_TYPE=$SIGNAL_TYPE
GitHub Workflows
The project includes four GitHub Actions workflows that enable granular control over deployment and execution.
Deploy Workflow
Deploy individual signals to ECS Fargate with full CI/CD pipeline. Triggered manually from GitHub Actions UI with parameter selection.
Process:
- Code Quality Checks: Runs linting and unit tests with 80% coverage threshold.
- Docker Build: Creates signal-specific container image with versioned tag.
- ECR Push: Uploads image to AWS ECR with environment-specific repository.
- CloudFormation Deployment: Updates ECS task definition with new container image.
- Task Execution: Automatically runs the deployed task to verify functionality.
Rollback Workflow
Quickly rollback a specific signal to a previous container image version. Triggered manually when issues are detected with a recent deployment.
Process:
- CloudFormation Update: Deploys task definition with previous container image.
- Verification: Ensures rollback completes successfully.
Run Task Workflow
Manually trigger execution of any signal task for testing or data refresh. Triggered manually for on-demand signal execution.
Process:
- Task Launch: Executes ECS task with current task definition.
- Monitoring: Provides task ARN for CloudWatch monitoring.
Pull Request Workflow
Automated quality gates for all pull requests. Automatically triggered on all pull requests to main branch.
Process:
- Build Verification: Ensures code compiles correctly.
- Linting: Enforces code style and quality standards.
- Unit Testing: Runs comprehensive test suite with coverage reporting.
- Coverage Threshold: Enforces 80% minimum coverage requirement.
Infrastructure
CloudFormation Templates
Infrastructure is managed using AWS CloudFormation templates with environment-specific parameterization:
blockbuster-index-task-definition.yaml
: Defines ECS task definitions and scheduled execution rules for all signals with environment-specific configurations.blockbuster-index-cluster.yaml
: Defines the ECS cluster, IAM roles, security groups, and related resources.blockbuster-index-dynamo-db.yaml
: Defines DynamoDB tables for signal data storage with environment-specific naming.blockbuster-index-broadband-s3.yaml
: Defines S3 buckets for broadband data storage.
Task Definitions
Each signal has its own ECS task definition with optimized resource allocation and environment-specific configurations:
- Amazon Task: Higher CPU allocation for web scraping operations with Puppeteer.
- Census Task: Balanced CPU/memory for API processing and data analysis.
- Broadband Task: Optimized for large dataset processing with S3 integration.
- Blockbuster Index Task: Lightweight aggregation and calculation with minimal resource requirements.
- Walmart Task: Higher CPU allocation for web scraping operations with Puppeteer.
Environment Parameterization
All infrastructure components use environment-specific naming and configuration:
- Resource Naming: All AWS resources include environment prefix (e.g.,
blockbuster-index-amazon-task-dev
) - Configuration Mapping: Environment-specific settings for logging levels, S3 buckets, and DynamoDB tables
- Security Groups: Environment-specific network security configurations
- Scheduling: Environment-specific EventBridge rules for task execution
Connecting to the Bastion Host
To connect to the AWS EC2 bastion host and access AWS resources, you can use the following command:
npm run bastion
Environment Variables
The following environment variables must be set in a .env
file in the root of the project:
Variable | Description |
---|---|
SSH_HOST | The bastion host IP. |
SSH_PRIVATE_KEY_PATH | Path to the bastion host private SSH key. |
SSH_USER | The bastion host username. |
Ensure you have the appropriate permissions set on your SSH key for secure access.
License
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.