babylonjs-mcp

VibeCAD/babylonjs-mcp

3.3

If you are the rightful owner of babylonjs-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

VibeCAD is a cutting-edge 3D scene manipulation system that leverages natural language processing and the Model Context Protocol (MCP) to create and manage 3D objects in a Babylon.js environment.

Tools
  1. create_object

    Creates a new 3D object in the scene.

  2. delete_object

    Removes an object from the scene.

  3. select_object

    Selects an object for manipulation.

  4. list_objects

    Lists all objects currently in the scene.

VibeCAD - AI-Powered 3D Scene Manipulation with MCP

VibeCAD is an innovative 3D scene manipulation system that combines natural language processing with the Model Context Protocol (MCP) to create and manage 3D objects in a Babylon.js environment. Users can create 3D scenes using simple text commands through either a web interface or Claude Desktop.

Core Functionality

VibeCAD enables users to:

  • Create 3D objects using natural language commands
  • Manipulate objects (move, rotate, scale, color)
  • Manage scene objects through an intuitive GUI
  • Use AI assistants (OpenAI or Claude) to interpret commands
  • Real-time 3D visualization with Babylon.js

Current Capabilities

This version of the MCP is able to create primitive shapes cylinder, cube, and sphere are working - but torus is not for some reason. The create, delete, select tools are all working. The list function is not working. The browser console logs in the GUI show that a babylon.js Torus is being generated, but it is not being rendered. Next steps would involve increasing the functionality of the MCP server to handle more complex shapes.

Architecture Overview

graph TB
    subgraph "Input Sources"
        User1[User via Browser]
        User2[User via Claude Desktop]
    end
    
    subgraph "AI Processing"
        OpenAI[OpenAI API]
        Claude[Claude Desktop]
    end
    
    subgraph "MCP Server"
        STDIO[STDIO Interface]
        HTTP[HTTP API :8081]
        WS[WebSocket Server :8080]
        Core[MCP Core Logic]
    end
    
    subgraph "Frontend"
        GUI[React GUI :5173]
        Babylon[Babylon.js Scene]
        WSClient[WebSocket Client]
    end
    
    User1 -->|Text Input| GUI
    GUI -->|API Call| OpenAI
    OpenAI -->|Structured Commands| HTTP
    
    User2 -->|Text Input| Claude
    Claude -->|MCP Protocol| STDIO
    
    HTTP --> Core
    STDIO --> Core
    Core -->|Commands| WS
    WS -->|Real-time Updates| WSClient
    WSClient --> Babylon
    
    style GUI fill:#e1f5fe
    style MCP Server fill:#fff3e0
    style Babylon fill:#c8e6c9

MCP Server Capabilities

Available Tools

1. create_object

Creates a new 3D object in the scene.

{
  "shape": "box|sphere|cylinder|cone|torus",
  "name": "unique_object_name",
  "position": { "x": 0, "y": 1, "z": 0 },
  "size": 2,
  "color": { "r": 1, "g": 0, "b": 0 }
}
2. delete_object

Removes an object from the scene.

{
  "name": "object_name"
}
3. select_object

Selects an object for manipulation.

{
  "name": "object_name"
}
4. list_objects

Lists all objects currently in the scene.

{}

Project Structure

VibeCAD/
ā”œā”€ā”€ babylonjs-mcpV2/          # MCP Server
│   ā”œā”€ā”€ src/
│   │   ā”œā”€ā”€ core/             # Core MCP logic
│   │   │   ā”œā”€ā”€ mcp-server.ts
│   │   │   ā”œā”€ā”€ types.ts
│   │   │   ā”œā”€ā”€ babylon-scene.ts
│   │   │   ā”œā”€ā”€ command-parser.ts
│   │   │   └── websocket-client.ts
│   │   └── mcp-stdio-server.ts  # Main server file
│   ā”œā”€ā”€ dist/                 # Compiled JS
│   └── package.json
│
└── gui/gui/                  # Frontend Application
    ā”œā”€ā”€ src/
    │   ā”œā”€ā”€ App.tsx           # Main React component
    │   ā”œā”€ā”€ App.css           # Styles
    │   └── main.tsx          # Entry point
    ā”œā”€ā”€ public/
    └── package.json

Technical Architecture

WebSocket Connection

  • Purpose: Real-time bidirectional communication between MCP server and GUI
  • Port: 8080 (configurable via MCP_WS_PORT)
  • Protocol: Standard WebSocket with JSON message format
  • Features:
    • Auto-reconnection (5 attempts, 3-second intervals)
    • Connection status monitoring
    • Message acknowledgment system

HTTP API Wrapper

  • Purpose: Bridge between web-based AI services and MCP server
  • Port: 8081 (configurable via MCP_HTTP_PORT)
  • Endpoints:
    • GET /health - Server status and connection info
    • POST /tools/:toolName - Execute MCP tools
  • Features:
    • CORS enabled for browser access
    • RESTful design
    • Structured JSON responses

Communication Paths

Browser Path (OpenAI Integration)

  1. User enters natural language command in GUI
  2. GUI sends text to OpenAI API
  3. OpenAI interprets and returns structured tool calls
  4. GUI makes HTTP POST to MCP server's HTTP API
  5. MCP server processes command and sends via WebSocket
  6. GUI receives WebSocket message and updates 3D scene

Claude Desktop Path

  1. User enters command in Claude Desktop
  2. Claude interprets using MCP tools directly
  3. Claude calls MCP server via STDIO interface
  4. MCP server processes command and sends via WebSocket
  5. GUI receives WebSocket message and updates 3D scene

Architecture Justification

Why HTTP API?

  • Browser Security: Browsers cannot directly access STDIO interfaces
  • CORS Support: Enables cross-origin requests from web applications
  • RESTful Design: Familiar pattern for web developers
  • Stateless: Each request is independent, improving reliability
  • Debugging: Easy to test with tools like curl or Postman

Why WebSocket for GUI Updates?

  • Real-time: Instant updates without polling
  • Bidirectional: Allows for future features like state sync
  • Efficient: Lower overhead than HTTP for frequent updates
  • Event-driven: Natural fit for UI updates

Why Dual Interface (STDIO + HTTP)?

  • Flexibility: Supports both desktop and web clients
  • MCP Compliance: STDIO for standard MCP protocol
  • Web Compatibility: HTTP for browser-based AI services
  • Unified Logic: Single codebase serves both interfaces

Configuration

Environment Variables

MCP Server
  • MCP_WS_PORT - WebSocket port (default: 8080)
  • MCP_HTTP_PORT - HTTP API port (default: 8081)
GUI
  • VITE_MCP_WS_URL - WebSocket URL (default: ws://localhost:8080)
  • VITE_MCP_HTTP_URL - HTTP API URL (default: http://localhost:8081)

Getting Started

Prerequisites

  • Node.js 18+
  • npm or yarn
  • OpenAI API key (for browser path)
  • Claude Desktop (for MCP path)

Installation

  1. Install MCP Server
cd babylonjs-mcpV2
npm install
npm run build:server
  1. Install GUI
cd gui/gui
npm install

Running the Application

  1. Start MCP Server
cd babylonjs-mcpV2
node dist/mcp-stdio-server.js
  1. Start GUI
cd gui/gui
npm run dev
  1. Open Browser Navigate to http://localhost:5173

Usage Examples

Via Browser (OpenAI)

  • "Create a red cube"
  • "Make a blue sphere above the cube"
  • "Delete the cube"
  • "Create three green cylinders in a row"

Via Claude Desktop

Configure Claude Desktop with:

{
  "mcpServers": {
    "babylonjs": {
      "command": "node",
      "args": ["/path/to/babylonjs-mcpV2/dist/mcp-stdio-server.js"]
    }
  }
}

Then ask Claude to create 3D objects using the available tools.

Future Enhancements

  • Additional MCP tools (move, rotate, scale, change color)
  • Advanced object grouping and hierarchies
  • Scene persistence and loading
  • Multi-user collaboration
  • Export to 3D file formats
  • Physics simulation integration

License

MIT License - See LICENSE file for details