expert-enigma-2 by timlawrenz - MCP Server

expert-enigma: A GNN-Powered Model Context Protocol Server for Ruby

A next-generation, local-first Model Context Protocol (MCP) server for Ruby repositories. It uses Graph Neural Networks (GNNs) to provide LLMs and coding agents with a deep, structural understanding of code, far beyond simple text analysis.

The Problem

Modern LLM-based coding agents are powerful, but they often lack a true understanding of a project's architecture. When analyzing dynamic languages like Ruby, they rely on text-based heuristics and miss the rich structural relationships within the code (inheritance, method calls, composition). This leads to shallow, context-poor responses.

The Solution

This project provides a highly intelligent context server that speaks the standard MCP language. Instead of just parsing text, it transforms Ruby code into a graph and uses a Graph Neural Network to create sophisticated embeddings that capture the code's structure and intent.

The core innovation is leveraging the research and models from the jubilant-palm-tree project, which demonstrated that GNNs can learn meaningful representations of Ruby ASTs.

Core Concepts

AST to Graph Transformation: Ruby files are parsed into Abstract Syntax Trees (ASTs), which are then converted into rich graph structures where nodes represent code entities (classes, methods) and edges represent their relationships (calls, inherits, includes).
GNN-Powered Embeddings: We use a pre-trained GNN model (in ONNX format) to generate vector embeddings for each code symbol. Unlike text embeddings, these vectors capture the structural similarity and complexity of the code, allowing for powerful semantic search.
Lightweight & Local-First: The entire engine is designed to run with minimal overhead on a developer's machine. It uses an embedded database solution (SQLite with the sqlite-vss extension for vector search) that requires no external services.

Architecture & Implementation Details

The data pipeline is designed for a rich, offline-first experience:

Ruby Files -> AST Parser -> Symbol/Reference Extractor -> GNN Inference (ONNX) -> SQLite DB -> MCP API -> LLM Agent

Key Components

scripts/05_build_database.rb: This is the main script for indexing a repository. It scans for Ruby files, extracts symbols and references, generates embeddings, and populates the SQLite database.
lib/expert_enigma/symbol_extractor.rb: A class that uses the parser gem to traverse the AST of a Ruby file and extract definitions (classes, modules, methods) and references (usages) of symbols.
lib/expert_enigma/embedding_generator.rb: This class loads the pre-trained GNN model (in .onnx format) and uses the onnxruntime gem to generate vector embeddings for method ASTs.
lib/expert_enigma/ast_explorer.rb: A utility class for querying and navigating the AST of a file, with methods to find nodes by type, ID, and to get ancestors.
lib/mcp_server.rb: A Sinatra-based web server that exposes the MCP API endpoints. It queries the SQLite database to provide information about the codebase.
expert_enigma.db: An SQLite database containing the indexed data for the repository, including file ASTs, symbols, references, and vector embeddings for methods.

Progress & Implemented Features

The project has a functional core that successfully covers all planned features from the "Core AST Inspection" and "Semantic & Cross-File Analysis" categories. However, some features were found to be unstable and have been temporarily disabled.

Completed

Phase 1: Core Integration
- Port the graph and embedding generation logic from jubilant-palm-tree.
- Set up the SQLite database schema (for symbols, files, relations).
- Integrate sqlite-vss for vector storage and search.
Phase 2: Indexer & API
- Build the main indexer process for full repository scans.
- Implemented all core MCP API endpoints for inspection and analysis.
- Successfully tested the prototype locally.

Next Steps

API Documentation

The server runs on http://localhost:65432. All endpoints are POST requests with a JSON-RPC 2.0 body.

Method	Description	Parameters
`status`	Health check	None
`list_files`	Lists all indexed files in the repository.	None
`get_ast`	Retrieves the full AST for a single file.	`file_path` (string)
`get_symbols`	Returns all symbols for a given file.	`file_path` (string)
`query_nodes`	Finds nodes of a specific type in a file's AST.	`file_path`, `type`
`get_node_details`	Retrieves details for a specific node by its ID.	`file_path`, `node_id`
`get_ancestors`	Returns the ancestor nodes for a given node ID.	`file_path`, `node_id`
`find_definition`	Finds the definition of a symbol by name.	`name` (string)
`find_references`	Finds all references to a symbol by name.	`name` (string)
`get_call_hierarchy`	[DISABLED] Gets inbound/outbound calls for a method.	`file_path`, `line`
`search`	[DISABLED] Vector search for methods.	`query`, `limit`

Testing Approach

The server's endpoints are tested using a set of controlled Ruby files in the test/ directory. The testing process is as follows:

Create Test Files: The test/ directory contains Ruby files with a known structure of classes, modules, methods, and references.
Build Test Database: The scripts/05_build_database.rb script is configured to scan only the test/ directory, creating a clean expert_enigma.db with only the test data.
Verify with curl: The MCP server is started, and curl commands are used to systematically test each endpoint against the known content of the test files, verifying the JSON output.

This approach ensures that the core functionality of the server is working as expected before moving on to more complex features.

License

This project is licensed under the MIT License. See the file for details.