mcp-fs-server by bramburn - MCP Server

MCP Semantic Watcher

This project provides an MCP server that watches a code repository, indexes code snippets using Ollama embeddings, and stores them in Qdrant for semantic search. It leverages Tree-sitter for parsing code structures and Chokidar for efficient file watching.

Features

Real-time Code Indexing: Monitors a specified repository path for file changes.
Semantic Search: Indexes code snippets as embeddings using Ollama and allows searching via natural language queries.
Code Parsing: Utilizes Tree-sitter for structured parsing of supported programming languages.
Qdrant Integration: Stores embeddings and metadata in a Qdrant vector database.
MCP Server: Exposes tools for semantic search and index management via the Model Context Protocol.

Setup and Installation

Prerequisites

Node.js: Version 18 or higher is recommended.
npm or yarn: Package manager for Node.js.
Qdrant: A running Qdrant instance. The default URL is http://localhost:6333. Ensure it's accessible and configured if using authentication.
Ollama: Ollama must be installed and running, with a compatible embedding model downloaded (default: nomic-embed-text).

Steps

Clone the Repository:

git clone <repository-url>
cd mcp-fs-server

Install Node.js Dependencies:
```
npm install
# or
yarn install
```
Download WASM Grammars: This project uses Tree-sitter for parsing code. The necessary WASM grammar files need to be downloaded.
```
npm run setup
# or
yarn setup
```
This command executes node scripts/wasm-installer.js which downloads the required Tree-sitter WASM files into the ./wasm directory.
Configure Environment Variables (Optional): The server can be configured using environment variables. If not set, default values will be used. See the Configuration section for details.

Running the Server

Build the Project: Compile the TypeScript code into JavaScript.
```
npm run build
# or
yarn build
```
Start the Server: Run the compiled application.
```
npm start
# or
yarn start
```
The server will start listening for MCP requests on STDIN/STDOUT.
Development Watch Mode: To automatically recompile TypeScript on file changes during development:
```
npm run watch
# or
yarn watch
```

Usage

MCP Server Configuration

This MCP server can be configured and used with MCP-compatible clients (like Claude Desktop) through JSON configuration. Add the server to your MCP client configuration:

Claude Desktop Configuration

Add to your Claude Desktop claude_desktop_config.json file:

{
  "mcpServers": {
    "semantic-watcher": {
      "command": "node",
      "args": ["/path/to/mcp-fs-server/build/index.js"],
      "env": {
        "REPO_PATH": "/path/to/your/codebase",
        "QDRANT_URL": "http://localhost:6333",
        "QDRANT_COLLECTION": "my_codebase",
        "OLLAMA_MODEL": "nomic-embed-text",
        "LOG_LEVEL": "info"
      }
    }
  }
}

Example with Multiple Environments

{
  "mcpServers": {
    "semantic-watcher-dev": {
      "command": "node",
      "args": ["/path/to/mcp-fs-server/build/index.js"],
      "env": {
        "REPO_PATH": "/path/to/dev/project",
        "QDRANT_COLLECTION": "dev_codebase",
        "OLLAMA_MODEL": "nomic-embed-text",
        "LOG_LEVEL": "debug"
      }
    },
    "semantic-watcher-prod": {
      "command": "node",
      "args": ["/path/to/mcp-fs-server/build/index.js"],
      "env": {
        "REPO_PATH": "/path/to/production/project",
        "QDRANT_URL": "https://qdrant.example.com",
        "QDRANT_API_KEY": "your-api-key-here",
        "QDRANT_COLLECTION": "prod_codebase",
        "OLLAMA_MODEL": "nomic-embed-text",
        "LOG_LEVEL": "warn"
      }
    }
  }
}

Configuration Requirements

Command Path: Use the absolute path to the built index.js file
Environment Variables: Set all required environment variables in the env section
Repository Path: REPO_PATH must point to the codebase you want to index
Dependencies: Ensure Qdrant and Ollama are running and accessible

Direct Usage

The server can also be run directly as an MCP service, communicating via STDIN/STDOUT. You can interact with it by sending MCP requests. The primary way to use this server is by calling the tools it exposes.

Tools

The MCP Semantic Watcher exposes the following tools:

`semantic_search`

Description: Searches the indexed codebase using semantic vector search. It finds code snippets semantically similar to your natural language query.
Input Schema:
- query (string, required): The natural language query to search for.
- limit (number, optional): The maximum number of results to return. Defaults to 5, with a maximum of 20.
Output: Returns a formatted string containing search results, including file path, line numbers, score, and the code snippet.

`refresh_index`

Description: Manually triggers a full re-scan and re-indexing of the configured repository path (REPO_PATH). This is useful if new files are added or if you want to ensure the index is up-to-date.
Input Schema: None.
Output: A confirmation message indicating the refresh process has started or completed.

Example Prompts (MCP Tool Calls)

These examples show how you might call the tools using an MCP request structure.

Example: Semantic Search

To search for code related to "how to initialize the Qdrant client" and get up to 3 results:

{
  "method": "call_tool",
  "params": {
    "name": "semantic_search",
    "arguments": {
      "query": "How is the Qdrant client initialized?",
      "limit": 3
    }
  }
}

Example: Refresh Index

To manually trigger a re-scan of the repository:

{
  "method": "call_tool",
  "params": {
    "name": "refresh_index",
    "arguments": {}
  }
}

Configuration

The server's behavior can be customized using environment variables:

Variable	Description	Default Value
`QDRANT_URL`	URL of the Qdrant instance.	`http://localhost:6333`
`QDRANT_API_KEY`	API key for Qdrant authentication.	(None)
`OLLAMA_MODEL`	Name of the Ollama model for generating embeddings (e.g., `nomic-embed-text`).	`nomic-embed-text`
`QDRANT_COLLECTION`	Name of the Qdrant collection to use for storing embeddings.	`codebase_context`
`REPO_PATH`	Path to the code repository to watch and index.	`./target-repo`
`WASM_PATH`	Path to the directory containing Tree-sitter WASM grammars.	`./wasm`
`LOG_PATH`	Directory where log files are stored. Creates daily log files with name format `mcp-server-YYYY-MM-DD.log`.	`./logs`
`MAX_FILE_SIZE`	Maximum file size in bytes to index (e.g., `1048576` for 1MB).	`1048576`
`MIN_CHUNK_SIZE`	Minimum character length for a code chunk to be considered for indexing.	`50`
`CHUNK_OVERLAP`	Number of lines to overlap between chunks when using simple line-based splitting.	`10`
`CHUNK_LINES`	Number of lines per chunk when using simple line-based splitting.	`50`
`VECTOR_SIZE`	Dimension of the vectors stored in Qdrant. Must match the embedding model's output dimension.	`768`
`SEARCH_LIMIT`	Default number of results to return for semantic search queries.	`5`
`LOG_LEVEL`	Controls the verbosity of logs (`info`, `debug`, `warn`, `error`).	`info`

Contributing

See for details on how to contribute to this project.

License

This project is licensed under the MIT License - see the file for details.