yergin/local-mcp-hub
If you are the rightful owner of local-mcp-hub and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The Local MCP Hub is a portable Model Context Protocol server designed to run on development machines, providing AI coding assistance by connecting to a remote Ollama server.
Local MCP Hub
A portable Model Context Protocol (MCP) hub that bridges VS Code Continue extension with remote Ollama servers and local MCP tools. This project enables seamless AI-assisted development with access to powerful code analysis tools and documentation lookup capabilities.
Note: This project is entirely written by human-guided AI.
Architecture
┌─────────────────┐ HTTP/OpenAI API ┌─────────────────┐ HTTP API ┌─────────────────┐
│ VS Code │ ◄─────────────────────► │ Local MCP Hub │ ◄──────────────► │ Ollama Server │
│ Continue Ext. │ │ │ │ (Local/Remote) │
└─────────────────┘ │ │ └─────────────────┘
│ │
│ ┌───────────▼──────────┐
│ │ MCP Processes │
│ │ │
│ │ ┌─────────────────┐ │
│ │ │ Serena │ │
│ │ │ (Code Analysis) │ │
│ │ └─────────────────┘ │
│ │ │
│ │ ┌─────────────────┐ │
│ │ │ Context7 │ │
│ │ │ (Library Docs) │ │
│ │ └─────────────────┘ │
│ └──────────────────────┘
└─────────────────┘
Key Features
- OpenAI API Compatibility: Drop-in replacement for OpenAI API with Continue extension
- Remote Ollama Support: Connect to Ollama servers running on different machines
- Dual Model Architecture: Separate fast and full models for optimal performance
- MCP Tool Integration: Access to advanced code analysis via Serena MCP server
- Library Documentation: Context7 integration for instant access to library docs
- Intelligent Tool Selection: Two-stage LLM process for smart tool selection and argument generation
- Streaming Responses: Real-time response streaming for better user experience
- Code Completion: Fill-in-middle (FIM) support for autocomplete functionality
- Request Logging: Comprehensive logging with configurable levels
Prerequisites
- Node.js 18+ and npm
- Python 3.9+ (for Serena MCP server)
- Ollama server running locally or remotely
- VS Code with Continue extension installed
Installation
Clone Repository
git clone https://github.com/yergin/local-mcp-hub
cd local-mcp-hub
Platform-Specific Installation
Linux/macOS
chmod +x install.sh
./install.sh
macOS (Alternative)
chmod +x install-mac.sh
./install-mac.sh
Windows
install.bat
Configure Ollama Server
Edit config.json to point to your Ollama server:
{
"ollama": {
"host": "http://your-ollama-server:11434",
"model": "qwen2.5:latest",
"fast_model": "qwen2.5:0.5b"
}
}
Start the Hub
npm start
The hub will start on port 3002 by default.
Configure Continue Extension
Copy the contents of continue-config.yaml to your Continue extension configuration file:
Linux/macOS: ~/.continue/config.yaml
Windows: %USERPROFILE%\.continue\config.yaml
Environment Variables
| Variable | Description | Default |
|---|---|---|
PORT | Server port | 3002 |
LOG_LEVEL | Logging level (debug, info, warn, error) | info |
Configuration Files
config.json
Main configuration file controlling:
- ollama: Ollama server connection and model settings
- hub: Server port, logging, and CORS configuration
- mcps: Enabled MCP servers list and timeout configurations
initializationTimeoutMs: Time to wait for MCP servers to start (default: 60000ms)toolCallTimeoutMs: Maximum time for individual tool operations (default: 45000ms)
prompts.json
Advanced prompt configuration for:
- connectionTest: Ollama connection verification prompts
- toolSelection: Tool selection logic and templates
- argumentGeneration: Argument generation for fast/full models
- codeCompletion: Code completion prompt templates
- toolGuidance: Usage hints and model routing for tools
- responseGeneration: Response templates for tool results
- systemMessages: User-facing messages for various states
API Endpoints
Core OpenAI Compatible Endpoints
POST /v1/chat/completions- Chat completions with tool supportPOST /v1/completions- Code completions (FIM support)GET /v1/models- Available models list
Administrative Endpoints
GET /health- Health check and initialization statusGET /v1/tools- List available MCP toolsPOST /v1/admin/reload-prompts- Reload prompt configuration
Health Check Response
{
"status": "healthy",
"timestamp": "2025-01-20T10:30:00.000Z",
"ollama_host": "http://10.0.0.24:11434",
"mcps_enabled": 2,
"mcp_tools_initialized": true,
"mcp_tools_count": 15
}
Logging
Logs are written to:
- Console: Colored, human-readable format
- File:
.tmp/local-mcp-hub.login JSON format
Key log categories:
- HTTP requests and responses
- MCP tool calls and results
- Ollama communication
- Performance timing data
- Error tracking and debugging
Project Structure
local-mcp-hub/
├── src/ # Source code (modular architecture)
│ ├── hub.ts # Main application server (coordination)
│ ├── ollama-client.ts # Ollama server communication
│ ├── mcp-manager.ts # MCP process lifecycle & tool calls
│ ├── tool-selector.ts # Intelligent tool selection & argument generation
│ └── request-processor.ts # Request parsing & response formatting
├── mcps/ # MCP server implementations (downloaded during installation)
│ ├── serena/ # Code analysis MCP server
│ └── context7/ # Library documentation MCP server
├── patches/ # Patch files applied to downloaded dependencies
│ └── serena-python-detection.patch # Automatic Python detection for Serena
├── config.json # Main configuration
├── prompts.json # Prompt templates and tool guidance
├── continue-config.yaml # Continue extension configuration
├── install.sh # Linux/macOS installer
├── install-mac.sh # macOS-specific installer
├── install.bat # Windows installer
└── .tmp/ # Runtime logs and debug files (created when run)
Source Code Architecture
The codebase follows a modular architecture with clear separation of concerns:
hub.ts- Main application server that coordinates all componentsollama-client.ts- Handles all communication with Ollama serversmcp-manager.ts- Manages MCP process lifecycle and tool executiontool-selector.ts- Implements intelligent tool selection using two-stage LLM processrequest-processor.ts- Handles request parsing and response formatting
Available Tools
| Tool Name | Description |
|---|---|
list_dir | Lists all non-gitignored files and directories in the given directory (optionally with recursion). Returns a JSON object with the names of directories and files within the given directory. |
find_file | Finds non-gitignored files matching the given file mask within the given relative path. Returns a JSON object with the list of matching files. |
replace_regex | Replaces one or more occurrences of the given regular expression. This is the preferred way to replace content in a file whenever the symbol-level tools are not appropriate. Even large sections of code can be replaced by providing a concise regular expression of the form "beginning.*?end-of-text-to-be-replaced". Always try to use wildcards to avoid specifying the exact content of the code to be replaced, especially if it spans several lines. IMPORTANT: REMEMBER TO USE WILDCARDS WHEN APPROPRIATE! I WILL BE VERY UNHAPPY IF YOU WRITE LONG REGEXES WITHOUT USING WILDCARDS INSTEAD!. |
search_for_pattern | Offers a flexible search for arbitrary patterns in the codebase, including the possibility to search in non-code files. Generally, symbolic operations like find_symbol or find_referencing_symbols should be preferred if you know which symbols you are looking for. Pattern Matching Logic: For each match, the returned result will contain the full lines where the substring pattern is found, as well as optionally some lines before and after it. The pattern will be compiled with DOTALL, meaning that the dot will match all characters including newlines. This also means that it never makes sense to have .* at the beginning or end of the pattern, but it may make sense to have it in the middle for complex patterns. If a pattern matches multiple lines, all those lines will be part of the match. Be careful to not use greedy quantifiers unnecessarily, it is usually better to use non-greedy quantifiers like .*? to avoid matching too much content. File Selection Logic: The files in which the search is performed can be restricted very flexibly. Using restrict_search_to_code_files is useful if you are only interested in code symbols (i.e., those symbols that can be manipulated with symbolic tools like find_symbol). You can also restrict the search to a specific file or directory, and provide glob patterns to include or exclude certain files on top of that. The globs are matched against relative file paths from the project root (not to the relative_path parameter that is used to further restrict the search). Smartly combining the various restrictions allows you to perform very targeted searches. Returns A JSON object mapping file paths to lists of matched consecutive lines (with context, if requested). |
restart_language_server | Use this tool only on explicit user request or after confirmation. It may be necessary to restart the language server if the user performs edits not through Serena, so the language server state becomes outdated and further editing attempts lead to errors. If such editing errors happen, you should suggest using this tool. |
get_symbols_overview | Gets an overview of the given file or directory. For each analyzed file, we list the top-level symbols in the file (name_path, kind). Use this tool to get a high-level understanding of the code symbols. Calling this is often a good idea before more targeted reading, searching or editing operations on the code symbols. Before requesting a symbol overview, it is usually a good idea to narrow down the scope of the overview by first understanding the basic directory structure of the repository that you can get from memories or by using the list_dir and find_file tools (or similar). Returns a JSON object mapping relative paths of all contained files to info about top-level symbols in the file (name_path, kind). |
find_symbol | Retrieves information on all symbols/code entities (classes, methods, etc.) based on the given name_path, which represents a pattern for the symbol's path within the symbol tree of a single file. The returned symbol location can be used for edits or further queries. Specify depth > 0 to retrieve children (e.g., methods of a class). The matching behavior is determined by the structure of name_path, which can either be a simple name (e.g. "method") or a name path like "class/method" (relative name path) or "/class/method" (absolute name path). Note that the name path is not a path in the file system but rather a path in the symbol tree within a single file. Thus, file or directory names should never be included in the name_path. For restricting the search to a single file or directory, the within_relative_path parameter should be used instead. The retrieved symbols' name_path attribute will always be composed of symbol names, never file or directory names. Key aspects of the name path matching behavior: - Trailing slashes in name_path play no role and are ignored. - The name of the retrieved symbols will match (either exactly or as a substring) the last segment of name_path, while other segments will restrict the search to symbols that have a desired sequence of ancestors. - If there is no starting or intermediate slash in name_path, there is no restriction on the ancestor symbols. For example, passing method will match against symbols with name paths like method, class/method, class/nested_class/method, etc. - If name_path contains a / but doesn't start with a /, the matching is restricted to symbols with the same ancestors as the last segment of name_path. For example, passing class/method will match against class/method as well as nested_class/class/method but not method. - If name_path starts with a /, it will be treated as an absolute name path pattern, meaning that the first segment of it must match the first segment of the symbol's name path. For example, passing /class will match only against top-level symbols like class but not against nested_class/class. Passing /class/method will match against class/method but not nested_class/class/method or method. Returns JSON string: a list of symbols (with locations) matching the name. |
find_referencing_symbols | Finds symbols that reference the symbol at the given name_path. The result will contain metadata about the referencing symbols as well as a short code snippet around the reference (unless include_body is True, then the short snippet will be omitted). Note that among other kinds of references, this function can be used to find (direct) subclasses of a class, as subclasses are referencing symbols that have the kind class. Returns a list of JSON objects with the symbols referencing the requested symbol. |
replace_symbol_body | Replaces the body of the symbol with the given name_path. |
insert_after_symbol | Inserts the given body/content after the end of the definition of the given symbol (via the symbol's location). A typical use case is to insert a new class, function, method, field or variable assignment. |
insert_before_symbol | Inserts the given body/content before the beginning of the definition of the given symbol (via the symbol's location). A typical use case is to insert a new class, function, method, field or variable assignment. It also can be used to insert a new import statement before the first symbol in the file. |
write_memory | Write some information about this project that can be useful for future tasks to a memory. Use markdown formatting for the content. The information should be short and to the point. The memory name should be meaningful, such that from the name you can infer what the information is about. It is better to have multiple small memories than to have a single large one because memories will be read one by one and we only ever want to read relevant memories. This tool is either called during the onboarding process or when you have identified something worth remembering about the project from the past conversation. |
read_memory | Read the content of a memory file. This tool should only be used if the information is relevant to the current task. You can infer whether the information is relevant from the memory file name. You should not read the same memory file multiple times in the same conversation. |
list_memories | List available memories. Any memory can be read using the read_memory tool. |
delete_memory | Delete a memory file. Should only happen if a user asks for it explicitly, for example by saying that the information retrieved from a memory file is no longer correct or no longer relevant for the project. |
activate_project | Activates the project with the given name. |
remove_project | Removes a project from the Serena configuration. |
switch_modes | Activates the desired modes, like ["editing", "interactive"] or ["planning", "one-shot"]. |
get_current_config | Print the current configuration of the agent, including the active and available projects, tools, contexts, and modes. |
check_onboarding_performed | Checks whether project onboarding was already performed. You should always call this tool before beginning to actually work on the project/after activating a project, but after calling the initial instructions tool. |
onboarding | Call this tool if onboarding was not performed yet. You will call this tool at most once per conversation. Returns instructions on how to create the onboarding information. |
think_about_collected_information | Think about the collected information and whether it is sufficient and relevant. This tool should ALWAYS be called after you have completed a non-trivial sequence of searching steps like find_symbol, find_referencing_symbols, search_files_for_pattern, read_file, etc. |
think_about_task_adherence | Think about the task at hand and whether you are still on track. Especially important if the conversation has been going on for a while and there has been a lot of back and forth. This tool should ALWAYS be called before you insert, replace, or delete code. |
think_about_whether_you_are_done | Whenever you feel that you are done with what the user has asked for, it is important to call this tool. |
summarize_changes | Summarize the changes you have made to the codebase. This tool should always be called after you have fully completed any non-trivial coding task, but only after the think_about_whether_you_are_done call. |
prepare_for_new_conversation | Instructions for preparing for a new conversation. This tool should only be called on explicit user request. |
initial_instructions | Get the initial instructions for the current coding project. If you haven't received instructions on how to use Serena's tools in the system prompt, you should always call this tool before starting to work (including using any other tool) on any programming task, the only exception being when you are asked to call activate_project, which you should then call before. |
resolve-library-id | Resolves a package/product name to a Context7-compatible library ID and returns a list of matching libraries. You MUST call this function before 'get-library-docs' to obtain a valid Context7-compatible library ID UNLESS the user explicitly provides a library ID in the format '/org/project' or '/org/project/version' in their query. Selection Process: 1. Analyze the query to understand what library/package the user is looking for 2. Return the most relevant match based on: - Name similarity to the query (exact matches prioritized) - Description relevance to the query's intent - Documentation coverage (prioritize libraries with higher Code Snippet counts) - Trust score (consider libraries with scores of 7-10 more authoritative) Response Format: - Return the selected library ID in a clearly marked section - Provide a brief explanation for why this library was chosen - If multiple good matches exist, acknowledge this but proceed with the most relevant one - If no good matches exist, clearly state this and suggest query refinements For ambiguous queries, request clarification before proceeding with a best-guess match. |
get-library-docs | Fetches up-to-date documentation for a library. You must call 'resolve-library-id' first to obtain the exact Context7-compatible library ID required to use this tool, UNLESS the user explicitly provides a library ID in the format '/org/project' or '/org/project/version' in their query. |
Troubleshooting
Common Issues
Hub won't start
- Check if port 3002 is available
- Verify Node.js version is 18+
- Check config.json syntax
Ollama connection failed
- Verify Ollama server is running
- Check network connectivity to remote host
- Confirm model names exist on Ollama server
MCP tools not working
- Check
.tmp/local-mcp-hub.logfor MCP process errors - Verify Python environment for Serena (requires Python 3.9+)
- Ensure all dependencies are installed
Continue extension not connecting
- Verify Continue config.yaml matches provided template
- Check VS Code developer console for errors
- Confirm hub health endpoint responds:
http://localhost:3002/health
Debug Mode
Enable debug logging in config.json:
{
"hub": {
"log_level": "debug"
}
}
Or set environment variable:
LOG_LEVEL=debug npm start
Known Issues
Serena Language Server Initialization Issues
Symptoms:
- Serena takes a long time to initialize (25+ seconds) or fails to start
- Code analysis tools are not available
- Timeout errors in the logs
Root Cause: The Pyright language server cannot start due to missing Python installation or incompatible Python command.
Automatic Resolution: This issue has been resolved with an automatic Python detection patch that is applied during installation. The system now:
- Auto-detects the correct Python command (
python3,python, orpy) - Provides clear error messages if no suitable Python installation is found
- Logs the detected Python command for troubleshooting
Manual Troubleshooting: If you still encounter issues after installation, ensure:
-
Python is installed and accessible:
python3 --version # or python --version -
Pyright is available:
# Install via pip pip install pyright # Or install via npm (alternative) npm install -g pyright -
Check the logs for the detected Python command:
Using Python command: python3
Expected Behavior: After proper installation, Serena should initialize in under 5 seconds and display the detected Python command in the logs.
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly with different Ollama models
- Submit a pull request
Please ensure all changes maintain compatibility with the OpenAI API specification and Continue extension requirements.
License
MIT License - see file for details.
Acknowledgements
- Ollama - Local language model server
- Continue - VS Code AI coding assistant
- Serena - Semantic code analysis toolkit
- Context7 - Documentation search MCP
- Model Context Protocol - Standardized tool integration protocol