Beginnersguide138/cc-mcp
If you are the rightful owner of cc-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The Context Consistent MCP Server is designed to facilitate seamless communication and data exchange between various applications using the Model Context Protocol.
๐ง CC-MCP (Context Consistent MCP Server)
๐ Language / ่จ่ช / ่ฏญ่จ
- English (Current)
- - Japanese version
- - Chinese version
โ ๏ธ Important: Please Check the Proper Usage Guide
- - Essential knowledge for session management
- - ใปใใทใงใณ็ฎก็ใฎๅฟ ้ ็ฅ่ญ
- - ไผ่ฏ็ฎก็ๅฟ ๅค็ฅ่ฏ
๐ Revolutionary AI Context Management: Solving LLM Memory Loss in Long Conversations
Long-term conversational consistency management system for LLM-powered AI agents
๐ The Problem We Solve
Traditional LLMs suffer from "intent forgetting" in long conversations:
- โ Lose track of the original goal after a few turns
- โ Forget important constraints and decisions
- โ Provide inconsistent responses across dialogue sessions
- โ Require users to constantly remind the AI of context
CC-MCP provides intelligent context management tools to help LLM maintain consistency.
Add this to your MCP client settings (e.g., Cline's cline_mcp_settings.json
):
{
"mcpServers": {
"cc-mcp": {
"autoApprove": [
"process_user_message",
"start_session",
"get_debug_info",
"list_sessions",
"get_session_stats",
"export_context",
"import_context"
],
"disabled": false,
"timeout": 60,
"type": "stdio",
"command": "uv",
"args": [
"run",
"--directory",
"/path/to/your/cc-mcp",
"main.py"
],
"env": {
"CLASSIFIER_API_URL": "https://api.openai.com/v1/chat/completions",
"CLASSIFIER_API_KEY": "your_openai_api_key_here",
"CLASSIFIER_MODEL": "gpt-4o-mini"
}
}
}
}
- Environment configured (see Quick Start)
Replace
/path/to/your/cc-mcp
with your actual repository path.
For Ollama users: When using Ollama instead of OpenAI, modify the environment variables as follows:
- Change
CLASSIFIER_API_URL
to your Ollama server URL (e.g.,http://localhost:11434/v1/chat/completions
) - Change
CLASSIFIER_MODEL
to your preferred Ollama model (e.g.,llama3.2
) - Remove or leave empty
CLASSIFIER_API_KEY
as it's not required for Ollama
Example configuration for Ollama:
"env": {
"CLASSIFIER_API_URL": "http://localhost:11434/v1/chat/completions",
"CLASSIFIER_MODEL": "llama3.2"
}
Prerequisites:
- Python 3.13+
- uv package manager
- LLM API access (Azure OpenAI, OpenAI, etc.) or a local LLM such as Ollama.
- Environment configured (see Quick Start)
โจ Key Features
๐๏ธ Hierarchical Context Architecture
Core Context (Long-term) โโโบ Problem definitions & core objectives
โ
Evolving Context (Mid-term) โโโบ Constraints, decisions & refinements
โ
Turn Context (Short-term) โโโบ Recent conversation flow
๐ง Intelligent Intent Classification
- 5 Intent Categories:
PROBLEM_DEFINITION
,CONSTRAINT_ADDITION
,REFINEMENT
,QUESTION
,UNCLEAR
Detailed Intent Descriptions (as defined in intent_classifier.py
):
The classification logic is based on a prompt that instructs the LLM to analyze the user's message and assign one or more of the following labels:
-
PROBLEM_DEFINITION
: The user is defining the central problem they want to solve.- Example from prompt: "I'd like to automatically summarize meeting minutes with AI, is there a good way to do that?"
-
CONSTRAINT_ADDITION
: The user is adding constraints or conditions, such as budget or timeline.- Example from prompt: "Sounds good. However, please limit the models used to open-source ones."
-
REFINEMENT
: The user is making an existing requirement more specific or is modifying it. -
QUESTION
: The user is asking a simple question. -
UNCLEAR
: The message cannot be clearly classified into any of the above categories.
๐ง Intelligent Context Management
- Context-Aware: Automatically organizes context across three hierarchical levels
- Purpose-Driven: Maintains awareness of core problems and objectives
- Constraint-Compliant: Tracks and applies accumulated constraints and decisions
How Memory Management is Implemented:
The "Intelligent context pruning and optimization" is achieved through a multi-faceted approach within the HierarchicalContextStore
(context_store.py
):
-
Hierarchical Context Storage: The system categorizes information into three tiers, preventing context degradation:
- Core Context: Holds the primary problem definition, ensuring the main goal is never lost.
- Evolving Context: Stores a list of constraints and refinements, allowing the solution space to be progressively narrowed.
- Turn Context: A short-term memory that holds the last few conversational turns to maintain immediate conversational flow.
-
Automatic Turn Context Pruning: To prevent memory bloat, the
turn_context
is automatically trimmed. The_trim_turn_context()
method ensures that only the most recent messages (default: 6) are kept, discarding older, less relevant turns. -
Keyword-Based Optimization: Instead of relying on full-text history, the system extracts and stores weighted keywords for
Core
andEvolving
contexts. Theget_context_summary()
method then provides a condensed, keyword-driven summary to the LLM, which is far more efficient to process than raw conversational history.
This structured, rule-based approach ensures that memory is managed efficiently and intelligently, keeping the most critical information accessible without overwhelming the LLM.
๐ Statistical Keyword Extraction
- TF-IDF Powered: Identifies and extracts the most significant keywords from the conversation using the TF-IDF algorithm.
- Contextual Summarization: Provides a condensed, keyword-based summary of the core problem and evolving constraints, which is more efficient for the LLM to process.
- Dynamic Corpus: The keyword extraction engine continuously updates its document corpus with every new conversation, improving the accuracy of its IDF calculations over time.
How Keyword Extraction is Implemented (as defined in keyword_extraction.py
):
The statistical keyword extraction uses a sophisticated TF-IDF implementation with several optimizations:
-
Term Frequency (TF) Calculation: The
_calculate_tf()
method computes how frequently each word appears in a specific message relative to the total word count in that message. This identifies words that are emphasized within the current context. -
Inverse Document Frequency (IDF) Calculation: The
_calculate_idf()
method determines how rare a word is across all conversations in the corpus. The formula used isIDF = log(total_documents / documents_containing_term)
, making rare words more valuable. -
TF-IDF Scoring: The
extract_keywords()
method combines TF and IDF scores (TF-IDF = TF ร IDF
) to identify words that are both frequent in the current message and rare across all conversations, indicating topic-specific importance. -
Multilingual Tokenization: The
_tokenize()
method uses regex patterns to extract Japanese (hiragana, katakana, kanji) and English words while filtering out common stop words in both languages. -
Corpus Management: The system maintains a growing document corpus that improves IDF accuracy over time. Each new message updates the
_document_frequency
dictionary and_total_documents
count, making future extractions more precise.
This approach ensures that extracted keywords represent the most contextually relevant terms while filtering out common, less meaningful words.
โก Performance Excellence
- Ultra-Fast: Works with less-capable models like
gpt-3.5-tubo
,gpt-4o-mini
orgemma3:4b
- Scalable: Handles multiple concurrent sessions
- Memory Efficient: Intelligent context pruning and optimization
๐พ Session Persistence
- Stateful Operation: All conversation sessions and keyword statistics are automatically saved to and restored from the
persistence/
directory, ensuring no data is lost between server restarts. - Robustness: This makes the server resilient, allowing context to be maintained even if the server crashes or is rebooted.
๐ฏ Real-World Impact
Before CC-MCP:
User: "I want to build an AI assistant app"
AI: "Sure! Here are some general approaches..."
[10 messages later]
User: "Remember, budget is 500K yen, 3 months timeline"
AI: "What project are we talking about?" โ
After CC-MCP:
User: "I want to build an AI assistant app"
AI: "Great! Let me help you design this system..."
[10 messages later]
User: "What about deployment options?"
AI: "For your AI assistant app (budget: 500K yen, 3-month timeline, security priority), here are deployment strategies that fit your constraints..." โ
๐ ๏ธ Technical Architecture
graph TD
subgraph CC-MCP Server
direction LR
C[Intent Classifier]
D[Context Store]
E[Keyword Extractor]
end
A[MCP Client Application] -- MCP Tools --> B((main.py))
B -- Manages --> C
B -- Manages --> D
B -- Manages --> E
C -- Calls --> G[External LLM API]
Component Descriptions:
This server is built on three core components that work together to provide context-aware responses:
-
Intent Classifier
(intent_classifier.py
):- Function: This component analyzes the user's message to determine its underlying purpose.
- Implementation: It sends the user's message to an external LLM (like GPT or a local Ollama model) with a carefully crafted prompt that asks the LLM to categorize the message into one of five intents:
PROBLEM_DEFINITION
,CONSTRAINT_ADDITION
,REFINEMENT
,QUESTION
, orUNCLEAR
.
-
Context Store
(context_store.py
):- Function: This is the "memory" of the system. It organizes and stores the conversational context in a hierarchical structure.
- Implementation: It uses a three-tiered system: a
core_context
for the main problem, anevolving_context
for constraints and refinements, and aturn_context
for the most recent messages. It automatically prunes the short-termturn_context
to manage memory efficiently.
-
Keyword Extractor
(keyword_extraction.py
):- Function: This engine identifies the most significant terms from user messages that are stored in the
core
andevolving
contexts. - Implementation: It uses a classic TF-IDF (Term Frequency-Inverse Document Frequency) algorithm. It calculates how often a word appears in a message (TF) and how rare that word is across all conversations (IDF). The combination of these scores allows it to pinpoint keywords that are uniquely important to the current topic.
- Function: This engine identifies the most significant terms from user messages that are stored in the
MCP Tools Available:
process_user_message
- Core message processing with context managementexport_context
- Export conversation state as JSONimport_context
- Restore conversation stateclear_context
- Reset all context dataget_debug_info
- Detailed debugging informationstart_session
/end_session
- Session lifecycle managementlist_sessions
- Active session monitoringget_session_stats
- Performance analytics
Transport Protocol:
The default transport protocol is stdio
. SSE (Server-Sent Events) is also supported for direct HTTP communication, using the endpoint http://127.0.0.1:8001/sse/
.
๐ฆ Quick Start
Prerequisites
- Python 3.13+
- uv package manager
- LLM API access (Azure OpenAI, OpenAI, etc.) or a local LLM such as Ollama.
- Environment configured (see Quick Start)
Installation
# Clone the repository
git clone https://github.com/Beginnersguide138/cc-mcp.git
cd cc-mcp
# Install dependencies
uv sync
# Configure environment
cp .env.example .env
# Edit .env with your API credentials
Updating the Repository
To receive the latest updates after cloning the repository, use the following Git commands:
# Fetch the latest changes from the remote repository
git fetch origin
# Pull the latest changes from the main branch
git pull origin main
# Update dependencies (if they have changed)
uv sync
It's recommended to regularly update your local repository to:
- Get the latest features
- Receive bug fixes
- Ensure compatibility with the most recent version
Environment Configuration
# .env file example - Only 3 variables needed!
CLASSIFIER_API_URL=https://api.openai.com/v1/chat/completions
CLASSIFIER_API_KEY=sk-your_openai_api_key_here
CLASSIFIER_MODEL=gpt-3.5-turbo
# Alternative providers (see .env.example for full list):
# Azure OpenAI, Anthropic Claude, Google Gemini, Ollama supported
# For Ollama (Local LLM):
# CLASSIFIER_API_URL=http://localhost:11434/v1/chat/completions
# CLASSIFIER_API_KEY=not_required # Ollama doesn't need auth; leave empty with the latest version
# CLASSIFIER_MODEL=llama3
Launch Server
# Start CC-MCP server with stdio transport (default for clients like Cline)
uv run main.py
๐ฆ Using Ollama (Local LLM)
CC-MCP now supports Ollama for local LLM usage without API costs. This feature was added in a recent commit.
- Install and run Ollama:
# Install Ollama from https://ollama.com/
# Download and run a model
ollama run llama3
- Configure CC-MCP for Ollama in
.env
:
CLASSIFIER_API_URL=http://localhost:11434/v1/chat/completions
CLASSIFIER_API_KEY=not_required # Ollama doesn't need auth; leave empty
CLASSIFIER_MODEL=llama3 # Or: mistral, gemma, phi, qwen, etc.
- Update your MCP client configuration:
Simply change the
env
section in your existingcc-mcp
configuration:
"env": {
"CLASSIFIER_API_URL": "http://localhost:11434/v1/chat/completions",
"CLASSIFIER_API_KEY": "not_required",
"CLASSIFIER_MODEL": "llama3"
}
That's it! CC-MCP will now use your local Ollama model for intent classification.
๐ก Usage Examples
1. Basic Message Processing
import asyncio
from cc_mcp import process_user_message
async def demo():
# Define core problem
result = await process_user_message(
"I want to build an AI assistant for customer support"
)
print("Intent:", result["metadata"]["intent_classification"]["intent"])
# Expected: ["PROBLEM_DEFINITION"]
# Add constraints
result = await process_user_message(
"Budget is $50,000 and must be completed in 2 months"
)
print("Intent:", result["metadata"]["intent_classification"]["intent"])
# Expected: ["CONSTRAINT_ADDITION"]
# Ask unrelated question - context still preserved!
result = await process_user_message(
"What's the weather like today?"
)
# AI response will still maintain awareness of the support system project!
asyncio.run(demo())
2. Context Management
# Export current conversation state
context_data = await export_context(session_id="my-session")
print(f"Core Problem: {context_data['core_problem']}")
print(f"Active Constraints: {len(context_data['constraints'])}")
# Clear context for fresh start
await clear_context(session_id="my-session")
# Import previously saved context
await import_context(json_state=context_data, session_id="my-session")
3. Multi-Session Management
# Start new session
session_info = await start_session()
session_id = session_info["session_id"]
# Process messages in specific session
await process_user_message(
message="Design a mobile app",
session_id=session_id
)
# Check session statistics
stats = await get_session_stats(session_id=session_id)
print(f"Messages: {stats['total_messages']}")
print(f"Active Constraints: {stats['active_constraints']}")
๐ง Advanced Configuration
Custom Intent Labels
# Extend intent classification in intent_classifier.py
CUSTOM_INTENTS = {
"TECHNICAL_SPEC": "User is providing technical specifications",
"BUDGET_CHANGE": "User is modifying budget constraints",
"TIMELINE_UPDATE": "User is adjusting project timeline"
}
Context Storage Optimization
# Configure in context_store.py
CONTEXT_CONFIG = {
"max_core_contexts": 3, # Maximum core problems per session
"max_evolving_items": 10, # Maximum constraints/refinements
"turn_history_limit": 5, # Recent conversation turns to keep
"auto_cleanup": True # Automatic memory optimization
}
API Parameter Tuning
# Optimize for your use case
CLASSIFIER_PARAMS = {
"temperature": 0.1, # Lower for consistency
"max_tokens": 150, # Sufficient for classification
"top_p": 0.9 # Focus on high-probability responses
}
MAIN_LLM_PARAMS = {
"temperature": 0.7, # Higher for creativity
"max_tokens": 2000, # Generous for detailed responses
"presence_penalty": 0.1 # Encourage diverse vocabulary
}
๐ International Support
CC-MCP is designed with internationalization in mind:
- Japanese-Optimized: Native Japanese intent classification
- English Support: Full English documentation and examples
- Extensible: Easy to add support for additional languages
- Unicode-Safe: Handles all character encodings properly
๐ค Contributing
We welcome contributions from the global community!
How to Contribute:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
Development Setup:
# Clone your fork
git clone https://github.com/YOUR-USERNAME/cc-mcp.git
cd cc-mcp
# Install development dependencies
uv sync --group dev
# Run pre-commit hooks
pre-commit install
# Run the full test suite
uv run pytest
๐ Documentation
Usage Guides (Essential Reading!)
- ๐ - Complete guide for proper usage
- ๐ - ๆญฃใใไฝฟใๆนใฎๅฎๅ จใฌใคใ
- ๐ - ๆญฃ็กฎไฝฟ็จ็ๅฎๆดๆๅ
๐ Support & Community
- ๐ Report Issues - Bug reports and feature requests
- ๐ฌ Discussions - Community Q&A
- ๐ง Contact - Direct support
๐ Recognition
CC-MCP represents a breakthrough in AI conversation management:
- ๐ฅ First Production-Ready Solution for LLM intent forgetting
- โก Sub-300ms Performance with enterprise-grade reliability
- ๐ฌ Research-Backed architecture with proven effectiveness
- ๐ Open Source commitment to advancing AI technology
๐ License
This project is licensed under the MIT License - see the file for details.
๐ Acknowledgments
- MCP Protocol Team - For the excellent Model Context Protocol
- OpenAI & Azure OpenAI - For powerful LLM APIs
- FastAPI Community - For the robust web framework
- UV Team - For the lightning-fast Python package manager
๐ Ready to revolutionize your AI conversations?
Built with โค๏ธ by the CC-MCP Team
Empowering consistent AI conversations worldwide