d-fine/DatalandMCP
If you are the rightful owner of DatalandMCP and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
DatalandMCP is an MCP server that enables LLMs to access data from Dataland, integrating with open-source chat clients like LibreChat and Open WebUI.
DatalandMCP
This repository contains an MCP server that allows LLMs to access data from Dataland. Additionally, the open-source chat client LibreChat can be launched within this repository and serve as the MCP host.
Table of Contents
Prerequisites
- Have Docker Desktop installed
- Create a personal account on https://dataland.com and https://test.dataland.com
Quick Start with Docker Compose
The easiest way to get started is by using Docker Compose, which starts the MCP server and LibreChat with one command.
Note: LibreChat requires some configuration prior to launching the service. See instructions below.
Create a .env
file at the project root with your Dataland API key:
DATALAND_API_KEY=your_api_key_here
You can create your API key as described here.
Launch
From the repository root directory, start both the MCP server and host:
./deployment/local_deployment.sh --profile all
After successful launch:
- LibreChat will be available at http://localhost:3080
- DatalandMCP server streams via http://localhost:8001/mcp
- MCP server documentation (Swagger UI) will be accessible at http://localhost:8000/DatalandMCP/docs
To stop the services:
docker compose --profile all down
Startup Options
There are three different profiles to only launch and stop specific services. These can be triggered via the --profile
flag.
--profile | DatalandMCP | LibreChat |
---|---|---|
mcp | ✅ | ❌ |
librechat | ❌ | ✅ |
all | ✅ | ✅ |
Note: The startup option librechat
is to be used if an instance of the MCP server is deployed on a remote machine and the user only wants to run LibreChat locally.
Docker Volumes (User data, configurations, ...)
LibreChat stores specific data (e.g. user accounts) in volumes that are preserved between service restarts.
# List all volumes
docker volume ls
# Stop the services first
docker compose --profile all down
# Remove specific volumes to start fresh
docker volume rm <volume-name1> <volume_name2>
# Start services again
./deployment/local_deployment.sh --profile all
Configure LibreChat
LibreChat can be configured via a librechat.yaml
file in the project root.
Note: A separate file .env.librechat
contains the environment variables needed for LibreChat. They do not contain private secrets and do not need to be modified.
The DatalandMCP server is already configured in LibreChat which expects the server to run on port 8001.
Using other ports will require to amend the port also in the librechat.yaml
file.
# DatalandMCP Server Connection
mcpServers:
Dataland:
type: http
url: http://host.docker.internal:8001/mcp
timeout: 60000
The following steps illustrate how to connect an Azure OpenAI model to LibreChat.
-
Stop running services:
docker compose --profile all down
-
Add API key: Add the
API_KEY
of the deployed model to the.env
file:AZURE_OPENAI_API_KEY=your_api_key_here
-
Add model to config file: Open the
librechat.yaml
file located in the project root. Go to theendpoints
object and uncomment theazureOpenAI
configuration:# Azure OpenAI configuration endpoints: azureOpenAI: titleModel: "" # Name of the deployed model in Azure, e.g. "d-fine-azure-gpt-5". groups: - group: "" # Arbitrary name, e.g. "dataland-group" apiKey: "${AZURE_OPENAI_API_KEY}" # Azure OpenAI API KEY from .env instanceName: "" # Azure resource name, e.g. "dataland-mcp-resource" version: "" # API version, e.g. "2024-12-01-preview" models: displayed-model-name: # Change to name of the deployed model in Azure, e.g. "d-fine-azure-gpt-5". deploymentName: "" # Name of the deployed model in Azure, e.g. "d-fine-azure-gpt-5". version: "" # API version same as above, e.g. "2024-12-01-preview"
Fill out the configuration with the corresponding values of your deployed model. Note that the
displayed-model-name
key also needs to be changed to your model name. -
Add model specifications: Within the
librechat.yaml
file, go to themodelSpecs
object and uncomment the configuration.# Model Specification modelSpecs: enforce: false prioritize: true list: - name: "dataland-mcp-gpt-X" # Unique identifier, change accordingly label: "Dataland Assistant (GPT-X)" # Displayed name, change accordingly default: true description: "Retrieves and analyzes ESG data from Dataland." preset: endpoint: "azureOpenAI" model: "your_model_here" # Model name of the configured endpoint below ...
For
model
use the name of the deployed model (titleModel
from the previous step). Amendname
andlabel
according to the used GPT version. The other values must not be changed. -
Start the services:
./deployment/local_deployment.sh --profile all
Note: Initially, LibreChat may report that the connection to the MCP server has failed. This occurs because the LibreChat service starts more quickly than the DatalandMCP service; hence, the MCP server might not yet be running. As soon as the server is running, LibreChat will connect without requiring a restart.
-
Create a LibreChat account: Navigate to http://localhost:3080 and create an account.
-
Select the Dataland MCP server: Upon successful connection with the MCP server, a button will appear in the chat window. Select Dataland and start chatting.
Troubleshooting
WSL Segmentation Errors (Windows Users)
If you're running on Windows with WSL and encounter segmentation faults during pdm install
, this is likely due to insufficient RAM allocation. Create a .wslconfig
file in your Windows user directory (C:\Users\[username]\.wslconfig
) with the following content:
[wsl2]
memory=8GB
processors=4
swap=2GB
Adjust the memory allocation based on your system's available RAM. After creating the file, restart WSL by running wsl --shutdown
in PowerShell and then reopen your WSL terminal.
Development Setup
For development purposes, you may want to set up the environment locally without Docker.
Prerequisites for Development
- Have Python 3.11 or 3.12 installed
- Have PDM installed on your machine (on Windows, open Command Prompt and execute the following command to download PDM, then restart your PC):
powershell -ExecutionPolicy ByPass -c "irm https://pdm-project.org/install-pdm.py | py -"
- Have Java installed (if you have attended the d-fine Basic IT training during onboarding you should already have it). It is recommended to use the IntelliJ IDEA Community Edition.
- Have Visual Studio Code or PyCharm Community Edition installed.
Clone this repository to a designated folder via git clone
.
Dataland Client
- Create a
.env
file at the project root based on the.env.example
file. SetDATALAND_MCP_ROOT_DIR
to the repository root on your machine andDATALAND_API_KEY
to your API key (you can create one as described here). - Execute
.\bin\setup_dev_environment.sh
using a Git Bash shell from your repository root.