DatalandMCP by d-fine - MCP Server

DatalandMCP

This repository contains an MCP server that allows LLMs to access data from Dataland. Additionally, the open-source chat client LibreChat can be launched within this repository and serve as the MCP host.

Prerequisites
Quick Start with Docker Compose
- Launch
- Configure LibreChat
Troubleshooting
Development Setup
- Dataland Client

Prerequisites

Have Docker Desktop installed
Create a personal account on https://dataland.com and https://test.dataland.com

Quick Start with Docker Compose

The easiest way to get started is by using Docker Compose, which starts the MCP server and LibreChat with one command.

Note: LibreChat requires some configuration prior to launching the service. See instructions below.

Create a .env file at the project root with your Dataland API key:

DATALAND_API_KEY=your_api_key_here

You can create your API key as described here.

Launch

From the repository root directory, start both the MCP server and host:

./deployment/local_deployment.sh --profile all

After successful launch:

LibreChat will be available at http://localhost:3080
DatalandMCP server streams via http://localhost:8001/mcp
MCP server documentation (Swagger UI) will be accessible at http://localhost:8000/DatalandMCP/docs

To stop the services:

docker compose --profile all down

Startup Options

There are three different profiles to only launch and stop specific services. These can be triggered via the --profile flag.

`--profile`	DatalandMCP	LibreChat
`mcp`	✅	❌
`librechat`	❌	✅
`all`	✅	✅

Note: The startup option librechat is to be used if an instance of the MCP server is deployed on a remote machine and the user only wants to run LibreChat locally.

Docker Volumes (User data, configurations, ...)

LibreChat stores specific data (e.g. user accounts) in volumes that are preserved between service restarts.

# List all volumes
docker volume ls

# Stop the services first
docker compose --profile all down

# Remove specific volumes to start fresh
docker volume rm <volume-name1> <volume_name2>

# Start services again
./deployment/local_deployment.sh --profile all

Configure LibreChat

LibreChat can be configured via a librechat.yaml file in the project root.

Note: A separate file .env.librechat contains the environment variables needed for LibreChat. They do not contain private secrets and do not need to be modified.

The DatalandMCP server is already configured in LibreChat which expects the server to run on port 8001. Using other ports will require to amend the port also in the librechat.yaml file.

# DatalandMCP Server Connection
mcpServers:
  Dataland:
    type: http
    url: http://host.docker.internal:8001/mcp
    timeout: 60000

The following steps illustrate how to connect an Azure OpenAI model to LibreChat.

Stop running services:
```
docker compose --profile all down
```
Add API key: Add the API_KEY of the deployed model to the .env file:
```
AZURE_OPENAI_API_KEY=your_api_key_here
```

Add model to config file: Open the librechat.yaml file located in the project root. Go to the endpoints object and uncomment the azureOpenAI configuration:

# Azure OpenAI configuration
endpoints:
  azureOpenAI:
    titleModel: "" # Name of the deployed model in Azure, e.g. "d-fine-azure-gpt-5".
    groups:
      - group: "" # Arbitrary name, e.g. "dataland-group"
        apiKey: "${AZURE_OPENAI_API_KEY}" # Azure OpenAI API KEY from .env
        instanceName: "" # Azure resource name, e.g. "dataland-mcp-resource"
        version: "" # API version, e.g. "2024-12-01-preview"
        models:
          displayed-model-name: # Change to name of the deployed model in Azure, e.g. "d-fine-azure-gpt-5".
            deploymentName: "" # Name of the deployed model in Azure, e.g. "d-fine-azure-gpt-5".
            version: "" # API version same as above, e.g. "2024-12-01-preview"

Fill out the configuration with the corresponding values of your deployed model. Note that the displayed-model-name key also needs to be changed to your model name.

Add model specifications: Within the librechat.yaml file, go to the modelSpecs object and uncomment the configuration.

# Model Specification
modelSpecs:
  enforce: false
  prioritize: true
  list:
    - name: "dataland-mcp-gpt-X" # Unique identifier, change accordingly
      label: "Dataland Assistant (GPT-X)" # Displayed name, change accordingly
      default: true
      description: "Retrieves and analyzes ESG data from Dataland."
      preset:
        endpoint: "azureOpenAI"
        model: "your_model_here" # Model name of the configured endpoint below
        ...

For model use the name of the deployed model (titleModel from the previous step). Amend name and label according to the used GPT version. The other values must not be changed.

Start the services:
```
./deployment/local_deployment.sh --profile all
```
Note: Initially, LibreChat may report that the connection to the MCP server has failed. This occurs because the LibreChat service starts more quickly than the DatalandMCP service; hence, the MCP server might not yet be running. As soon as the server is running, LibreChat will connect without requiring a restart.
Create a LibreChat account: Navigate to http://localhost:3080 and create an account.
Select the Dataland MCP server: Upon successful connection with the MCP server, a button will appear in the chat window. Select Dataland and start chatting.

Troubleshooting

WSL Segmentation Errors (Windows Users)

If you're running on Windows with WSL and encounter segmentation faults during pdm install, this is likely due to insufficient RAM allocation. Create a .wslconfig file in your Windows user directory (C:\Users\[username]\.wslconfig) with the following content:

[wsl2]
memory=8GB
processors=4
swap=2GB

Adjust the memory allocation based on your system's available RAM. After creating the file, restart WSL by running wsl --shutdown in PowerShell and then reopen your WSL terminal.

Development Setup

For development purposes, you may want to set up the environment locally without Docker.

Prerequisites for Development

Have Python 3.11 or 3.12 installed
Have PDM installed on your machine (on Windows, open Command Prompt and execute the following command to download PDM, then restart your PC):
```
powershell -ExecutionPolicy ByPass -c "irm https://pdm-project.org/install-pdm.py | py -"
```
Have Java installed (if you have attended the d-fine Basic IT training during onboarding you should already have it). It is recommended to use the IntelliJ IDEA Community Edition.
Have Visual Studio Code or PyCharm Community Edition installed.

Clone this repository to a designated folder via git clone.

Dataland Client

Create a .env file at the project root based on the .env.example file. Set DATALAND_MCP_ROOT_DIR to the repository root on your machine and DATALAND_API_KEY to your API key (you can create one as described here).
Execute .\bin\setup_dev_environment.sh using a Git Bash shell from your repository root.

d-fine/DatalandMCP