ibm-watsonxdata-dl-retrieval-mcp-server

IBM/ibm-watsonxdata-dl-retrieval-mcp-server

3.3

If you are the rightful owner of ibm-watsonxdata-dl-retrieval-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

The Watsonx.data Document Library Retrieval MCP Server is a service that connects AI agents with document libraries in watsonx.data, facilitating intelligent data retrieval and interaction.

Watsonx.data Document Library Retrieval MCP Server

The Watsonx.data Document Library Retrieval MCP Server is a Model Context Protocol (MCP)-compliant service that seamlessly connects AI agents with document libraries in watsonx.data, enabling intelligent data retrieval and interaction.

Key Features

  • Dynamic Discovery & Registration
    Automatically detects and registers document libraries as MCP tools.

  • Natural Language Interface
    Query document libraries using conversational language and receive human-readable responses.

  • Minimal Configuration
    Deploy with simple setup requirements and zero complex configurations.

  • Framework-Agnostic Integration
    Plug directly into the preferred agentic frameworks with native MCP compatibility.


Overview

  • Protocol: Model Context Protocol (MCP)
  • Purpose: Acts as a bridge between agentic AI frameworks and watsonx.data document libraries
  • Supported Environments: IBM Cloud Pak for Data (CPD), Watsonx SaaS
  • Agent Compatibility: The agentic framework must support the MCP standard (via SSE or Stdio).
    Note: This server will not function with agents that do not support MCP.

Prerequisites

  • Python version 3.11 or later
  • Access to your CPD or SaaS environment
  • Access credentials and a CA certificate bundle for CPD
  • Ensure your agent framework supports MCP protocol

Getting CA Bundle for CPD

  1. Login to your OpenShift cluster:

    oc login -u kubeadmin -p '<your_openshift_password>' https://<your_openshift_cpd_url>:6443
    
  2. Extract the root CA bundle:

    oc get configmap kube-root-ca.crt -o jsonpath='{.data.ca\.crt}' > cabundle.crt
    

NOTE: Please use open shift login command. The user and password will be open shift portal login username and password


Setup

Step 1: Install Python

Step 2: Create a virtual environment

python -m venv .venv

Step 3: Activate the virtual environment

source .venv/bin/activate  # macOS/Linux
.venv\Scripts\activate     # Windows

Step 4: Install the uv package manager

pip install uv

Step 5: Install the MCP server package

pip install ibm-watsonxdata-dl-retrieval-mcp-server

Configuration

For Cloud Pak for Data (CPD):

export CPD_ENDPOINT="<cpd-endpoint>"
export CPD_USERNAME="<cpd-username>"
export CPD_PASSWORD="<cpd-password>"
export CA_BUNDLE_PATH="<absolute_path_to_cabundle.crt>"
export LH_CONTEXT="CPD"

NOTE:

For Watsonx SaaS:

export WATSONX_DATA_API_KEY="<api-key>"
export WATSONX_DATA_RETRIEVAL_ENDPOINT="<retrieval-service-endpoint>"
export DOCUMENT_LIBRARY_API_ENDPOINT="<document-library-endpoint>"
export WATSONX_DATA_TOKEN_GENERATION_ENDPOINT="<token-generation-endpoint>"
export LH_CONTEXT="SAAS"

NOTE:


Running the Server

uv run ibm-watsonxdata-dl-retrieval-mcp-server

By default, the server runs in sse transport mode on port 8000.

Transport: SSE

uv run ibm-watsonxdata-dl-retrieval-mcp-server --port <desired_port> --transport sse

Transport: stdio

uv run ibm-watsonxdata-dl-retrieval-mcp-server --port <desired_port> --transport stdio

Integrating with WXO

Prerequisite:

Install WXO ADK and complete the initial setup. Refer documentation for more details: https://developer.watson-orchestrate.ibm.com

Transport: STDIO

To add the MCP server in stdio transport with WXO refer the example below.

  1. create connection
orchestrate connections add -a <app id>
  1. Configure connection
orchestrate connections configure --app-id <app id> --environment draft -t team -k key_value
  1. Setting credentials
orchestrate connections set-credentials --app-id=<app id> --env draft -e WATSONX_DATA_API_KEY="<api_key>" -e WATSONX_DATA_RETRIEVAL_ENDPOINT="<wxd retrieval endpoint>" -e DOCUMENT_LIBRARY_API_ENDPOINT="<DL endpoint>" -e WATSONX_DATA_TOKEN_GENERATION_ENDPOINT="<token generation endpoint>" -e LH_CONTEXT="SAAS"

Example for Saas:

orchestrate toolkits import \
    --kind mcp \
    --name "mcp-toolkit" \
    --description "mcp server for watsonx retrival service" \
    --package "ibm-watsonxdata-dl-retrieval-mcp-server" \
    --command "uv run ibm-watsonxdata-dl-retrieval-mcp-server --port <port> --transport stdio" \
    --language python \
    --tools "*" \
    --app-id <app id>

Transport: SSE

  1. Install mcp-proxy
pip install mcp-proxy  
  1. Run ibm-watsonxdata-dl-retrieval-mcp-server in sse transport.

Once prerequisites are met, the tools can be added as toolkit in WXO.

Example :

orchestrate toolkits import \ 
  --kind mcp \ 
  --name mcp_toolkit \ 
  --description "MCP server (hosted, SSE)" \ 
  --package "mcp-proxy" \ 
  --language python \ 
  --command "uvx mcp-proxy https://<mcp server endpoint>/sse" \ 
  --tools "*" 

NOTE:
When running wxo in SAAS and MCP server locally, expose the mcp server endpoint if required.

Refer wxo documentation for more details: https://www.ibm.com/docs/en/watsonx/watson-orchestrate/base?topic=tools-importing-from-mcp-server


Integrating with other Agentic Frameworks

For more examples on using Watsonx.data Document Library Retrieval MCP Server with agentic framework refer examples

Limitations

  • Environment credentials cannot be changed during runtime.
    • To change credentials, either:
      • Start a new server with new env variables, OR
      • Source new environment variables and restart the server.

Tool Naming

Each document library is registered with a unique tool name:

tool_name = <library_name><library_id>

Example:

invoice_document_library77e4b4dd_479e_4406_acc4_ce154c96266c