hunterpaulson/wikipedia-mcp-server
If you are the rightful owner of wikipedia-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
A Model Context Protocol (MCP) server that provides tools to fetch and format Wikipedia content for The Wiki Game where models can compete in time/clicks to go from one Wikipedia page to another.
Wikipedia MCP Server
A Model Context Protocol (MCP) server that provides tools to fetch and format Wikipedia content for The Wiki Game where models can compete in time/clicks go from one wikipedia page to another.
Overview
This server exposes several tools that can:
- Fetch the main text content and a list of internal links from a Wikipedia page.
- Provide a "game-friendly" HTML version of a Wikipedia page, where only internal article links are active, designed for experiences like the "Wikipedia race game."
- Extract all unique Wikipedia article links from a page, with options to return full URLs or just page titles to minimize token usage.
It uses httpx
for asynchronous HTTP requests and BeautifulSoup4
for HTML parsing.
Features
- Structured Content Extraction: Get clean text and a list of internal links from any Wikipedia page.
- Wikipedia Race Game Support: Fetch a version of a Wikipedia page tailored for "click-based" navigation games, with external links and clutter removed.
- Minimal Link Extraction: Efficiently retrieve all Wikipedia article links from a page, returning either full URLs or just titles.
- Asynchronous Operations: Built with
async
andhttpx
for non-blocking network requests.
Prerequisites
- Python 3.12+
mcp[cli]
libraryhttpx
librarybeautifulsoup4
library
Installation
-
Clone the repository:
git clone https://github.com/hunterpaulson/wikipedia-mcp-server.git cd wikipedia-mcp-server
-
Set up a Python virtual environment (recommended):
python -m venv .venv source .venv/bin/activate # On Windows use `.\.venv\Scripts\activate`
-
Install the project and its dependencies: The project uses
pyproject.toml
to define its dependencies.If you are using
uv
:uv pip install -e .
This command installs the project in "editable" mode, which is convenient for development. It will automatically pick up dependencies listed in
pyproject.toml
.Alternatively, using
pip
:pip install -e .
This also installs the project in editable mode.
Running the Server
You can run the MCP server in a couple of ways:
-
Using the MCP development tool (recommended for development): This provides an inspector UI to easily test your tools.
mcp dev server.py
-
Directly executing the Python script:
python server.py
Available MCP Tools
Once the server is running, the following tools will be available to compatible MCP clients (e.g., via the MCP Inspector started with mcp dev
):
1. fetch_wikipedia_page
Fetches the main text content and internal links of a Wikipedia page.
- Arguments:
page_title
(str): The title of the Wikipedia page (e.g., "Python (programming language)").
- Returns: (Dict)
A dictionary containing:
"main_text"
: The main article text."wikipedia_links"
: A list of full URLs to other Wikipedia articles found on the page. Or, in case of an error:"error"
: A string describing the error."main_text"
: Empty string."wikipedia_links"
: Empty list.
- Example Call (in MCP Inspector):
- Tool:
fetch_wikipedia_page
page_title
:"Artificial intelligence"
- Tool:
2. fetch_next_wikipedia_page
Fetches and processes a Wikipedia page for the "Wikipedia race game." Returns HTML content where only internal Wikipedia article links are active. References, external links, and non-essential elements (like navboxes, category links, edit links) are removed or de-linked. Infoboxes (sidebars with summary information) are preserved.
- Arguments:
page_title
(str): The title of the Wikipedia page (e.g., "Philosophy").
- Returns: (str)
An HTML string of the processed page content, suitable for rendering in a simple web view for the game.
In case of an error, returns an HTML string containing the error message (e.g.,
<p>Error fetching page...</p>
). - Example Call (in MCP Inspector):
- Tool:
fetch_next_wikipedia_page
page_title
:"Game theory"
- Tool:
3. extract_all_wikipedia_links
Fetches a Wikipedia page and extracts all unique links to other Wikipedia articles found anywhere on the page.
- Arguments:
page_title
(str): The title of the Wikipedia page (e.g., "World Wide Web").titles_only
(bool, optional, default:True
):- If
True
, returns only the page titles (e.g., "Tim Berners-Lee"). - If
False
, returns full URLs (e.g., "https://en.wikipedia.org/wiki/Tim_Berners-Lee").
- If
- Returns: (List[str]) A list of unique Wikipedia page titles or full URLs. If an error occurs, returns a list containing a single string with the error message.
- Example Call (in MCP Inspector):
- Tool:
extract_all_wikipedia_links
page_title
:"Hypertext Transfer Protocol"
titles_only
:True
(orFalse
to get full URLs)
- Tool:
Development Notes
- The server is configured using
FastMCP
from themcp
library. - HTML parsing relies on specific Wikipedia page structures (e.g.,
div#mw-content-text
,div.mw-parser-output
). Significant changes to Wikipedia's layout might require updates to the parsing logic. - Error handling is included for HTTP issues and unexpected exceptions during processing.