How it works
Topic Builder lets you use AI to collect a set of Wikipedia articles about any topic. You explain what the topic is, and what sorts of articles should be central to it, peripheral but still relevant, or outside the scope of your topic. The AI gets a menu of common exploration strategies, and handles the mechanical work — running searches, traversing categories, fetching descriptions.
Getting a more complete topic
The AI will produce a serviceable topic on its own, but it tends to stop where stopping is easy: round-number article counts, the obvious core, the first strategies it tried. The difference between a "fine" topic and a great one is mostly how involved you are at five high-leverage moments.
- Push back during scoping. The AI will propose a default scope. State explicitly what's in and what's out — biographies of key people, sub-events, geographic or sister-language variants, "in popular culture" articles, lists. Ambiguity in scope becomes ambiguity in the corpus.
- Don't accept the first "looks done." LLMs gravitate to round-number stopping points and pattern-complete to "this seems comprehensive." Ask "what kinds of articles haven't we tried to find yet?" — the AI usually has more strategies left when prompted.
- Run complementary strategies and compare. If you've harvested via WikiProject, also try a category-tree or list-page pull. The diff between two strategies surfaces both noise (articles only one picked up) and gaps (articles the other found that the first missed).
- Spot-check before exporting. Name 3–5 articles you'd expect to see — ideally obscure ones, not the obvious core. If any are missing, ask the AI to investigate; one missed article often reveals a strategy gap affecting dozens of similar ones.
- Bring your domain expertise to edge calls. For ambiguous articles, your judgment about whether something is "really" the topic is what the AI doesn't have. Don't outsource these calls — your time on edges is the highest-leverage time you'll spend.
The five recommendations above are the patterns we already know help. The built-in strategies — categories, WikiProjects, list pages, Wikidata, search — cover a lot, but they aren't the whole space. If you (or the AI) have other ideas for where articles might live — an authoritative reference list, a SPARQL query, the references section of a few key articles, a topic-specific infobox template — say so. The tools are general-purpose enough that most approaches can be worked in, and a domain expert's instincts often beat the standard menu.
Sign in
Topic Builder uses your Wikimedia account to scope topics to you. Sign in once per device, paste the resulting token to your AI, and the rest of the conversation is yours.
authenticate to bind your identity to the session, and may offer to remember the token for future chats.Topics are private by default — only you can see or modify them. To share, ask the AI to set the topic's visibility to public_read (anyone reads, only you edit) or public_edit (any signed-in user reads + edits). If a token is ever leaked, ask the AI to call revoke_my_token and get a fresh one at /oauth/login.
Once signed in, visit /topics to see every topic you own and download a fresh simple or enriched CSV for any of them on demand.
Connect to Claude
Detailed setup for Claude on the web. Choose the path that matches your account.
Individual Claude account Personal / Pro
"I want to build a list of all Wikipedia articles about human trafficking. Let's start with reconnaissance."
Organization account — admin setup Admin
As an org admin, you add the connector once and it becomes available to all members.
Organization account — using an enabled connector Member
Your admin has already added the Wikipedia Topic Builder connector. You just need to enable it in a chat.
"I want to build a list of all Wikipedia articles about human trafficking. Let's start with reconnaissance."
Connect to ChatGPT
ChatGPT supports remote MCP servers as custom connectors. Availability and menu names vary by plan (Plus / Pro / Business / Enterprise / Edu) and have shifted as OpenAI has rolled out MCP support — if any step below doesn't match your UI, check OpenAI's MCP documentation.
ChatGPT custom connector ChatGPT
"I want to build a list of all Wikipedia articles about human trafficking. Let's start with reconnaissance."
Tool-calling quality depends on the model you pick. For long multi-step topic builds, use a reasoning-capable model.
Other MCP clients
Any MCP-capable client (Cursor, Zed, Continue, custom agents built on the protocol) can connect to the URL above. Consult your client's documentation for how to add a remote MCP server with no connector-level authentication — user sign-in happens in chat via the steps under Sign in above.
Available tools
The AI drives these directly — you don't need to call them by hand.
Output format
By default, export_csv emits a single-column CSV — one article title per row, no header, plain UTF-8 — ready to feed into the Wiki Education Impact Visualizer, which reads exactly that shape.
Pass enriched=True for a richer six-column variant with a header row: title, wikidata_qid, description, score, source_labels (pipe-separated), first_added_at. UTF-8 with BOM so Excel detects encoding. Useful for manual review, downstream tooling, or future Impact Visualizer filtering on topic centrality.
For an end-to-end handoff that skips the CSV download, use publish_topic instead of export_csv. It mints an https://impact-visualizer.wmcloud.org/imports/<handle> URL the user clicks; Impact Visualizer server-side fetches the snapshot from https://topic-builder.wikiedu.org/packages/<handle> and creates the Topic + ArticleBag in one transaction. Centrality scores ride along per-article. Frozen at publish time — re-publish to refresh.