{"id":1261,"date":"2023-09-15T20:20:59","date_gmt":"2023-09-15T20:20:59","guid":{"rendered":"https:\/\/enterprise.wikimedia.com\/?p=1261"},"modified":"2026-06-01T16:41:21","modified_gmt":"2026-06-01T16:41:21","slug":"structured-contents-wikipedia-infobox","status":"publish","type":"post","link":"https:\/\/enterprise.wikimedia.com\/blog\/structured-contents-wikipedia-infobox\/","title":{"rendered":"Wikipedia API Parsed Infobox. Introducing Structured Contents"},"content":{"rendered":"\n<p>We are excited to announce the addition of a new <a href=\"https:\/\/enterprise.wikimedia.com\/api\/structured-contents\/\" data-type=\"page\" data-id=\"2070\">Structured Contents initiative<\/a> and endpoint to our On-demand REST API. We&#8217;ve heard all your requests for a more machine-readable API for Wikimedia data, and this is the next leap forward on that path. The response data from this new beta endpoint is similar in structure to the current <code>\/v2\/articles<\/code> endpoint, but includes a <strong>fully parsed Wikipedia infobox<\/strong>! (With more to come soon.)<\/p>\n\n\n\n<p>Here&#8217;s what&#8217;s covered in this article:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"#what-is-the-wikipedia-infobox\">What is the Wikipedia Infobox?<\/a><\/li>\n\n\n\n<li><a href=\"#structured-data-not-blobs\">More Structured Data, not blobs<\/a><\/li>\n\n\n\n<li><a href=\"#get-wikipedia-infobox-api\">How to get Wikipedia Infobox content<\/a><\/li>\n\n\n\n<li><a href=\"#roadmap\">Roadmap: Then, Now, and What\u2019s Next<\/a><\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading has-x-large-font-size\" id=\"what-is-the-wikipedia-infobox\" style=\"font-style:normal;font-weight:400\">What is the Wikipedia Infobox?<\/h2>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-28f84493 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:65%\">\n<p>The infobox is a panel that commonly appears in the top right of a Wikipedia article. It summarizes key facts and statistics that appear in the article. Depending on the article subject you may see key dates, biographical information, important images, scientific data, and more. The infobox is a desired resource for data reusers because Wikipedia editors follow strict guidelines and work hard to keep the infobox populated with the article\u2019s most pertinent and current metadata.<\/p>\n\n\n\n<p class=\"has-small-font-size\"><em>The examples here are from the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Josephine_Baker\">Josephine Baker page<\/a>.<\/em><\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-container-core-column-is-layout-f8561097 wp-block-column-is-layout-flow\" style=\"padding-top:0;padding-right:0;padding-bottom:0;padding-left:0;flex-basis:35%\">\n<figure class=\"wp-block-image size-full has-custom-border\"><img loading=\"lazy\" decoding=\"async\" width=\"500\" height=\"578\" src=\"https:\/\/enterprise.wikimedia.com\/uploads\/2023\/09\/what-is-infobox-short.jpg\" alt=\"Screenshot of a Wikipedia page with the right sidebar, known as the infobox, highlighted.\" class=\"wp-image-1279\" style=\"border-radius:4px\" srcset=\"https:\/\/enterprise.wikimedia.com\/uploads\/2023\/09\/what-is-infobox-short.jpg 500w, https:\/\/enterprise.wikimedia.com\/uploads\/2023\/09\/what-is-infobox-short-260x300.jpg 260w\" sizes=\"auto, (max-width: 500px) 100vw, 500px\" \/><\/figure>\n<\/div>\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading has-x-large-font-size\" id=\"structured-data-not-blobs\" style=\"font-style:normal;font-weight:400\">More Structured Data, not blobs<\/h2>\n\n\n\n<p>In our <a href=\"https:\/\/enterprise.wikimedia.com\/blog\/new-api-features-april-2023\/\" data-type=\"post\" data-id=\"1089\">spring update<\/a> we covered architectural upgrades and a handful of new features including a new field called <em>abstract<\/em> which provides a <a href=\"https:\/\/enterprise.wikimedia.com\/docs\/data-dictionary\/#abstract\">summary of the article content<\/a>. We have also recently introduced an object that adds the <a href=\"https:\/\/enterprise.wikimedia.com\/docs\/data-dictionary\/#image\">main <em>image<\/em><\/a> of the article to the articles payload.<\/p>\n\n\n\n<div class=\"wp-block-columns is-style-default is-layout-flex wp-container-core-columns-is-layout-9fa5bd33 wp-block-columns-is-layout-flex\" style=\"padding-right:0;padding-left:0\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<pre title=\"Example: abstract summary field \" class=\"wp-block-code has-tiny-font-size\"><code lang=\"json\" class=\"language-json\">\"abstract\": \"Freda Josephine Baker, naturalised as Jos\u00e9phine Baker, was an American-born French dancer, singer and actress. Her career was centered primarily in Europe, mostly in her adopted France. She was the first black woman to star in a major motion picture, the 1927 silent film Siren of the Tropics...\"<\/code><\/pre>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<pre title=\"Example: image object\" class=\"wp-block-code has-tiny-font-size\"><code lang=\"json\" class=\"language-json\">\"image\": {\n   \"content_url\": \"https:\/\/upload.wikimedia.org\/wikipedia\/commons\/0\/0b\/Baker_Harcourt_1940_2.jpg\",\n   \"width\": 540,\n   \"height\": 756\n}<\/code><\/pre>\n<\/div>\n<\/div>\n\n\n\n<p>These existing features were the first of a broader, ongoing endeavor to improve the machine readability of Wikimedia data. The new Structured Contents endpoint continues this evolution of Wikimedia Enterprise APIs as we actively work to expose Wikimedia project wikitext\/html blobs as structured JSON.<\/p>\n\n\n\n<h3 class=\"wp-block-heading has-large-font-size\" id=\"parsed-wikipedia-infobox-json\" style=\"font-style:normal;font-weight:400\">Structured Contents Parses Wikipedia Infobox into JSON<\/h3>\n\n\n\n<div class=\"wp-block-columns are-vertically-aligned-top is-layout-flex wp-container-core-columns-is-layout-7fc3d43a wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-top is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:66.66%\">\n<p>Until now, if you wanted any data located within infoboxes you\u2019d have to parse a very large project dump file, find the article you wanted, parse through the entire wikitext markup blob, and then grab the data needed. Doable, but quite difficult and time consuming to extract clean data.<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-top is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:33.33%\">\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>&#8220;Is there an API for the Wikipedia infobox?&#8221;<\/p>\n<cite> Everybody<\/cite><\/blockquote>\n<\/div>\n<\/div>\n\n\n\n<p>What if you could hit one endpoint and get exactly the data you need in standard formatted JSON? Well, it\u2019s here!<\/p>\n\n\n\n<p><strong>Wikipedia article infobox data is parsed into the Structured Contents beta endpoint JSON response!<\/strong><\/p>\n\n\n\n<div style=\"height:1px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image size-full has-custom-border is-style-default\"><img loading=\"lazy\" decoding=\"async\" width=\"1000\" height=\"637\" src=\"https:\/\/enterprise.wikimedia.com\/uploads\/2023\/09\/wikitext-blob-vs-json-infobox.png\" alt=\"Wikitext blob text on left versus parsed infobox from Structured Contents endpoint.\" class=\"wp-image-1292\" style=\"border-radius:8px\" srcset=\"https:\/\/enterprise.wikimedia.com\/uploads\/2023\/09\/wikitext-blob-vs-json-infobox.png 1000w, https:\/\/enterprise.wikimedia.com\/uploads\/2023\/09\/wikitext-blob-vs-json-infobox-300x191.png 300w, https:\/\/enterprise.wikimedia.com\/uploads\/2023\/09\/wikitext-blob-vs-json-infobox-768x489.png 768w\" sizes=\"auto, (max-width: 1000px) 100vw, 1000px\" \/><figcaption class=\"wp-element-caption\">Wikitext blob on left versus Structured Contents beta parsed JSON on right<\/figcaption><\/figure>\n\n\n\n<p>Structured Contents is part of our On-demand API, therefore you can use request filtering to grab the data you need and bypass the rest of the payload. This means you can send a request for just the article&#8217;s url, summary (abstract), and infoboxes and images; we empower you with the flexibility to choose.<\/p>\n\n\n\n<h2 class=\"wp-block-heading has-x-large-font-size\" id=\"get-wikipedia-infobox-api\" style=\"font-style:normal;font-weight:400\">How to get Wikipedia Infobox content<\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li><a href=\"https:\/\/enterprise.wikimedia.com\/signup\/\">Sign up for a free Wikimedia Enterprise account<\/a><\/li>\n\n\n\n<li>Get your API keys from the <a href=\"https:\/\/enterprise.wikimedia.com\/docs\/authentication\/\">Authentication endpoint<\/a><\/li>\n\n\n\n<li>Make a request to the <a href=\"https:\/\/enterprise.wikimedia.com\/docs\/on-demand\/#structured-contents-article-lookup-beta\" data-type=\"page\" data-id=\"210\">On-demand Structured Contents beta endpoint<\/a><\/li>\n\n\n\n<li>Enjoy your structured article content with parsed infobox JSON<\/li>\n<\/ol>\n\n\n\n<p>If you don\u2019t already have an account, <a href=\"https:\/\/enterprise.wikimedia.com\/signup\/\">sign up<\/a>, it&#8217;s free and in a few minutes you can start making requests. <a href=\"https:\/\/enterprise.wikimedia.com\/contact\/\">Let us know<\/a> if you need more.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-a89b3969 wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link has-secondary-background-color has-background wp-element-button\" href=\"\/signup\/\">Get your API keys and Go!<\/a><\/div>\n<\/div>\n\n\n\n<p><a href=\"https:\/\/enterprise.wikimedia.com\/docs\/on-demand\/\" data-type=\"page\" data-id=\"210\">Developer Docs for On-demand API<\/a> have been updated to include the new Structured Contents endpoint and a Data Dictionary entry for <a href=\"https:\/\/enterprise.wikimedia.com\/docs\/data-dictionary\/#infoboxes\" data-type=\"page\" data-id=\"199\">infoboxes schema<\/a> was added too.<\/p>\n\n\n\n<p>We have also released an <a href=\"\/docs\/#sdk\">SDKs, written in Go and Python, on github<\/a> to help you get started with any of our APIs. A working example of how to query the Structured Contents endpoint using our <a href=\"https:\/\/github.com\/wikimedia-enterprise\/wme-sdk-go\/blob\/main\/example\/structured-contents\/main.go\" target=\"_blank\" rel=\"noreferrer noopener\">Go SDK<\/a> and <a href=\"https:\/\/github.com\/wikimedia-enterprise\/wme-sdk-python\/blob\/main\/example\/structured-contents\/structuredcontents.py\" target=\"_blank\" rel=\"noreferrer noopener\">Python SDK<\/a> are included as well.<\/p>\n\n\n\n<p>For Wikimedia volunteers, log into <a rel=\"noreferrer noopener\" href=\"https:\/\/wikitech.wikimedia.org\/wiki\/Help:Cloud_Services_introduction\" target=\"_blank\">Wikimedia Cloud Services<\/a> to get access.<\/p>\n\n\n\n<h2 class=\"wp-block-heading has-x-large-font-size\" id=\"roadmap\" style=\"font-style:normal;font-weight:400\">Roadmap: Then, Now, and What\u2019s Next<\/h2>\n\n\n\n<p>When <a href=\"https:\/\/enterprise.wikimedia.com\/blog\/hello-world\/\" data-type=\"post\" data-id=\"878\">we launched Wikimedia Enterprise<\/a>, our mission was clear: to create a modern, machine-readable API that could meet the reliability and scalability demands of high-volume reusers of Wikimedia project data. We&#8217;re just getting started.<\/p>\n\n\n\n<p>The <a href=\"https:\/\/enterprise.wikimedia.com\/api\/structured-contents\/\" data-type=\"page\" data-id=\"2070\">Structured Contents<\/a> endpoint marks our inaugural beta release. This approach enhances transparency in our development process and facilitates more frequent improvements advantageous to reusers.<\/p>\n\n\n\n<p><strong>We&#8217;re actively seeking your feedback<\/strong> and would love to hear your experience with it, including your thoughts, requirements, and critiques. Please don&#8217;t hesitate to reach out to us. All account holders can use the &#8220;Contact Support&#8221; form in your account dashboard to share your input with our team.<\/p>\n\n\n\n<p>Thank you for reading!<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-a89b3969 wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link has-secondary-background-color has-background wp-element-button\" href=\"\/signup\/\">Sign up for free to get started!<\/a><\/div>\n<\/div>\n\n\n\n<p><strong>Update:<\/strong> <a href=\"https:\/\/enterprise.wikimedia.com\/blog\/structured-contents-snapshot-api\/\" data-type=\"post\" data-id=\"1866\">Structured Contents snapshots now available<\/a> too!<\/p>\n\n\n\n<p><strong>Update:<\/strong> <a href=\"https:\/\/enterprise.wikimedia.com\/blog\/article-sections-and-description\/\" data-type=\"post\" data-id=\"1488\">APIs now include parsed Article sections and short description<\/a>!<\/p>\n\n\n\n<div class=\"wp-block-group is-layout-constrained wp-block-group-is-layout-constrained\">\n<div style=\"height:30px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p class=\"has-small-font-size\">\u2014 <a href=\"https:\/\/meta.wikimedia.org\/wiki\/User:CReynolds_(WMF)\" target=\"_blank\" rel=\"noreferrer noopener\">Chuck Reynolds<\/a>, Staff Product Manager, Growth<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>We&#8217;ve heard all your requests for a more machine-readable API for Wikimedia data. We are announcing a new Structured Contents endpoint with the fully parsed contents of Wikipedia article Infoboxes in JSON! Jump into the article to read about it and get started. <\/p>\n","protected":false},"author":2,"featured_media":1301,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[12,18],"class_list":["post-1261","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-releases","tag-ondemand","tag-structured-contents"],"_links":{"self":[{"href":"https:\/\/enterprise.wikimedia.com\/wp-json\/wp\/v2\/posts\/1261","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/enterprise.wikimedia.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/enterprise.wikimedia.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/enterprise.wikimedia.com\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/enterprise.wikimedia.com\/wp-json\/wp\/v2\/comments?post=1261"}],"version-history":[{"count":10,"href":"https:\/\/enterprise.wikimedia.com\/wp-json\/wp\/v2\/posts\/1261\/revisions"}],"predecessor-version":[{"id":3418,"href":"https:\/\/enterprise.wikimedia.com\/wp-json\/wp\/v2\/posts\/1261\/revisions\/3418"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/enterprise.wikimedia.com\/wp-json\/wp\/v2\/media\/1301"}],"wp:attachment":[{"href":"https:\/\/enterprise.wikimedia.com\/wp-json\/wp\/v2\/media?parent=1261"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/enterprise.wikimedia.com\/wp-json\/wp\/v2\/categories?post=1261"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/enterprise.wikimedia.com\/wp-json\/wp\/v2\/tags?post=1261"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}