Artificial intelligence (AI) systems that drive today’s technologies, from chatbots to search engines, predominantly rely on a single source: Wikipedia. With over 7 million articles in English and a policy for free use, this platform is a goldmine of high-quality training data.
But will online encyclopedias adopt AI technology? In the BBC Science Focus instant genius podcast, Wikipedia founder Jimmy Wales expressed optimism about using AI for editing and error detection, though he remains uncertain about its role in drafting complete articles.
“What excites me about AI is its potential to aid the Wikipedia community,” Wales remarked.
Wales elaborated on various methods he’s exploring, mentioning a tool designed to analyze brief Wikipedia entries and their sources to pinpoint missing information and unsupported claims. “I’ve found that I’m quite adept at it,” he noted.
He also emphasized that this experimentation is not limited to his own efforts. The Wikimedia Foundation, which operates Wikipedia, has a committed machine learning team working on developing valuable AI tools for the Wikipedia community.
“Many individuals are engaged in maintaining Wikipedia,” Wales stated. “[These tools] represent an exciting initiative that enhances quality.”
When queried about the prospect of AI drafting Wikipedia entries soon, Wales was skeptical.
“I’m not ruling it out completely, but it seems unlikely in the short term. From a Wikipedia perspective, the current models still fall short.”
One area where Wikipedia’s founders see potential for AI is in mitigating bias within the encyclopedia itself. For instance, research indicates that 20 percent of biographies on Wikipedia feature women, and these entries often skimp on coverage by focusing more on family, relationships, or appearance.
In light of these statistics, Wales proposed, “It’s feasible to envision AI continuously scanning Wikipedia for certain types of bias and alerting us to areas we should focus on.”
However, he also raised concerns about biases present in the Large-Scale Language Model (LLM), as many are trained extensively on data from Wikipedia: “Model trainers must be vigilant about this issue, reflecting deeply on it.”

Despite these concerns, Wales insists that few online spaces rival Wikipedia for quality training data.
“Fortunately, we don’t have an AI model trained exclusively on Twitter. That would result in a rather peculiar and hostile model,” he remarked.
“It’s crucial to have training materials that are factual, well-considered, and thoughtful.”
He summed up, saying, “Broadly speaking, the more fact-driven and extensive the language models we have, the better it is.”
Jimmy Wales’ new book, 7 Rules of Trust, is now available for purchase.
Read more:
Source: www.sciencefocus.com












