Close Menu
Mondo NewsMondo News
  • Technology
  • Science
  • Blockchain
What's Hot
Reviving Retro Games With Kids: A Surreal And Transformative Experience
Technology

Reviving Retro Games with Kids: A Surreal and Transformative Experience

Podcast Reveals How Reality Show Deceived Women Into Believing Fake
Technology

Podcast reveals how reality show deceived women into believing fake Prince Harry was real

Spiff, An Automated Commission Management Platform, Acquired By Salesforce
Technology

Spiff, an automated commission management platform, acquired by Salesforce

  • About Us
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram
Mondo NewsMondo News
  • Technology
    Exploring the Limitations of AI Safety Management Practices

    Exploring the Limitations of AI Safety Management Practices

    May 14, 2026
    What is the likelihood of an asteroid impacting Earth

    What is the likelihood of an asteroid impacting Earth?

    December 21, 2025
    Understanding Britains Debt Through Biscuits How Labour MPs Embrace Viral

    Understanding Britain’s Debt Through Biscuits: How Labour MPs Embrace Viral Trends

    December 5, 2025
    Tesla Launches Affordable Model 3 in Europe Amid Criticism of

    Tesla Launches Affordable Model 3 in Europe Amid Criticism of Mask Sales

    December 5, 2025
    Horror Game Horses Banned Is the Controversy Bigger Than You

    Horror Game Horses Banned: Is the Controversy Bigger Than You Think?

    December 5, 2025
  • Science
    7 Reasons We Overtrust AI and the Hidden Costs Were

    7 Reasons We Overtrust AI and the Hidden Costs We’re Already Facing

    June 2, 2026
    Webb Space Telescope Discovers Methane in Interstellar Comet 3IATLAS

    Webb Space Telescope Discovers Methane in Interstellar Comet 3I/ATLAS

    June 2, 2026
    Newly Discovered Axolotl Fossil Unearthed in Mexico

    Newly Discovered Axolotl Fossil Unearthed in Mexico

    June 2, 2026
    Breakthrough Pancreatic Cancer Drug Doubles Survival Rates A Revolutionary Treatment

    Breakthrough Pancreatic Cancer Drug Doubles Survival Rates: A Revolutionary Treatment

    June 2, 2026
    How Pigeons Use Superparamagnetic Immune Cells in Their Livers to

    How Pigeons Use Superparamagnetic Immune Cells in Their Livers to Detect Earth’s Magnetic Field

    June 1, 2026
  • Blockchain
    Top 5 Best Altcoins Of 2024 Revealed: Etfs (etfs), Pepe

    Top 4 Altcoins Unveiled by Expert for 100x Portfolio Growth: Blockchain News, Opinion, TV, Jobs

    May 21, 2024
    Blockchain Experts Forecast Which Tokens Will Generate Profits

    Blockchain experts forecast which tokens will generate profits

    May 17, 2024
    The Leading Platform For Seasoned Traders Featuring Blockchain News,

    The Leading Platform for Seasoned Traders – Featuring Blockchain News, Insights, TV, and Job Listings

    May 8, 2024
    Darklume Fantasy Metaverse: Presale Now Available Latest Blockchain Updates,

    Darklume Fantasy Metaverse: Presale Now Available – Latest Blockchain Updates, Opinions, Television, and Job Listings

    April 30, 2024
    Sui Collaborates With Google Cloud To Drive Web3 Advancement Through

    Sui collaborates with Google Cloud to drive Web3 advancement through improved security, scalability, and AI features

    April 30, 2024
Mondo NewsMondo News
You are at:Home » AI’s Hallucinations Are Intensifying—and They’re Here to Stay
Ai's hallucinations are intensifying—and they're here to stay
Science May 10, 2025

AI’s Hallucinations Are Intensifying—and They’re Here to Stay

Share
Facebook Twitter LinkedIn Pinterest Email

Errors Tend to Occur with AI-Generated Content

Paul Taylor/Getty Images

AI chatbots from tech giants like OpenAI and Google have seen several inference upgrades in recent months. Ideally, these upgrades would lead to more reliable answers, but recent tests indicate that performance may be worse than that of previous models. Errors called “hallucinations,” particularly in the “hagatsuki” category, have been persistent issues that developers have struggled to eliminate.

Hallucination is the broad term used to describe specific errors generated by large-scale language models (LLMs) from organizations like OpenAI’s ChatGPT and Google’s Gemini. It primarily refers to instances where these models present false information as fact, but it can also describe instances where a generated answer is accurate yet irrelevant to the question posed.

A technical report from OpenAI evaluating the latest LLMs revealed that the O3 and O4-MINI models, released in April, exhibit significantly higher hallucination rates compared to earlier O1 models introduced in late 2024. For instance, if O4-MINI had a summary accuracy of 33%, the hallucination rate for O3 was similarly at 33%, whereas the O1 model maintained a rate of only 16%.

This issue is not exclusive to OpenAI. The popular leaderboard showcases various inference models from different companies assessing their hallucination rates, including the DeepSeek-R1 model. This model has shown increased hallucination rates compared to previous versions, undergoing several reasoning steps before reaching a conclusion.

An OpenAI spokesperson stated, “We are actively working to reduce hallucination rates in O3 and O4-MINI. Hallucinations are inherently more common in inference models. We will continue our research across all models to enhance accuracy and reliability.”

Some potential applications of LLMs can be significantly impeded by hallucinations. Models that frequently produce misinformation are unsuitable as research assistants, and a bot stating fictitious legal cases could endanger lawyers. Customer service agents falsely citing obsolete policies can also create significant challenges for businesses.

Initially, AI companies believed they would resolve these issues over time. Historically, models had shown reduced hallucinations with each update, yet the recent spikes in hallucination rates complicate this narrative.

Vectara’s leaderboard ranks models based on their consistency in summarizing documents. This indicates that for systems from OpenAI and Google, “hallucination rates are roughly comparable for inference and irrational models,” as noted by Forest Shen Bao from Vectara. Google has not provided further comments. For leaderboard assessments, the specific rates of hallucinations are less significant than each model’s overall ranking, according to Bao.

However, these rankings may not effectively compare AI models. For one, different types of hallucinations are often conflated. The Vectara team pointed out that the DeepSeek-R1 model demonstrated a 14.3% hallucination rate, but many of these hallucinations were “benign,” being logically deduced yet not appearing in the original text.

Another issue with these rankings is that tests based on text summaries “reveal nothing about the percentage of incorrect output” for tasks where LLMs are applied, as stated by Emily Bender at Washington University. She suggests that leaderboard results don’t provide a comprehensive evaluation of this technology, particularly since LLMs are not solely designed for text summarization.

These models generate answers by repeatedly answering the question, “What is the next word?” to formulate responses, thus not processing information in a traditional sense. However, many technology companies continue to use the term “hallucination” to describe output errors.

“The term ‘hallucination’ is doubly problematic,” says Bender. “On one hand, it implies that false output is abnormal and could potentially be mitigated, while on the other hand, it inaccurately anthropomorphizes the machine since large language models lack awareness.”

Arvind Narayanan from Princeton University argues that the issue extends beyond hallucinations. Models can also produce errors by utilizing unreliable sources or outdated information. Merely increasing training data and computational power may not rectify the problems.

We may have to accept the reality of error-prone AI, as Narayanan mentioned in a recent social media post. In some circumstances, it may be prudent to use such models solely for tasks requiring fact-checking. The best approach might be to avoid relying on AI chatbots for factual information altogether.

Source: www.newscientist.com

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleHow Climate Change Is Impacting Our Daily Lives Right Now
Next Article Google to Pay $1.4 Billion to Settle Dual Privacy Lawsuits

Related Posts

7 Reasons We Overtrust AI and the Hidden Costs Were
Science

7 Reasons We Overtrust AI and the Hidden Costs We’re Already Facing

Webb Space Telescope Discovers Methane in Interstellar Comet 3IATLAS
Science

Webb Space Telescope Discovers Methane in Interstellar Comet 3I/ATLAS

Newly Discovered Axolotl Fossil Unearthed in Mexico
Science

Newly Discovered Axolotl Fossil Unearthed in Mexico

Breakthrough Pancreatic Cancer Drug Doubles Survival Rates A Revolutionary Treatment
Science

Breakthrough Pancreatic Cancer Drug Doubles Survival Rates: A Revolutionary Treatment

How Pigeons Use Superparamagnetic Immune Cells in Their Livers to
Science

How Pigeons Use Superparamagnetic Immune Cells in Their Livers to Detect Earth’s Magnetic Field

Leveraging Human Error as a Tactic Against Large Scale Language Models
Science

Leveraging Human Error as a Tactic Against Large-Scale Language Models

Exploring the Real Health Benefits of Turmeric and Curcumin
Science

Exploring the Real Health Benefits of Turmeric and Curcumin

Boost Your Mood Daily Study Reveals Benefits of Drinking Fruit
Science

Boost Your Mood Daily: Study Reveals Benefits of Drinking Fruit Juice

Leave A Reply Cancel Reply

Stay In Touch
  • Facebook
  • Twitter
  • Instagram
  • Pinterest
Quote of the day

A good traveler has no fixed plans, and is not intent on arriving.

Lao Tzu
Exchange Rate

Exchange Rate EUR: Tue, 2 Jun.

Top Insights
Skip The Line: How To Buy The New Iphone 16 Technology

Skip the line: How to buy the new iPhone 16 without waiting | Technology

500 Million Euro Fine Imposed On Apple By Eu For Technology

500 million euro fine imposed on Apple by EU for restricting music streaming access, according to reports in technology sector

Mozart of Mathematics Stays Silent on Politics—Until Funding Cuts Spark Science

“Mozart of Mathematics” Stays Silent on Politics—Until Funding Cuts Spark Change.

Categories
  • Blockchain (65)
  • Science (7,684)
  • Technology (2,968)
Top Posts
UK Government to Renew Dispute with Apple Over Access to

UK Government to Renew Dispute with Apple Over Access to User Data | Data Protection

October 2, 2025
Ai Invents New Battery Design That Decreases Lithium Usage By

AI invents new battery design that decreases lithium usage by 70%

January 9, 2024
Human Level AI is Inevitable Harnessing the Power to Influence the

Human-Level AI is Inevitable: Harnessing the Power to Influence the Journey | Garrison Nice

July 21, 2025

Mondo News is a Professional Technology & Science Blog. Here we will provide you with only exciting content that you will enjoy and find useful. We’re working to turn our passion into a successful website. We hope you enjoy our Content as much as we enjoy offering them to you.

Facebook X (Twitter) Instagram Pinterest
Categories
  • Blockchain (65)
  • Science (7,684)
  • Technology (2,968)
Most Popular
Trump reviews potential plans for tiktok's future as us ban
Technology

Trump reviews potential plans for TikTok’s future as US ban looms | TikTok

Newly Discovered Pterosaur Unearthed in Germany
Science

Newly Discovered Pterosaur Unearthed in Germany

SiteLock
© 2026 Mondo News.
  • Home
  • About Us
  • Privacy Policy
  • Terms & Conditions

Type above and press Enter to search. Press Esc to cancel.

We are using cookies to give you the best experience on our website.

You can find out more about which cookies we are using or switch them off in .

Ad Blocker Enabled!
Ad Blocker Enabled!
Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.
Go to mobile version
Powered by  GDPR Cookie Compliance
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Strictly Necessary Cookies

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.