Close Menu
Mondo NewsMondo News
  • Technology
  • Science
  • Blockchain
What's Hot
Ftse Companies Urge Executives To Increase Pay And Bonuses Beyond
Technology

FTSE companies urge executives to increase pay and bonuses beyond £17m

Neanderthal Infants Much Larger Than Modern Human Babies
Science

Neanderthal Infants: Much Larger Than Modern Human Babies

Recent Research Indicates Gradual Bottleneck Events In Neanderthal Evolution
Science

  • Recent Research Indicates Gradual Bottleneck Events in Neanderthal Evolution

  • About Us
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram
Mondo NewsMondo News
  • Technology
    Exploring the Limitations of AI Safety Management Practices

    Exploring the Limitations of AI Safety Management Practices

    May 14, 2026
    What is the likelihood of an asteroid impacting Earth

    What is the likelihood of an asteroid impacting Earth?

    December 21, 2025
    Understanding Britains Debt Through Biscuits How Labour MPs Embrace Viral

    Understanding Britain’s Debt Through Biscuits: How Labour MPs Embrace Viral Trends

    December 5, 2025
    Tesla Launches Affordable Model 3 in Europe Amid Criticism of

    Tesla Launches Affordable Model 3 in Europe Amid Criticism of Mask Sales

    December 5, 2025
    Horror Game Horses Banned Is the Controversy Bigger Than You

    Horror Game Horses Banned: Is the Controversy Bigger Than You Think?

    December 5, 2025
  • Science
    Unlocking the Longevity of Heliconius Butterflies The Surprising Role of

    Unlocking the Longevity of Heliconius Butterflies: The Surprising Role of Pollen

    June 23, 2026
    Study Finds That Competition Between Species Was A Significant Factor

    New Research Disproves Longstanding Belief That Human Ancestors Simply Became Bigger Over Time

    June 23, 2026
    Webb Space Telescope Discovers Methane in Interstellar Comet 3IATLAS

    New Findings Reveal Interstellar Comet 3I/ATLAS Originated 12 Billion Years Ago

    June 23, 2026
    Unlocking Early Childhood How Our Brains Form Initial Thoughts at

    Understanding Early Brain Development: When Do Babies Start to Think?

    June 23, 2026
    Transformative Brain Changes What Happens from Your 20s to 40s

    Transformative Brain Changes: What Happens from Your 20s to 40s

    June 23, 2026
  • Blockchain
    Top 5 Best Altcoins Of 2024 Revealed: Etfs (etfs), Pepe

    Top 4 Altcoins Unveiled by Expert for 100x Portfolio Growth: Blockchain News, Opinion, TV, Jobs

    May 21, 2024
    Blockchain Experts Forecast Which Tokens Will Generate Profits

    Blockchain experts forecast which tokens will generate profits

    May 17, 2024
    The Leading Platform For Seasoned Traders Featuring Blockchain News,

    The Leading Platform for Seasoned Traders – Featuring Blockchain News, Insights, TV, and Job Listings

    May 8, 2024
    Darklume Fantasy Metaverse: Presale Now Available Latest Blockchain Updates,

    Darklume Fantasy Metaverse: Presale Now Available – Latest Blockchain Updates, Opinions, Television, and Job Listings

    April 30, 2024
    Sui Collaborates With Google Cloud To Drive Web3 Advancement Through

    Sui collaborates with Google Cloud to drive Web3 advancement through improved security, scalability, and AI features

    April 30, 2024
Mondo NewsMondo News
You are at:Home » AI’s Hallucinations Are Intensifying—and They’re Here to Stay
Ai's hallucinations are intensifying—and they're here to stay
Science May 10, 2025

AI’s Hallucinations Are Intensifying—and They’re Here to Stay

Share
Facebook Twitter LinkedIn Pinterest Email

Errors Tend to Occur with AI-Generated Content

Paul Taylor/Getty Images

AI chatbots from tech giants like OpenAI and Google have seen several inference upgrades in recent months. Ideally, these upgrades would lead to more reliable answers, but recent tests indicate that performance may be worse than that of previous models. Errors called “hallucinations,” particularly in the “hagatsuki” category, have been persistent issues that developers have struggled to eliminate.

Hallucination is the broad term used to describe specific errors generated by large-scale language models (LLMs) from organizations like OpenAI’s ChatGPT and Google’s Gemini. It primarily refers to instances where these models present false information as fact, but it can also describe instances where a generated answer is accurate yet irrelevant to the question posed.

A technical report from OpenAI evaluating the latest LLMs revealed that the O3 and O4-MINI models, released in April, exhibit significantly higher hallucination rates compared to earlier O1 models introduced in late 2024. For instance, if O4-MINI had a summary accuracy of 33%, the hallucination rate for O3 was similarly at 33%, whereas the O1 model maintained a rate of only 16%.

This issue is not exclusive to OpenAI. The popular leaderboard showcases various inference models from different companies assessing their hallucination rates, including the DeepSeek-R1 model. This model has shown increased hallucination rates compared to previous versions, undergoing several reasoning steps before reaching a conclusion.

An OpenAI spokesperson stated, “We are actively working to reduce hallucination rates in O3 and O4-MINI. Hallucinations are inherently more common in inference models. We will continue our research across all models to enhance accuracy and reliability.”

Some potential applications of LLMs can be significantly impeded by hallucinations. Models that frequently produce misinformation are unsuitable as research assistants, and a bot stating fictitious legal cases could endanger lawyers. Customer service agents falsely citing obsolete policies can also create significant challenges for businesses.

Initially, AI companies believed they would resolve these issues over time. Historically, models had shown reduced hallucinations with each update, yet the recent spikes in hallucination rates complicate this narrative.

Vectara’s leaderboard ranks models based on their consistency in summarizing documents. This indicates that for systems from OpenAI and Google, “hallucination rates are roughly comparable for inference and irrational models,” as noted by Forest Shen Bao from Vectara. Google has not provided further comments. For leaderboard assessments, the specific rates of hallucinations are less significant than each model’s overall ranking, according to Bao.

However, these rankings may not effectively compare AI models. For one, different types of hallucinations are often conflated. The Vectara team pointed out that the DeepSeek-R1 model demonstrated a 14.3% hallucination rate, but many of these hallucinations were “benign,” being logically deduced yet not appearing in the original text.

Another issue with these rankings is that tests based on text summaries “reveal nothing about the percentage of incorrect output” for tasks where LLMs are applied, as stated by Emily Bender at Washington University. She suggests that leaderboard results don’t provide a comprehensive evaluation of this technology, particularly since LLMs are not solely designed for text summarization.

These models generate answers by repeatedly answering the question, “What is the next word?” to formulate responses, thus not processing information in a traditional sense. However, many technology companies continue to use the term “hallucination” to describe output errors.

“The term ‘hallucination’ is doubly problematic,” says Bender. “On one hand, it implies that false output is abnormal and could potentially be mitigated, while on the other hand, it inaccurately anthropomorphizes the machine since large language models lack awareness.”

Arvind Narayanan from Princeton University argues that the issue extends beyond hallucinations. Models can also produce errors by utilizing unreliable sources or outdated information. Merely increasing training data and computational power may not rectify the problems.

We may have to accept the reality of error-prone AI, as Narayanan mentioned in a recent social media post. In some circumstances, it may be prudent to use such models solely for tasks requiring fact-checking. The best approach might be to avoid relying on AI chatbots for factual information altogether.

Source: www.newscientist.com

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleHow Climate Change Is Impacting Our Daily Lives Right Now
Next Article Google to Pay $1.4 Billion to Settle Dual Privacy Lawsuits

Related Posts

Unlocking the Longevity of Heliconius Butterflies The Surprising Role of
Science

Unlocking the Longevity of Heliconius Butterflies: The Surprising Role of Pollen

Study Finds That Competition Between Species Was A Significant Factor
Science

New Research Disproves Longstanding Belief That Human Ancestors Simply Became Bigger Over Time

Webb Space Telescope Discovers Methane in Interstellar Comet 3IATLAS
Science

New Findings Reveal Interstellar Comet 3I/ATLAS Originated 12 Billion Years Ago

Unlocking Early Childhood How Our Brains Form Initial Thoughts at
Science

Understanding Early Brain Development: When Do Babies Start to Think?

Transformative Brain Changes What Happens from Your 20s to 40s
Science

Transformative Brain Changes: What Happens from Your 20s to 40s

Alzheimers Patient Experiences Remarkable Speech Recovery with Psilocybin Treatment
Science

Alzheimer’s Patient Experiences Remarkable Speech Recovery with Psilocybin Treatment

Fusive Neurosurgery How Paralyzed Pigs Are Walking Again – Could
Science

Fusive Neurosurgery: How Paralyzed Pigs Are Walking Again – Could Humans Be Next?

Cutting Edge Natural Technology for CO2 Removal Potential Risks and Backfire
Science

Cutting-Edge Natural Technology for CO2 Removal: Potential Risks and Backfire Effects

Leave A Reply Cancel Reply

Stay In Touch
  • Facebook
  • Twitter
  • Instagram
  • Pinterest
Quote of the day

A highbrow is a man who has found something more interesting than women.

Edgar Wallace
Exchange Rate

Exchange Rate EUR: Tue, 23 Jun.

Top Insights
Scientists Perplexed as Earth Spins Faster Than Normal Today Science

Scientists Perplexed as Earth Spins Faster Than Normal Today

Sustainable Scaling Strategies Are Essential For Startups Technology

Sustainable Scaling Strategies are Essential for Startups

European union regulations on deforestation are causing chaos for coffee Science

European Union regulations on deforestation are causing chaos for coffee farmers in Ethiopia

Categories
  • Blockchain (65)
  • Science (7,893)
  • Technology (2,968)
Top Posts
UK Government to Renew Dispute with Apple Over Access to

UK Government to Renew Dispute with Apple Over Access to User Data | Data Protection

October 2, 2025
Transform Your Filmmaking How New AI Tools Are Revolutionizing the

Transform Your Filmmaking: How New AI Tools Are Revolutionizing the Industry

July 20, 2025
Human Level AI is Inevitable Harnessing the Power to Influence the

Human-Level AI is Inevitable: Harnessing the Power to Influence the Journey | Garrison Nice

July 21, 2025

Mondo News is a Professional Technology & Science Blog. Here we will provide you with only exciting content that you will enjoy and find useful. We’re working to turn our passion into a successful website. We hope you enjoy our Content as much as we enjoy offering them to you.

Facebook X (Twitter) Instagram Pinterest
Categories
  • Blockchain (65)
  • Science (7,893)
  • Technology (2,968)
Most Popular
Used car retailer carvana sees potential business benefits from trump's
Technology

Used Car Retailer Carvana Sees Potential Business Benefits from Trump’s Tariffs

Nomination For The 2025 Award For Reverse Nomination Determinism
Science

Nomination for the 2025 Award for Reverse Nomination Determinism

SiteLock
© 2026 Mondo News.
  • Home
  • About Us
  • Privacy Policy
  • Terms & Conditions

Type above and press Enter to search. Press Esc to cancel.

We are using cookies to give you the best experience on our website.

You can find out more about which cookies we are using or switch them off in .

Ad Blocker Enabled!
Ad Blocker Enabled!
Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.
Go to mobile version
Powered by  GDPR Cookie Compliance
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Strictly Necessary Cookies

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.