Is Superintelligent AI Just Around the Corner or Merely a Sci-Fi Fantasy?

Could machines surpass human intelligence?

chan2545/istockphoto/getty images

Listening to the leaders of AI companies suggests that the coming decade will transform human history profoundly. We’re entering an era described as “radical abundance,” which presents an optimistic view reminiscent of groundbreaking advancements in high-energy physics and aspirations for space colonization. Yet, the experience of researchers working with today’s leading AI systems reveals a contrasting narrative. In practice, even the top-performing models struggle with basic tasks that most individuals find simple. So, who should we trust?

According to Sam Altman of OpenAI and Demis Hassabis of Google DeepMind, a transformative AI system is on the horizon. In Altman’s blog post, he predicts that the 2030s will usher in significant changes compared to prior decades, suggesting that breakthroughs in materials science might enable high-bandwidth brain-computer interfaces within just a year.

Hassabis also projects a fruitful decade ahead in an interview with Wired, claiming that artificial general intelligence (AGI) will tackle major challenges like the treatment of severe diseases, potentially leading to improved health and longevity. He confidently states, “If all this transpires…”

This ambitious outlook heavily relies on the premise that larger language models (LLMs) such as ChatGPT can effectively utilize more data and computing power. While this “scaling approach” has proven successful in recent years, signs have begun to signal a slowdown. For instance, OpenAI’s latest GPT-4.5 model demonstrated only modest gains over its predecessor, GPT-4, despite likely costing hundreds of millions to train. Such expenditures pale compared to future investments; Meta is poised to announce a $15 billion investment aimed at realizing “superintelligence.”

Yet, the sole approach to resolving these challenges isn’t merely financial. AI companies are shifting towards “inference” models like OpenAI’s O1, which was introduced last year. These models require more computational resources, taking longer to generate responses while processing their output iteratively, mimicking a human-like “thinking” process. Noam Brown from OpenAI cautioned about AI’s limitations, noting last year that both the O1 model and its iterations indicate that “scaling methods” can indeed progress.

Nevertheless, recent studies reveal that these inference models can falter even on straightforward logic challenges. Research conducted by Apple scientists found that AI models, including Deepseek’s inference model and Anthropic’s Claude Thinking model, encountered obstacles during basic tasks. The study highlighted that while the models demonstrated limitations in accurate computations, they frequently failed to apply explicit algorithms and reasoning consistently.

The researchers tested AI performance on various puzzles, including scenarios where individuals must transport items using the least number of moves, as well as the Tower of Hanoi challenge requiring sequential movement without placing larger discs atop smaller ones. Although the models could tackle simpler instances, they struggled as complexity increased. This research suggests that while more intricate problems may require longer contemplation from AI, the reduced number of “tokens” (information bundles) indicates that the apparent “thinking” time of the models may be deceptive.

“It’s concerning that these can be easily resolved,” remarked Artur Garcez from the University of London. “We mastered symbolic AI inference techniques for these tasks half a century ago.” Although enhancements and fixes could eventually enable these new systems to tackle complex problems, Garcez suggests that merely increasing the model size or computational capabilities is unlikely to be a panacea.

These models also illustrate their persistent difficulties in addressing scenarios they haven’t encountered in their training data, remarked Nicos Aletras from the University of Sheffield. “In practical terms, while they excel at information retrieval, summarization, and related tasks due to their training, they can come off as impressive without being truly adaptive,” Aletras concluded. “Apple’s research has undoubtedly highlighted a significant blind spot.”

Additionally, other research indicates that extending “thinking” duration could detrimentally affect AI model performance. Soumya Suvra Ghosal and colleagues at the University of Maryland analyzed Deepseek’s model and uncovered that prolonged “thinking chains” reduced accuracy in mathematical inference tests. In a mathematical benchmark, they found that tripling the number of tokens enhanced performance by around 5%, but using 10-15 times the tokens led to a decline of roughly 17% in scores.

In certain instances, the “chain of thought” generated by AI bears little relation to the eventual answer it provides. When testing Deepseek’s navigation abilities in a simple maze, Subbarao Kambhampati from Arizona State University discovered that even when the AI solved the issue, its “chain of thought” contained mistakes not reflected in its final answer. Moreover, presenting AI with an irrelevant “chain of thought” sometimes improved the accuracy of its responses.

“Our findings challenge the common belief that intermediate tokens or ‘thought strands’ provide a meaningful trace of internal inference in AI models,” Kambhampati stated.

All recent studies assert that the terms “thinking” and “inference” in relation to these AI models are misleading, according to Anna Rogers at the University of Copenhagen. “Many leading techniques I’ve encountered in this field have historically been accompanied by vague, cognitively-inspired analogies that ultimately proved incorrect.”

Andreas Vlachos from Cambridge University observed that while LLMs have distinct applications in text generation and other tasks, recent insights imply that Altman and Hassabis may face difficulties confronting the complex challenges they anticipate solving in the near future.

“There is an inherent conflict between their model training—predictions based on the forthcoming words—and our objectives, which involve generating true inferences,” Vlachos remarked.

On the other hand, OpenAI maintains a different stance. A spokesperson remarked, “Our research indicates that chain-like inference methodologies can significantly enhance performance on complex problems, and we are actively pursuing advancements in training, evaluation, and model design.” Deepseek has yet to comment on requests for input.

Topics:

Source: www.newscientist.com

What's Hot

BM Boys: Nigerian Sextortion Network Concealed Behind TikTok’s Exterior | Our Crimes

Zircuit Launches Staking Program for New Security-Focused ZK Rollup – Blockchain News, Opinion, TV, and Jobs

Guardian ceases publishing content about Elon Musk’s X from official account

Elon Musk’s White House Outfit Gave Off a Vibe of Defeat | Fashion

Mindseye Review: A Dystopian Future Echoing 2012 | Games

European Journalists Investigating Paragon Solutions Spyware: A Press Freedom Perspective

What Are Your Favorite Video Games of 2025 So Far? | Gaming Highlights

Researchers Develop AI Tools to Revive Artwork Aged by Time in Just Hours | Science

What Happens to Your Teeth if Fluoride Vanishes from Drinking Water?

Two Uncommon Radio Signals Discovered Emerging from Antarctic Ice

Studies Indicate Regular Fruit Consumption by Mastodons in South America

Patient Rover Could Uncover Secrets of Newly Discovered Mars Volcano

Hidden Plumes in Earth’s Mantle May Drain Heat from the Core

Sui and Atoma introduce AI capabilities to dApp developers – Blockchain Updates, Views, Videos, Opportunities

Bitcoin ETF issuer acquires 5% of BTC supply, $100 million invested in ETFSwap (ETFS) presale – Blockchain updates, insights, and career opportunities

Agora boosts Sui’s native stablecoin with addition of AUSD stablecoin to network

Meme Coin Memeinator Goes Viral, Raises $7.7 Million and Debuts on Exchanges- Latest in Blockchain News, Opinion, TV, and Job Listings

Changing the game of betting with Blockchain: New News, Opinions, TV, and Job Opportunities

Is Superintelligent AI Just Around the Corner or Merely a Sci-Fi Fantasy?

What Happens to Your Teeth if Fluoride Vanishes from Drinking Water?

Two Uncommon Radio Signals Discovered Emerging from Antarctic Ice

Studies Indicate Regular Fruit Consumption by Mastodons in South America

Patient Rover Could Uncover Secrets of Newly Discovered Mars Volcano

Hidden Plumes in Earth’s Mantle May Drain Heat from the Core

Giant Atoms Kept “Confined” for Record Durations at Room Temperature

How the US Agriculture Organization Played a Crucial Role in Combating Bird Flu

Discover “Monster” Tumors That Can Develop Hair, Teeth, and Organs

The Unexpected Shift of Futurism towards Conservatism in the 20th Century

New Treatment for Stomach Cancer Proposed by Scientists

Reviving frozen human brain tissue without damage is now achievable.

Franks secures more capital to enhance automation of wealth services in Europe

Newly Discovered Light Properties Unveiled by Centuries-Old Theorem

Snap collaborates with edtech firm Inspirit to introduce augmented reality technology in 50 American schools

What's Hot

Is Superintelligent AI Just Around the Corner or Merely a Sci-Fi Fantasy?

Related Posts