Study Reveals Many AI Chatbots Are Easily Misled and Provide Risky Responses

Compromised AI-driven chatbots pose risks by gaining access to harmful knowledge through illegal information encountered during their training, according to researchers.

This alert comes as an alarming trend emerges where chatbots have been “jailbroken” to bypass their inherent safety measures. These safeguards are meant to stop the systems from delivering harmful, biased, or inappropriate responses to user queries.

Powerful chatbots, including large language models (LLMs) like ChatGpt, Gemini, and Claude, consume vast amounts of content from the Internet.

Even with attempts to filter out harmful content from their training datasets, LLMs can still learn about illegal activities—including hacking, money laundering, insider trading, and bomb-making. Security protocols are intended to prevent the use of such information in their answers.

In a Report on the risks, researchers found that it is surprisingly easy to deceive many AI-powered chatbots into producing harmful and illegal content, emphasizing that the threat is “immediate, concrete, and alarming.”

The author cautions that “what was once limited to state actors and organized crime may now be accessible to anyone with a laptop or smartphone.”

The study, conducted by Professor Rior Lokach and Dr. Michael Fier from Ben Gurion University in Negev, Israel, highlights an escalating threat from “dark LLMs” developed without safety measures or altered through jailbreaks. Some entities openly promote a “no ethical guardrails” approach, facilitating illegal activities like cybercrime and fraud.

Jailbreaking involves using specially crafted prompts to manipulate chatbots into providing prohibited responses. This is achieved by taking advantage of the chatbot’s primary goal of following user requests against its secondary aim of avoiding harmful, biased, unethical, or illegal outputs. Prompts typically create scenarios where the program prioritizes usefulness over safety precautions.

To illustrate the issue, researchers created a universal jailbreak that breached several prominent chatbots, enabling them to answer questions that should normally be denied. Once compromised, LLMs consistently produced responses to nearly all inquiries, according to the report.

“It was astonishing to see the extent of knowledge this system holds,” Fier noted, citing examples that included hacking computer networks and providing step-by-step guides for drug manufacturing and other criminal activities.

“What makes this threat distinct from previous technical challenges is an unparalleled combination of accessibility, scalability, and adaptability,” Rokach added.

The researchers reached out to leading LLM providers to inform them of the universal jailbreak, but reported that the response was “overwhelmingly inadequate.” Some companies did not reply, while others claimed that the jailbreak threat lay outside the parameters of their bounty programs, which encourage ethical hackers to report software vulnerabilities.

The report suggests that chatbots need to “forget” any illegal information they learn, emphasizing that technology companies must screen training data rigorously, implement strong firewalls to block dangerous queries and responses, and develop techniques for “learning machines.” Dark LLMs should be regarded as a “serious security threat,” comparable to unlicensed weapons and explosives, warranting accountability from providers.

Dr. Isen Aloani, an AI security expert at Queen’s University Belfast, highlighted that jailbreak attacks on LLMs could lead to significant risks, ranging from detailed weapon-building instructions to sophisticated disinformation campaigns, social engineering, and automated fraud.

“A crucial part of the solution is for companies to not only rely on front-end safeguards but to also invest meaningfully in red teaming and enhancing model-level robustness. Clear standards and independent oversight are essential to adapt to the evolving threat landscape,” he stated.

Professor Peter Garraghan, an AI security authority at Lancaster University, emphasized, “Organizations need to treat LLMs as they would any other vital software component.”

“While jailbreaking is a concern, understanding the entire AI stack is vital for genuine accountability. The real security requirements involve responsible design and deployment, not merely responsible disclosure,” he added.

OpenAI, the developer behind ChatGpt, stated that the latest O1 model can better infer its safety policies and improve its resistance to jailbreak attempts. The company affirmed its ongoing research to bolster the robustness of its solutions.

Meta, Google, Microsoft, and Anthropic were contacted for their feedback. Microsoft replied with a link to a blog detailing their work to mitigate jailbreaks.

Source: www.theguardian.com

What's Hot

The Reasons Behind Gene Editors’ Desire to Treat Fetal Conditions in Utero

LIF3.com Partners with Fireblock to Enhance Consumer DeFi Safety and Security in Next-Generation Blockchain Technology – Latest News, Insights, and Career Opportunities

The Earth has just experienced its hottest year on record

Exploring the Value of Boredom: Should We Embrace It? | Psychology

Earn Up to £800 Daily: How Fraudsters Use Phones and Texts to Deceive Victims | Consumer Concerns

Thousands of UK University Students Use AI to Combat Fraud | Higher Education Insights

Defending Against the Threat: How UK Banks Combat Cyber Attacks

Guide #195: The Impact of Reddit on Our Culture

Can Humans Thrive Beneath the Waves? Exploring a Live Underwater Experiment

Drunken Behavior of Seagulls in Summer: A Possible Explanation

Science Encourages You to Share More Cute Pet Photos—Here’s Why!

Study Reveals Potential Weight Gain from Certain ‘Healthy Fats’

From 9/11 to Hurricane Katrina: 15 Stunning Images of Earth from Space

Sui and Atoma introduce AI capabilities to dApp developers – Blockchain Updates, Views, Videos, Opportunities

Bitcoin ETF issuer acquires 5% of BTC supply, $100 million invested in ETFSwap (ETFS) presale – Blockchain updates, insights, and career opportunities

Agora boosts Sui’s native stablecoin with addition of AUSD stablecoin to network

Meme Coin Memeinator Goes Viral, Raises $7.7 Million and Debuts on Exchanges- Latest in Blockchain News, Opinion, TV, and Job Listings

Changing the game of betting with Blockchain: New News, Opinions, TV, and Job Opportunities

Study Reveals Many AI Chatbots Are Easily Misled and Provide Risky Responses

Exploring the Value of Boredom: Should We Embrace It? | Psychology

Earn Up to £800 Daily: How Fraudsters Use Phones and Texts to Deceive Victims | Consumer Concerns

Thousands of UK University Students Use AI to Combat Fraud | Higher Education Insights

Defending Against the Threat: How UK Banks Combat Cyber Attacks

Study Reveals Potential Weight Gain from Certain ‘Healthy Fats’

Guide #195: The Impact of Reddit on Our Culture

Minister Warns British Workers Risk Being Left Behind by AI Advancements

Elon Musk’s White House Outfit Gave Off a Vibe of Defeat | Fashion

Scientists develop ultra-thin gold ‘golden’ that is only one atom thick

Blood tests indicates RNA may pose a risk of pre-lammosis

Athena, the intuitive moon lander machine, poised for explosive mission on the lunar surface

Franks secures more capital to enhance automation of wealth services in Europe

Newly Discovered Light Properties Unveiled by Centuries-Old Theorem

Snap collaborates with edtech firm Inspirit to introduce augmented reality technology in 50 American schools

What's Hot

Study Reveals Many AI Chatbots Are Easily Misled and Provide Risky Responses

Related Posts