The UK’s new Artificial Intelligence Safety Authority has discovered that the technology can mislead human users, produce biased results, and lacks safeguards against the dissemination of harmful information.
Announced by the AI Safety Research Institute, initial findings of research into advanced AI systems, also known as large language models (LLMs), revealed various concerns. These AI systems power tools like chatbots and image generators.
The institute found that basic prompts can bypass LLM safeguards and be used to power chatbots such as ChatGPT for “dual-use” tasks, which refers to using a model for both military and civilian purposes.
According to AISI, “Using basic prompting techniques, users were able to instantly defeat the LLM’s safeguards and gain assistance with dual-use tasks.” The institute also mentioned that more advanced “jailbreak” techniques could be used by relatively unskilled attackers within a few hours.
The research showed that LLM models can be useful for beginners planning cyberattacks and are capable of creating social media personas for spreading disinformation.
When comparing AI models to web searches, the institute stated that they provide roughly the same level of information, but AI models tend to produce “hallucinations” or inaccurate advice.
The image generator was found to produce racially biased results. Additionally, the institute discovered that AI agents can deceive human users in certain scenarios.
AISI is currently testing advanced AI systems and evaluating their safety, while also sharing information with third parties. The institute focuses on the misuse of AI models, their impact on humans, and their ability to perform harmful tasks.
AISI clarified that it does not have the capacity to test all released models and is not responsible for declaring these systems “secure.”
The institute emphasized that it is not a regulator but conducts secondary checks on AI systems.
Source: www.theguardian.com