OpenAI has announced its latest artificial intelligence model called GPT-4o. This will soon be included in some versions of the company’s ChatGPT product. The upgraded ChatGPT can respond quickly to text, audio, and video input from real-time conversation partners, while speaking with intonations and diction that convey strong emotion and personality.
The company demonstrated the new voice mode’s emotional mimicry on May 13 in a supposedly live OpenAI presentation featuring both the ChatGPT mobile app and the new desktop app. This new AI conversational feature, which speaks in a female-like voice and responds to the name ChatGPT, is more reminiscent of Scarlett in her 2013 sci-fi film “Her” than the stylized, robotic responses of a typical voice. It seemed similar to her personal AI voiced by Johansson. assistant technology.
“The new GPT-4o voice-to-voice interaction is more similar to human-to-human interaction.” michelle cohn at the University of California, Davis. “This is largely due to the low latency, but even more important is the level of emotional expression that the audio produces.”
During a conversation with Mira Murati, the company’s CTO, and two other employees, ChatGPT powered by GPT-4o commented on OpenAI’s Mark Chen’s heavy, fast-paced breathing, saying, “Hey, slow down, you’re cleaning.” “It’s not the machine,” he advised. We suggest breathing techniques. The AI also visually inspects a drawing with words and a heart by OpenAI’s Barret Zoph and responds with a gushing response: “Oh, I see you wrote that I love ChatGPT. That’s so sweet.” did.
The new ChatGPT also verbally instructed conversation partners to solve simple linear equations, explained the functions of computer code, and interpreted graphs showing temperature lines that peak in the summer. When prompted, the AI even retold the bedtime story it made up several times, switching between increasingly dramatic narration and ending songs.
Sam Altman, CEO and co-founder of OpenAI, said the new voice mode will first be available to paid subscribers of ChatGPT Plus in the coming weeks. post On the X platform.
ChatGPT was able to conversationally recover from occasional technical glitches. When asked to interpret the facial expressions and emotions in OpenAI’s Zoph selfies, the AI first looked at the surface of the tree from the previous image before being prompted to evaluate the most recent image. suggested.
“Ah, it’s finally time to go. You look very happy and cheerful, with big smiles and a little excitement,” said ChatGPT. “Whatever is going on, you seem to be in a good mood. Would you like to share the source of that good mood?”
When told during ChatGPT’s live demo that it was because they were showing off “how helpful and awesome you are,” the AI responded, “Stop it, you’re making me blush.”
However, Murati said the updated version of ChatGPT using GPT-4o (which he says will eventually be made available to free ChatGPT users) introduces new security concerns by incorporating and interpreting real-time information. acknowledged that there are risks involved. She said OpenAI is working on building “mitigations against abuse.”
“The demo is impressive because it’s very difficult to have seamless multimodal conversations,” he says. peter henderson At Princeton University, New Jersey. “However, as you add more modalities, safety becomes more difficult and critical. With this expansion of inputs that the model utilizes, identifying potential safety failure modes will likely take some time. It will take time.”
Henderson also noted that once ChatGPT users start sharing input such as live audio or video, OpenAI’s privacy terms and free users can opt out of data collection that may be used to train future OpenAI models. He said he was “interested” in finding out whether that is the case.
“Because the model appears to be hosted off-device, the fact that you can share your desktop screen with the model over the internet and continuously record audio and video could mean that if your plans are right, this specific “I think that’s going to further amplify the challenges in product launches in terms of storing and using that data,” Henderson says.
More anthropomorphic AI chatbots also represent another threat. Bots that can fake empathy through voice conversations could potentially sound more approachable and persuasive to people. the study A study by Cohn and colleagues. As a result, people are more likely to trust potentially inaccurate information and biased stereotypes generated by large-scale language models such as GPT-4.
“This has important implications for how people search for and receive guidance from large language models, especially since they don’t always produce accurate information,” Cohn says.
topic:
Source: www.newscientist.com