OpenAI has announced a program called Sora, a state-of-the-art artificial intelligence system that can turn text descriptions into photo-realistic videos. This video generation model has added to excitement over advances in AI technology, along with growing concerns about how synthetic deepfake videos will exacerbate misinformation and disinformation during a critical election year around the world. I am.
Sora AI models can currently create videos up to 60 seconds using text instructions alone or a combination of text and images. One demonstration video begins with a text prompt describing a “stylish woman walking down a Tokyo street filled with warmly glowing neon lights and animated city signs.” Other examples include more fantastical scenarios such as dogs frolicking in the snow, vehicles driving down the road, and sharks swimming through the air between city skyscrapers.
“Like other technologies in generative AI, there is no reason to believe that text-to-video conversion will not continue to advance rapidly. We are increasingly approaching a time when it will be difficult to tell the fake from the real.” Honey Farid at the University of California, Berkeley. “Combining this technology with AI-powered voice cloning could open up entirely new ground in terms of creating deepfakes of things people say and do that they have never actually done.”
Sora is based on some of OpenAI's existing technologies, including the image generator DALL-E and the GPT large language model. Although his text-to-video AI models lag somewhat behind other technologies in terms of realism and accessibility, Sora's demonstrations are “orders of magnitude more believable and cartoon-like” than previous ones. “It's less sticky,” he said. Rachel TobackHe is the co-founder of SocialProof Security, a white hat hacking organization focused on social engineering.
To achieve this higher level of realism, Sora combines two different AI approaches. The first is a diffusion model similar to those used in AI image generators such as DALL-E. These models learn to gradually transform randomized image pixels into a consistent image. The second of his AI techniques is called “Transformer Architecture” and is used to contextualize and stitch together continuous data. For example, large-scale language models use transformer architectures to assemble words into commonly understandable sentences. In this case, OpenAI split the video clip into visual “space-time patches” that Sora's transformer architecture could process.
Sora's video still contains many mistakes, such as a walking person's left and right feet swapping positions, a chair floating randomly in the air, and a chewed cookie magically leaving no bite marks. contained. still, jim fanThe senior research scientist at NVIDIA praised Sora on social media platform X as a “data-driven physics engine” that can simulate the world.
The fact that Sola's video still exhibits some strange glitches when depicting complex scenes with lots of movement suggests that such deepfake videos are still detectable for now. There is, he says. Arvind Narayanan at Princeton University. But he also warned that in the long term, “we need to find other ways to adapt as a society.”
OpenAI has been holding off on making Sora publicly available while it conducts “red team” exercises in which experts attempt to break safeguards in AI models to assess Sora's potential for abuse. An OpenAI spokesperson said the select group currently testing Sora are “experts in areas such as misinformation, hateful content, and bias.”
This test is very important. Because synthetic videos allow malicious actors to generate fake footage, for example, to harass someone or sway a political election. Misinformation and disinformation fueled by AI-generated deepfakes ranks as a major concern For leaders as well as in academia, business, government, and other fields. For AI experts.
“Sora is fully capable of creating videos that have the potential to deceive the public,” Tobac said. “Videos don't have to be perfect to be trustworthy, as many people still don't understand that videos can be manipulated as easily as photos.”
Toback said AI companies will need to work with social media networks and governments to combat the massive misinformation and disinformation that could arise after Sora is released to the public. Defenses could include implementing unique identifiers, or “watermarks,” for AI-generated content.
When asked if OpenAI has plans to make Sora more widely available in 2024, an OpenAI spokesperson said the company “will make Sora more widely available in OpenAI's products.” We are taking important safety measures.” For example, the company already uses automated processes aimed at preventing commercial AI models from producing extreme violence, sexual content, hateful images, and depictions of real politicians and celebrities. .With more people than ever before Participate in elections this yearthese safety measures are extremely important.
topic:
- artificial intelligence/
- video
Source: www.newscientist.com