“F”First the text, then the image, and now OpenAI has a model for generating videosshouted mashable The other day. The creators of ChatGPT and Dall-E are We announced Sora, a text-to-video diffusion model. Excited commentary has been made around the web for what will no doubt become known as his T2V, covering the usual gamut from “Is this the end?” [insert threatened activity here]” to “meh” and everything in between.
Sora (the name means “sky” in Japanese) is not the first T2V tool, but it looks more sophisticated than previous efforts like Meta. Video creation AI.can Convert short text descriptions into detailed high-resolution film clips. Maximum 1 minute. For example, a prompt might be: “Cat wakes up sleeping owner and asks for breakfast.” The owner tries to ignore the cat, but the cat tries a new strategy, and finally the owner retrieves a secret treat from under her pillow to lure the cat away a bit more. A clever video clip like this will be generated. Go viral on all social networks.
cute? Well, to a certain extent. OpenAI seems to be uncharacteristically candid about the limitations of its tools. For example, it may “struggle to accurately simulate the physics of complex scenes.”
That’s to say the least. One of the videos in the sample set explains the difficulty of the model. The prompt to create a movie is “Photorealistic close-up video.” two pirate ships fighting each other They sail in a cup of coffee. At first glance, it’s impressive. But then he notices that one of his ships is moving fast in an inexplicable way, and while Sora may know a lot about the reflection of light in fluids, he doesn’t know much about the physical laws that govern the movement of galleons. It becomes clear that you know little or nothing about it.
Other limitations: Sora can be a little vague about cause and effect. “A person can take a bite of a cookie and not leave a bite mark on the cookie afterward.” Tut, tut. It’s also possible that “the spatial details of the prompt can be confusing, for example, confusing left and right.” and so on.
Still, this is a start and will definitely get even better with the next work 1 billion teraflops About computing power. And while Hollywood studio heads can continue to sleep peacefully in their king-sized beds, Sora will soon will perform well enough to replace some types of stock video.
But despite making concessions about the tool’s limitations, OpenAI says Sora “serves as the foundation for models that can understand and simulate the real world.” The company said this would be a “significant milestone” in the realization of artificial general intelligence (AGI).
Here’s where things get interesting. OpenAI’s corporate goal is to achieve the holy grail of AGI, and the company seems to believe that generative AI is a concrete step toward that goal. The problem is that to achieve AGI, we need to build machines that understand the real world at least as well as our own. Among other things, it requires an understanding of the physics of moving objects. So the implicit bet of the OpenAI project is that, given enough computing power, a machine that can predict how pixels will move on a screen will one day be able to predict how the physical objects it depicts will behave in the real world. It seems like you’ll learn how it works. In other words, the bet is that extrapolating machine learning paradigms will eventually lead to superintelligent machines.
But an AI that can navigate the real world needs to understand more than just how the laws of physics work in that world. They will also need to understand how humans behave in it.And to everyone who followed me Works by Alison Gopnikwhich seems a bit far-fetched for the kind of machines that the world currently considers “AI.”
Gopnik is famous for his research into how children learn. As she watched her Ted Talk, What is the baby thinking?, will be a useful experience for technologists who imagine technology as the answer to intelligence problems. Decades of research examining the sophisticated information gathering and decision-making that babies make when they play led her to the conclusion that “babies and toddlers are like humanity’s research and development arm.” This columnist, who spent a year observing our granddaughter’s first year of development and especially how she was beginning to understand cause and effect, is inclined to agree. If Sam Altman and his OpenAI staff are really interested in AGI, maybe he should spend some time with the baby.
what I’ve been reading
algorithmic politics
Henry Farrell wrote the following seminal essay: The political economy of AI.
bot habits
there is reflective part inside atlantic ocean How chatbots are changing the way we talk by Albert Fox Kahn and Bruce Schneier.
no call
Science fiction author Charlie Stross wrote the following blog post: Why couldn’t Britain introduce conscription?Even if I wanted to.
Source: www.theguardian.com