If you thought AI was stopping at generating images, well the revolution has only begun. Today, Open AI launched Sora, a video generative model that’ll focus on creating videos from text inputs. Although this project is still in the quality assurance stage and not available to the public, this post will share everything you need to know at the moment.
What Exactly Can this Video Generator Do?
Based on their press statement, the Sora AI model can generate videos of up to one minute based on text prompts. This is impressive, considering how other alternatives haven’t been able to extend the duration. Also, with a 1080p video quality, Sora can create complex scenes, include multiple characters and most importantly, use context from the physical world to create these videos.
Advertisement – Continue reading below
Another feature that stands out is that you can generate videos in different styles. For example, you could request for a photorealistic video, an animated one or one in black and white.
Video Credit: Open AI
You’ll also be able to upload a still image and get an animated video from it. Plus, you can also extend existing videos or include certain frames to it.
How Does the Sora AI Model Work?
The Sora AI model uses previous Open AI research like the recaption from DALL.E for visual training and other GPT models. It also gets its scaling performance from a transformer architecture. The company in its announcement highlighted that “Sora is a diffusion model that generates videos by starting with one that looks like static noise before the transformation over several steps.”
Shortcomings of the Current Sora AI Model
Since it’s still in the research and testing stage, there are already notable shortcomings of the video generator. For starters, it doesn’t yet understand cause and effect. This means you cannot use it for precise details like drinking a glass of milk and seeing a corresponding reduction in the liquid.
Open AI also notes that it may not understand spatial details like left and right. All of this suggests that weird hands may be the least of our worries when generating videos around human activity.
Advertisement – Continue reading below
Privacy and Security Concerns
Celebrities are already being harassed by AI photo generators as people create deep fake images. So it’s only normal to be very concerned as the rate of deep fakes will skyrocket. There are also concerns about child protection, as well as the rise in sexual and violent content.
Open AI addresses this in its release by emphasizing that it is designed for positive and beneficial use cases only. While the term is still vague, the model promises to reject any text input that requests extreme violence, sexual content, hateful imagery, celebrity cloning or the intellectual property of others. Hopefully, this will include filmmakers as well.
The company also plans to add a detection classifier via metadata that’ll notify users when a video has been generated by the Sora AI model which is a step in the right direction.
Overall, this model shows a lot of promise. However, in addition to the concerns, we can never really know its full potential until we try it ourselves. This is why we’re looking forward to the official rollout soon.