Video output from Sora, OpenAI's AI-powered text-to-video generator.
News

OpenAI Unveils Sora, Its Impressive AI Video Generator

7 minute read
Michelle Hawley avatar
SAVED
OpenAI launches Sora, the next frontier in AI video generation offering hyper-realistic content.

The Gist

  • Innovative AI video generation. OpenAI introduces Sora, a groundbreaking AI tool capable of generating hyper-realistic videos up to a minute long from text prompts.
  • Exclusive access and future expansion. Currently, Sora is available only to a select group of red teamers and creative professionals for testing and feedback, focusing on identifying potential risks and improving the model's capabilities.
  • Technological advancements and ethical considerations. Sora represents a significant step toward more sophisticated AI capabilities, using diffusion models and transformer architecture for video generation.  

OpenAI announced an exciting new tool — Sora, an AI model that can generate hyper-realistic video from text. 

Sora can create videos up to a minute long featuring highly detailed scenes, complex camera motion and multiple characters with vibrant emotions, OpenAI wrote in a recent tweet — like this generated historical footage of California during the gold rush:

Sora builds on the tech of DALL-E, OpenAI’s text-to-image generation tool. Sora not only understands what users ask for in the text prompt, but also how those things exist in the physical world. 

Sora Only Available to Select Users 

Sora is not currently available to the general public. It’s only available to red teamers — experts in areas like misinformation, bias and hateful content — to test critical areas for harm or risk. OpenAI also granted access to a handful of visual artists, designers and filmmakers in an attempt to gain feedback on how to improve the model for creative professionals. 

OpenAI has not yet released information on when Sora will be available for general use, and there is no waitlist users can join. However, if you want to see the AI model in action, plenty of users (along with OpenAI) are sharing their experiences online. 

Sam Altman, CEO of OpenAI, also requested prompts for Sora videos on Twitter, wanting to show off the AI model in action. He followed up by telling users not to “hold back on detail or difficulty.”

Related Article: Midjourney vs. DALL-E 2 vs. Stable Diffusion. Which AI Image Generator Is Best for Marketers?

How Sora Works 

Sora is a diffusion model that builds on past research in DALL-E and GPT models. It uses the recaptioning technique from DALL-E 3, meaning it can generate highly descriptive captions for the visual training data and follow users’ text instructions more accurately. 

Similar to GPT models, Sora uses a transformer architecture, allowing it to have excellent scaling performance. This new AI model creates videos that start off looking like static noise. It then gradually transforms those videos by removing the noise over many steps.

OpenAI's Sora video generation process

While Sora generates videos from text, users can also prompt it with other inputs, such as pre-existing images or videos. For instance, users can create an image with DALL-E, then ask Sora to animate that image. 

Video-to-video editing is also an option. Users can upload videos to Sora and use the diffusion model to edit the video — like changing the video’s setting, connecting two input videos with a seamless transition or extending videos backward or forward in time to produce an infinite loop. 

And while right now Sora is the talk of the internet for its impressive realistic videos, the model is also capable of generating images of up to 2048x2048 resolution. 

Sora Still Has Weaknesses

According to OpenAI, Sora still has some imperfections, claiming it may struggle with accurately simulating the physics of a scene or understanding specific instances of cause and effect. 

One example they gave was if a person takes a bite out of a cookie, afterward the cookie may not have a bite mark. 

Learning Opportunities

Many of the videos shared online have these tell-tale AI signs, like this video shared by Altman on Twitter where a woman giving a cooking demonstration has a magically disappearing spoon. 

Or this video showing a pack of coyotes that seem to merge and unmerge from each other. OpenAI commented on the video, “Animals or people can spontaneously appear, especially in scenes containing many entities.” 

AI Video Safety and Concerns 

OpenAI is building tools to help detect misleading content, such as a detection classifier that can tell when a video was generated by Sora. 

In addition to developing new tools and techniques, the company also plans to utilize existing safety methods built for DALL-E 3 — like prompt transformations, which rewrite submitted text to comply with guidelines (such as not using public figure names) and blocklists, which can block certain images from being outputted. 

OpenAI also plans to work with global policymakers, educators and artists to understand concerns and identify positive use cases for the new technology. However, the company said, “Despite extensive research and testing, we cannot predict all of the beneficial ways people will use our technology, nor all the ways people will abuse it.” 

With easy access to AI-generated content, many are concerned about the rise of misinformation, going so far as to say AI is a potential threat to democracy. 

In a PBS NewsHour interview, Lauren Barrón-López, White House correspondent, said that while AI has been used before in past elections, “AI generative tools are now more widely available, and they’re much more sophisticated.” And while some companies have decided to label AI content, they aren’t outright banning it, with Twitter not even agreeing to label AI generative content that might be fake.   

Political interference is not the only concern when it comes to generative AI video, either. Many public figures have been caught in the crossfire of fake videos, with notable examples including an AI version of Tom Hanks promoting dental insurance, an AI-generated stand-up special of the late comedian George Carlin and even sexually explicit AI content of music superstar Taylor Swift.

Related Article: What Brands Need to Know About AI Image Generation Models

Sora Not the Only AI Video Generator

Sora is not the first AI model that can produce video from text prompts — but it may be the most impressive.

AI video generators first started cropping up in late 2022, like Meta’s Make-A-Video, Google’s Lumiere and Runway’s Gen-1 model. However, most of these models produce low-quality and glitchy results that are only a few seconds long.

OpenAI’s Sora, on the other hand, can produce videos up to a minute long, with the added ability of creating transition videos that can stitch multiple videos together seamlessly. While Sora can produce high-quality animated content, many of its videos are full of rich detail that make them easy to mistake for real-life content, especially for those not used to looking for the tell-tale signs of AI.

Is Sora the Path to AGI?

Artificial general intelligence (AGI), a type of intelligence where a machine can understand, learn and think like a human, is still only a hypothetical. But that’s not stopping companies like OpenAI, Microsoft, Meta and others from trying to make it a reality.

Ultimately, said OpenAI, Sora serves as the foundation for models that can understand and stimulate the real world — “a capability we believe will be an important milestone for achieving AGI.”

Altman tweeted after the release of Sora that OpenIA is “extremely focused on making AGI.” 

OpenAI Launches Sora, It's Impressive New AI Video Generator

fa-regular fa-lightbulb Have a tip to share with our editorial team? Drop us a line:

About the Author

Michelle Hawley

Michelle Hawley is an experienced journalist who specializes in reporting on the impact of technology on society. As a senior editor at Simpler Media Group and a reporter for CMSWire and Reworked, she provides in-depth coverage of a range of important topics including employee experience, leadership, customer experience, marketing and more. With an MFA in creative writing and background in inbound marketing, she offers unique insights on the topics of leadership, customer experience, marketing and employee experience. Michelle previously contributed to publications like The Press Enterprise and The Ladders. She currently resides in Pennsylvania with her two dogs. Connect with Michelle Hawley:

Main image: OpenAI