Tech

Google’s Veo generates 1080p AI-created videos lasting one minute

Google has announced a revolutionary new AI model named “Veo” that will support video generation, tailored to users’ creative visions. Google is also upgrading its image generation model, bringing it to its third generation in Imagen 3.

Bard was one of our first glimpses of modern AI LLMs under Google. This version first launched about a year ago, with major changes coming to the platform in recent months. One of the biggest changes was a complete name change, rebranding the user-facing AI tool as Gemini, which has now spread across the company’s product line with Gemini Nano in current and upcoming devices and Gemini Pro.

Just before Bard was renamed Gemini, Google added the ability to request images through the AI’s conversational model. Asking for an image of a cow on a boat would yield exactly that, whatever style suits you. This process was powered by Imagen 2, which was the first version to be publicly available.

Google’s Veo model

Today, Google is announcing two creative generation models, Veo and Imagen 3. Veo is the most exciting, because it’s something the public hasn’t been able to try yet. The template is specially designed for video generation that understands visual semantics and natural language, like other modern templates. This approach introduced in video generation offers results that can be creatively adapted to fit certain styles.

Google notes that the Veo model will be able to understand “cinematic terms” in user prompts, like aerial shots and timelapse formats. Veo is capable of generating 1080p videos that can be over a minute long, outperforming current models like OpenAI’s Sora, with a maximum length of 60 seconds.

Veo builds on years of work on our generative video models including Generative Query Network (GQN), DVD-GAN, Imagen-Video, Phenaki, WALT, VideoPoet and Lumiere – combining architecture, scaling laws scaling and other innovative techniques to improve output quality and resolution. .

Google invites creators and filmmakers to put Veo through its paces to shape the model so that it can accommodate a wide variety of artistic styles and use cases.

Picture 3

The Imagen model is also getting a substantial update. Imagen 3 is positioned as Google’s “highest quality” text-to-image model and offers some improvements over the Imagen 2 model we saw in Gemini and Bard.

Imagen 3 would bring a higher level of detail to images without as many visual artifacts and impurities in the generated images. Images are more photorealistic and realistic on demand.

Perhaps the biggest improvement is Imagen 3’s ability to render text. This has become a comical weakness of text-to-image conversion models like DALL-E and Adobe Firefly. Google is positioning the new model as a way to create personalized images with text, like greeting cards or photos with messages. How well it actually renders text remains to be seen, but it’s a promising improvement.

Veo and Imagen 3 can be used in a private preview via VideoFX from Google Labs. VideoFX will use SynthID to ensure that content created is digitally watermarked and responsibly generated.

Those who want to try the new models can sign up through Google’s waitlist.

FTC: We use automatic, revenue-generating affiliate links. More.

News Source : 9to5google.com
Gn tech

Back to top button