Jannah Theme License is not validated, Go to the theme options page to validate the license, You need a single license for each domain name.
Tech

Watch it and weep (or smile): Synthesia’s AI video avatars now feature emotions

Generative AI has captured the public imagination by engaging in the creation of elaborate, plausibly real texts and images based on verbal prompts. But the problem – and there is often a problem – is that the results are often far from perfect when you look a little closer.

People point strange fingers, floor tiles slide, and math problems are precisely that: problematically, sometimes they don’t add up.

Today, Synthesia – one of the ambitious AI startups working in the field of video, specifically personalized avatars designed for business users to create promotional, training and other video content – ​​releases an update day that she hopes will help her overcome some of the challenges of her particular field. Its latest version features avatars – built from real humans captured in their studio – that provide more emotion, better lip tracking and what it says are more expressive natural, human movements when given text to generate videos.

This release follows impressive progress made by the company to date. Unlike other generative AI players like OpenAI, which has built a two-pronged strategy – raising public awareness with consumer tools like ChatGPT while developing a B2B offering, with its APIs used by independent developers as well as by giant companies – Synthesia leans toward the approach taken by other leading AI startups.

In the same way that Perplexity focuses on generative AI search, Synthesia focuses on how to create the most human-like generative video avatars possible. More precisely, he seeks to do so only for the enterprise market and use cases such as training and marketing.

This focus has allowed Synthesia to stand out in what has become a very crowded AI market that risks becoming commoditized when hype settles into longer-term concerns such as ARR, economics unit and operational costs linked to the implementation of AI.

Synthesia describes its new expressive avatars, the version released today, as a first of its kind: “the world’s first avatars generated entirely with AI.” Built on large pre-trained models, Synthesia says its breakthrough lies in how they are combined to achieve multimodal distributions that more closely mimic the way humans speak.

These are generated on the fly, Synthesia explains, which is meant to be closer to the experience we have when we speak or react in life, and contrasts with the way many video tools work today. Avatar-based AI: Typically these are actually many pieces of video that are quickly stitched together to create facial responses that correspond, more or less, to the scripts embedded within them. The goal is to appear less robotic and more realistic.

Previous version:

New version:

As you can see in the two examples here, one from the old version of Synthesia and the one released today, there is still a way to go in terms of development, something that CEO Victor Riparbelli himself also admits.

“Of course, we’re not 100% there yet, but it will be very, very soon, by the end of the year. It will be so mind-blowing,” he told TechCrunch. “I think you can also see that the AI ​​part is very subtle. In humans, there is so much information in the smallest details, the smallest movements of our facial muscles. I don’t think we could ever sit down and describe: “yes, you smile like that when you’re happy but it’s fake, right?” It’s such a complex thing for humans to describe, but it can be (captured in) deep learning networks. They are actually able to understand the model and reproduce it in a predictable way. The next thing to work on, he added, is the hands.

“The hands are super tough,” he added.

The focus on B2B also helps Synthesia further anchor its messaging and products on “safe” use of AI. This is essential, especially given the enormous concern today about deepfakes and the use of AI for malicious purposes such as disinformation and fraud. Despite this, Synthesia has not managed to completely avoid controversy on this front. As we have previously highlighted, Synthesia’s technology has already been misused to produce propaganda in Venezuela and false information promoted by pro-China social media accounts.

The company said today that it has taken additional steps to try to lock down this use. Last month, it updated its policies, it said, “to restrict the type of content people can create, investing in early detection of bad faith actors, increasing the number of teams working on AI security and experimenting with content identification technologies such as C2PA.” .”

Despite these challenges, the company continued to grow.

Synthesia was last valued at $1 billion when it raised $90 million. Notably, this fundraiser took place almost a year ago, in June 2023.

Riparbelli (pictured above, right, with other co-founders Steffen Tjerrild, Professor Lourdes Agapito and Professor Matthias Niessner) said in an interview earlier this month that there are currently no plans to collect more, although that doesn’t really answer the question of whether Synthesia is being approached proactively. (Note: We are very excited to hear the real human Riparbelli speak at one of our London events in May, where I will definitely be asking about this again. Please come along if you are in town.)

What we know for sure is that building and operating AI is very expensive, and Synthesia has built and operated a lot of it.

Before the current version launched, some 200,000 people created more than 18 million video presentations in some 130 languages ​​using Synthesia’s 225 existing avatars, the company said. (It’s not specified how many users benefit from its paid tiers, but there are plenty of big-name customers, including Zoom, the BBC, DuPont and many others, and businesses pay.) Hope The startup, of course, is that with the new version released today, these numbers will increase even more.

techcrunch

Back to top button