Google DeepMind announcement Monday January 6, he created a new team to work on “massive” generative models that would “simulate the world”. These models represent the next stage of advancement in artificial intelligence (AI) capabilities for decision-making, planning And creativity.
World models are computational frameworks that help AI systems understand and simulate the real or virtual world. They are essential to portion teaching AI systems to navigate an environment and have widespread applications in robotics, gaming, and autonomous systems.
For example, autonomous vehicles use these global models to simulate traffic and road conditions. They can also train general AI robots in different environments. A common problem is the lack of rich, diverse and safe training environments for supposedly Embodied AI.
DeepMind’s job posting posted on Monday said scaling AI models was also important for the technology’s evolution.
“We believe that expanding pre-training on video and multimodal data is on the critical path to artificial general intelligence. World models will power many areas, such as visual reasoning and simulation, embodied agent planning, and real-time interactive entertainment,” the job posting states. PYMNTS contacted Google but has not yet received a response.
Tim Brooks, who left OpenAI in October to join Google DeepMind, will lead the team. At OpenAI, Brooks co-led the development of Sora, its video generation model that went viral upon its unveiling due to its sophistication.
According to job announcements for the teamthe new hires will “collaborate and build on” the work of the Gemini teams, Google’s large flagship multimodal model, Veo (video generation model) and Genie (global model).
Google DeepMind’s focus on global models comes as AI startup World Labs said when it came out of stealth last September. The startup develops large global models. Led by Stanford AI pioneer Fei Fei Li, the startup is funded by AI pioneer and Nobel laureate Geoffrey Hinton, Salesforce CEO Marc Benioff, LinkedIn co-founder Reid Hoffman, l former Google Chairman Eric Schmidt as well as Andreessen Horowitz, NEA, NVentures and others. .
Google DeepMind has already developed several global models, including Genie and Genie 2. Genie 2 can transform text and picture in 3D worlds that react based on a user’s actions in that environment. (Genie only created 2D worlds).
Genie 2 is a powerful AI model that learns from a large video dataset and uses a process that compresses video frames into simpler, meaningful representations via an autoencoder. These compressed frames are then analyzed by a transformer model that predicts the progress of the video, step by step, using a method similar to how text generation models like ChatGPT work.
Trained on a large-scale video dataset, Genie 2 can display object interactions, complex character animation, physics (like gravity and water splash effects). And modeling the behavior of other agents. The world he creates can last up to a minute, most times between 10 and 20 seconds.
Google DeepMind’s increased focus on global models will further refine the capabilities of its AI systems as it competes with OpenAI, Meta and Microsoft. And Amazon serving businesses.
The latest innovation adds to its already rich portfolio of innovations, one of which recently led to The Nobel Prize nods for CEO Demis Hassabis and John M. Jumper: AlphaFold2. It is an AI model that predicted the nature of all known proteins, thereby solving a 50-year-old biochemical challenge.
In a paper published in October, Google DeepMind researchers said they trained a large language model called the Habermas Machine to act as an AI mediator helping small British groups find common ground on controversial issues such as Brexit or immigration. To do this, he drafted a “group statement” which reflected their shared views.