Tech

Too many models

How many AI models is too many? It depends on how you look at it…but 10 a week is probably a bit too much. That’s how many we’ve had in the last few days, or so, and it’s increasingly difficult to tell if and how these models compare to each other – if that was ever possible to begin with. So, what is it for ?

We’re at a strange time in the evolution of AI, although, of course, it’s been pretty strange the whole time. We are seeing a proliferation of models, large and small, from niche developers to large, well-funded developers.

Let’s go through this week’s list, shall we? I tried to condense as much as possible what differentiates each model.

  • LLaMa-3: Meta’s latest flagship “open” large language model. (The term “open” is currently contested, but this project is nonetheless widely used by the community.)
  • Mistral 8×22: A rather large “expert mix” model from a French company that has been reluctant to move away from the openness it once embraced.
  • Stable Diffusion 3 Turbo: an upgraded SD3 to go with Stability’s new open API. Borrowing “turbo” from OpenAI’s model nomenclature is a bit weird, but OK.
  • Adobe Acrobat AI Assistant: “Talk to your documents” from the 800-pound document gorilla. However, I’m pretty sure this is primarily a wrapper for ChatGPT.
  • Reka Core: From a small team formerly employed by Big AI, a multimodal model built from scratch that is at least nominally competitive with the big dogs.
  • Idefics2: A more open multimodal model, built on recent, smaller models from Mistral and Google.
  • OLMo-1.7-7B: A larger version of the AI2 LLM, among the most open on the market, and a springboard towards a future 70B scale model.
  • Pile-T5: A version of the old reliable T5 model refined on the Pile code database. The same T5 you know and love, but with better coding.
  • Cohere Compass: an “integration model” (if you don’t already know, don’t worry) focused on integrating multiple data types to cover more use cases.
  • Imagine Flash: Meta’s latest image generation model, leveraging a new distillation method to speed up delivery without compromising too much on quality.
  • Unlimited: “Personalized AI powered by what you’ve seen, said or heard. IIt is a web app, a Mac app, a Windows app, and a portable device. 😬

That makes 11, as one was announced as I wrote this. And let’s be clear, it’s not all models released or previewed this week! These are just the ones we’ve seen and discussed. If we relaxed the inclusion conditions a little, there would be dozens: refined existing models, combos like Idefics 2, experimental or niche ones, etc. Not to mention this week’s new tools to build (torchtune) and fight against (Glaze 2.0) generative AI!

What do we think of this endless avalanche? Because next week, although it may not have the ten or twenty versions that we saw in the previous one, it will surely have at least five or six of the levels mentioned above. We can’t “examine” them all. So how can we help you, our readers, understand and keep track of all these things?

Well…the truth is no need follow, and hardly anyone else either. There has been a shift in the AI ​​space: some models, like ChatGPT and Gemini, have evolved into entire web platforms spanning multiple use cases and endpoints. Other major language models like LLaMa or OLMo, although technically speaking, share a basic architecture, do not actually fulfill the same role. They are meant to live in the background as a service or component, not in the foreground as a brand.

There has been a deliberate confusion between these two things, because model developers want to borrow some of the fanfare we tend to associate with major releases of AI platforms like your GPT-4V or Gemini Ultra. Everyone wants you to think their release is important. And although it is probably important to someonethat someone is definitely not you.

Think of it in the sense of another large and diverse category like cars. When they were invented, you just bought “a car.” Then, a little later, you can choose between a big car, a small car and a tractor. Nowadays, hundreds of cars are released every year, but you probably don’t need to know about one out of ten, because nine out of ten are not a car you need, or even a car in the sense that you hear it. . We are moving from the era of large/small/tractor AI to the era of proliferation, and even AI specialists can’t keep up with and test every model that comes out.

The other side of this story is that we were already at this stage long before ChatGPT and the other major models were released. Far fewer people were reading about it 7 or 8 years ago, but we talked about it nonetheless because it was clearly a technology waiting for its defining moment – ​​which arrived in due time. Papers, models, and research were constantly being published, and conferences like SIGGRAPH and NeurIPS were filled with machine learning engineers comparing notes and building on each other’s work. Here’s a visual comprehension story that I wrote in 2011!

This activity continues every day. But because AI has become big business – arguably the biggest in tech right now – these developments have taken on a little more weight, as people are curious about whether one of them could be the big leap forward from ChatGPT that ChatGPT was from its predecessors.

The simple truth is that none of these models will be such a big step, since OpenAI’s advancement relies on a fundamental change in the machine learning architecture that every other company has now adopted and which has not not been replaced. Incremental improvements like a point or two better on a synthetic benchmark, or slightly more convincing language or images, are all we have to hope for at the moment.

Does this mean that none of these models matter? Certainly. You can’t upgrade from 2.0 to 3.0 without 2.1, 2.2, 2.2.1, etc. – and this is what researchers and engineers are working diligently on. And sometimes these advances are significant, correcting serious shortcomings or revealing unexpected vulnerabilities. We try to cover the most interesting ones, but that’s only a fraction of the total number. We’re currently working on an article collecting all the models we think ML curious people should know about, and it’s in the order of a dozen.

Don’t worry: When a big problem happens, you’ll know about it, and not just because TechCrunch covers it. This will be as obvious to you as it is to us.

techcrunch

Back to top button