Tech

Google admits its AI previews need improvement, but we’re all helping it beta test

Google is also embarrassed by its AI previews. After a deluge of dunks and memes over the past week that were blamed on poor quality and outright misinformation stemming from the tech giant’s new AI-powered search feature, the company has published a sort of mea culpa on Thursday. Google – a company whose name is synonymous with web search – whose brand focuses on “organizing the world’s information” and putting it at the user’s fingertips – actually wrote in an article blog that “strange, inaccurate, or unhelpful insights into AI have certainly emerged.” .”

That’s an understatement.

The admission of failure, penned by Google Vice President and Head of Search Liz Reid, appears to be a testament to how the drive to integrate AI technology into everything has now become a Somehow made Google search worse.

In the article titled “About Last Week” (did it go beyond PR?), Reid explains the many ways his AI insights make mistakes. Although they don’t “hallucinate” or make things up like other major language models (LLMs) do, she says, they can get it wrong for “other reasons,” such as “misinterpreting queries, misinterpreting a nuance of language on the web, or not having a lot of interesting information available.”

Reid also noted that some of the screenshots shared on social media over the past week were fake, while others were of absurd queries, like “How many rocks should I eat?” ” – something no one had ever really looked for before. Since there is little factual information on this topic, Google’s AI guided a user to satirical content. (In the case of the rocks, the satirical content had been published on the website of a geological software provider.)

It’s worth pointing out that if you had Googled “How many stones should I eat?” » and you are presented with a bunch of useless links, or even a crazy article, you wouldn’t be surprised. What people reacted to was the confidence with which the AI ​​responded that “geologists recommend eating at least one small stone per day” as if it were a factual answer. It may not be a “hallucination” in technical terms, but the end user doesn’t care. It’s crazy.

What’s also troubling is that Reid claims Google “tested the feature extensively before launch,” including with “robust red teaming efforts.”

So, no one at Google has a sense of humor? Did no one think of any prompts that would generate bad results?

Additionally, Google downplayed the AI ​​feature’s reliance on Reddit user data as a source of knowledge and truth. Although people have been routinely adding “Reddit” to their searches for so long that Google finally made it a built-in search filter, Reddit is not a body of factual knowledge. And yet the AI ​​points to Reddit forum posts to answer questions, without understanding when direct knowledge of Reddit is useful and when it’s not — or worse, when it’s a troll.

Today, Reddit makes bank by offering its data to companies like Google, OpenAI and others to train their models, but that doesn’t mean users want Google’s AI to decide when to search for an answer on Reddit, or suggests that someone’s opinion is fact. There are nuances to knowing when to search on Reddit and Google’s AI doesn’t understand it yet.

As Reid admits, “Forums are often a great source of authentic, first-hand information, but in some cases they can result in less-than-helpful advice, like using glue to make cheese stick to cheese.” pizza,” she said, referring to one of the most spectacular AI feature failures over the past week.

Google’s AI preview suggests adding glue to make the cheese stick to the pizza, and it turns out the source is an 11-year-old Reddit comment from user F*cksmith 🙂 pic.twitter.com/uDPAbsAKeO

– Peter Yang (@petergyang) May 23, 2024

If last week was a disaster, at least Google is quickly iterating accordingly – or so it says.

The company says it looked at examples from AI insights and identified patterns where it could do better, including creating better detection mechanisms for nonsense queries, limiting the use of user-generated content for answers that might offer misleading advice, adding trigger restrictions for queries where AI previews weren’t helpful, not showing AI previews for difficult news topics, “where freshness and factuality matter,” and adding additional trigger enhancements to its protections for health searches.

As AI companies create ever-improving chatbots every day, the question is not whether they will ever surpass Google Search in helping us understand the world’s information, but whether Google Search will one day be able to catch up with the AI ​​to challenge them in return.

As ridiculous as Google’s mistakes are, it’s too early to count it out of the race, especially given the massive scale of Google’s beta testing team, which basically includes anyone who uses search.

“There’s nothing like seeing millions of people using this feature for a lot of new research,” says Reid.

techcrunch

Back to top button