Google’s “AI Overview” can give false, misleading, and dangerous answers

Post content hidden for low score. Show…
Post content hidden for low score. Show…

invertedpanda

Ars Tribunus Militum
1,936
Subscriptor
It should be noted that the issue is actually rarely the AI itself, but Google's ranking system and featured snippets.

In most cases where I've tested these "bad AI results", the actual problem is that the AI is just re-phrasing the top result that creates the featured snippets. As an example, the "How many rocks should I eat per day" one that's been making the rounds is actually just some gray-hat SEO of a featured snippet for a fracking company that cites an Onion article (SEO is complicated, folks).

So, the problem already exited in the form of featured snippets: Now it's just rephrased with AI.

Of course, Google has some real issues with handling these shifty SEO strategies, and it plays whack-a-mole constantly. I actually shut down one of my own websites because of Google not being able or willing to handle low-effort content farms that just gobble up content like mine and rewrite it using AI while doing a handful of additional techniques to edge out ranking over my original content.
 
Upvote
85 (116 / -31)

torp

Ars Tribunus Militum
2,836
Subscriptor

This is a great example of:

  • garbage in, garbage out. Even the LLM says it's from a Reddit post.
  • people having unrealistic expectations about LLMs. Perhaps this will convince everyone that they're parroting what they're fed and have no understanding or self consciousness.
  • google shooting themselves in the foot. It's one thing to give a result like the Reddit suggesion as a link to the original post on Reddit. It's another one entirely to get it in this overview where it sounds like it's endorsed by Google.
 
Upvote
231 (235 / -4)

Jiggers

Ars Scholae Palatinae
868
As more and more of the training data is fake posts and web pages generated by AI, this is going to get a lot funnier until it's not, and adults in charge wake up.

Don't forget Google is run on the social media model where engagement and delivering eyeballs is valued ahead of quality, facts, and editorial responsibility, so no wonder their first public facing AI is fraught with poor quality, fiction and almost no editorial oversight.
 
Last edited:
Upvote
185 (186 / -1)

CarlSagan82

Smack-Fu Master, in training
77
Subscriptor
I listened to a podcast with Steve Gibson (GRC) this week and he shared an email he sent out 25 years ago when he first discovered google and hardly anyone knew what this new search was. He shared how incredible it was at that time and how it degraded over time. If this isn't the final straw what is?
 
Last edited:
Upvote
88 (91 / -3)

balthazarr

Ars Praefectus
5,719
Subscriptor++
Upvote
115 (117 / -2)
So far, I've found AI searches to be helpful, but only as a starting point. Myself: I use them for difficult to find fire code references in my job: What comes back from the AI, I'll then reference in CAN/ULC or NFPA to verify the accuracy - It still saves me a ton of time, but my own liability, as well as that of my employer, would skyrocket if I ever took the AI's response as the only one to put into my work for our customers.
I am of the opinion (especially in this situation as a former firefighter myself) that fire code or any safety-critical use-case is inappropriate for general AI searches/summaries. Possibly Retrieval Augmented Generation (RAG) built around your use-case domain would be accurate enough, but that’s not what you are likely using. Your result validation is good, but for every person who does that, there are 10 who don’t, or who aren’t as diligent as you are. So please be careful about how you present this idea to others in your industry so people don’t get killed because of misuse/misunderstanding of this technology.
 
Last edited:
Upvote
180 (180 / 0)

psarhjinian

Ars Praefectus
3,457
Subscriptor++
The problem, really, is money. Google makes a lot of it, but there's the possibility that they might make less money than they did before, or that someone else might make money that Google would like to make. That "not making money you feel you should be making" results in some serious maladaptive behaviour.

We talk about LLMs being just fancy pattern-imitation systems, but behaviourism drives a lot of what people do, too, and we're seeing a form of it here: executives get rewarded when stock price goes up, executives get punished when stock price go down, so executives, like an LLM AI, will follow the pattern of doing anything and everything that makes stock price go up, regardless of whether or not it'll make stock price down six months from now, will kill your brand, or will cause the rise of fascism in the west.

It's like the grey-goo issue, but for MBAs instead of nanomachines.
 
Upvote
195 (196 / -1)

inomyabcs

Smack-Fu Master, in training
97
Subscriptor
Since a year (except for a leap year) has 365 days, three months would be (365 days/year) * (3 months) = 1095 days.
This might explain why Gemini is having problems with math'ing dates.

I was able to get Gemini to return 92 days after it went out and used a code library. You can see the full chat here .
 
Upvote
46 (46 / 0)

Little-Zen

Ars Praefectus
3,123
Subscriptor
This all just highlights that no part of any of this has ever been an “AI” and continuing to present it this way is misleading at best.

It’s just parroting back whatever it has been fed, including nonsense joke posts, or making up its own nonsense, and Google (and everyone else) is too focused on trying to get advertising dollars so they won’t even admit how inaccurate it can be, and thus how dangerous it is to present it as authoritative.

When it can independently analyze a source and determine if the information in it is correct, recognize a joke, and understand the context in which information is being given (and requested, it seems like it has a problem with that for some queries), then maybe they can consider calling it intelligent.


when your “AI” is a statistical model of the frequency and probability of certain words following other certain words in response to some words, and it has no way of actually understanding what those mean, you’ve got yourself a chat help bot. A big one, sure, but it has just been trained on a bigger dataset than the usual “one company’s useless outdated knowledge base articles”.
 
Upvote
134 (137 / -3)

Publius Enigma

Ars Praetorian
636
Subscriptor
This is a great example of:

  • garbage in, garbage out. Even the LLM says it's from a Reddit post.
  • people having unrealistic expectations about LLMs. Perhaps this will convince everyone that they're parroting what they're fed and have no understanding or self consciousness.
  • google shooting themselves in the foot. It's one thing to give a result like the Reddit suggesion as a link to the original post on Reddit. It's another one entirely to get it in this overview where it sounds like it's endorsed by Google.
Are the people having unrealistic expectations of LLMs to which you refer actually CEOs of large tech companies?
 
Upvote
116 (116 / 0)
As more and more of the training data is fake posts and web pages generated by AI, this is going to get a lot funnier until it's not, and adults in charge wake up.
The problem self-magnifies too. As AI discourages and demonetizes human content creators, the corpus of training data becomes more and more AI-generated as you said. The “internet grey goo” concept isn’t new, but I’d take it even a step further and compare it to genetic diversity. We all know what happens to a population that lacks an injection of new genetics…what happens when people stop making new original content and all that’s left is just one big pile of stagnant content statistically rearranged over and over?
 
Last edited:
Upvote
165 (165 / 0)

sailaway777

Wise, Aged Ars Veteran
113
Sundar Pichai just keeps knocking it out of the park. Google leadership grossly misunderstands the value proposition of Google Search. That's the problem with lack of visionary and insightful leadership. You keep chasing whatever new shiny thing others are doing. While not an Apple fan, I keep getting astounded at just how good they are at execution, focus and long term thinking. In comparison, Google just seems like a drunk, headless chicken staggering along.
 
Upvote
174 (176 / -2)

gruberduber

Smack-Fu Master, in training
90
I am on the opinion (especially in this situation as a former firefighter myself) that fire code or any safety-critical use-case is inappropriate for general AI searches/summaries. Possibly Retrieval Augmented Generation (RAG) built around your use-case domain would be accurate enough, but that’s not what you are likely using. Your result validation is good, but for every person who does that, there are 10 who don’t, or who aren’t as diligent as you are. So please be careful about how you present this idea to others in your industry so people don’t get killed because of misuse/misunderstanding of this technology.
This is part of why AI is dangerous in my opinion. People currently say that it's fine so long as you verify the answer, but very few people will continue to do that long term if they don't spot immediate problems themselves.

Any safety measure that relies on a constantly attentive and educated human is not going to last - and always taking the effort to verify a result that is usually fine falls into that category.

I know someone who started using LLMs to get answers for their work contracts which they "always verify themselves first". But how many times does it have to be correct before you start wondering if verifying is really necessary?

This person says that they still always double-check the answer, but it actually took less than a week before they started to just kind of waive it past without checking at all. It's just how most people are.

In my experience, if there are no immmediate error with something, then as soon as the novelty wears off and people can't be bothered anymore, they just start trusting it. And most people are not very diligent in the first place.

If AI is only safe because the humans who use it are diligent and educated every time we use it then we're screwed.
 
Upvote
190 (191 / -1)

Dinosaurius

Ars Centurion
353
Subscriptor++
I am on the opinion (especially in this situation as a former firefighter myself) that fire code or any safety-critical use-case is inappropriate for general AI searches/summaries. Possibly Retrieval Augmented Generation (RAG) built around your use-case domain would be accurate enough, but that’s not what you are likely using. Your result validation is good, but for every person who does that, there are 10 who don’t, or who aren’t as diligent as you are. So please be careful about how you present this idea to others in your industry so people don’t get killed because of misuse/misunderstanding of this technology.
Former firefighter, as well :) I use it solely as a starting point. I.e. "What NFPA code references lumens per square foot for strobes in an office setting" to get me to the right NFPA code and section - I've heard enough horror stories about AI generated content that precisely zero of it is used in anything given to the client.
I am of the opinion that, right now: AI is handy for speeding up the start-up portion of the process, but at least in my line of work: Then end result must still be 100% human generated - and verified.
 
Upvote
3 (24 / -21)

wourm

Smack-Fu Master, in training
90
I am not surprised considering LLM don't actually know anything. As for training with reddit and other social media...garbage in garbage out still holds true it seems.
Given that they're using reddit as a source, I'm surprised that we didn't get a bunch of dick jokes in with the blinker fluid suggestions.
...the only content left is just one big pile of stagnant content statistically rearranged over and over?
We are nearly there.

My anecdote is about an ad on the Solitaire game that I play. An ad for "Fake Video Call" app shows up and any touch of the screen starts the install of this app. It's happened to me twice so far and is extremely irritating. I immediately uninstall it before it launches hoping it hasn't planted any malicious code on my phone. So I searched Google for "how to stop Fake Video Call app from being installed automatically". All of the results were about sources for fake video call apps. Who the hell wants to install a fake video call app? What exactly IS a fake video call app?
 
Upvote
88 (88 / 0)

gruberduber

Smack-Fu Master, in training
90
Former firefighter, as well :) I use it solely as a starting point. I.e. "What NFPA code references lumens per square foot for strobes in an office setting" to get me to the right NFPA code and section - I've heard enough horror stories about AI generated content that precisely zero of it is used in anything given to the client.
I am of the opinion that, right now: AI is handy for speeding up the start-up portion of the process, but at least in my line of work: Then end result must still be 100% human generated - and verified.
As I've basically just mentioned in another comment: I agree, but I don't think the usage you describe is really possible for most people, and that's the problem.

If you hand the average human a complete-looking report about something that appears accurate, and usually is, but then say "don't foget to completely redo it yourself checking every fact thoroughly", they just aren't going to do that. Not with the attention to detail they would have used if they had to do it all from scratch. Most people are going to rely on what these things tell them (or at least reduce their scepticism of it over time) whether they should or not.
 
Upvote
96 (97 / -1)