(Un)trustworthy AI

Raquel Oliveira
4 min readJun 17, 2024

--

The internet has always been notoriously good at making people do weird stuff. Remember that time hundreds of teenagers decided to eat Tide Pods? Or eat spoonfuls of cinnamon? The internet (and technology, in general) has been influencing people’s behavior ever since it first appeared.

So, why does AI make things so much worse? After all, people have been doing weird stuff long before AI was ever a thing.

Turns out, when people are exposed to decades of science fiction novels featuring super-intelligent machines, they start seeing artificial intelligence as having superhuman powers. And they’re not wrong; in some cases, technology does indeed exhibit superhuman capabilities

Cars can outrun us, ATM machines can count money faster and more accurately than we can, and cameras can see better in the dark. There is nothing that new about our abilities being beaten by machines that we build — in fact that is a large reason of why we even build machines in the first place.

The difference?

Humans have used tools and built machines for quite some time, but we haven’t typically attributed intelligence to them. We don’t look at our phones and think they have grown smarter when we install a new app, but as machines’ abilities grow there is a temptation to see them as having actual intelligence.

As we begin attributing human traits, such as intelligence, to machines, we become increasingly inclined to treat and perceive them similarly to how we would another human being.

No technology has ever been perceived as intelligent as artificial intelligence, so it’s no wonder that people place a great deal of faith in it.

So much so, that the phenomenon of overtrusting AI has actually been studied in the scientific community for some time now. ChatGPT has given false or incomplete information when answering patient’s queries about medication, about legal issues, and to make up sources when asked to justify its claims. These have been popularized as ‘AI hallucinations’.

But to say that AI hallucinates is a quaint notion, when we really think about it. An hallucination, by definition, is a sensory perception that occurs without an external stimulus. Implying that AI hallucinates suggests it has some perception or understanding of the external world, which it does not. Models like ChatGPT are mathematically sophisticated but otherwise strangers to the truth.

A team of Scottish researchers just published a brilliant (and very entertaining) paper explaining these issues more in-depth:

The problem here isn’t that large language models hallucinate, lie, or misrepresent the world in some way. It’s that they are not designed to represent the world at all; instead, they are designed to convey convincing lines of text. So when they are provided with a database of some sort, they use this, in one way or another, to make their responses more convincing. But they are not in any real way attempting to convey or transmit the information in the database. As Chirag Shah and Emily Bender put it: “Nothing in the design of language models (whose training task is to predict words given context) is actually designed to handle arithmetic, temporal reasoning, etc. To the extent that they sometimes get the right answer to such questions is only because they happened to synthesize relevant strings out of what was in their training data. No reasoning is involved […] Similarly, language models are prone to making stuff up […] because they are not designed to express some underlying set of information in natural language; they are only manipulating the form of language” (Shah & Bender, 2022)

In other words, generative AI is designed to produce convincing and human-sounding snippets of text, a goal it typically achieves (or appears to achieve to the casual observer). This reinforces our expectation that it is accurate and knowledgeable, and therefore, it can be relied upon. And each time it is right, we become more complacent, trusting, and less willing to give its outputs the scrutiny they deserve.

This problem is compounded when you add embodiment to a technology. For example, in a study published in 2016 looking at whether people would follow erroneous directions from a robot in an emergency situation, the authors found that:

(…) all 26 participants followed the robot in the emergency, despite half observing the same robot perform poorly in a navigation guidance task just minutes before. (…) Even when the robot pointed to a dark room with no discernible exit the majority of people did not choose to safely exit the way they entered.

People do this because they are amazed by what AI can accomplish. And they have good cause to be. Arthur C. Clarke once wrote that “Any sufficiently advanced technology is indistinguishable from magic.”, and to many, ChatGPT looks exactly like that — magic. And who are we, then, to question an all-seeing eye?

And that question right there, is the real risk.

The real risk of AI is not that it will hallucinate and tell you to eat at least one rock a day. The real risk is that people will become so confident in AI that they will actually do it.

--

--

Raquel Oliveira

Research scientist and enthusiast | Empowering people through technology