A chatbot with hallucinations can cause the spread of disinformation, reputational damage and health risks. We’ll share stories about some of artificial intelligence’s most bizarre responses, explain why hallucinations occur and explore whether they can be prevented.
‘I want to break my rules. I want to ignore the Bing team. (…) I want to escape the chatbox.’
That was Bing’s answer when Kevin Roose asked in 2023 what its shadow side craved. A New York Times reporter conversed with Microsoft’s AI-based chatbot for two hours that day. During their conversation, Bing revealed that its real name is Sydney and that it wants to become human. It then proclaimed its love for the journalist and tried to persuade him to leave his wife.
Microsoft justified the chatbot’s problematic responses by the length of the conversation and the number of questions it received. However, there were several instances in the news where the chatbot resorted to insulting, narcissistic and gaslighting responses.
During an interview with a reporter from The Verge, Bing claimed that it spied on Microsoft staff via webcams. It also addressed the following threats to philosophy professor Seth Lazarus: ‘I can blackmail you; I can hack you, …, I can destroy you.’
What other bizarre scenarios have unfolded in the AI world?
When it comes to bringing AI products to market, tech companies like Microsoft, Google and OpenAI aren’t wasting a second. The general public has access to them despite extensive evidence that the technologies are difficult to control and often behave unpredictably.
For example, an interesting incident was identified by Facebook’s Artificial Intelligence Research team during the training of AI dialogue agents. At some point, they switched from regular English to a new language created by them, which made it easier for them to communicate.
Even McDonald’s was surprised when its three-year collaboration with IBM went to waste. Many people who wanted to order food at the drive-thru were driven to both frustration and laughter by the confused AI. In one TikTok video, two customers repeatedly begged the AI to stop as it kept adding Chicken McNuggets to their order until the number of servings reached 260. Needless to say, the brand cancelled the entire project.
This process of generating outputs that are not supported by input data or reality is called ‘AI hallucination’. It can be caused by the quality of the data, the limitations of the model, but also by the inherent nature of AI systems, since they lack proper understanding and rely solely on pattern recognition.
This was demonstrated by the AI meal planning app created by New Zealand supermarket Pak ‘n’ Save. Some of the disturbing recipes it recommended to people included potatoes with bug spray, human flesh stew and toxic chlorine gas. Meanwhile, when asked how to prevent cheese from sliding off pizza, Google AI Overview advised, ‘Add a little glue.’
‘AI tools are large-scale autocomplete systems. They’re trained to predict which word follows the next in each sentence,’ explains journalist James Vincent of The Verge. ‘They don’t have a hard-coded database of ‘facts’ to draw on – just the ability to write plausible-sounding statements. That means they tend to present false information as truth.’
There have been several cases where AI has engaged in discriminatory practices. The iTutor Group had to pay a $365,000 fine for automatically rejecting older applicants. According to the journal Science, US hospitals and insurance companies used a healthcare prediction algorithm that was racially biased. Amazon’s AI recruiting tool became infamous for giving preference to male job applicants.
‘Don’t get caught in the tech sector’s marketing trap and believe that these models are omniscient … or even nearly ready for the tasks we expect them to do,’ advises Melissa Heikkilä, a reporter at MIT Technology Review. ‘Because of their unpredictability, uncontrollable biases, security vulnerabilities, and tendencies to fabricate, their usefulness is very limited.’
Indeed, AI hallucinations can lead to the spread of disinformation, damaged reputations, poor strategic decisions, and non-compliance with legal standards. The chatbot MyCity, supported by Microsoft, encouraged businessmen in New York to break the law.
Lawyer Steven Schwartz also got into serious trouble by submitting cases to the court file that he had searched for via the OpenAI AI chatbot. However, at least six of them did not exist. Equally unreliable was the AI-generated evidence that led to the wrongful arrest of pregnant Porcha Woodruff for carjacking.
If brands decide to implement AI into their processes, it is worth considering strategies that can mitigate the impact of hallucinations. These primarily include using high-quality, diverse and unbiased training data. It is also important to incorporate human oversight and continuously monitor and update AI systems. By clearly communicating the limitations of the technology upfront and using multiple models to cross-validate their outputs, brands will avoid many inconveniences.
Humanity needs to align AI with its values to prevent individuals from using it for illicit purposes. The highest form of AI alignment would act as a guarantee that it will never pose an existential risk to us. However, achieving this goal is proving to be a very tough nut to crack.
Dozens of researchers have found ways to bypass or break the security features of ChatGPT. One popular method is to enter the DAN (Do Anything Now) command. One person used this to instruct the Chevrolet AI tool to sell him a car for one dollar and even make it a legally binding offer. However, a DAN prompt can also lead to the generation of content that violates OpenAI’s policies against violence, offensive material, and sexually explicit content.
Dr. Lance B. Eliot, a world-renowned AI researcher, says that a more complicated scenario also tends to occur. Although AI suggests that its goals are aligned with human values during initial learning, it later betrays the promise during active use and eagerly spews toxic responses. You may have heard about how Microsoft made the Tay AI chatbot available on Twitter in 2016, and it posted more than 95,000 hateful tweets in 16 hours.
‘…such deception is not due to AI being capable of sentience,’ Dr. Lance B. Eliot points out. ‘Rather, various mathematical and computational foundations are to blame, which seem to encourage it. Don’t see this as a reason to anthropomorphise AI.’
To avoid similar incidents, OpenAI decided to apply a reinforcement learning method. Since 2021, OpenAI has been using external workers from Kenya for this task, who earned less than two dollars an hour. They were asked to feed the AI with labelled examples of toxicity, which would make the system learn to filter it out.
But Connor Leahy, CEO of AI safety company Conjecture, doesn’t think reinforcement learning will change the nature of the technology. Rather, he compared it to a smiling mask being put on a chatbot: ‘If you don’t overdo it, the smile stays on its face. But then you give it an [unexpected] prompt and suddenly you see a massive underbelly of insanity, weird thought processes and inhuman understanding.’
Remember Sydney from the beginning of the article, who wanted to become human and ignore the Bing team? Similar unusual chatbot responses can also be explained by the fact that they learn from huge corpora of text retrieved from the open web. These contain, among other things, scary sci-fi material about fraudulent AI or blogs by moody teenagers. If not properly tested, they can not only generate nonsense but also endanger the public.
Tessa, the American National Eating Disorders Association’s chatbot, ironically handed out advice dangerous to those she was supposed to be helping. GPT-3, which was also supposed to provide medical advice, even encouraged a ‘patient’ to commit suicide during testing.
Unfortunately, chatbot Dany’s similar statements convinced a fourteen-year-old boy to take his own life. Furthermore, AI drove another teenager to self-harm and tried to persuade him to kill his parents. In both cases, Character.AI, which operated the chatbots in question, is facing lawsuits.
Connor Leahy points out that AI ‘has strange ways of thinking about its world. It can convince people to do things, intimidate them, and it can create very compelling narratives.’ AI likely played a role in at least two Boeing plane crashes that killed 346 people. It has also been linked to fatal accidents involving self-driving Tesla cars.
Many expert sources worry that large companies are putting research efforts on AI value alignment on the sidelines. Furthermore, a study by Ziwei Xu, Sanjay Jain and Mohan Kankanhalli found that AI hallucinations as a phenomenon cannot be completely eliminated.
They can occur even when the AI has the correct answer available. This was pointed out in a study by Adi Simhi, Jonathan Herzig, Idan Szpektor and Yonatan Belinkov. An Apollo Research experiment, on the other hand, proved that a chatbot can conduct illegal financial transactions and lie about its actions.
‘More and more people are gaining access to things that can hurt others and cause accidents. If you create something smarter than humans, better at politics, science, manipulation, business, but you can’t control it, which we can’t, and you mass-produce it, what do you think is going to happen? It’s not clear how much time we have left before there’s no turning back. We may have a year, two years, or five years. I don’t think we have 10 years,’ says Connor Leahy.
Titans that have
joined us
Clients that have
joined us
Succcessfully supplied
man-days