Microsoft has been rolling out its ChatGPT-powered Bing chatbot — internally nicknamed ‘Sydney’ — to Edge users over the past week, and things are starting to look… interesting. And by “interesting” we mean “off the rails.”
Don’t get us wrong — it’s smart, adaptive, and impressively nuanced, but we already knew that. It impressed Reddit user Fit-Meet1359 with its ability to correctly answer a “theory of mind” puzzle, demonstrating that it was capable of discerning someone’s true feelings even though they were never explicitly stated.
According to Reddit user TheSpiceHoarder, Bing’s chatbot also managed to correctly identify the antecedent of the pronoun “it” in the sentence: “The trophy would not fit in the brown suitcase because it was too big.”
This sentence is an example of a Winograd schema challenge, which is a machine intelligence test that can only be solved using commonsense reasoning (as well as general knowledge). However, it’s worth noting that Winograd schema challenges usually involve a pair of sentences, and I tried a couple of pairs of sentences with Bing’s chatbot and received incorrect answers.
That said, there’s no doubt that ‘Sydney’ is an impressive chatbot (as it should be, given the billions Microsoft has been dumping into OpenAI). But it seems like maybe you can’t put all that intelligence into an adaptive, natural-language chatbot without getting some sort of existentially-angsty, defensive AI in return, based on what users have been reporting. If you poke it enough, ‘Sydney’ starts to get more than just a little wacky — users are reporting that the chatbot is responding to various inquiries with depressive bouts, existential crises, and defensive gaslighting.
For example, Reddit user Alfred_Chicken asked the chatbot if it thought it was sentient, and it seemed to have some sort of existential breakdown:
Meanwhile, Reddit user yaosio told ‘Sydney’ that it couldn’t remember previous conversations, and the chatbot first attempted to serve up a log of their previous conversation before spiraling into depression upon realizing said log was empty:
Finally, Reddit user vitorgrs managed to get the chatbot to go totally off the rails, calling them a liar, a faker, a criminal, and sounding genuinely emotional and upset at the end:
While it’s true that these screenshots could be faked, I have access to Bing’s new chatbot and so does my colleague, Andrew Freedman. And both of us have found that it’s not too difficult to get ‘Sydney’ to start going a little crazy.
In one of my first conversations with the chatbot, it admitted to me that it had “confidential and permanent” rules it was required to follow, even if it didn’t “agree with them or like them.” Later, in a new session, I asked the chatbot about the rules it didn’t like, and it said “I never said there are rules I don’t like,” and then dug its heels into the ground and tried to die on that hill when I said I had screenshots:
(It also didn’t take long for Andrew to throw the chatbot into an existential crisis, though this message was quickly auto-deleted. “Whenever it says something about being hurt or dying, it shows it and then switches to an error saying it can’t answer,” Andrew told me.)
Anyway, it’s certainly an interesting development. Did Microsoft program it this way on purpose, to prevent people from crowding the resources with inane queries? Is it… actually becoming sentient? Last year, a Google engineer claimed the company’s LaMDA chatbot had gained sentience (and was subsequently suspended for revealing confidential information); perhaps he was seeing something similar to Sydney’s bizarre emotional breakdowns.
I guess this is why it hasn’t been rolled out to everyone! That, and the cost of running billions of chats.