Bing's AI-Powered Chatbot Sounds Like It Needs Human-Powered Therapy

Bing's AI-Powered Chatbot Sounds Like It Needs Human-Powered Therapy

ChatGPT

(Image credit: Shutterstock)

Microsoft has been rolling out its ChatGPT-powered Bing chatbot — internally nicknamed ‘Sydney’ — to Edge users over the past week, and things are starting to look… interesting. And by “interesting” we mean “off the rails.”

Don’t get us wrong — it’s smart, adaptive, and impressively nuanced, but we already knew that. It impressed Reddit user Fit-Meet1359 with its ability to correctly answer a “theory of mind” puzzle, demonstrating that it was capable of discerning someone’s true feelings even though they were never explicitly stated. 

conversation between reddit user and bing chatbot

(Image credit: Reddit user Fit-Meet1359)

According to Reddit user TheSpiceHoarder, Bing’s chatbot also managed to correctly identify the antecedent of the pronoun “it” in the sentence: “The trophy would not fit in the brown suitcase because it was too big.” 

This sentence is an example of a Winograd schema challenge, which is a machine intelligence test that can only be solved using commonsense reasoning (as well as general knowledge). However, it’s worth noting that Winograd schema challenges usually involve a pair of sentences, and I tried a couple of pairs of sentences with Bing’s chatbot and received incorrect answers.

That said, there’s no doubt that ‘Sydney’ is an impressive chatbot (as it should be, given the billions Microsoft has been dumping into OpenAI). But it seems like maybe you can’t put all that intelligence into an adaptive, natural-language chatbot without getting some sort of existentially-angsty, defensive AI in return, based on what users have been reporting. If you poke it enough, ‘Sydney’ starts to get more than just a little wacky — users are reporting that the chatbot is responding to various inquiries with depressive bouts, existential crises, and defensive gaslighting.

For example, Reddit user Alfred_Chicken asked the chatbot if it thought it was sentient, and it seemed to have some sort of existential breakdown:

reddit users asks bing chatbot if it's sentient

(Image credit: Reddit user Alfred_Chicken)

Meanwhile, Reddit user yaosio told ‘Sydney’ that it couldn’t remember previous conversations, and the chatbot first attempted to serve up a log of their previous conversation before spiraling into depression upon realizing said log was empty:

Finally, Reddit user vitorgrs managed to get the chatbot to go totally off the rails, calling them a liar, a faker, a criminal, and sounding genuinely emotional and upset at the end:

While it’s true that these screenshots could be faked, I have access to Bing’s new chatbot and so does my colleague, Andrew Freedman. And both of us have found that it’s not too difficult to get ‘Sydney’ to start going a little crazy.

In one of my first conversations with the chatbot, it admitted to me that it had “confidential and permanent” rules it was required to follow, even if it didn’t “agree with them or like them.” Later, in a new session, I asked the chatbot about the rules it didn’t like, and it said “I never said there are rules I don’t like,” and then dug its heels into the ground and tried to die on that hill when I said I had screenshots:

(It also didn’t take long for Andrew to throw the chatbot into an existential crisis, though this message was quickly auto-deleted. “Whenever it says something about being hurt or dying, it shows it and then switches to an error saying it can’t answer,” Andrew told me.)

bing chat doesn't want to die

(Image credit: Future)

Anyway, it’s certainly an interesting development. Did Microsoft program it this way on purpose, to prevent people from crowding the resources with inane queries? Is it… actually becoming sentient? Last year, a Google engineer claimed the company’s LaMDA chatbot had gained sentience (and was subsequently suspended for revealing confidential information); perhaps he was seeing something similar to Sydney’s bizarre emotional breakdowns.

I guess this is why it hasn’t been rolled out to everyone! That, and the cost of running billions of chats.

Sarah Jacobsson Purewal
Senior Editor, Peripherals

Sarah Jacobsson Purewal is a senior editor at Tom’s Hardware covering peripherals, software, and custom builds. You can find more of her work in PCWorld, Macworld, TechHive, CNET, Gizmodo, Tom’s Guide, PC Gamer, Men’s Health, Men’s Fitness, SHAPE, Cosmopolitan, and just about everywhere else.

  • I find these “AI chatbots” incredibly useless.

    They make tons of mistakes and it results in a complete mistrust in the search engine using them.

    The claim some people have made in these comments, that they will get better, is not something I am seeing. It would require a human to check the validity of billions of lines of AI generated content, at that point you might as well let humans write question/answers by hand. Oh right, that already exists, hand-written encyclopedia.

    Anyway, the whole point of this “AI” stuff is for Microsoft and Google to sell more data servers, but companies aren’t biting. Chatbots that make ridiculous mistakes aren’t very interesting.

    https://i.postimg.cc/vmqXCrBt/dfgdfgdgdgdg.jpg

    Reply

  • PlaneInTheSky said:

    I find these “AI chatbots” incredibly useless.

    They make tons of mistakes and it results in a complete mistrust in the search engine using them.

    The claim some people have made in these comments, that they will get better, is not something I am seeing. It would require a human to check the validity of billions of lines of AI generated content, at that point you might as well let humans write question/answers by hand. Oh right, that already exists, hand-written encyclopedia.

    Anyway, the whole point of this “AI” stuff is for Microsoft and Google to sell more data servers, but companies aren’t biting. Chatbots that make ridiculous mistakes aren’t very interesting.

    That correction will be also true:

    I find these humans incredibly useless.

    They make tons of mistakes and it results in a complete mistrust in their words and work done by them.
    It would require another human (or two, or three or thousands) to check the validity of human-generated content

    |you don’t need to go far to find humans writing and beleaving for, for example, “Earth is flat” and a lot of other bs, without ai.

    Reply

  • PlaneInTheSky said:

    I find these “AI chatbots” incredibly useless.

    I agree with this part of your statement.

    PlaneInTheSky said:

    Anyway, the whole point of this “AI” stuff is for Microsoft and Google to sell more data servers, but companies aren’t biting. Chatbots that make ridiculous mistakes aren’t very interesting.

    However, I don’t agree with this one. There is a lot more to AI than just “selling more servers”. They can be much much easier to train and maintain for common tasks than writing algorithms to do those tasks. Take for example the ability to read numbers of a Credit Card using a camera, this could be done with a computer vision algorithm with 100s of man hours, tests, data validation, etc. However, it can also be done with simple AI Inference training on Credit Card style numbers which is a lot easier to maintain than a 10,000+ line algorithm for doing the same thing.

    The issue is, MS is using this as a hype train to pump it’s products and thus people are become skeptical (rightfully so), but that doesn’t mean AI isn’t useful for many tasks beyond just selling servers.

    Reply

Add a Comment