Erasing Authors, Google and Bing’s AI Bots Endanger Open Web

Erasing Authors, Google and Bing’s AI Bots Endanger Open Web

With the massive growth of ChatGPT making headlines every day, Google and Microsoft have responded by showing off AI chatbots built into their search engines. It’s self-evident that AI is the future. But the future of what? 

At Tom’s Hardware, we’re all about pushing the bounds of technology to see what’s possible so, on a technical level, I’m impressed with how human the chatbot responses look. AI is a powerful tool that can be used to enhance human learning, productivity and fun. But both Google’s Bard bot and Microsoft’s “New Bing” chatbot are based on a faulty and dangerous premise: that readers don’t care where their information comes from and who stands behind it. 

Built on information from human authors, both companies’ AI engines are being positioned as alternatives to the articles they learned from. The end result could be a more closed web with less free information and fewer experts to offer you good advice. 

Google demonstrated Bard in both a tweet and at a live-streamed event where it provided not just factual answers, but recommendations. The company was embarrassed when it turned out that one of Bard’s answers was factually incorrect, but the problems with Bard and New Bing go way beyond inaccuracy.

Giving Advice, Without the Advisor

At the live event, Google SVP Prabhakar Raghavan said that Bard was a good choice to answer queries that he categorized as NORA (No One Right Answer). “For questions like those, we probably want to explore a diverse range of opinions or perspectives and be connected to the expanse of wisdom of the web,” he said. “New generative AI features will help us organize complex information and multiple viewpoints right in search results.” 

The “expanse of the web,” actually means that the bot is appropriating data from millions of articles that were written by humans who are neither credited nor compensated for their work. But you wouldn’t know any of that from viewing Bard’s output. 

Raghavan showed Google answering “what are the best constellations to look for when stargazing?” Instead of showing the best articles from the web on this topic, the search engine displays its own mini article with its own set of recommended constellations. There are no citations to tell you where these “multiple viewpoints” came from, nor is there authority to back up the assertions it makes.

(Image credit: Google)

Bard says “There are many constellations you can look for the next time you’re stargazing. Here are a few popular ones.” How do we know that these are popular and how do we know that they are the best constellations? On whose authority is the bot saying this? Apparently, we’re just supposed to trust it. 

But trust is hard to come by. The company’s tweet showed an embarrassing situation where Bard gives a list of “new discoveries” from the James Webb Space Telescope (JWST), but one of the three bullet points is actually not a new discovery at all. According to NASA, the first pictures of an exoplanet were actually taken in 2004, long before the JWST launched in 2021.

(Image credit: Google)

A lot of critics will justifiably be concerned about possible factual inaccuracies in chatbot results, but we can likely assume that, as the technology improves, it will get better at weeding out mistakes. The larger issue is that the bots are giving you advice that seems to come from nowhere – though it was obviously compiled by grabbing content from human writers whom Bard is not even crediting.

Devaluing Expertise, Authority and Trust

Ironically, Google’s initiative goes against the key criteria the company says it prioritizes in ranking organic searches. The company advises web publishers that it considers E-E-A-T (Experience, Expertise, Authority and Trust) when it decides which articles are at the top of a page and which appear at the bottom. It takes into account criteria such as the background of the author, whether that author describes relevant experiences and whether that publication has a reputation for being trustworthy. By definition, a bot can meet none of these criteria.

When there is no one right answer, that’s when human content matters the most. We want advice from a trustworthy source, someone with expertise in the topic and a set of professional opinions based on experience. 

If we know the author, we can then make judgments about the trustworthiness of the information. This is why, when I go to family gatherings, relatives approach me and ask questions like “what laptop should I buy?” They could just do a web search for that question (and maybe they should), but they know me and believe that I won’t steer them wrong.

Google hopes that, in a world where everyone has an opinion and spouts off on social media, expertise no longer matters to users. The company expects you to just read whatever you see on the top of your screen, believe it as much as you do anything else and then stay on the site so you can continue viewing its ads, a source of revenue, uninterrupted. If you leave the search engine, you will be on someone else’s website where Google may or may not be in charge of serving the ads.

Bing Shows Sources But Buries Them

Bing’s new chatbot implementation is slightly better than Google’s in that it actually shows sources, but it buries them in tiny footnotes, some of which aren’t even visible unless you click a button to expand the answer.

(Image credit: Tom’s Hardware)

Even if the citations in Bing’s chatbot were more prominent, it is still offering advice we would be better served getting from a person with expertise. In the example above, the bot puts together a three-course vegetarian dinner with a chocolate dessert. But how do I know that these are the best recipes or the pairing of these dishes together is a good one?

I’d sooner trust the top search result, which was written by a human who has cooked before and thought about this set of dishes like a person would. To Bing’s credit, though, it puts its own content next to the organic search results, rather than pushing them down.

(Image credit: Tom’s Hardware)

You might argue that the demo topics Google and Microsoft chose are highly subjective. Some people will think Orion is the best constellation to look at while others will pick Ursa Major. And that’s exactly why the author matters. 

We’re all guided by our biases and, rather than trying to hide them, the best authors embrace and disclose them. If you’ve read some of my laptop reviews, you know that touch typing is important to me. I’ve tried out hundreds of laptop keyboards in my career; I can tell you which ones feel mushy and which are tactile and snappy. And keyboard comfort will factor into my recommendations if you ask me which models are the best ones. If you don’t care about keyboard quality or like spongy keys, you might decide to give my ratings (or that aspect of them) less weight. 

We don’t know whose biases sit behind the bot’s lists of recipes, constellations or anything else.  The data could even be drawn from commercial.

AI Bots Could Harm the Open Web

I’ll admit another bias. I’m a professional writer, and chatbots like those shown by Google and Bing are an existential threat to anyone who gets paid for their words. Most websites rely heavily on search as a source of traffic and, without those eyeballs, the business model of many publishers is broken. No traffic means no ads, no ecommerce clicks, no revenue and no jobs. 

Eventually, some publishers could be forced out of business. Others could retreat behind paywalls and still others could block Google and Bing from indexing their content. AI bots would run out of quality sources to scrape, making their advice less reliable. And readers would either have to pay more for quality content or settle for fewer voices. 

At the same time, it’s clear that these bots are being trained by indexing the content of human writers, most of whom were paid by publishers for their work. So when the bot says Orion is a great constellation to view, that’s because at some point it visited a site like our sister site Space.com and gathered information about constellations. How the models were trained is a black box, so we don’t know which exact websites led to which factual assertions, though Bing gives us some idea with its citations.

Whether what Google and Bing are doing constitutes plagiarism or copyright infringement is open to interpretation and may be determined in court. Getty Images is currently suing Stable Diffusion, a company that generates AI images, for using 10 million of its pictures to train the model. I can imagine a large publisher following suit. 

A couple of years ago, Amazon was credibly accused of copying products from its own third party sellers and then making their own Amazon-branded goods – which unsurprisingly come up higher in internal search. This sounds familiar. 

You could argue that Google’s LaMDA (which powers Bard) and Bing’s OpenAI engine are just doing the same thing that a human author might do. They are reading primary sources and then summarizing them in their own words. But no human author has the kind of processing power or knowledge base that these AIs do.

You could also argue that publishers need a better source of traffic than Google – and we do, but most users have been trained to treat search as their first stop on the Internet. Back in 1997, people would go to Yahoo, browse through a directory of websites, find ones that matched their interests, bookmark them and come back again. This was a lot like the audience model for TV networks, magazines and newspapers as most people had a few preferred sources of information that they would rely on regularly. 

But most Internet users have been trained to go to distribution platforms such as Google, Facebook and Twitter and navigate the web from there. You can like and trust a publisher, but you often get to that publisher’s website by finding it on Google or Bing, which you can reach directly from your browser’s address bar. It will be difficult for any publisher or even group of publishers to change this deeply ingrained behavior.

More importantly, if you are a consumer of information, you are being treated like a bot, expected to imbibe information without caring about where it came from or whether you can trust it. That’s not a good user experience. 

Note: As with all of our op-eds, the opinions expressed here belong to the writer alone and not Tom’s Hardware as a team. 

Add a Comment