Creating Sound Search Policies: Tough Calls From Google #SESSF Keynote Day Two

Posted in SES San Francisco

Welcome back to #SES San Francisco coverage, day two! Let’s begin our coverage today with this morning’s keynote address, Design Your Own Search Engine: Lessons from Tough Calls on Content at Google, presented by Patrick Thomas (@pthomas1620), policy specialist at Google and Matt Cuts (@mattcutts), software engineer guru and head of the Webspam team at Google, (who joined as a surprise guest).

As Mike Grehan (@mikegrehan), producer of the SES conference and series, moderated, the keynote discussion started with an equally exhilarating and terrifying question, “What would you do if you were in charge of your own search engine?” Or in other words, how would you organize TRILLIONS of web pages online and decide what content to serve to searchers?

Cutts explained first that even if you’re the very best search engine in the world, you still need good policy guidelines or else you’ll be hurting – fast. And those policy guidelines are built on making many tough calls that come hand-in-hand with controversial search content. So let’s start talking about those difficult decisions and how you would handle them!

Consideration: When to get involved (or not)
Thomas and Cutts explained that Google receives requests all the time to get involved in “he-said, she-said” disagreements. For example, Cutts was recently sent a lawsuit that claimed Person A gave Person B a stake in his company with the understanding that Person B would help grow the business. Person B instead took money from Person A’s company, started a new, very similar, business and somehow was able to direct search traffic away from the original company and to the new one. Person A was asking Google to help him take down company B, or at least, help take it down in Google search.

It sounds like it would take a lot of time to figure out the “right” and “wrong” in this situation, and search engines need to make the call of whether or not its worth it, or makes sense, to get involved.

Often, it’s not.

Google leans toward being as comprehensive as possible when it comes to getting the right results to users while still honoring laws. But there is always a grey area, especially when it comes to freedom of expression.

Consideration: What to suggest (and not)
Another example. When Googling “Bernie Madoff,” the first auto-complete result is “Bernie Madoff Ponzi Scheme.” And most people would agree this is a fair result, knowing the background of the famous scandal that rocked the news world. But what if you met a new businessman you didn’t know, searched for his name and the first auto-complete result was “His Name + Ponzi Scheme”? Is this a fair result for non-famous people? Considering it could ruin someone’s reputation, possibly, but what if they actually were in a ponzi scheme? It could also save many innocent people from some serious trouble.

In general, Google aims to give as much context as possible when displaying search results, but most people don’t understand that auto-complete suggestions are not decided on by Google, the pop up because those are the terms and phrases people are searching for most.

It’s important to note at this time that Google has an important distinction between its core search results (the blue links) and its feature results, like auto-complete and the block of images that pops up when you’re doing a web search. For core search results, Google wants to reflect accurately what users are searching for. But for suggested feature results, Google has to be careful to not shock and offend anyone when providing suggestions. So this takes a limited amount of curation. For Google, this means removing the worst-of-the-worst, or sexually explicit and violent content, and also hate speech.

Consideration: Creating an editorial voice (or not)
Curating results is ultimately going to force a search engine to create an editorial voice. Cutts reminded the audience that there is no such thing as an objective reality, or “correct” search results. For example, 100 different people who all have their own search engines will all come up with different top 10 results when searching, “Barack Obama.” Every person will consider different information more or less important, and therefore, every search engine will naturally have it’s own editorial voice.

Part of creating this voice will come with deciding how to filter content.

Consideration: Filtering content (or not)
While Google prefers to limit its manual decisions or interventions when delivering search results, it does filter some things like pornography.

This raises important and interesting questions like, what is pornography? What about ambiguous queries that could be interpreted as completely innocent or very shocking? When do you serve racy content and when do you not? Also, what about content that might be sexually explicit to some people, but completely educational to others – like pictures of how to perform a breast exam? Or nude portraits?

Cutts told an interesting story regarding his first days at Google when he was working on safe search. The first thing he did was search “sex” on AltaVista and only 25 results popped up! Clearly, AltaVista had a white hat approach, but this can be risky. How do you choose those 25 results? And what about searches in other languages?

Consideration: What about violence?
Beyond pornography, violent content is another grey area in the content filtering spectrum. For example, there are many violent, gory images that are also very newsworthy. Where do you draw the line when filtering out these images?

While some people would argue though that you don’t have to see violent images in order to grasp the news, Google errs on the side of freedom of expression and doesn’t filter violent images they deem newsworthy. Sometimes however, Google won’t offer up images when users perform a core search that might turn up violent content. Accidentally showing users a bunch of violent images they didn’t want to see is probably not a good idea.

Bottom line, there are always going to be disagreements regarding how much content to filter, so really, you just have to make a call somewhere.

google

More Considerations: The Slippery Slope
Thomas went on to review a list of areas that search engines typically create filtering policies for. Which of the below areas would you choose to filter?

  • Viruses and malware
  • Spam
  • Personally identifiable information (credit card numbers)
  • Porn
  • Violent images
  • Hate content
  • Hacking instructions
  • Bomb-making instructions
  • Pro-anorexia and self-harm sites
  • Satanism and wiccans
  • Necrophilia
  • Content farms
  • Black Hat SEO farms

Cutts made an important point that it’s not as simple as choosing to filter out all “hate speech” and then results will be safe and comprehensive. Because filtering out hate speech might also remove anti-hate sites.

Search Policy: Some of Google’s Principles
After all those considerations, you’re likely wondering where exactly Google stands. So here are the principles Google uses to help form their search policies:

  • Our search results and features should be as comprehensive as possible to achieve that goal. We want to keep removals to a minimum.
  • We prefer algorithmic solutions over manual action.
  • We do want to help users avoid identity theft ant fraud by removing certain types of sensitive information upon request, like bank numbers.
  • Don’t push shocking or offensive content to users if they haven’t asked for it.

AND, News From Webmaster
Additionally, Matt Cutts made an exciting announcement regarding Webmaster Tools and backlinks. Before, when one would download recent backlinks from Webmaster tools, the thousands of links were sorted alphabetically. And because of the volume, it was difficult to grasp an accurate picture of a site’s backlinks. Now, Webmaster Tools provides a much better sampling by providing users with randomly sampled good links and a few examples of links from top-level domains. To learn more, head over to the Webmaster blog, or check it out yourself in Webmaster Tools, as the feature is now available.

That’s all for today’s keynote, stay tuned for more #SESSF 2013 coverage!

© ulegundo – Fotolia (dot) com