Google

The screenshots below are from a video produced by Yelp.com and TripAdvisor.com. They show what Google local search results would look like if powered by local review sites. According to the study published at the website FocusOnTheUser.eu (FOTU), 23% of users prefer local search results powered by local review sites. What do you think?

Here is an example of FOTU's proposed local results for the query [hotel bilbao spain].
Yelp TripAdvisor EU proposed results

As you can see, local results powered by local review sites lack important details local searchers want. Because these results use relevancy algorithms instead of localized algorithms, the wrong kinds of businesses and businesses in the wrong location will appear frequently.

Here is another example from the FOTU website. In this case Google local results for [pediatrician nyc] powered by Zocdoc.com.
ZocDoc FocusOnTheUser.eu results

Clearly the results in the screenshots above are not better for local searchers than Google local search results. How did FocusOnTheUser.eu reach the opposite conclusion?

The study is based on results from previous studies. One study was debunked earlier this year by SearchEngineLand.com's Founding Editor Danny Sullivan. The other study found no significant increases when local results were setup like the ones in the FOTU video.

Usability experts and Google agree, when you know the answer to a user's query it is best not to make users "click anywhere." That is the idea behind Google "instant" search results and why Google does not make users click local results when it "knows" what the user wants. When users mouse over local search results at Google, the local information users want is displayed on the right side of the page.

Google Local SERPS

It is impossible to compare Google local results without testing page functionality. FOTU did not address, test or measure Google local page functionality. The site claims to "preserve" a way for users not to have to click to have to find information but provides no guidance for finding missing details. Even after clicking the links from the video, I could not find phone numbers or other important details. Instead of testing how Google local search results actually work, FOTU tested something else.

Instead of measuring success with user focused metrics, FOTU measured success as increased click through rates. The tools they used will tell you what users did but not what they intended to do. Users typically do not click away from information they intend to find so clicks are not always the best measure of utility.

At 5:50 seconds into the video, FocusOnTheUser.eu presents the results of its study. According to the video, the site proved local searchers prefer the proposed results over Google by 23%. These results are based on a "significant increase" in "click engagement" but where?

A number of clicks went to links created by the tool used for the study. These links do not exist in real world local search results.

Some clicks went to a cafe in a city 18 miles away. Other clicks went to a cafe clearly marked as being CLOSED. Still other clicks went to a duplicate of the first listing with different ratings and reviews. When you add all of these clicks together and add clicks caused by missing map pins you come up with almost 23%.

According to the site, ratings and reviews are especially important to users. Ratings and reviews are important but only when they are for things users want to find. FOTU claims that having reviews and ratings exclusively provided by Google raises critical questions. That said none of the data provided appears to reflect any clicks for ratings or reviews. In fact, most if not all of the clicks shown went to local business websites.

FOTU claims "Google promotes search results drawn from Google+ ahead of the more relevant ones you would get from using Google's organic search algorithm." I will talk about that more in a minute but the FOTU widget proves that Google does not promote Google+ ahead of other results. The FOTU widget does however exclude all sites other than Google and local review websites.

FOTU wants Google to use standard relevancy algorithms for localized search results but Google local search results are based primarily on relevance, distance and prominence," not just relevance. FOTU provides a widget to demonstrate the differences between Google's search algorithm and the Google Maps algorithm used for local search queries. I used the widget for the screen shots below. As shown below, using standard algorithms for localized search queries is not in the best interest of users.

FOTU results

Yelp.com, TripAdvisor.com and other local review websites are not local businesses and do not have local business addresses. For that reason these sites probably should not rank ahead of true local businesses. The FOTU widget essentially excludes all local business websites from appearing in local search results.

At the end of the day, these local sites want a "single conspicuous" link "directly to" their site from Google local results. It is important to remember that these are the same sites that blocked Google in the past and forced Google to invest billions of dollars in local search. Now they want Google to change things for them even though nothing is to stop them from doing the same thing again.

Before governments force companies to change things based on allegations from potential competitors, I think it is important for an unbiased investigation be conducted.

 

According to Google, "everything is going Google+" but, few search marketers truly understand what that means. Here are a few points to help bring everyone up to speed.

 

Google+ Sign In:

Even though keyword level data for signed in users is "Not Provided" in Google Analytics, Google's goal is to increase the number of signed in user searches.

According to a recent Google Jobs post:

Google+ Signed in users

"The mission of the search growth marketing team is to make that information universally accessible by enabling and educating users around the world to search on Google, search more often, and search while signed-in. Research and analysis has shown that putting Google search access points at the fingertips of users is an effective way of achieving these goals. And the more users that are signed in to Google, the better we can tailor their search results and create a unified experience across all of the Google products that they use."

When users are signed in, Google can better tailor search results and better target ads. Better ads and better search results increase Google's market share not to mention ad revenue. Google+ is one of many programs intended to help increase signed in users.

Google+ Links:

In order to return relevant search results for human users based on what is important to human users, Google needs access to analyze content and links created by humans.

When Google and its "secret sauce" PageRank algorithm were originally developed, the web was a very different place than it is today. At that time, blogs, Tweets and Facebook did not exist. In the late 1990's, content and links tended to be created by humans and both were freely accessible to Google's crawlers. Back then important websites were "likely to receive more links from other websites." As a result, Google was able to leverage the "citation graph" of the internet to measure "importance" based on "people's subjective idea of importance."

Today, content and links tend to be created by software and not by humans. The best place to find high quality human made content and links today is deep within the password protected confines of social media websites. These issues are both problematic for Google because most social media sites prevent Google from accessing high quality content and links.

For all the skeptics, Google does appear to have billions of Facebook pages indexed. That being said, many of the Facebook pages that Google has indexed are duplicate content from Wikipedia, Facebook and other sources. In cases where Facebook pages are accessible to Google crawlers, outbound links are almost always password protected, nofollowed, disallowed via robots.txt or links to internal Facebook pages which cannot be crawled. As a result, Google is limited to extracting only external Facebook content and a few social media signals which can easily be spammed.

Google+ is like the internet used to be before social media websites existed and PageRank ruled the land. Google+ Ripples even provides a visual representation of impact factor like data similar to PageRank. PageRank or not, Google+ is a place where human made content and links are accessible to Google. According to Google, Google+ represents the "unification of all of Google's services with a common social air." This "social air" makes Google+ a place where more important websites are still likely to receive more links than less important websites. Google+ is a new "citation graph" where Google can once again crawl human crafted content and links to measure page importance based on people's subjective ideas about importance.

Google+ Spam Prevention:

Even if Google's crawlers could access the highest quality human crafted content and links on social media sites, fake content, reviews and unnatural link spam are of little value to Google. Without access to social media user account data, detecting these types of spam can be difficult.

According to anti-spam software experts, 40% of social media profiles are spam and by 2014 as many as 15% of reviews on social media sites are expected to be fake. In order to help address these issues, on March 1, 2012 Google moved to a single unified privacy policy across all Google properties. With this new level of shared data, Google's Spam & Abuse Team (the same team that handles GMail spam) has the most advanced systems in existence at its disposal to fight spam on Google+. Google+ has been designed to provide Google's Spam & Abuse Team with an almost endless selection of potential spam detection signals.

For example and without going into too much detail, Google accounts that frequently send and receive GMail, participate in Google+ Hangouts, watch YouTube videos and that are associated with an Android phone that moves around town, might be considered legitimate. On the other hand, if several accounts are associated with the same IP address and one is used to spam Blogger with duplicate blog posts authored by an associated account, each account could be considered untrustworthy.

It is difficult say for sure which signals Google is currently using, but with Google+ the potential for future spam signals is nearly unlimited. Spam, ranking manipulation, impersonation, deceptive behavior, fake profiles and adding people to circles too aggressively are all violations of Google+ guidelines.

Google+ Identification:

In order for content to be authoritative and trustworthy, its source must be identifiable. At the same time, spammers usually setup multiple accounts using fictitious identities.

Google CEO and Co-Founder Larry Page has stated "It's really important to know the identity of people so you can share things and comment on things and improve the search ecosystem, you know, as you and as a real person. I think all those things are absolutely crucial. That is why we have worked so hard on Google+, on making it an important part of search."

Google+ was initially developed as an "identity service." The success of Google+ depends on users using their real name. Real names are entities and Google can use entity related data to infer additional information. This type of data can be especially helpful when it comes to returning better search results for queries where expertise is required, and for queries about a specific individual where multiple individuals have the same name.

According to Google, "The internet would be better if we knew you were a real person rather than a dog or a fake person. Some people are just evil and we should be able to ID them and rank them downward." In order to set up a Google+ Profile or Google+ Page for business, Google requires your "common name". In some cases, Google may require an image of the user's drivers license, proof of identification and/or references to verify a user's name as well as his/her identity. For an author's picture to appear in Google search results, Google requires authors to provide a "recognizable headshot" photo. Images like these not only help searchers recognize authors, they can also by used by Google facial recognition software in various ways to help fight spam.

For example, in the near future expect to see Google roll out Google+ custom URLs for a nominal fee, paid by credit card. Because credit card transactions are one method for verifying a users identity, this approach allows Google to verify the identities of multiple users in a short time at scale.

Google believes that, "letting authors verify their name helps increase their credibility and trustworthiness in the eyes of their readers." In addition to name verification, Google+ provides tools for identity verification that Google can use to combat various forms of entity authentication fraud.

Google+ User Data:

Google can only collect personal information from users who are willing to provide personal information. According to a former Google employee, "Google could still put ads in front of more people than Facebook, but Facebook knows so much more about those people. Advertisers and publishers cherish this kind of personal information, so much so that they are willing to put the Facebook brand before their own."

Google+ allows Google to ask users for personal information that otherwise could not be collected. Without Google+, Google would have no reason to collect personal data like relationship status, employment, occupation, education or places lived. In addition to collecting direct user data, Google+ collects indirect user data from Google +1 buttons. Google +1 buttons have been widely adopted and are currently embedded within billions of webpages. According to Google, +1s provide contextual value when users are in the market for a particular product. It only stands to reason that +1s also allow Google to collect sentiment related data. Once collected, Google can translate this new gold mine of user data into increased ad revenue through targeted ads for signed in users.

As you can see, Google+ is far more than just another social network!

Search engines have focused on simply "matching keywords to queries" for years. This approach is slightly problematic however, because it disassociates keyword meanings for multiple keyword queries. For example, search engines might interpret the query [Paris Hilton] (a proper noun and named entity) as simply a request for instances where the words "hilton" and "paris" appear within a page. With a large enough set of data, fortunately it is possible to make statistical inferences about the intent of a user's query. As a result, Google has relied on statistical inference for uncertain data queries like [Paris Hilton] and [b&b ab] (bed & breakfast in Alberta) for years.

In 2010 Google purchased Metaweb Technologies, Inc. which was the company behind Freebase. Freebase was/is an "open, shared database of the world's knowledge". Before being acquired by Google, Metaweb was in the process of identifying millions of "entities and mapping out how they're related" via Freebase. In addition to entity mapping, Freebase also looks at what words other sites use to refer to entities. In May 2012 Google launched "Knowledge Graph," a “graph” which is built in part on Freebase. According to Google, Knowledge Graph can "understand real-world entities and their relationships to one another." Google hopes Knowledge Graph will improve search results and provide more immediate answers to user's questions in search results pages.

The concept behind Freebase and Google's use of graphed entities is pretty interesting but, I would like to know more about what is really going on under the hood of Google Knowledge Graph. Since Knowledge Graph launched, I have spent hours trying to break it, find bugs, discover issues and/or to identify abnormalities. Remarkably I must say, until last week I had found very little. Then as they say, "it happened!" Last Thursday, while looking for a good example of Google Knowledge Graph results to use in a presentation, I got the search result below.

SERP for Matt Cutts

 

Suddenly it dawned on me, Matt did not go to UNC Law School!

 

Matt Cutts SERP

I clicked on "University of North Carolina School of Law" in Matt's Google's Knowledge Graph result under his bio from Wikipedia but, it returned search results for another entity [university of north carolina at chapel hill]. From that result, I searched for [unc] and was returned this result.

Just to be sure what I was seeing was correct, I deleted all cookies, signed out of Google and restarted my browser. After refreshing all of my settings, I searched for [unc founded] and was returned this search result.

At that point, I realized UNC's founding date even seemed off? I checked and according to the University of North Carolina Planning Department, UNC was founded in 1793 not 1789. To be sure this was not the date UNC's Law School was founded, I checked the UNC School of Law website. According to the site, the first law professor did not arrive at UNC until 1845. Then went back and checked Wikipedia's page for UNC and it did not contain any text being displayed in Google's Knowledge Graph search results either.

With the suspected smoking gun already in hand, I went to Freebase.com and searched for [UNC]. You guessed it, Freebase.com's first result for [UNC] was exactly what had appeared in Knowledge Graph results "University of North Carolina School of Law". It turns out Matt is not alone, all UNC graduates listed in Freebase.com are listed as UNC School of Law graduates even if they did not attend the UNC School of Law. At that point it was clear, Google Knowledge Graph "thinks" UNC and UNC's School of Law are a the same or a single entity because that is what Freebase.com is "telling" Google Knowledge Graph.

Because Freebase data appears in Google Knowledge Graph search results and Google's main search results this issue also means results for 100+ notable figures are potentially incorrect. For instance, according to Google Knowledge Graph results US President James K Polk graduated from UNC's School of Law but UNC's School of Law was founded when he was already in office.

Knowledge Graph Results for James K Polk

In addition to Matt Cutts and President Polk, search results for [Michael Jordan college] in Google's main search results are also incorrect due to this issue.

Knowledge Graph Results for Michael Jordan

Other UNC School of Law alumni according to Freebase and potentially Google Knowledge Graph, include Alge Crumpler, Lawrence Taylor, Andy Griffith, Rick Dees, Roger Mudd, Vince Carter, Jerry Stackhouse and even Thomas Layton, the former CEO of Metaweb.

This issue is potentially due at least in part to the fact that only a shell page for UNC (UNC being the parent University of UNC Law School) existed in Freebase.com until yesterday. To hopefully help improve the quality of Google Knowledge Graph results, I added an image, description, UNC's correct founding date and other information from UNC.edu to UNC's Freebase page yesterday.

With fingers crossed that Matt's wild and crazy UNC Law School days are not his best kept secret, that my site won't vanish from Google tomorrow and that the US Secret Service won't show up at my door, I removed "Law School" from both Matt's and President Polk's profiles in Freebase. As a result, Matt Cutts and President Polk are now the only non-Law School students / graduates in UNC's Freebase page. It will be interesting to see how long these changes take to appear in Google's Knowledge Graph search results.

Google Knowledge Graph is really interesting and seems to be working pretty well despite a few bugs. This is yet another edge case but a situation you should know about. Instances where different entities have the same or similar names are problematic. Instances were multiple keywords are similar to multiple keyword entities are also problematic. Google may already be using Knowledge Graph data based on Freebase.com to determine whether on not content falls in or out of scope. For all of these reasons and others, it is important to ensure you keep an eye on Knowledge Graph results that relate to you. If you notice issues, click on "feedback" just below Knowledge Graph results on the right hand site of Google search results pages.