Google

Earlier today Google launched www.wesolveforx.com. This site is reported to be the new website for Google's highly Top Secret "X-Lab project" but, one secret is already out.

The content seen by users at www.wesolveforx.com actually resides on an internet marketing agency's website http://www.thinkbelieveact.com/solveforx/. Meaning that Google's new site www.wesolveforx.com, is currently little more than a domain name. To make things worse, this content is indexed on the agency's website by both the www and non-www versions of the URL.

Background information aside, what interests me is that my +1 for www.wesolveforx.com was actually credited to Google's agency website www.thinkbelieveact.com instead of where I intended. While this makes sense given the technical issue at hand, Google's agency does appear to be benefiting in some ways from this situation. To be fair though, this situation may not have been easily avoidable because adding the +1 button to pages causes Google to ignore disallow directives in robots.txt and meta noindex tags. For that reason, maybe some trade-offs had to be made I'm not sure.

Either way, it seems like preventing +1 buttons from appearing in framed content might be a good idea?

UPDATE: Larry Page mentioned earlier today that the new site is now live. The new site is now live but as of the time of this update, the pages on Google's agency site are also still live.

Hangar One MT View, CA

Hangar One MT View, CA

According to MV Voice Google's founders (H211 LLC) plan to spend $32 million to clean up and restore a 200-foot-tall 1930's icon next to Google's headquarters in order to lease the structure.

If you've ever been on the 101 between San Francisco and San Jose, CA or watched MythBusters you've probably seen Hangar One at Moffett Airfield near Mountain View, CA. If not, it's massive covering 8 acres and could hold nearly 10 football fields. This hangar was designed by the Goodyear Zeppelin Corporation in the 1930s for airships but since then has fallen into disrepair.

Over the holidays, Google rolled out a pretty major update to Webmaster Tools. This latest update provides much more detail in terms of data and reporting. So much in fact, that some folks seem confused now about the difference between Google Webmaster Tools and Google Analytics. The big difference for SEO is that Google Webmaster Tools shows Google's own data for URLs in Google SERPs and doesn't track web pages like Google Analytics. In addition to the key difference in reporting, Google Webmaster Tools requires no installation. While it's difficult to say for sure, this update should force folks to abandon the ignorance is bliss mentality when it comes to analytics reporting once and for all.

BUT before diving in, here is a little background...

In 2005, with a little help from a Googler named Vanessa Fox, Google launched Google Sitemaps. This program has since, evolved into what we know as Google Webmaster Central. About the same time, Google bought Urchin and shortly after made Google Analytics free to everyone. Back then small to medium sized sites that couldn't afford enterprise analytics relied primarily on ranking reports to measure search visibility.

Ranking reports are created with software that emulates users and sends automated queries to search engines. The software then records data about positioning within organic search results by specific keywords and URLs. Ranking reports don't provide bounce rates but, they do provide an important metrics for measuring SEO ROI directly from Google SERPs. That being said, automated queries from ranking software are expensive for search engines to process and as a result they are a direct violation of search engine guidelines.

In 2010 Google introduced personalization features in organic search engine results. These personalized results are based on the user's history, previous query, IP address and other factors determined by Google. Over the past two years, Google's personalized search results have rendered ranking reporting software nearly useless.

Enter Analytics… Without accurate ranking reports, analytics may seem like a decent alternative tool for measuring SEO ROI by URL but, is that really the case? If analytics were enough why did Google recently update Google Webmaster Tools? These are a couple of the questions that I hope to answer.

To start off, let's establish a few laws of the land...

Google Webmaster Tools Update Case Study: Redirects

Experiment: To compare 301 and 302 reporting accuracy between Google Analytics and Google Webmaster Tools

Hypothesis: Google Analytics incorrectly attribute traffic when 301 and/or 302 redirects are present.

Background: Google ranks pages by URL, for that reason accurate reporting by specific URL is critical. In order for Google Analytics to record activity a page must load and Google Analytics JavaScript must execute. Google Analytics reports based on a page and not URL. While most "pages" have URLs not all URLs result in page views. This is the case when 301 and/or 302 server side redirected URLs appear in search results.

Procedure: For this comparison, I created apples.html and allowed it to be indexed by Google. I then created oranges.html and included noindex meta to prevent indexing until the appropriate time. After ranking in Google's SERPs, apples.html was 301 redirected to oranges.html and results recorded.

Result:
According to Google Analytics, oranges.html is driving traffic from Google SERPs via "apples" related keywords. Google Webmaster Tools on the other hand, reports each URL individually by keyword and remarks the 301 redirect.

Conclusion: Google Analytics reports oranges.html is indexed by Google and ranks in Google SERPs for apples.html related keywords. However reporting that data to clients would be a lie. Oranges.html hasn't been crawled by Google and isn't actually indexed in Google SERPs. Secondly, until Google crawls and indexes the URL oranges.html it is impossible to determine how or if it will rank in Google search results pages. In addition, this data becomes part of the historical record for both URLs and is calculated into bounce rates for URLs not shown in SERPs.

(Google's Caffeine has improve the situation for 301 redirects as time between discovery and indexing are reduced.)

Google Webmaster Tools Update Case Study: Complex redirects

Experiment: To compare differences in tracking via multiple redirects from SERPs ending on off-site pages.

Hypothesis: Multiple redirects ending off-site are invisible to Google Analytics because there is no page load.

Background: Google ranks pages by URL, for that reason accurate reporting by URL is critical. In order for Google Analytics to record activity a page must load and Google Analytics JavaScript must execute. While most "pages" have URLs not all URLs render pages. In most cases 301 issues are resolved by engines over time, however 302 issues will remain. The same is the case for multiple redirects ending off-site.

(For those who aren't aware, this is one way spammers try and trick Google into "crediting" their site with hundreds or thousands and sometimes even hundreds of thousands of content pages that actually belong to someone else.)

Procedure: To test how Google Analytics handles multiple redirects, I created page1.html which 302 redirects to page2.html which 301 redirects to another-domain.com. Google indexes the content from another-domain.com but SERPs show it as residing at the URL page2.html.

Result: Despite being ranked in SERPs, Google Analytics has no data for these URLs. Google Webmaster Tools reports the first two URLs and remarks redirects.

Conclusion: Google Webmaster Tools recognizes the existence of the URLs in question while Google Analytics doesn't at all and that is a major problem. For SEO reporting these URLs are critical, the content is real and it's impacting users as well as Google.

Google Webmaster Tools Update Case Study: Installation

Experiment: To compare tracking without Google Analytics tracking code installed.

Hypothesis: Google Analytics won't track if tracking code is not installed properly on each page within site architectures supporting analytics.

Background: In order for Google Analytics to record data it must be implemented correctly in each page and be able to communicate with Google. Legacy pages without the Google Analytics tracking code often rank in SERPs but, go unnoticed because they're invisible to analytics. In addition to this issue there are various other situations where untracked content appears in Google's index. Even when implemented properly, analytics tools are often prevented from reporting due to architectural problems.

Procedure: To test how Google Analytics works without proper installation, I setup an account but DID NOT implement the Google Analytics tracking code snippet into pages.

Result: Google Analytics reports that their has been no traffic and that the site had no pages but, Google Webmaster Tools reports as usual impressions by keyword, by URL, CTR and other.

Conclusion: In order to function properly Google Analytics must be implemented in each and every page and function properly in addition to being supported by the site architecture. Google Analytics requires extensive implementation in many cases which is an extra obstacle for SEO. Google Webmaster Tools data is direct from Google, requires no implementation and verification is easy.

Google Webmaster Tools Update Case Study: Site Reliability

Experiment: To see how Google Analytics tracks pages when a website goes offline.

Hypothesis: Google Analytics will not track site outages.

Background: In order for Google Analytics to record data it must be properly implemented, supported by the site's architecute and be able to communicate back and forth with Google.

Procedure: To test how Google Analytics reports when a site goes offline, I turned off a website with Google Analytics installed.

Result: Google Analytics reports no visitors and/or other metrics but suggests nothing about the real cause. Google Webmaster Tools - reports errors suggesting the site was down.

Conclusion: Google Analytics does not report site outages or outage error URLs whereas Google Webmaster Tools does. For SEO, site uptime is critical.

Final thoughts...

As illustrated above, analytics will report keywords for URLs that aren't indexed and won't report keywords for URLs that are indexed in SERPs. Analytics is unaware of redirected URLs even those indexed by Google and seen by users worldwide. Analytics can't tell the difference between a brief lack of visitors and periods of site downtime, it's possible for analytics tracking code to fire without pages loading and pages loading without firing tracking code. Analytics doesn't know framed content is indexed, or about legacy pages without tracking, alternative text versions of Flash pages, how long pages take to load, and on, and on, and on....

In fairness, the tool is doing what it is designed to do, folks using it just don't understand the limitations. Often times, they aren't aware data is fragmented and/or missing or that site architecture impact reporting ability. Checking Google to see if SERPs jive with reports never occurs for some reason.

I've been kvetching about these issues for years, to anyone and everyone who would listen. If you can't tell, few things F R U S T R A T E me more.

The case studies above represent just a few ways in which analytics data is skewed due to bad and/or missing data. Believe it or not, a substantial amount of analytics data is bogus. According to one Google Analytics Enterprise partner, 44% of pages with analtyics have analytics errors. On average analytics only tracks about 75% of traffic. Analytics is a weird beast, when something goes wrong nothing happens in analytics and sometimes it happens on invisible pages. :)

Bad data attacks like a virus from various sources polluting reporting exponentially, silently, undetected and over time. Sadly, very few folks including most "analytics experts" have the experience or expertise to track down issues like these by hand. Until now there has been no tool to report analytics not reporting. The recent Google Webmaster Tools update empowers webmasters by providing them with the best data available. This update exposes analytics issues. It also places the burden of proving data measurement accuracy back on the folks responsible for it.

Oh yeah, HAPPY NEW YEAR!

In case you missed it, Logitech finally released Revue for Google TV. While I'm really excited about the Revue launch and Google TV, I'm a little concerned by Google TV's website. I know, Google doesn't have the best "track record" when it comes things like SEM but, Google.com/tv is especially bad.

The new Google TV minisite has two duplicate "homepages", one in Flash and one not in Flash. This kind of duplicate content acts to thin both keyword relevancy and PageRank. In addition to thinning, Google filters non-malicious duplicate content from SERPs. Neither one of Google TV's homepages provide properly formatted meta descriptions in fact, the meta description for the non-Flash version is empty and has a different TITLE element than the Flash version. As far as content, neither one of Google TV's homepages provide accurate "alternative" textual content for users without Flash and/or JavaScript. For tracking they seem to be using a customized version of Google Analytics possibly even with heat mapping functionality but, it fires before the onload event which slows PageSpeed and degrades user experience.

Search while you type isn't a new concept, it has been around for years but, it doesn't really work for users. Predictive text on the other hand does work for users, it's really simple and extremely fast. Google's new Instant search results combine predictive analysis with instant results and a new scroll to search feature that automatically suggests predicted queries to users all in real time. These predictions are based on years of data and billions of previous searches but, Google's results are the same. What makes Instant search radically different, is speed and most of all the feedback it gives users.

For example, Instant search provides users with feedback about misspellings and provides suggested spellings for queries before users even search. This is no accident, Google knows users intend to spell queries correctly and has been working behind the scenes on spelling improvements for some time. Improvements like these have been integrated in and rolled out with Google Instant. Instant feedback about misspellings is great for users but, may come as a shock to webmasters focused on tricking users with misspellings as their SEO strategy. Google Instant is no threat to ethical, white hat SEO efforts or unique quality content that users value. Bottom line, users "aren’t going to fundamentally change what they’re looking for."

In addition to improved spelling, Google has also recently improved how they handle proper nouns. Better handling of proper nouns helps Google extract more information, especially about named entities. Proper nouns and named entities often share a common trait, they're capitalized. Google already has entity related patents and recently acquired MetaWeb a company that specializes in this field. Mapping multiple named entities to one "thing", increases data captured about each entity as well as the whole. Improved understanding of named entities, improves data about potentially vital pages and increase the quality of results even in the absence of relevant keywords. All of these help Google Instant entice users to explore more of the space around their query. Combining better spelling with better quality and better targeted results, decreases the percentage of "unique queries". These unique queries are difficult to monetize and often result in a poor experience.

The day before Google launched Instant search, CEO Eric Schmidt said, "Never underestimate the importance of fast!" When it comes to speed, Google Instant makes other sites, including Yahoo and Bing seem much slower by comparison. Google Instant is so fast in fact, it increases the perceived latency of other sites. This factor could help increase Google's market share especially, if dedicated users start leaving other engines for Google. Increased focused on site performance is no doubt more critical than ever before.

A few other notes, Instant has Google's improved triggering for realtime queries and that could tie directly into their "Social Layer" scheduled for release in Q4 2010. You'll find Google Squared technology in Instant results for queries like [inventor of airplane]. You may notice, the scroll to search feature actually pushes results down the page and the footer search box is no more. It's possible that Google Instant's GUI emphasizes images, video and highly positioned AdWords ads (colored background) more, because they flash in and out of view in certain cases.

What does the near future hold for SEO, PPC and analytics? In coming months expect to see, more data missing from analytics and fragmented with Google Webmaster Tools and/or other sources. It's quite possible Google Instant thwarts automated queries on some levels and for that reason, ranking reporting software may be even more inaccurate. AdWords impression data will be less accurate for testing and virtually worthless in terms of historical comparison.