search engine optimization

While Google has condemned buying and selling links that pass PageRank, they've encouraged listing in paid directories like Yahoo for years. It seems that era may have come to an end earlier today. The following bullet points have been removed from Google's Webmaster Guidelines Webmaster Help Center*

  • "Have other relevant sites link to yours."
  • "Submit your site to relevant directories such as the Open Directory Project and Yahoo!, as well as to other industry-specific expert sites."

Does this recent move reflect a renewed emphasis on rooting out paid links passing PageRank and/or low quality links by Google?

*As mentioned, the bullet points above have been removed from the US version of Google's Webmaster Help Center. Other versions may not yet reflect this change.
-----------

UPDATE: Hat tip to Barry Schwartz who noticed John Honeck's post in Google Groups where Google's John Mueller comments on the change. Barry provides a full recap at SERoundTable.com and SearchEngineLand.com.

Google recently modified how they show results and in doing so virtually crippled at least one popular "rank checking software" package. As "JohnMu" pointed out, Google has always been clear about using these kinds of tools. In fact for as long as I can remember, Google Webmaster Guidelines has clearly stated:

"Don't use unauthorized computer programs to submit pages, check rankings, etc. Such programs consume computing resources and violate our Terms of Service."

Bottom line, automated queries require resources without any potential for generating revenue. It has been said that Google's "ultimate selection criterion is cost per query, expressed as the sum of capital expense (with depreciation) and operating costs (host-ing, system administration, and repairs) divided by performance."

During Q4 2007, Google reported capital expenses of $678 million with operating costs of $1.43 billion. According to ComScore 17.6 billion "core searches" were conducted by Google during the same period. Using Google's formula and financial data along with ComScore's estimates, it appears as though Google's average cost per "core search query" was nearly $.12 during Q4 2007. Again, this is a rough estimate and a rounded total but, personally I was a little surprised by the number.

If 1 million sites run ranking reports on 100 keywords 12 times per year at $.12 per "core search query", it costs Google $144,000,000. Over a ten year period, that's more than a billion dollars. Given this data, it's easy to see why Google uses "algorithms and different techniques to block excessive automated queries and scraping, especially when someone is hitting Google quite hard." Matt Cutts suggests, contacting Google's Business Development Team about permission on sending automated queries to Google.

Now, I fully understand the importance of ranking reports when it comes to SEO clients. That said, there are folks out there abusing the system, running ranking reports on thousands of keywords daily.

A friend of mine recently emailed to ask, how TinyURL impacts SEO? It's a good question and one many folks can't answer so, I thought I'd blog my answer to his question!

For anyone not familiar with TinyURL, in layman terms it's a tool where users can enter long displaying URLs to get a shortened version. TinyURLs are often used where long URLs might wrap and therefore break, such as in email or social media web applications like Twitter. In more technical terms, TinyURLs are short, dynamically created URLs that redirect users to another intended URL via 301 redirect. Because TinyURLs "301" or permanently redirect, search engines should not index the TinyURL but instead should index and pass PageRank to the actual URL.

It is important to note, TinyURLs to paid links passing PageRank is a violation of Google Webmaster Guidelines and that sites like Twitter use nofollow techniques to prevent spam.

On their own, TinyURLs can be search engine friendly from a technical perspective. At the same time, I wouldn't suggest replacing your site's navigation with TinyURLs and would point out that tracking TinyURLs via analytics might be difficult.

By now you probably know Google indexes text content within Flash thanks to Google's new Algorithm for Flash. In case you missed it, Google recently updated their original announcement to include additional details about how Google handles Flash files.

SWFObject - Google confirms that Googlebot did not execute JavaScript such as the type used with SWFObject as of the July 1st launch of the new algorithm.

SWFObject - Google confirms "now" rolling out an update that enables the execution of JavaScript in order to support sites using SWFObject and SWFObject 2.

According to Google, "If the Flash file is embedded in HTML (as many of the Flash files we find are), its content is associated with the parent URL and indexed as single entity." I found this isn't the case using a variation of the example used by Google. The following query finds the same content indexed at three URLs 2 SWF and 1 HTML:
http://www.google.com/search?q=%22NASA%27s+Hubble,+...

http://www.jpl.nasa.gov/multimedia/deep-impact/index.swf
http://www.nasa.gov/externalflash/deepimpact_flash/index.swf
http://www.jpl.nasa.gov/multimedia/deep-impact/index-flash.html

Additional:

Deep Linking - Google doesn't support deep linking. "In the case of Flash, the ability to deep link will require additional functionality in Flash with which we integrate."

Non-Malicious Duplicate content - Flash sites containing "alternative" content in HTML might be detected as having duplicate content.

Googlebot, it seems still ignores #anchors but will soon crawl SWFObject. Given that Googlebot can or will soon crawl SWFObject sites, major reworks should be considered for "deep linking" sites where correlating "alternative" HTML content pages contain the same Flash file and are accessible via multiple URLs.

ActionScript - Google confirms indexing ActionScript 1, ActionScript 2 and ActionScript 3 while at the same time Google shouldn't expose ActionScript to users.

External Text (XML) - Google confirms, content loaded dynamically into Flash from external resources isn't associated with the parent URL.

While this is a great development for Flash Developers moving forward, lots of education may be required.

JohnMu aka Googler John Mueller, confirmed Google's use of sitemaps on Sunday and suggests using only quality meta data in xml sitemaps.

In his Google Groups post, John Mueller goes on to mention specifics as to how Google uses meta data in xml sitemaps submitted via Google Webmaster Tools :

URL - According to Mueller, it's best to list only working URLs in xml sitemaps and only the correct version for canonical URLs. For canonical URLs, he suggests providing the "/" version and not "index.html" in his example. He goes on to point out the importance of using the same URL found in the site's navigation and if necessary to use 301 redirects to that same URL when necessary. The navigation issue if important especially if something other than a crawler creates your sitemap. Either way, it's worth testing to be sure your Sitemap URLs are identical to those in the user path (I've actually had near knock down drag outs over this issue). JohnMu suggest only including URLs to indexable content like (X)HTML pages and other documents. In addition he points out, it's best to only include URLs webmastes want indexed.

Last modification date - In his post Mueller points out the difficulty Google can have with determining a "Last modification date" for dynamic sites due to their dynamic nature. He suggests either using the correct time or none at all. John suggests using a "Last modification date" but not "Change frequency" unless webmasters can establish a consistent frequency.

Change frequency - Like "Last modification date", Mueller suggests not using a date/time if the actual one isn't available.

Priority - Mueller suggests not including "Priority" meta data in xml sitemaps unless webmasters feel they can provide accurate data.

In summary, JohnMu suggests sitemap.org XML files that contain URLs for inclusion in Google's index and only those found in the site's navigation. He suggests "Date or change frequency" and "Priority" as optional meta data.

UPDATE: JohnMu has posted additional information over at Search Engine Roundtable in response to Barry's post.

- beu