Google Sitemap Meta Data

June 15th, 2008
-- by Brian Ussery

[del.icio.us] [Digg] [Facebook] [Google] [Sphinn] [StumbleUpon] [Technorati]

JohnMu aka Googler John Mueller, confirmed Google’s use of sitemaps on Sunday and suggests using only quality meta data in xml sitemaps.

In his Google Groups post, John Mueller goes on to mention specifics as to how Google uses meta data in xml sitemaps submitted via Google Webmaster Tools :

URL - According to Mueller, it’s best to list only working URLs in xml sitemaps and only the correct version for canonical URLs. For canonical URLs, he suggests providing the “/” version and not “index.html” in his example. He goes on to point out the importance of using the same URL found in the site’s navigation and if necessary to use 301 redirects to that same URL when necessary. The navigation issue if important especially if something other than a crawler creates your sitemap. Either way, it’s worth testing to be sure your Sitemap URLs are identical to those in the user path (I’ve actually had near knock down drag outs over this issue). JohnMu suggest only including URLs to indexable content like (X)HTML pages and other documents. In addition he points out, it’s best to only include URLs webmastes want indexed.

Last modification date - In his post Mueller points out the difficulty Google can have with determining a “Last modification date” for dynamic sites due to their dynamic nature. He suggests either using the correct time or none at all. John suggests using a “Last modification date” but not “Change frequency” unless webmasters can establish a consistent frequency.

Change frequency - Like “Last modification date”, Mueller suggests not using a date/time if the actual one isn’t available.

Priority - Mueller suggests not including “Priority” meta data in xml sitemaps unless webmasters feel they can provide accurate data.

In summary, JohnMu suggests sitemap.org XML files that contain URLs for inclusion in Google’s index and only those found in the site’s navigation. He suggests “Date or change frequency” and “Priority” as optional meta data.

UPDATE: JohnMu has posted additional information over at Search Engine Roundtable in response to Barry’s post.

- beu

Google Doorway Pages New Definition

June 1st, 2008
-- by Brian Ussery

[del.icio.us] [Digg] [Facebook] [Google] [Sphinn] [StumbleUpon] [Technorati]

In case you missed it, the definition of “Doorway pages” has changed according to Google. It seems that “poor-quality pages” optimized for specific keywords and/or phrases are now considered “Doorway pages”. Worth pointing out that there is no mention of links in Google’s new definition. Seems like this opens a Pandora’s box of issues for webmasters?

What has changed:

1) Doorway pages are pages specifically made for search engines. Doorway pages contain many links - often several hundred - that are of little to no use to the visitor, and do not contain valuable content. HTML sitemaps are a valuable resource for your visitors, but ensure that these pages of links are easy for your visitors to navigate. If you have a number of links to include, consider organizing them into categories or into multiple pages. But in doing so, ensure that they are intended for visitors to navigate the sections of your site, and not simply for search engines.

has changed to

Doorway pages are typically large sets of poor-quality pages where each page is optimized for a specific keyword or phrase. In many cases, doorway pages are written to rank for a particular phrase and then funnel users to a single destination.

Whether deployed across many domains or established within one domain, doorway pages tend to frustrate users, and are in violation of our webmaster guidelines.

2) “Sites making use of these practices may be removed from the Google index, and will not appear in Google search results.”

has changed to

“Google may take action on doorway sites and other sites making use of these deceptive practice, including removing these sites from the Google index.”

Full Previous Version:
Doorway pages

Doorway pages are pages specifically made for search engines. Doorway pages contain many links - often several hundred - that are of little to no use to the visitor, and do not contain valuable content. HTML sitemaps are a valuable resource for your visitors, but ensure that these pages of links are easy for your visitors to navigate. If you have a number of links to include, consider organizing them into categories or into multiple pages. But in doing so, ensure that they are intended for visitors to navigate the sections of your site, and not simply for search engines.

Google’s aim is to give our users the most valuable and relevant search results. Therefore, we frown on practices that are designed to manipulate search engines and deceive users by directing them to sites other than the ones they selected and that provide content solely for the benefit of search engines. Sites making use of these practices may be removed from the Google index, and will not appear in Google search results.

If your site has been removed from our search results, review our webmaster guidelines for more information. Once you’ve made your changes and are confident that your site no longer violates our guidelines, submit your site for reconsideration.

If you’d like to discuss this with Google, or have ideas for how we can better communicate with you about it, please post in our Webmaster Help Group.

Current Version:
http://www.google.com/support/webmasters/bin/answer.py?answer=66355

-beu

Why Google WiFi Matters

May 22nd, 2008
-- by Brian Ussery

[del.icio.us] [Digg] [Facebook] [Google] [Sphinn] [StumbleUpon] [Technorati]

Google’s mission is to organize the world’s information and to make it accessible to anyone. As it turns out, making the internet accessible is almost as difficult as organizing the information. For that reason, Google’s Founder Larry Page is in Washington, DC talking with folks at New America. Google wants to make wifi broadband available to everyone not just in the US but, around the world. To do this, Google has proposed using vacant TV channels as well as unused closed cellular networks. Before using either one, Google needs authorization.

Opening these already existing virtual “lines of communication”, is still just one step in providing wifi broadband to the world’s population. The next step involves the placement of hardware as well as infrastructure that is capable of supporting broadband WiFi even in remote locations. That may sound like a daunting task in and of itself in theory but actually it isn’t. In their recent white paper “On Geolocation“, Google concludes that none of the issues involved in creating such a network are “particularly challenging”.

Google already owns vast amounts of bandwidth in the form of unused, ultra high-speed, fiber-optic or “dark cable”. This “dark cable” could easily be used to connect users to “Google ISP” via transmitters broadcasting WiFi. As far as the transmitters are concerned, Google has a number of options ranging from boxes mounted on existing phone poles to vehicles and even airborne transmitters suspended from weather balloons. Under conditions where fiber-optic broadcast range exceeds transmitter capacity, the system switches to communicate via satellite. This type of network could be partially solar powered, easily made redundant and wouldn’t depend on infrastructure. In addition to “normal use”, this type of network could provide immediate and advanced point to point communications anywhere in the world during disasters. All this, assuming Google is allowed to use a few empty TV channels!

Why is this important to search marketers? In addition to making information available, ads will also be available and in some areas of the world that means for the first time. By providing internet access, Google will be able to provide more relevant results based on the users exact geo-location.

- beu

Register for SES