Monthly Archives: August 2010

Malcolm Coles pointed out a new feature in Google SERPs and posed some interesting questions last week. I don't think Google is treating "brand names" as site operator queries. Site operator queries only return results for a single site. Either way, both Malcolm's and Matt's examples appear to be navigational and/or what are referred to as "named entity queries."

Queries provide numerous signals that engines can use for insight about user intent. They are for the most part either informational, navigational or transactional (action) in intent but, some queries fall into more than one category. These queries are often classed as named entities. Problem is, it's difficult to surmise intent from a single query that may have multiple interpretations. Google already has related patents and recently purchased Metaweb, a company specializing in this field. One aspect that I haven't seen mentioned elsewhere in plain English, is that company names, product names, organization names, brand names and/or combinations thereof are named entities. Named entities are easy to extract online because they are often capitalized. If leveraged properly, they could provide a number of associative signals that are well worth considering.

All that said, I'm not sure that is what is happening today. When statistic probability significantly favors one site over all others in terms of user intent, it makes sense that engines would return multiple results for that site instead of just two. Google may have introduced named entity elements or may simply be handling navigational queries in a way that seems... well more logical.

Adobe recently submitted a US Patent application that relates to SEO for Flash / Flex, titled "EXPOSING RICH INTERNET APPLICATION CONTENT TO SEARCH ENGINES." Believe it or not, this Patent application claims "shadowing" techniques like SWFObject and SOFA are at an "obvious disadvantage" for search. According to Adobe, shadowing textual content in rich Internet applications with textual content in (X)HTML results in duplication and other issues. For those not aware, duplicate content thins keyword relevancy, Google's secret sauce, PageRank and requires a "duplication of effort" in producing "the actual rich Internet application as well as the shadow HTML." This Patent claims site management time is also increased because "changes in the rich Internet application must also be made to the shadow HTML, if that HTML code is to remain consistent with the rich Internet application."

To address these and other issues, Adobe's application proposes an invention that returns different content to users and search engines. According to the Patent application, content will be "available through a rich Internet application to search engine queries" via a "translation module" that interfaces "between a Web crawler and a rich Internet application." It seems this application isn't intended to provide alternative textual "eye wash" for users, but instead descriptions of the state, content and identifying URLs that are "important to Web crawler and/or search engines." According to Adobe the "translation module may comprise pseudo HTML page code providing a description of the state which omits description of aspects of the state which are not useful to Web crawler and/or search engine. According to the invention application, "cached pages" will reflect a poorly formatted and quite likely partially humanly readable page.

According to the Site Performance feature in Google Webmaster Tools, your pages load reeeeealy slow but, other external tools or monitoring services tell a different story.

What should you believe?

First, it's important to understand the differences between these tools, the data they capture and how it's measured.

Page Speed evaluates the performance of a specific web page and individual elements in the browser. As a result, this type of testing may not accurately reflect latency experienced by users. Page Speed is for testing and improving speed for individual pages.

Tools like webpagetest.org and monitoring services often test latency for a specific URL at various times of day and locations around the world. As a result, these kinds of tests may not reflect latency as perceived by users in the region the site targets.

Google Webmaster Tools Site Performance data is collected from actual Google Toolbar users in the same geographic region as the target audience of the site. This data can be measured in several ways. One being, time between when the user clicks on a link "until just before that document’s body.onload() handler is called." If for example, if a user clicks on a link, is then redirected and then redirected again, that delay should be recorded and reflected in Google Webmaster Tools Site Performance data. These are the kinds of delays that impact users and Googlebot and that are totally missing from other tools including analytics.

Speed doesn't currently have a major impact on rankings but, slow pages deter users and hamper crawl efficiency. Crawl efficiency can be a major factor for pages with lower PageRank because " the number of pages Google crawls is roughly proportional to PageRank".