Adobe recently submitted a US Patent application that relates to SEO for Flash / Flex, titled “EXPOSING RICH INTERNET APPLICATION CONTENT TO SEARCH ENGINES.” Believe it or not, this Patent application claims “shadowing” techniques like SWFObject and SOFA are at an “obvious disadvantage” for search. According to Adobe, shadowing textual content in rich Internet applications with textual content in (X)HTML results in duplication and other issues. For those not aware, duplicate content thins keyword relevancy, Google’s secret sauce, PageRank and requires a “duplication of effort” in producing “the actual rich Internet application as well as the shadow HTML.” This Patent claims site management time is also increased because “changes in the rich Internet application must also be made to the shadow HTML, if that HTML code is to remain consistent with the rich Internet application.”
To address these and other issues, Adobe’s application proposes an invention that returns different content to users and search engines. According to the Patent application, content will be “available through a rich Internet application to search engine queries” via a “translation module” that interfaces “between a Web crawler and a rich Internet application.” It seems this application isn’t intended to provide alternative textual “eye wash” for users, but instead descriptions of the state, content and identifying URLs that are “important to Web crawler and/or search engines.” According to Adobe the “translation module may comprise pseudo HTML page code providing a description of the state which omits description of aspects of the state which are not useful to Web crawler and/or search engine. According to the invention application, “cached pages” will reflect a poorly formatted and quite likely partially humanly readable page.
Adobe CEO Shantanu Narayen recently sat down with the Wall Street Journal to discuss Steve Jobs’s comments about Flash. I was shocked by what Narayen had to say and the spin was a little annoying.
According to Narayen, Adobe is “producing the world’s best content,” interesting considering Flash only supports a dozen or so languages. With more than 200 spoken languages world-wide, I’m not sure how Adobe can claim world domination. But then again, they also claim that “99% of Internet users have Flash,” even though that’s not what you’ll find in web analytics. These results are probably because Adobe’s survey only looks at PC users and PCs often come with Flash installed. In addition, the survey only focuses on 4,500 users in 13 countries and U.S. users are represented nearly two to one. These numbers are available at http://www.adobe.com/products/player_census/methodology so, please have a look for yourself before emailing.
According to Narayen, important Flash security issues are nothing more than a “smoke screen” but, according to SANS “Adobe Flash has similar problems with the applications of its updates (TH) there are four Flash vulnerabilities in our Top 30 list that date back as far as 2007.” I’m not sure how you can claim customers are important but not address issues like these.
Narayen even claims that there are no performance issues with Flash and that Flash works fine on mobile devices. Adobe’s recent report to the SEC indicates the opposite, “To the extent new releases of operating systems or other third-party products, platforms or devices, such as the Apple iPhone or iPad, make it more difficult for our products to perform, and our customers are persuaded to use alternative technologies, our business could be harmed.” My favorite part of the interview is where Narayen claims Flash isn’t 100% proprietary and goes on to either confuse software specifications with web standards or TOTALLY spin out of answering the question being asked. If you’ve ever waited for Flash to load or been told you needed Flash to view Flash content, you know there are performance issues and that Flash is proprietary.
Narayen himself actually said, “We’ve evaluated the SDK. We can now start to develop the Flash player ourselves…” referring to the iPhone SDK and Flash for iPhone but, now for some reason it’s all Apple’s fault? To be clear, I think Adobe is a great company, I’ve used their products on my Mac for over a decade now. That said, I commend Steve Jobs for having cajones, not drinking Adobe’s Kool-Aid, keeping them honest and his long overdue feedback.
While I commend Adobe for its recent efforts to help engines index textual content locked in Flash, I have issues with the new “SEO Technology Center.”
For example, the following video by one of Adobe’s Senior Technology Evangelists states that tv.adobe.com “…rises to the top of the heap in the Google…” for [Duanes World] thanks to Adobe’s new headless Flash player technology “Ichabod.” According to Adobe’s Evangelist, “Duane” could only be visible to Googlebot by having Ichabod change states in the Flash file, therefore exposing “Duane” as textual content. Unfortunately this is not correct as the cached version of the page from Google’s SERP states “These terms only appear in links pointing to this page“.
As shown in the Google SERP, “Duane” appears to Googlebot only in links pointing at tv.adobe.com and not in the Flash file as the video claims. Using the advanced “site:” operator to search for [Duane] within tv.adobe.com shows a number of pages with links pointing to AdobeTV using “Duane” as anchor text. Because these links use #anchors (fragment identifiers) which Googlebot ignores, in URLs, Google “credits” keyword relevancy to the root instead of the intended target URL.
With on page factors out of the way, as a side note it’s also worth mentioning that the search engine results page in Adobe’s video is based on the user’s prior search history, while logged into a Google Account and Searchwiki appears to be activated. These personalization settings can all act to throw off the data in such an experiment.
This post isn’t intended to bash Adobe but rather to point out some critical errors in their research. Please don’t get me wrong, I’m a huge fan of Adobe and have been for years. I think they make great products and appreciate all of the hard work done by Adobe’s team of Evangelists. I understand that Adobe Evangelists are experts at Flash but, when it comes to SEO for Flash and interpreting Google SERPs, wish people wouldn’t take their opinions blindly as being fact.
For the handful of us with expertise in SEO for Flash, it’s a little awkward having to tell clients that Adobe’s information isn’t entirely correct. Either way, it would be nice to see more research as well as accurate and up-to-date information in Adobe’s SEO Technology Center. It would also be great to see some of these best practices implemented at tv.adobe.com.
Google “Universal” has placed increased emphasis on image results. Prior to Google Universal, users viewed search results one vertical at a time. Now days, users have options and access to search results from across key Google verticals all within their main search engine results pages. The Google internal video below illustrates in real-time how one eye-tracking study participant migrates between verticals within universal SERPs.
This migration between vertical search results may explain Google’s introduction of Image Ads. In addition to ads, Google Images offers more options than ever before. Users can search for images by size or in a variety of content types including news, faces, clip art, line drawings and photo content categories. While a lot has changed when it comes to image optimization, users still enter text queries for Google to translate into image results. For that reason, linguistics is still critical when it comes to image optimization. Images are indexed in ways similar to text but have their own flavor of PageRank.
Before diving into the finer points of advanced image optimization, let’s see if the basics are still valid. In terms of basic image optimization best practices, the rule of thumb is to provide as much descriptive information about images as is possible, but without “keyword stuffing” which could cause your site to be perceived as spam. Focus on including images with relevant textual context. Be sure to provide hypertextual clues about the subject matter of pages where images appear. With images like everything else the key concept is relevancy when it comes to search. Again, image results are based on textual queries.
Basic Image Optimization Best Practices:
- Informative filenames provide important signals about images and/or their subject matter. For that reason it’s best to incorporate descriptive wording into image file names. (IE beach-dog.jpg instead of 1.jpg) (Hint – In some cases, image filenames may be used as the snippet in SERPs.)
- ALT attributes provide users and engines alike, with textual information about image subject matter. Engines rely heavily on the structure present in hypertext, especially where images are concerned. It’s best to always incorporate short but descriptive ALT attributes when using images. ALT attributes assist engines in determining the most relevant result for image specific keyword queies.
Basic Image Optimization Case Study Experiment:
To determine whether or not images ranking for the query used in Google’s eye tracking study above, follow basic image optimization best practices.
To examine the hypertext structure related to images appearing in Google Universal search results for the query [how to tie a tie] and record observations related to basic image optimization.
Image A -
ALT = “How to tie a necktie video”
Image B -
URL = http://www.tieking.com.au/images/hw.gif
Image C -
URL = http://www.jitterbuzz.com/esquire/tietie_big.jpg
ALT = “Tying the Tie”
Two of three images in this case study have descriptive file names in conjunction with descriptive ALT attributes. In addition to file names and ALT attributes, there seem to be other key factors to consider when optimizing images. As Google states, “other factors” do seem to come into play based on the results of this case study.
Advanced Image Optimization:
It’s important to note that Google uses crawl caching proxy techniques to make images available in other services and that Google doesn’t index images directly. As a result, there is no need to include images in your XML Sitemap. Either way, quality images start with quality pages. Quality pages contain few errors and load quickly. In order to decrease load time for pages, focus on template images that appear in every page (navigational images, logos and/or other). When possible, consider converting static GIFs into PNGs. When using GIFs, be certain palette sizes are correct based on the number of colors in the image. For JPEGs use a lossless tool to like Photoshop to remove unnecessary information from your file unless it’s important for users and/or search (see Exif below). Always define image sizes via (X)HTML by size and not scale and be sure to include a favicon for branding, bookmarking and to avoid 404s. These steps will help decrease load time and increase page quality.
Place high quality images high up and above the fold if possible. When necessary create unique static detail pages linked via thumbnail. Don’t prevent other sites from using your images even if it means loosing bandwidth. Posting and tagging sample images at other sites can help get more eyes on your images, but be sure to include links to your site as the objective isn’t to donate content to social media sites.
“Google analyzes the text on the page adjacent to the image, the image caption and dozens of other factors to determine the image content. Google also uses sophisticated algorithms to remove duplicates and ensure that the highest quality images are presented first in your results.”
Always provide textual content in close proximity to images. Obviously, at this point it’s worth pointing out again that image accessibility is of paramount importance. In addition, TITLE elements, captions and image titles in your pages can provide important clues for search engines. When it comes to search, more data is better data so consider taking the extra time to include as much as possible while avoiding techniques that could be detected as an attempt to spam search engines. Be sure to place your images near or above relevant text in pages and always include descriptive captions. Don’t embed textual content within vector graphic formats other than .pdf as engines can’t extract text from other image formats. Fresh images accessible in various sizes are a good idea but be sure to define image length and width information via hypertext as well as for users. Enable Google Image Labeler via “Enhanced Image Search” in Google webmaster tools. While PNG is more optimal than GIF in terms of load speed, certain circumstance may require image metadata contained in JPEG. Consider a file structure for images that denotes individual directories for thumbnails, art, drawings, photos and/or other. Be sure not to mix Adult images with images for general audiences.
The Future of Image Optimization:
In terms of the future of image optimization, all signs seem to point to Exif. The Exif file format for image metadata is one specification used by digital cameras and was developed nearly 10 years ago. Exif metadata includes information such as the date and time an image were captured. When set properly, modern digital cameras record the date and time images were captured and this information is recorded in image Exif metadata associated with image files. In addition to date and time, digital cameras record image metadata pertaining to the camera’s manufacturer, model, orientation, aperture setting, shutter speed, focal length of image, meter mode, ISO speed information, a preview thumbnail and copyright information.
So, what does any of the information in Exif have to do with SEO for images you ask? Well, Exif can also be used to record information about where images were taken and whether or not they’ve been “photoshopped” for example. For images or universal queries related to news or specific geographic areas, Exif could easily be used as a quality signal. I asked Matt Cutts about Google’s use of Exif last year and his reply was “I’m not sure, personally. I could imagine that any stuff embedded in an image file might be used, though.” Currently both Panoramio and Picasa use Exif and I’d expect to see this trend rise as new GPS enabled devices enter the market.
For more great information, check out Peter Linsley’s latest post on the Google Webmaster Central Blog…
Thanks for all the great feedback regarding my recent post on Flash! I’ll be talking about this more on Wednesday during the “SEO Friendly Flash” session at SES Chicago but, wanted to provide a little update to my original case studies.
Without going into detail, Google seems to be associating text content in Flash with the correct parent URL and indexing both as a single entity on an increasingly frequent basis. While I haven’t been able to get any type of “Official” confirmation from Google or Adobe, this just might be very big news when it comes to SEO for Flash. Implication being, the marriage of meta data about hypertext structure with text content in Flash for the first time from the perspective of search engines.