According to the White House, search engine optimization is a priority for US Government websites. Given the White House mandate and staggering number of Americans that search for health related information, you would think the new HealthCare.Gov website would be search engine friendly. Unfortunately the NEW site is not search friendly, due in part to the OLD site not being cleaned up properly. Until issues with the new version of the site are resolved and the old version of the site is cleaned up, users will continue to experience issues. Expert developers are usually focused on development and not technical search related issues. As a result, technical search issues usually go unnoticed and continue to frustrate users.
Technical SEO site assessment is difficult to teach in a public setting because of the risk of potentially offending site owners. Since we all own HealthCare.Gov, offending someone is not a problem. As a result, I took a few minutes to check out the site and have documented a few critical issues below. Please note, the list of issues outlined herein is by no means comprehensive and only took a few minutes to compile. Please feel free to post additional search related issues in the comment section below. The objective of this post is to educate others and lend an extra set of eyes to the “A-Team".
It is widely known that HealthCare.Gov has a number of potential security issues and several of these are search related.
Findings: Without going into detail for security reasons, it is currently possible to search and get results for “public and secure content” at HealthCare.Gov. Please note, this is an internal HealthCare.Gov IT issue, not a web search issue and has already been reported to HealthCare.gov.
Recommendation: Ensure access to content not intended for public consumption is password protected.
According to Google and Bing, websites should be tested with a text browser. Text browsers make it possible for webmasters to "see" sites more like search engine crawlers. This kind of testing will also reveal issues experienced by individuals with disabilities when accessing the site on an assistive device.
When users search for [healthcare.gov], chances are they want to navigate to HealthCare.Gov the US Government health insurance market place.
Findings: Currently when users search for [healthcare.gov] they are returned Google search results above. Clicking on the top result in the site link section takes searchers to finder.healthcare.gov which "is not the Health Insurance Marketplace.”
Recommendation: Demote the Sitelink in question via webmaster tools.
When the same text content appears on different webpages, it is considered duplicate content by search engines. There is no penalty for duplicate content but it can thin certain ranking signals. As a result, search engines recommend that webmasters specify the preferred version of each page.
Findings: www.HealthCare.Gov includes the same content as well as different combinations of content from various versions of both the old and new website. For example, Spa.HealthCare.Gov , www.HealthCare.Gov, Finder.HealthCare.Gov and LocalHelp.HealthCare.Gov just to name a few. As a result, it is possible that searchers will arrive at the unintended subdomain and the site will appear not to work.
Recommendation: Use rel=canonical attributes to specify which page version is preferred and return 410 HTTP responses for pages at additional subdomains.
Soft 404 Pages:
“Usually, when someone requests a page that doesn’t exist, a server will return a 404 (not found) error. This HTTP response code clearly tells both browsers and search engines that the page doesn’t exist. As a result, the content of the page (if any) won’t be crawled or indexed by search engines.” https://support.google.com/webmasters/answer/181708?hl=en
Findings:HealthCare.Gov errors do not redirect to a dedicated 404 landing page or return a 404 HTTP response. As a result, URLs for pages without content will be indexed by search engines when posted online. In addition, versions of older pages like http://finder.healthcare.gov/404.html return a 302 HTTP response which is a temporary redirect. As a result, site error pages will continue to be indexed and frustrate users.
Recommendation: Create a dedicated 404 page which returns a 404 HTTP response and redirect error requests to the dedicated 404 URL.
Development Platform Indexing:
Findings: The new HealthCare.Gov website appears to have been developed at the subdomain Test.HealthCare.Gov. This subdomain does not appear to have been password protected and as a result was crawled and indexed by search engines. Currently 100s of pages from this subdomain are indexed in search results. In order to help prevent searchers from going to the developer version of the site, Test.HealthCare.Gov now returns a 503. Disallowing via robots.txt or returning a 503 will not prevent pages from appearing in search results. The only way to prevent content from appearing in search results is to add the noindex meta tag or password protection.
Recommendation: To have this content removed from search results return a 401 HTTP response.
"A breadcrumb trail is a set of links (breadcrumbs) that can help a user understand and navigate your site's hierarchy." In order to understand information in a page, searchers need to know where they have landed in the site architecture.
Findings: When users arrive at the page above from search results there is currently nothing to indicate where the user is within the site architecture. For example, if a users arrives at the page above from search, there is nothing to indicate whether this information applies to business or individual health care plans.
Recommendation: Implement breadcrumb navigational elements in each page.