In late June 2008, both Google and Adobe announced that they had worked together to improve the Google spider’s ability to crawl Flash elements for content.

In the past, web designers faced challenges if they chose to develop a site in Flash because the content they included was not indexable by search engines. They needed to make extra effort to ensure that their content was also presented in another way that search engines could find.
(Official Google Blog: Google Learns to Crawl Flash)

This is a great step.  While not accessible, Flash is a great enhancer for pages and Google being able to index them really can help.  You were waiting for it.  Here’s the catch: we have no idea if content from a Flash crawl is as highly valued to the search engine algorithm as regular ol’ semantic HTML.  If you have dynamic content, either Google won’t index it, or won’t assign them to the page they’re viewed on.  There is no way to see how/what Google is indexing in your Flash file.

The best practice before this announcement is the same as it is now. Don’t leave it up to them.

The “extra effort” they’re talking about is using technologies like SWFObject to make your page smart enough to realize that a user (or a spider) cannot read the Flash and displays a plain HTML version for them. This is an ideal solution for both accessibility and findability. Screen readers can enjoy the data provided in your Flash.  Search engines will never know Flash was there, and will index your HTML.

One concern of using this method is being blacklisted by Google for unethical SEO practices.  I mean after all, you’re inserting content simply for search engine ranking, right?  Probably not.  Google’s policy on hidden content basically says that as long as the substitution has a redeeming value for accessibility, it’s kosher.  Interpreting this official blog post, it seems to agree.  Be honest that you’re showing this data for the purpose of your users and NOT only targeting SEO.

Important things to remember about replacement:

  1. Googlebot cannot execute certain types of JS.  If you use JS to load Flash, it will not index the Flash.
  2. If your Flash file pulls from an outside source (ie– XML), it really isn’t indexed.
  3. sIFR is indexed as normal HTML.
  4. Show Googlebot the same thing as your users and you’re in the clear.  “Users” is open to some interpretation.
  5. If using replacement, use the exact same text and you’re still in the clear.  If using SWFObject, and you add more descriptive terms to the HTML version, you are violating.
  6. Sites get blacklisted every day.  It happens to people, don’t think you’re too small of a fish.  It is a hard hole to climb out of.

Got all that?  Until Google will show what they’re indexing and acknowledge that spiders can see Flash content administered by SWFObject, these are my recommended best practices:

Flash should be used only as a support element on web pages to enhance the user experience.  It should not be a primary content delivery method.  Google can now more effectively index Flash, but we should not let them.  We are unsure of the methods and if it is as effective as semantic HTML.  Using a JavaScript replacement such as SWFObject allows the luxury of controlling what Googlebot is indexing.  This method has been approved by Google and has virtually no risk of blacklisting as long as your replacement is ethical and contains the same data as the Flash you are replacing.  The accessibility benefits of this method are what make it ethical and are an excellent byproduct of the method.

References:

  • http://googlewebmastercentral.blogspot.com/2007/07/best-uses-of-flash.html
  • http://www.google.com/support/webmasters/bin/answer.py?answer=66353
  • http://googleblog.blogspot.com/2008/06/google-learns-to-crawl-flash.html
  • http://www.adobe.com/aboutadobe/pressroom/pressreleases/200806/070108AdobeRichMediaSearch.html
  • http://news.cnet.com/8301-13530_3-9748159-28.html
  • http://www.yourseoplan.com/google-flash.html
  • http://www.conversationmarketing.com/2008/07/google-indexing-flash-dont.htm