Duplicate Content & Multiple Site Issues

SES San Jose 2009 Review:
Duplicate Content & Multiple Site Issues

The Google Booth
The Google Booth

After a good sleep at the Hilton San Jose, I wandered to the registration booth, got my Press Badge, and was on my way to the first session: Duplicate Content & Multiple Site Issues

This panel had representation from all three main search engines: Bing, Google and Yahoo. But surprisingly the best presentations were by the two non-search engine representatives: Shari Thurow, Founder & SEO director of Omni Marketing Interactive and Marty Weintraub, President of aimChair.

Shari covered a lot of ground and provided great information about why you want to get rid of duplicates and technically what qualifies as duplicate content. She mentioned some testing her company has been doing regarding navigational usability.

Sasi Parthasarathy from Bing
Sasi Parthasarathy from Bing

The results indicated that the optimal amount of links a page should have is 100. She reviewed how engines determine duplicate content with filters related to: Boiler Plates, linkage properties, server/hostname properties, shingles and last but not least–content. Overall Shari tried to stress the importance of not optimizing for search engines but “optimize for people who use search engines.” I can’t agree with her more, as obviously this is what search engines are and will be optimizing for in the future.

After Shari’s serious discussion Marty Weintraub graced us with an energetic presentation where he listed specific suggestions on how to fix duplicate content issues with examples he has provided to his clients in the past. Some included: removing secure and unsecure duplication, cleaning up urls and canonicalization.

The three search engine representatives echoed similar themes of the two previous speakers. Yahoo’s Ivan Davtchev informed the attendees that 1/3 of content online is duplication.

Other key take-aways include:

  • Use Google webmaster tool to identify duplicate titles and start from there
  • If 10 copies of your content are out-there search engines will just pick whichever the algorithm feels is most authoritative
  • We all love Site Explorer and hope Microsoft doesn’t kill it
  • Duplicate content is acceptable just don’t abuse it
  • Use robots tag or 301 redirects to hide the duplicate parts of your site
  • Lots of filters determine duplicate content not just shingles