
How Duplicate Content Occurs
Whether you realize it or not, there’s probably duplicate content on your website. Some of the most common ways in which duplicate content is created involves URL parameters, session IDs, categories, archives, “tags,” and more. Even if you think your content is being published in a single location, it could show up in half a dozen different URLs depending on the settings of your content management system (CMS).
WordPress, for instance, is the perfect example of how duplicate content is created, often without the user’s knowledge. When you publish a new post or page using the default WordPress settings, it’s published on the post/page URL, as well as the archives, author and tags. Search engines will crawl each of these locations, at which point they’ll see the same content being published. So, how will this affect your website’s search rankings?
Video: How does Google handle duplicate content?
In the video published above, Google’s Webspam team leader Matt Cutts discusses the topic of duplicate content and how it’s handled by the search engine giant.
According to Cutts, roughly 25-30% of the entire Internet consists of duplicate content; therefore, publishing small amounts shouldn’t have a negative impact on your site’s search rankings. Elements like legal jargon, terms and conditions, quotes, and boilerplate content is perfectly fine, assuming it’s used sparingly and isn’t the bulk of your site’s content.
With that said, there are instances where an excessive amount of duplicate content can bring down a site’s search rankings. This is particularly true in cases involving scraped content (e.g. sites that automatically copy and publish content from other websites and sources). Cutts says that Google can and will take manual action against offenders.
Cutts also says reveals Google’s method for handling duplicate content. Rather than listing the multiples with the same content, Google attempts to identify and display only the original source.
Follow These Tips To Reduce Duplicate Content:
- Use an SEO plugin for WordPress sites which allows you to set author, tags and archives pages as “noindex.”
- When moving content to a new location, use 301 redirects to guide search engines.
- Google recommends the use of rel=canonical tags when publishing duplicate content.
- You can set perimeter handling in your Google Webmasters Tools account.
