Are you curious about exactly what the guidelines are when it comes to duplicate content and how you can keep from having your sites unfairly penalized by Google? While there are sites that attempt to “fool the search engines” using duplicated content, there are also plenty of businesses that have substantial content blocks either across URLs or across domains that are similar. This article will help you to build a strategy for managing similar content and pages to keep from being penalized for duplicate content.
When the internet was still in its early stages and Google wasn’t as advanced in its strategies for determining the value of websites, site owners could “fool” Google and the other search engines into thinking that their sites were more valuable and relevant than they really were. For example, many sites would contain lines of the same keyword repeated over and over in order to pass keyword density. Other than this, the site would have no real valuable content, just a bunch of AdWords ads or sales offers.
The hope was that the site would rank high in the search engines and that the high traffic would result in high sales for the site owners. In addition to the above “keyword cramming,” some sites would be built using the same article multiple times within the same domain. This was done in an attempt to “fool” the search engines that the site contained more unique content than it really did.
It wasn’t long before the search engines caught onto this trickery and created the duplicate content rule. The purpose of this rule was to prevent site owners from using the same articles or posts in order to make it look like the site had more content than it really did. However, many people have now come to believe that you can get penalized for using any duplicated content at all on your website. This is certainly not true, and those who understand this have no need to exhaust their resources to create more unique articles and blog posts than they have to.
These days, hundreds of websites such as Ezine articles (a popular article submission site on the Internet) encourage what is known as “content syndication.” With content syndication, you can publish an article from Ezine articles on your own site. This is not likely something they would offer if, in fact, the content duplication were to have a negative impact on their own website.
Most legitimate duplicate content occurs when a company creates the same type of content to describe a product page for the same product and which is accessible through various URLs. For example:
Other uses of duplicate content might include printer only or PDF versions of websites, which contain the same content as the regular site, but for the purpose of printing the information. All of these will cause the Google spider to index the pages as containing duplicate content which can hurt the site’s page rank or even cause the pages to be removed from the Google search results.
Thankfully, changes in the site configuration, known as ”canonization,” can be used to inform the Google spiders of which URLs are more important and can keep your site from being penalized for duplicate content. This will also empower you to inform Google of how you would like your web pages to be indexed.
Parameter Handling is a method that informs Google about which URL parameters to ignore, which will keep their spiders from recognizing parts of your site as containing duplicate content. For example, using the Google Webmaster tools you can suggest 15 “parameters” on your site which you would like the Google spiders to ignore, and if you suggest the parameter “products” as one of these ignored parameters then:
This way, if you had another URL that contained content similar (similar enough to be considered duplicate) to the above URL, but which you considered being more important, the Google spider would give that page priority in the search results and ignore the one which contained duplicate content.
Many companies choose to use different versions of their URL in order to create links back to their site. For example, if your main URL is http://www.demo.com, you might choose to also use a non-www version of the URL such as http://demo.com. Using the Google Webmaster tools you can suggest which of these domains you would like to set as your preferred domain, which causes Google to crawl your site and index information from your site according to that domain.
Of course, it's important to remember that it will take some time before Google begins to index your site the way that you suggest to them if you hadn’t originally chosen a preferred domain. It might also be a good idea to use a 301 redirect to send traffic from your non-preferred domain.
If you have several versions of one page, all of which contained what might be considered to be duplicate content, you can indicate to the search spiders as to which of these pages is your primary page. For example, if you have the following two URLs…
…and you wish for the first piece to be considered the primary page, you can include a piece of code in the head section of the second page:
This code basically tells the search engine spiders that the contents of the second page referred back to the contents of the first page. This method can be duplicated for every additional page which contains the same information as the primary page:
Including your preferred pages as part of your site map indicates to Google which pages of your site be considered to be the most important. While isn't as guaranteed as the above three methods, it's important to set up a site in order to make your site more callable and indexable for the Google search engine spiders.
If you have been spending a lot of money on creating hundreds of unique articles or spinning articles in order to create enough unique content for your article submissions, this might be one of the most important things you’ve read in a long time. Contrary to what many have come to believe, content duplication is something that you can use in your content marketing strategy, and if you do it right, it won’t hurt your SEO. In this article, we’ll be talking about exactly where the duplicate content rule came from and why there’s no need to be afraid of it, as long as you’re doing things right.
If you have a high-quality article on your website and you want to publish it on an article marketing website or on someone else’s blog, you’re not going to hurt your website in doing so. Of course, if some one searches for that specific article, the search engines will present the version of the article which they believe is the most relevant. In these cases, the article which is placed on your home site might not be ranked as high as the one you posted on an article marketing site or on someone else’s blog.
However, it’s very rare that people will search for a specific article, and even if they do, providing a link back to your site within the article or post and also your author name will still promote your brand no matter where the reader finds your article. The thing you want to avoid is publishing that same article multiple times on the same domain, which is the type of trickery that led to the duplicate content rules in the first place. So if you have a good article that you want to send to article submission sites, there’s no reason to water it down by spinning it or spending your money and having someone rewrite it.
This can save you a LOT of time and money which would have otherwise been spent creating unique versions of a single article.
One thing worth mentioning when it comes to content duplication is that you don’t want to plagiarize someone else’s work. If you find an article or a blog post that you want to publish on your site, you MUST give credit to the author of the content. It’s also a good idea to contact the creator if, in fact, you’re going to use their content to help promote a product of yours. Doing these things will ensure that you don’t end up getting contacted by someone’s intellectual property rights attorney or that you don’t have a complaint filed against you for plagiarism.
It’s also a good idea to use a program like Copyscape which allows you to check if your content has been published anywhere else on the internet. This is a good way to find out who’s using your content and (especially if you’re using a freelance writer) if the content you’re about to publish on your site is being used somewhere else. As long as you keep an eye on this and make sure you’re not copying someone else, you have no need to worry about the duplicate content penalty.
If you have a lot of places on your site where there is duplicate content that absolutely cannot be rewritten, you can always block Google spiders from accessing these areas of your website using a robots.txt file, or a noindex meta tag, This encourages the search engine spiders to index only the most important pages of your site and keeps you from having to compete with yourself for space in the Google search engine results. However, Google still suggests that the best method for keeping their spiders from crawling pages that include duplicate content is to use the four canonization methods above.
Always remember that when it comes to content, your number one priority is providing value to your website visitors. If you stay focused on this and continue to educate yourself on the principles of SEO, you will do much better than your competitors. You can get started by signing up to receive your FREE seven-day mini-course which walks you step by step through the process of laying a strong foundation for SEO success. To get instant access to this mini-course and watch your site climb the search results, hurry and fill in the form below.