The duplicate content issue in search engine optimization plays an increasingly important role and was not by chance at the SMX Advanced Conference in Seattle, one of the greatest sessions, those on duplicate content. For this purpose, now appeared now finally in the Google Webmaster Central blog into German translated article by Vanessa Fox on Duplicate Content Session at SMX Advanced. The central argument of this article is that web pages are convicted of double content nor get into the supplemental index. Nevertheless, it is acknowledged that a lower PageRank may cause additional slip the pages into the index (Supplemental). So again: Duplicate content will not be punished, but as far as possible every webmaster should be striving to avoid duplicate content.
So what are the most verbreideten ways to produce unintentional duplicate content?
1st Generate many of the sub-pages with little unique content, a good example of this are sub-pages of web directories that were created with little or no entries.
2nd Use a CMS that multiple URLs for the references of the same content allowed. It's amazing at how many CMS sites when possible.
3rd The creation of sub-pages which differ only in small text fragments. For example: Identical texts on hotels in German cities, where the text is changed only the city name.
4th Furthermore, one should not forget to install a 301 redirect from http://domainname.com on http://www.domainname.com This avoids duplicate content and has the additional advantage that all incoming links PageRank will be centralized on one page.
5th Another mistake is often made is that the HOME-referencing links www.deineDomain.de / index.html / index.htm or index.php - what deeper meaning behind such a start is to link pages stuck if the entire internet at www . deineDomain.de points? Besides the content is generated twice, and explodes the inheritance of the internal PageRank.
6th The implement of print or archive sites which is not blocked by the robots.txt for the spider are.
7th The use of session ID parameter on the website. This means that every time the spider comes by, thinking that this new sub-pages are to crawl on the domain.
8th Take advantage of URL parameters to track. To use one of the most popular types of URL parameters, this is the pass for affiliate programs. However, a search engine sees a URL such as duplicate content by www.deineDomain.de?partner1234 www.deineDomain.de. It would probably make more sense to work with cookies.
9th To take online websites which ignore URL parameters. If anyone here from outside the site without intention link wrong, can quickly lead to duplicate content.
10th Blogs are also known to produce like duplicate content, here you should definitely show the RSS feed and Trackback by no-follow or keep out the robots on the robots.txt.
So you can recognize that there are many opportunities for unintentional duplicate content the search engines to produce and is probably slightly above the list drawn up by me or to supplement some points. Everyone should be clear that the best practice for duplicate content, this is avoided from the beginning.
How search engines recognize duplicate content?
When search engines look for duplicate content filter that first the entire contents of the website which is template-based. So header, footer and navigation which is on all sides of the same domain. This is then "given" was arrested and has no negative impact. Thereafter, the contents of the website, which is unique on the page, and viewed in detail with the contents of all websites on the search engine gespiederten of its uniqueness checked. One known type, such as search engines, check for duplicate content, the method of sliding window, in this process are a fixed number of characters checked the website on your uniqueness. Every webmaster is to recommend its own content from time to time on a tool such as Copyscape to check. Here you can also quickly determine whether laboriously created texts can be used by other website operators ;-)
In conclusion it should be noted that every webmaster should be sought to publish unique content on his website, because it is in the sense of search engine users to your high-quality and unique sites to showcase in the search results. Also, one should strive to be no link to the duplicate content websites have added, as well as links to these pages can start as soon backfired.
Finally, I would not conceal from you, another strategy for avoiding duplicate content, which Gerald has placed in his blog. Something radically in the implementation but for SEO's absolutely worth reading ;-)