Today we are going to talk about an ongoing issue that many of us in the search marketing world have been fighting with, duplicate content. Usually, duplicate content is a problem that appears due to such things as a website redesign, content updates or poor site design. What happens is that somehow you end up with more than one URL for a single page of content on your site. The problem is that this not only dilutes your links, it also confuses the search engines which could result in a low ranking for your site.
Although this is one of the most important ranking factors for your site it seems that not a lot of people know that the problem can be overcome with the application of a few onsite or server file changes.
www versus the non www:
First up is the www versus the non www. Search engines see http://www.site.com and http://site.com different sites. So if you have both versions available and both get links from other websites, you are splitting the link juice to your homepage and reducing your site’s internal linking power.
Default file extension:
The next issue pertains to Default file extension 301 redirect issues. Are you using the file extension “site.com/index.php “ or /index.html, /index.cfm etc.? If you are using any of these without a 301 redirect then you more than likely have duplicated content on your site.
HTTP and HTTPS versions:
Does your site have both HTTP and HTTPS versions? HTTPS versions of your site are mainly created for website security purposes and users, but if you have both it is important that the HTTPS does not exist for the search engines.
SEO URL rewrites:
Don’t forget to make sure that all your old URLs are redirected via 301 to the new URLs.
Pagination:
Although an excellent way for your users to navigate your site, pagination can be confusing to the search spiders if improperly implemented so be sure to apply canonical tags to help control the spider’s activity.
Article titles and old versus new URLs:
Occasionally, when you change the article title, the URLs of the page may change as well, make sure that all your old URLs are redirected to the new ones.
Printer friendly version of a webpage:
If you have a printer friendly version of your site the search engines may see it as a duplicate. The best way to fix this is by including a rel canonical within your HTML header tag.
Campaign Landing Pages:
Sometimes webmasters will create several versions of a landing page to test conversions but these pages can cause a duplicate content issue because they create several URLs for the same page.
Of course there are a large number of other ways that you could end up with duplicate content on your site, including incorrect internal navigation and different pages with the same meta data, etc. The list goes on. The above list is just some of the most common problems off the top. Fixing them, if you happen to find them will help you to provide your users with more relevant and unique content.
Now it’s your turn. Free feel to let me know if there is anything you would like to add on this list.