Google is all over duplicate content, your site will suffer for it. But what exactly is duplicate content? The biggest problem is, Google doesn’t always tell you there is a problem, you can set up Google Alerts on your domains and addresses, but that’s no help for code, and template duplication. So I will try to cover the most popular ways to get pages excluded from main search for duplicate content.
- Your site uses both plain http://website and http://www.website URL protocol. Use a redirect or mod rewrite to fix this. This canonical redirect will help Google understand that the www & non-www are NOT 2 separate pages. You see, urls are like phone numbers, unique for every user… So Google thinks these 2 versions of your domain’s urls are in fact different numbers, so to speak.
- Your site has no standardized handling method for the entry page, or default indexing. This means the index/default/main page can be viewed live & independently of the www.mysite.com. You fix this with a 301 or permanent redirect. For example /index.html is exactly the same page as your domain url.
- If you use a template, this dramatically increases th opportunity for content duplication by using dynamic site elements across many or all pages. To combat this, when page specific text is added, it must be ENOUGH text to make that page stand out and be unique from the other templated pages.
- Change your Meta tags and page titles for every page. This one is 100% easy and highly important. Use the <head> elements to properly describe the page and you will be fine.
- If you are going to share files across domains, link them ….yes even your own stuff can be duplicate content.
- In your Google Webmaster Tools it is advisable the you choose a preferred domain under the diagnostic tab, and by all means while you are there check out your content analysis for duplicate information Google has found in your site.
- The #1 duplication issue I see amoung the stores we analyze is product description duplication. You MUST write great unique product descriptions for your product to be successful. Using the manufacturers description creates content that is duplicate in whole or partially with every other distributer… Many time the supplier as well. I’m thinking your site is NOT the authority for this content and Google will not display your products for search.
- Your robots.txt WILL NOT block your page from being indexed… You must block them with a noindex, nofollow or other means to specifically DISALLOW indexing of these pages.
- In general when adding textual content (product descriptions), paraphrase and add rich text ….that’s really what Google wants. Unique pages perform well… Duplicate pages never show up.
- Be extremely careful with content generators, most times the content is duplicate. Try paying a college student to write the copy for you… Good investment!
We theorize that 70% of your total page must be unique. Here are a few tools to help check it:
- This tool compares any 2 pages
- This tool generates a detailed canonical report
- This tool will detect duplicate text
How do I know I have duplicate content in Google? Well even though we are signed up for their tools and participate in all their little programs, they don’t bother to go out of their to inform us (even though most times website duplication is unintentional and sometimes the webmaster is actually the victim (scraping)). So here are some helpful tools to monitor your store’s pages in Google.
- You can use Google Search site:yourdomain.com will give you the indexed pages, all including subdomains. So if you track this it can be helpful. So ideally if you see a drop here you might have had pages pulled from the main index. This is vague at best.
- This is the Google Cache Tool, and you are going to love this. It tells you how many pages in your domain are currently cached in Google.com, when the cached file was recorded and the results of the cache like +5 or -1 pages etc. A little more help is needed we still can’t identify which pages have issues, but at least a we have a time frame to narrow it down if you log changes to the site.
Lastly what do you do if your store’s pages get removed? Think unique and get to work. Once you have stellar conetnt rich and unique pages create links to them to help Google find its love for them once again.
I leave you with this thought for the day, the process is strict and possibly ridiculous right now…..but as with anything else, change inevitably brings about chaos.