Training GoogleBot

Training GoogleBot

Increasing your crawl rate and crawl budget from Google is as simple as training GoogleBot. Crawl rate is simply how often GoogleBot visits your website to crawl. Crawl budget is a much harder metric to define, but in its simplest terms is the amount of data transfer Google allocates to crawling your site. These metrics are based on many things, but most effectively on the need to crawl. If you always have fresh content, Google needs to crawl more often.

Why would we seek to increase our crawl rate and crawl budget?

Very simply, a more frequent crawl rate and more in depth crawl budget will allow your pages to be indexed quicker and your site crawled more deeply. While neither of these things help you rank in a specific manner, they both attribute to your site’s trust and freshness score, which are part of the algorithm. Additionally, there is much information coming out recently with relationship to freshness of content and PageRank…. So even those “old time” PR watchers have reason to take notice.

Fact is, aside from the time needed to create fresh, unique content for you website, this is the easiest thing you can accomplish for your site’s SEO campaign. Creating content regularly is key for these metrics. Note that this is generally going to be new pages, which have the best effect. Freshening content is also important, but not as effective as a new page. Like the old saying, “If you’re not growing, you’re dying”.

So a great example is a client who pays for content for his blog. He buys and receives 30 blog posts in a batch…. He would most effectively make use of this content by releasing them over a longer period of time, instead of in one batch on his website. Again, this goes to creating the habit to crawl more frequently, thus training GoogleBot.

Many ask me what the magic number is…. How many pages a week should I create?

I suspect that this number is going to be a minimum of 1 to 5% of your total pages. Higher percentages are obviously more effective, but if you cannot keep it up on a regular schedule, then the effect is lost. You also do not want to create so much new content that Google has to sandbox you or you start to see huge fluctuations in your indexing and rank. I would think the high end of the number is 10% for this purpose. Remember, sometimes you will have more, like when you launch a new section on your site or have a huge influx of seasonal products for example, that’s okay…. Remember to continue at your regular rate, forever.

How does Google know I have new pages?

Believe it or not, Google knows. Google finds new pages from links, from crawl, from your sitemap and from social media platforms. There was an interesting answer to a sitemap question posted at Google Webmaster Help, where Googler John Mu responded as follows:

Google’s Sitemaps crawler usually reacts to the update frequency of your Sitemap files. If we find new content there every time we crawl, we may choose to crawl more frequently. If you can limit updates of the Sitemap files to daily (or whatever suits your sites best), that may help. Similarly, if you create a shared Sitemap file for these subdomains, that could help by limiting the number of requests we have to make for each subdomain — you