Search Engine Crawl Rates: Useful Wordpress Plugin
This week I have started using a Wordpress plugin called “Crawl rate Tracker”.
The plugin a graph of how often your website has been visited by the indexing spiders from Google, Live Search (Microsoft), Yahoo and Technorati. This is important since it is an indicator of how much trust particular pages attract from a search engine.
The graph below is from the Wardman Wire Blog (i.e., not the “Magazine style” front page, which is a separate Wordpress installation) for the few days since I installed the plugin. As you can see, the blog has had up to 1000+ search engine spider visits each day - which makes clear why it is important to filter “robots” out of your traffic statistics. Also, the front page is being indexed up to 13 times a day by Google.
You can click through to the report for each page.
This is the one for my article “What can Technology bring to the Political Process next?” published in June 2007. I’d be interested to know why that article is attracting so much attention - but the web moves in mysterious ways.
![]()
My Notes on the plugin:
-
It provides very useful information.
-
It is one more reason why I think that self-hosted Wordpress is preferable to blogger as a platform.
-
It could possibly use a lot of CPU power (judging by the amount and frequency of data it generates), and so there may be some advantage in switching off the plugin when you are not using it actively.
You can download the plugin from Blogstorm (zip file), and read some more information here.
[tags]google spider, search engine, slurp, technorati spider, web robot, web spider, wordpress plugin, yahoo spider[/tags]


Unless you host your blog on your own pc it shouldn’t affect your cpu speed. I run my own blog but it is on someone else’s servers, they should be able to handle the workload easily.
It looks like an interesting plugin.
I was thinking of the load on web servers, especially for those where blogs are hosted on an account with multiple customers on one server.
Certain plugins which do a lot of database accesses are notorious resource hogs, such as Slimstat and Feed Wordpress. This looks to me to be the same since it records each page index. When I took Slimstat off my blog, just a few months of stats records constituted about 85% of the database size (I could have compressed and summarized it, but there was also a compatibility issue with Wordpress 2.5).
On this blog - assuming that the information is stored in the database - it is doing around 2500 accesses each day for the blog view, and another 800 or so for the magazine view. That may be doubled when Google does a deep scan.
That’s a lot. I don’t use wp-slimstat anywhere now for that reason.
On other blogs - such as welsh-politics.co.uk - I have had the account hit CPU or memory limits when Google has been running a full index.
Hope that clarifies what I mean.
Thanks, it does clarify it.
The Morningstars last blog post..Wendy Alexander Resigns
@The Morningstar: Good - thanks for coming back.