Knowledgebase

Block Bad robots, spiders, crawlers and harvesters  Print this Article

There are lots of examples across the internet that use ModRewrite. We will provide such an
examample as well. However, what to do when ModRewrite is not available? We can use SetEnv
directive with combination with FilesMatch.
SetEnvIfNoCase user-agent  "^BlackWidow" bad_bot=1
SetEnvIfNoCase user-agent  "^Bot\ mailto:craftbot@yahoo.com" bad_bot=1
SetEnvIfNoCase user-agent  "^ChinaClaw" bad_bot=1
SetEnvIfNoCase user-agent  "^Custo" bad_bot=1
SetEnvIfNoCase user-agent  "^DISCo" bad_bot=1
SetEnvIfNoCase user-agent  "^Download\ Demon" bad_bot=1<FilesMatch "(.*)">
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</FilesMatch>

How it works? If the string or regular expression matches the user-agent HTTP header it
sets the bad_bot environment variable. Then in the FilesMatch we tell the server to deny
access (show Forbidden page) to all users/bots that did match any of the strings above.

And of course here it is the ModRewrite based example:

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule ^.* - [F,L]


What it does? The RewriteCond looks for a string or regular expression that matches. In
case that there is a match it shows a Forbidden Error page.

 

Was this answer helpful?

Also Read

Can you provide a custom package for me?
Yes, most definitely. Simply open up a helpdesk ticket in the Sales department or contact us...
Redirect URLs using .htaccess
Sometimes you need to redirect some URL and/or page on your site to another one.The feature is...
Force SSL/https using .htaccess and mod_rewrite
Sometimes you may need to make sure that the user is browsing your site over a secure...
Change PHP variables using .htaccess
If you need to change the way your PHP is working you can do that using .htaccess.Please, note...
Force www vs non-www to avoid duplicate content on Google
Recently, it has been talked a lot about Google and duplicate content as well as Google Canonical...