Block Bad robots, spiders, crawlers and harvesters

There are lots of examples across the internet that use ModRewrite. We will provide such an
examample as well. However, what to do when ModRewrite is not available? We can use SetEnv
directive with combination with FilesMatch.
SetEnvIfNoCase user-agent  "^BlackWidow" bad_bot=1
SetEnvIfNoCase user-agent  "^Bot\ mailto:craftbot@yahoo.com" bad_bot=1
SetEnvIfNoCase user-agent  "^ChinaClaw" bad_bot=1
SetEnvIfNoCase user-agent  "^Custo" bad_bot=1
SetEnvIfNoCase user-agent  "^DISCo" bad_bot=1
SetEnvIfNoCase user-agent  "^Download\ Demon" bad_bot=1<FilesMatch "(.*)">
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</FilesMatch>

How it works? If the string or regular expression matches the user-agent HTTP header it
sets the bad_bot environment variable. Then in the FilesMatch we tell the server to deny
access (show Forbidden page) to all users/bots that did match any of the strings above.

And of course here it is the ModRewrite based example:

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule ^.* - [F,L]


What it does? The RewriteCond looks for a string or regular expression that matches. In
case that there is a match it shows a Forbidden Error page.

 

  • 15 Users Found This Useful
Was this answer helpful?

Related Articles

.htaccess guidance

.htaccess is a special Apache file that tells your website how to function. You can edit the...

How to block users from accessing your site based on their IP address

Blocking users by IP address is pretty simple with .htaccess.So here it is the example:Order...

Force www vs non-www to avoid duplicate content on Google

Recently, it has been talked a lot about Google and duplicate content as well as Google Canonical...

Change default directory page

Most probably you have been wondering how the Webserver decides which page from your site to...

301 Permanent redirects for parked domain names

If you have several domain names parked/pointed at your site it is a good idea to create...