Blocking unwanted guests with NGINX

2011-04-04

Lately, my sites are being hammered by “unwanted behavior”. Because I feel I cannot blindly trust my WordPress installs to be 100% bug-free, I like to block as many of these unwanted behaviors before any PHP code is executed on my server. Currently, I use the following NGINX techniques to block unwanted visitors.

  • User Agent
  • IP address
  • Referrer
  • Netwerk subnet
  • GeoIP

Blocking by User Agent

I use this technique primarily to keep out robots that ignore the settings in my robots.txt. I also block user agens that have a funny vibe to them, and -funnily enough- libwww-perl.

if ($http_user_agent ~ "Windows 95|Windows 98|biz360.com|xpymep|TurnitinBot|sindice|Purebot|libwww-perl") {
  return 403;
  break;
}

I chose to block libwww-perl because in the past few years, I have not seen any GET or POST by libwww-perl that did not try to exploit some sort of software vulnerability. The other agents only seem to scrape my content.

Blocking by IP address

I only block IP addresses when I come accross odd referrers, or when spammy comments are posted from an IP. Blocking an IP address is quite easy, using a deny statement:

deny 85.17.26.68; # spammy comments - Leaseweb
deny 85.17.230.23; # spammy comments - Leaseweb deny 173.234.11.105; # junk referrers
deny 173.234.31.9; # junk referrers - Ubiquityservers
deny 173.234.38.25; # spammy comments
deny 173.234.153.30; # junk referrers
deny 173.234.153.106; # spammy comments - Ubiquityservers deny 173.234.175.68; # spammy comments
deny 190.152.223.27; # junk referrers
deny 195.191.54.90; # odd behaviour, Mozilla, doesnt fetch js/css. Ended up doing a POST, prob a spambot
deny 195.229.241.174; # spammy comments - United Arab Emirates
deny 210.212.194.60; # junk referrers + spammy comments

I like to keep track of my reasons for blocking an IP address by adding a comment after the deny statement. I often choose to remove an IP address from the blocklist if I haven’t encountered in my logs for at least a month.

Blocking by referrer

It is also possible to block based on the referrer info that is sent in the HTTP headers.

if ($http_referer ~* (viagra|cialis|levitra) ) {
  return 403;
}

With the ~* I perform a case-insensitive match on the full referrer string. If either viagra, cialis or levitra appears in the referrer, the GET request will be forbidden.

Blocking by network subnet

If blocking a single IP address is not enough because strange things are happening from the entire network subnet it is in, you can choose to block the entire network subner. Looking at the list of blocked IP addresses above, you will notice Ubiquityservers is well presented here, but since it is a somewhat larger network (173.234.24.0/21) and the IP addresses don’t seem to have any other things in common, I chose not to block its entire subnet. Blocking a network subnet is also done by a deny statement, only instead of entering an IP address, you enter a CIDR netblock:

deny 69.28.58.0/24; # acting weirdly, might be robot without identity
deny 79.142.64.0/20; # weird behaviour from different hosts - Altushost INC
deny 80.67.0.0/20; # spammy comments
deny 88.214.193.0/24; # junk referrers, hosting company, not important

Blocking based on GeoIP data

Finally, you can choose to block whole countries, based on GeoIP data provided by MaxMind. Your NGINX install needs to have GeoIP support enabled though, this can be done at compile-time. First, you need to tell NGINX where the GeoIP database is located on the filesystem. You can do this inside the http {}; configuration block:

geoip_country /etc/nginx/GeoIP.dat;

Now you can tell NGINX which countries need to be blocked:

if ($geoip_country_code ~ (BR|CN|KR|RU) ) {
  return 403;
}

In this case, I block Brazil, China, Korea and Russia. None of these countries belong to my target audience and they are the source of 90% of unwanted behavior that targets my sites.

Blocking from a central configuration file

All of these configuration snippets can be contained in one single configuration file, which can be included inside the server{}; configuration block from any number of virtualhosts. In my case, this file is /etc/nginx/block.conf. To include it from a virtualhost configuration, use the follwing line:

include /etc/nginx/block.conf;

Finally

You have a nice arsenal of blocking techniques to keep the scriptkiddies (and worse) at bay. Use them sparingly, and be careful not to block Google ;)

spamnginx
Creative Commons License

Wordpress MU, NGINX and FastCGI on CentOS

Referrer spam (200912-01)