Lately, my sites are being hammered by “unwanted behavior”. Because I feel I cannot blindly trust my WordPress installs to be 100% bug-free, I like to block as many of these unwanted behaviors before any PHP code is executed on my server. Currently, I use the following NGINX techniques to block unwanted visitors.

  • User Agent
  • IP address
  • Referrer
  • Netwerk subnet
  • GeoIP

Blocking by User Agent

I use this technique primarily to keep out robots that ignore the settings in my robots.txt. I also block user agens that have a funny vibe to them, and -funnily enough- libwww-perl.

if ($http_user_agent ~ "Windows 95|Windows 98||xpymep|TurnitinBot|sindice|Purebot|libwww-perl") {
  return 403;

I chose to block libwww-perl because in the past few years, I have not seen any GET or POST by libwww-perl that did not try to exploit some sort of software vulnerability. The other agents only seem to scrape my content.

Blocking by IP address

I only block IP addresses when I come accross odd referrers, or when spammy comments are posted from an IP. Blocking an IP address is quite easy, using a deny statement:

deny; # spammy comments - Leaseweb
deny; # spammy comments - Leaseweb deny; # junk referrers
deny; # junk referrers - Ubiquityservers
deny; # spammy comments
deny; # junk referrers
deny; # spammy comments - Ubiquityservers deny; # spammy comments
deny; # junk referrers
deny; # odd behaviour, Mozilla, doesnt fetch js/css. Ended up doing a POST, prob a spambot
deny; # spammy comments - United Arab Emirates
deny; # junk referrers + spammy comments

I like to keep track of my reasons for blocking an IP address by adding a comment after the deny statement. I often choose to remove an IP address from the blocklist if I haven’t encountered in my logs for at least a month.

Blocking by referrer

It is also possible to block based on the referrer info that is sent in the HTTP headers.

if ($http_referer ~* (viagra|cialis|levitra) ) {
  return 403;

With the ~* I perform a case-insensitive match on the full referrer string. If either viagra, cialis or levitra appears in the referrer, the GET request will be forbidden.

Blocking by network subnet

If blocking a single IP address is not enough because strange things are happening from the entire network subnet it is in, you can choose to block the entire network subner. Looking at the list of blocked IP addresses above, you will notice Ubiquityservers is well presented here, but since it is a somewhat larger network ( and the IP addresses don’t seem to have any other things in common, I chose not to block its entire subnet. Blocking a network subnet is also done by a deny statement, only instead of entering an IP address, you enter a CIDR netblock:

deny; # acting weirdly, might be robot without identity
deny; # weird behaviour from different hosts - Altushost INC
deny; # spammy comments
deny; # junk referrers, hosting company, not important

Blocking based on GeoIP data

Finally, you can choose to block whole countries, based on GeoIP data provided by MaxMind. Your NGINX install needs to have GeoIP support enabled though, this can be done at compile-time. First, you need to tell NGINX where the GeoIP database is located on the filesystem. You can do this inside the http {}; configuration block:

geoip_country /etc/nginx/GeoIP.dat;

Now you can tell NGINX which countries need to be blocked:

if ($geoip_country_code ~ (BR|CN|KR|RU) ) {
  return 403;

In this case, I block Brazil, China, Korea and Russia. None of these countries belong to my target audience and they are the source of 90% of unwanted behavior that targets my sites.

Blocking from a central configuration file

All of these configuration snippets can be contained in one single configuration file, which can be included inside the server{}; configuration block from any number of virtualhosts. In my case, this file is /etc/nginx/block.conf. To include it from a virtualhost configuration, use the follwing line:

include /etc/nginx/block.conf;


You have a nice arsenal of blocking techniques to keep the scriptkiddies (and worse) at bay. Use them sparingly, and be careful not to block Google ;)