Increased Traffic, Yeah Right

2016-02-15

glue-boy saw an uptick in traffic recently, which can mean only one thing: spam-bots are at it again. Recapping how to stay sane on an internet littered with bad actors.

Identifying the Problem

I am in the middle of migrating nprescott.com and its subdomains, consequently I am in the middle of shaking out automated or at least repeatable ways to manage things on a new host. Looking at moving existing file uploads from host to another I noticed an unusual amount of activity coming from the internet pastebin that I host. Pastes expire after they go unaccessed for more than a week, consequently things tend to stay pretty quiet.

$ du -hs ./glues
48M     ./glues

$ ls -lk | awk '{ SUM += $5; N++ } END { print SUM/N }'
1.91168

So I'm looking at 48 megabytes of pasted content with an average upload size of under 2 kilobytes. This is a red flag because glue-boy is not a high-traffic site; thousands of uploads are always a good sign of a bot ring.

So I dig through Nginx's access logs, filtering down to just February 15th (today) and only the POST requests, I want to see the number of requests per IP addresses and immediately a pattern becomes obvious.

$ awk '/15\/Feb.*POST/ { print $1 }' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head -n20
  510 188.143.232.72
  150 188.143.232.40
  150 188.143.232.34
  150 188.143.232.22
  130 188.143.232.35
  130 188.143.232.26
  130 188.143.232.24
  130 188.143.232.21
  130 188.143.232.16
  130 188.143.232.15
  130 188.143.232.11
  130 188.143.232.10
  122 188.143.232.14
  120 188.143.232.37
  120 188.143.232.19
  119 188.143.232.13
  110 188.143.232.70
  110 188.143.232.62
  110 188.143.232.43
  110 188.143.232.41

It looks like 188.143.232.* is a bot net, especially when you realize the requests are coming in by the dozen per minute. A web search on any one of those addresses turns up results for IP blacklists, project honeypot, and forum spam notifications.

Filtering Bad Actors

It is a simple matter to simply drop the entire subnet from here, I still use iptables because I haven't migrated to something sleeker like ufw yet.

$ iptables -A INPUT -s 188.143.232.0/24 --jump DROP

Here I am appending a rule to the INPUT chain (basically any request for local resources) from the source 188.143.232.0/24 (where the /24 is the standard network prefix for a class-c subnet mask) that instructs the kernel to drop any requests matching. Subnet masks are a useful, if slightly opaque means to manage a kind of address space globbing with origins in the binary representation of IP addresses¹.

Immediately I can tail -f the access logs and see how the fix works. Potential downsides include blocking legitimate traffic from that particular subnet, sorry to any legitimate users in St. Petersburg.

Follow Up

This was an easy problem, but highly dependent on my noticing and investigating. I haven't identified the best way to mitigate this kind of concentrated usage, I'm still weighing the benefits of something like Nginx's Limit_req module as compared to simply writing my own report to flag activity. I'm not yet confident I should block every instance of such activity bursts.

IP addressing and Subnetting for New Users