I've been trying my hand at designing a stupidly simple system for monitoring the state of a server. This has turned into an exploration of the design of existing systems, and stealing shamelessly from those designs.
For several boring reasons, I've narrowly defined system monitoring to meet the following criteria: the periodic collection of (at least) 3 key metrics, load average, disk usage, memory usage. The intent isn't to make things too easy, but these are three useful, real-world values that give the project some relevance, without burdening it with a lot of hyper-specific scenarios.
Inevitably, you may notice this is re-implementing some portion of the popular open-source project Munin. This is true. My first exposure to Munin was in a project attempting to integrate the rudimentary reporting API with an external system. Say what you will about Munin, it is widely successful for its ability to deploy easily and the approach taken with the plugin architecture ensures that nearly anybody could write something and get it working. The down-side is, the project seems to have been designed without a real use-case in mind to report information outside of Munin. I can only imagine this is why there have been so many "modern" alternatives in recent years.
I won't apologize for this level of not-invented-here, but I also won't claim my solution to be an improvement on the functionality that exists in other projects. What doing it myself affords me is two-fold, an interesting project to work on, and a chance to explore a few technologies that I've been working with lately.
I've been trying to get a grip on Python's Twisted, so I'm availing myself of the opportunity here to utilize some of the batteries-included nature of both it and Python. Taking a page from the design of Munin, I am aiming for a daemon to "wake up" every n-minutes and collect load average, disk and memory usage. Unlike Munin, which uses RRDTool behind the scenes, I am intending to use SQLite for storage. Round robin databases have a few nice properties that I may miss, but for such a small prototype, I think they'd be overkill.
So far as actually collecting the desired measures of system performance, Python's standard library makes things pretty easy. I have the luxury of focusing only on Linux systems, specifically Debian-alike systems, which is reflected in a few specific paths to proc-files and their format.
The easiest of the three, this is a one-liner with Python.
def load_average(self):
try:
one_minute, five_minute, fifteen_minute = os.getloadavg()
return ('cpu', five_minute)
except Exception as e:
logging.warn(f'An error occurred collecting load average: {e}')
This one will obviously be tailored to any of those partitions I'm interested in monitoring, I'm using only the root partition for the sake of example.
def partition_usage(self, partition):
statvfs = os.statvfs(partition)
total_bytes = float(statvfs.f_frsize * statvfs.f_blocks)
free_bytes = float(statvfs.f_frsize * statvfs.f_bfree)
return ((total_bytes - free_bytes) / total_bytes) * 100
def disk_usage(self):
try:
return [('disk_' + partition, self.partition_usage(partition))
for partition in self.monitored_partitions]
except Exception as e:
logging.warn(f'An error occurred collecting disk usage: {e}')
def memory_usage(self):
try:
with open('/proc/meminfo', 'r') as meminfo:
for line in meminfo:
if line.startswith('MemTotal:'):
total = float(line.split()[1])
if line.startswith('MemAvailable:'):
available = float(line.split()[1])
free_memory = available / total
return ('memory', free_memory)
except Exception as e:
logging.warn(f'An error occurred collecting memory usage: {e}')
The most system-specific of the lot, this one is reading and parsing meminfo
,
which looks like this:
$ cat /proc/meminfo
MemTotal: 504228 kB
MemFree: 383232 kB
MemAvailable: 453024 kB
Buffers: 11304 kB
Cached: 62556 kB
SwapCached: 0 kB
Active: 50100 kB
...
This is pretty much exactly how free
works, which is easily checked with
strace
(if you look past some of the noise):
$ strace free
...
open("/proc/meminfo", O_RDONLY) = 3
lseek(3, 0, SEEK_SET) = 0
read(3, "MemTotal: 504228 kB\nMemF"..., 8191) = 1251
open("/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=2995, ...}) = 0
read(4, "# Locale name alias data base.\n#"..., 4096) = 2995
read(4, "", 4096) = 0
close(4) = 0
...
I haven't yet given too much thought to the best database schema. There seem to be two potential solutions:
Definitely the simpler of the two schemas, I think this one may suffer a bit in storage for the necessary duplication of keys in each entry. This, to me, feels more like the statsd approach of fire-and-forget, allowing for more flexibility at the cost of some overhead.
CREATE TABLE metrics (
key TEXT,
value FLOAT,
timestamp DATETIME DEFAULT CURRENT_TIMESTAMP NOT NULL
);
This one is more inline with standard relational practices, but (I think) would require look-ups on each insert to get the relevant key for each metric. I think this may eventually be the way to go.
CREATE TABLE systems (
_id INTEGER PRIMARY KEY AUTOINCREMENT,
system TEXT
);
CREATE TABLE metrics (
system_id INTEGER,
value FLOAT,
timestamp DATETIME DEFAULT CURRENT_TIMESTAMP NOT NULL,
FOREIGN KEY(system_id) REFERENCES systems(_id)
);
The final bit that's missing is the until-now elided SystemMonitor
and the MetricsDatabase
encapsulates the database
writes:
class MetricsDatabase:
db_pool = adbapi.ConnectionPool('sqlite3', 'metrics.sqlite',
check_same_thread=False)
table = 'metrics'
@inlineCallbacks
def save_metric(self, key, value):
yield self.db_pool.runOperation("INSERT INTO metrics (key, value) "
"VALUES (?, ?)", (key, value))
class SystemMonitor:
def __init__(self):
self.db = MetricsDatabase()
self.monitored_partitions = ['/']
Twisted's adbapi
is a pleasantly unsurprising wrapper the feels a lot like
the standard library's. I'm sure I'll want to refactor the __init__
method in
the SystemMonitor
the future, for easier dependency
injection/testing, but for now this is a working implementation. The remaining
thing is to configure a system daemon using Twisted.
Probably the most surprising part of the process for me was finding how easy Twisted makes it to implement daemons. While Munin uses cron to manage the periodic collection of system values, I've opted for a more self-contained system using Twisted's TimerService. Immediately, the home-grown version presents a greater flexibility in handling special cases than Munin can; in particular I'm referring to the ability to modify the data collection rate. One unpleasant discovery with Munin is how deeply ingrained the 5 minute cron interval is to everything working as intended.
from twisted.application import service
from twisted.application.internet import TimerService
from system_monitor import SystemMonitor
INTERVAL_MINUTES = 5
application = service.Application("SystemMonitor")
services = service.IServiceCollection(application)
monitor = SystemMonitor()
timer = TimerService((INTERVAL_MINUTES * 60), monitor.write_system_metrics)
timer.setServiceParent(services)
Finally, to setup automatic restarts at the operating system level, I've written a basic systemd service file:
[Unit]
Description=System Monitor
[Service]
ExecStart=/path/to/venv/twistd --nodaemon --python=/path/to/MonitoringService.py
Restart=always
[Install]
WantedBy=multi-user.target
The --nodaemon
flag is an idiosyncrasy of systemd's handling of processes,
instead of manually background-ing (or in this case not background-ing) the
process, systemd manages foreground processes behind the scenes and can track
logging more easily with journalctl
(useful for debugging the service file).
For my own curiosity, I've attempted to quantify my estimated storage requirements as follows:
60 | minutes/hour |
5 | minute samples |
24 | hours/day |
365 | days/year |
105120 | samples per year |
45 | bytes per row |
4730400 | bytes per year |
1024 | bytes/kilobyte |
4619.5313 | kilobytes per year |
1024 | kilobytes per megabyte |
4.5112610 | megabytes per year |
I've based the field sizes off of this document, which claims SQLite stores floats as "reals", which are 8 bytes, and this post which explains how SQLite manages text fields. It seems safe to assume all of the keys will likely fall within 1-byte lengths. In the methods above, the largest single key is 6 characters, so calculating with 10-bytes seems conservative enough. Dates are stored as ISO8601 strings (23 characters).
As is usually the case, now that I've gone through the trouble of writing it
all up I am left with a real desire to change things pretty drastically. I
think the entire design is a little too strongly influenced by Munin, which
makes sense, in that is where I started. But ultimately, I think the design
would be better if it took an approach closer to statsd
, and decoupled the
sender from the receiver a bit further. Right now, the whole system relies on a
kind of single-worker that is managed on a timer; I think it may be more
interesting if the timer approach could be supplemented with a system that may
(or may not!) send values in without being invoked by the daemon directly.
That will have to be a project for a later time, I think the potential here
will be in the consumption of this data, now that it exists. If we crib a bit
further from Munin, this will mean some barebones visualizations, but I wonder
if I can't do a bit better. Right now the only interface to the data is through
sqlite
:
sqlite> select * from metrics;
key value timestamp
---------- ----------------- -------------------
memory 0.862006869908057 2017-07-31 04:09:27
cpu 0.0 2017-07-31 04:09:27
disk_/ 18.6195640114525 2017-07-31 04:09:27
memory 0.860142633887844 2017-07-31 04:09:32
cpu 0.0 2017-07-31 04:09:32
disk_/ 18.6195640114525 2017-07-31 04:09:32
memory 0.860206097241724 2017-07-31 04:09:37
cpu 0.0 2017-07-31 04:09:37
disk_/ 18.6195640114525 2017-07-31 04:09:37