indexpost archiveatom feed syndication feed icon

journald for centralized logging

2024-01-25

I have had an item on my to-do list for ages to investigate options for centralized logging. I'm especially interested in simple solutions rather than enterprise-grade approaches that require massive resources and dedicated maintenance. To that end I am trying out systemd-journal-remote to see how it integrates with an uncomplicated Linux server configuration.

In truth this has been motivated by how egregiously bad the experience of running and using bigger systems like Loki or DataDog has been. While I still strive for a minimum of computers required to run any given project I also see the value in centralized logging and backups. In that vein I think it makes sense to minimize the number of moving parts and winnow down the number of things that can go bump in the night.

systemd-journal-remote is composed of two parts that integrate with the systemd journal to 1) forward and 2) receive journal entries across machines. journald is, of course, built into most Linux distributions due to systemd so there isn't much required to get things running. On Fedora 39 the package systemd-journal-remote downloads 396 k including 3 dependent packages. I've written before about how much I like using few dependencies and this is a comfortable number for me.

Dependencies

From a newly provisioned Fedora image, the packages required when installing systemd-journal-remote are:

Because there are so few I actually took the time to download the code for the one that probably has the most bearing on systemd-journal-remote (libmicrohttpd). It is a project that has been around for more than a decade, the code seems understandable and has some amount of testing. I am not really worried about mystery-meat 3rd party dependencies looking at it. Compare this with Loki which has over a hundred direct dependencies and several hundred indirect.

To be fair this approach obviously has a dependency on systemd, itself a large project. Consider though that any machine I am going to use is already using systemd so it isn't as though I'm adding new dependencies, I'll just be using the ones lying around.

Test Setup

I've provisioned two machines with a private network on 10.0.0.0/16 between both. Personally exciting for me was realizing that after just 6 years I finally have IPv6 available at home so the machines did not require I pay the $0.50 per month for an IPv4 address - the private network still happens to be IPv4 though.

The first server will be called alpha and the second bravo. alpha is responsible for generating logs and forwarding them to the log sink at bravo.

Server Configurations

bravo, log sink

# dnf install systemd-journal-remote

# systemctl edit systemd-journal-remote.service
...

# cat /etc/systemd/system/systemd-journal-remote.service.d/override.conf
[Service]
ExecStart= 
ExecStart=/usr/lib/systemd/systemd-journal-remote --listen-http=-3 --output=/var/log/journal/remote/

Clear the ExecStart and change listen-https to listen-http. The -3 refers to the file descriptor and comes through $LISTEN_FDS

# mkdir /var/log/journal/remote

# systemctl edit systemd-journal-remote.socket
...

# cat /etc/systemd/system/systemd-journal-remote.socket.d/override.conf
[Socket]
ListenStream=
ListenStream=10.0.0.2:19532

Clear the ListenStream to configure just the private network IPv4 address. The default otherwise also listens on [::] for IPv6.

# systemctl enable --now systemd-journal-remote.socket

alpha, log source

# dnf install systemd-journal-remote

# cat /etc/systemd/journal-upload.conf.d/override.conf
[Upload]
URL=http://10.0.0.2:19532

# systemctl enable --now systemd-journal-upload.service

Believe it or not that actually concludes the configuration. At this point anything that lands in the journal of alpha is automatically forwarded to bravo.

Using It

Here are just a few things I've verified work without issue. It may not be an amazing demo but what I like is that it leverages tools I already know and doesn't require I dig through documentation trying to untangle yet another query language. There's no web UI but that also means... there's no hideously slow web UI that mysteriously times out every other query I try (not that I'm bitter).

From the log sink, view all logs from both machines: journalctl --merge. This merges journals so that you don't have to specify the remote journal directory.

From the log sink, view logs from just the remote machine: journalctl -m _HOSTNAME=alpha

Tail the logs from the remote machine: journalctl -m --follow _HOSTNAME=alpha

View the logs from the remote machine for a specific service (here the SSH service): journalctl -m --unit sshd _HOSTNAME=alpha

View today's logs from the remote machine for a specific service: journalctl -m -u sshd --since today _HOSTNAME=alpha

More Cool Stuff

journald supports namespacing, where you can have a service log to a dedicated journal. This is achieved by starting a templated service of the journal, so something like:

systemctl start systemd-journal@myapp.service

Then the unit file for the service (myapp, above) gets a setting like:

LogNamespace=myapp

Here's a python program for demonstration, it just produces logging like any usual application:

import datetime
import logging
import sys
import time

if __name__ == '__main__':
        logger = logging.getLogger()
        handler = logging.StreamHandler(sys.stdout)
        logger.setLevel(logging.INFO)
        formatter = logging.Formatter('%(asctime)s %(name)s %(levelname)s %(message)s')
        handler.setFormatter(formatter)
        logger.addHandler(handler)

        while True:
                logger.info(f"the time is: {datetime.datetime.now()}")
                time.sleep(5)

If this is /opt/myapp.py I might write this service file for it:

[Unit]
Description=my neat application

[Service]
LogNamespace=myapp
ExecStart=/usr/bin/python /opt/myapp.py

Once it is started the logs are dumped into their own dedicated journal, visible like this:

journalctl --namespace=myapp

That is fine but mostly works like another system of filtering logs locally. The neat part comes in when you use a namespace to restrict which logs get transferred. A little confusingly, this option doesn't seem to be exposed within the configuration file but is instead a command-line argument to systemd-journal-upload. This feature was only just added in version 254 (the release I'm on) so it is possible it'll get added to the configuration file at some point.

Until then, adding the namespace qualifier to the journal upload requires this override file:

[Service]
ExecStart=
ExecStart=/usr/lib/systemd/systemd-journal-upload --save-state --namespace=myapp

With that done the only thing getting shuffled over to the log sink is this now (no more system services included):

Jan 25 03:30:30 alpha python[2247]: 2024-01-25 03:30:30,540 root INFO the time is: 2024-01-25 03:30:30.540640
Jan 25 03:30:35 alpha python[2247]: 2024-01-25 03:30:35,541 root INFO the time is: 2024-01-25 03:30:35.541050
Jan 25 03:30:40 alpha python[2247]: 2024-01-25 03:30:40,541 root INFO the time is: 2024-01-25 03:30:40.541431
Jan 25 03:30:45 alpha python[2247]: 2024-01-25 03:30:45,541 root INFO the time is: 2024-01-25 03:30:45.541839
Jan 25 03:30:50 alpha python[2247]: 2024-01-25 03:30:50,542 root INFO the time is: 2024-01-25 03:30:50.542208
Jan 25 03:30:55 alpha python[2247]: 2024-01-25 03:30:55,542 root INFO the time is: 2024-01-25 03:30:55.542544
Jan 25 03:31:00 alpha python[2247]: 2024-01-25 03:31:00,542 root INFO the time is: 2024-01-25 03:31:00.542891
Jan 25 03:31:05 alpha python[2247]: 2024-01-25 03:31:05,543 root INFO the time is: 2024-01-25 03:31:05.543229

One use case that comes to mind for this might be to limit which logs traverse the network to just critical services. I think it probably ends up being less sophisticated than a full fledged syslog, but I also don't think I need anything too complicated.

I tried out a few ways of logging more sophisticated entries to the journal, so that even further filtering might be possible. There is the systemd-python library, which has a JournalHandler or I might use syslog prefixes to embed priority information instead of just the textual INFO, WARNING, ERROR. That seems a little overwrought for anything I'll be doing but I'll keep it in mind for later.