indexpost archiveatom feed syndication feed icon

State of the Server

2024-02-18

It has been a few years and I've finally got around to revamping a few things on the server. Mostly these are notes in case I forget when I change something else in a couple years.

operating system

My record keeping on this front is a little spotty. Way back in 2017 I moved from Digital Ocean to Vultr because it was cheap ($2.50 a month); at the time I was on Debian 9. In 2019 I remarked on an uneventful migration to Debian 10. It seems like I didn't even make a note when the upgrade to Debian 11 happened, it was so uneventful that I don't even remember it. Approaching more recent history I realized it has been something like 8 months since Debian 12 was released and figured it was time to do my routine maintenance. Delightfully, nothing of real interest happened. There were some package name changes that conflicted with those in the PPA I had configured for Nginx but that didn't even slow me down.

In an era of "cattle not pets" I find something tremendously funny about keeping this absolutely bargain basement VPS running for years.

screenshot of the hosting provider dashboard indicating the VPS was created over 6 years ago

In an effort to ensure I had some documentation on how to configure things I wrote out an ansible playbook and checked it against a new machine. The whole thing is pretty dull and mostly consists of installing a few packages and then copying the handful of relevant configuration files. When I was sure things were working like I figured I deleted the new machine and patched up my tiny workhorse here.

the ansible playbook, not really interesting
- name: Prepare web server
  hosts: websandbox
  tasks:

   - name: update and upgrade apt (this is mostly for fresh installs)
     ansible.builtin.apt:
       update_cache: yes
       upgrade: yes

   - name: Install sandbox dependencies
     ansible.builtin.apt:
       name: firewalld,unattended-upgrades,haproxy,lighttpd,lighttpd-mod-deflate,certbot,sqlite3,libsqlite3-tcl,tcllib
       state: latest

   - name: permit traffic in default zone for ssh service
     ansible.posix.firewalld:
       service: ssh
       permanent: true
       state: enabled

   - name: permit traffic in default zone for http service
     ansible.posix.firewalld:
       service: http
       permanent: true
       state: enabled

   - name: permit traffic in default zone for https service
     ansible.posix.firewalld:
       service: https
       permanent: true
       state: enabled

   - name: copy lighttpd configuration
     ansible.builtin.copy:
       src: lighttpd.conf
       dest: /etc/lighttpd/lighttpd.conf

   - name: copy lighttpd socket
     ansible.builtin.copy:
       src: lighttpd.socket
       dest: /etc/systemd/system/

   - name: copy lighttpd service override
     ansible.builtin.copy:
       src: lighttpd/override.conf
       dest: /etc/systemd/system/lighttpd.service.d/

   - name: copy haproxy service override
     ansible.builtin.copy:
       src: haproxy/override.conf
       dest: /etc/systemd/system/haproxy.service.d/

   - name: copy haproxy configuration
     ansible.builtin.copy:
       src: haproxy.cfg
       dest: /etc/haproxy/haproxy.cfg

   - name: stop lighttpd to get off port 80
     ansible.builtin.systemd_service:
       name: lighttpd.service
       state: stopped

   - name: restart lighttpd socket
     ansible.builtin.systemd_service:
       enabled: true
       name: lighttpd.socket
       state: restarted

   - name: restart haproxy service
     ansible.builtin.systemd_service:
       enabled: true
       state: restarted
       daemon_reload: true
       name: haproxy

   - name: firewalld reload
     command: firewall-cmd --reload

web servers

I finally got around to dropping Nginx. We'll see if it sticks but more than a year ago I was thinking about it:

The last real impediment is Nginx, which does not support CGI... For a while now I've been chafing at a few minor things with Nginx, especially after learning more about HAProxy. I've had the idea that I might switch from Nginx as a reverse proxy and web server to HAProxy for proxying and a dedicated web server for serving HTML, it hasn't become a priority so I haven't done it though.

After backing up the old configurations I was nearly giddy to delete all the janky configuration files that have been piling up for years. I am sure there were oddities that had accumulated but I also know I could never quite tell what they were. For now I've decided to give lighttpd a chance, partly because it supports CGI and partly because it is supposed to be light on resources. The configuration for it ended up being about 50 lines long, which isn't too bad:

server.modules = (
        "mod_indexfile",
        "mod_access",
        "mod_alias",
        "mod_redirect",
        "mod_deflate",
        "mod_cgi",
)

server.tag                       = ""
server.document-root             = "/var/www/html"
server.errorlog                  = "/var/log/lighttpd/error.log"
server.systemd-socket-activation = "enable"

# this is a little ugly, I wonder if there is a better way? I thought
# BindPaths at the service level but hit issues with "socket still in
# use" - very mysterious
server.bind                      = "/var/lib/haproxy/run/lighttpd.sock"

server.feature-flags       += ("server.graceful-shutdown-timeout" => 5)
server.feature-flags       += ("server.graceful-restart-bg" => "enable")

server.http-parseopts = (
  "header-strict"           => "enable", # default
  "host-strict"             => "enable", # default
  "host-normalize"          => "enable", # default
  "url-normalize-unreserved"=> "enable", # recommended highly
  "url-normalize-required"  => "enable", # recommended
  "url-ctrls-reject"        => "enable", # recommended
  "url-path-2f-decode"      => "enable", # recommended highly (unless breaks app)
  "url-path-dotseg-remove"  => "enable", # recommended highly (unless breaks app)
)

include_shell "/usr/share/lighttpd/create-mime.conf.pl"

deflate.mimetypes = ( "text/html",
                      "text/plain",
                      "text/css",
                      "text/javascript",
                      "text/xml",
                      "application/atom+xml" )
deflate.allowed-encodings = ( "gzip", "deflate" )

index-file.names            = ( "index.html" )
url.access-deny             = ( "~" )

$HTTP["url"] =~ "^/idle.nprescott.com/cgi-bin/" {
        alias.url += ( "/idle.nprescott.com/cgi-bin/" => "/var/www/cgi-bin/" )
	cgi.assign = ( ""  => "" )
}

server.compat-module-load   = "disable"
server.modules += ( "mod_staticfile" )

Most of it is the default configuration on Debian. I added the CGI handling and turned on compression. The one tiny bit of fun comes in the addition of the search box to the archive page. The entire diff to the static site generator to enable it is this:

diff -r 39daf0954970 generator.tcl
--- a/generator.tcl     Sun Feb 18 00:05:36 2024 -0500
+++ b/generator.tcl     Sun Feb 18 00:05:44 2024 -0500
@@ -110,7 +110,13 @@
     global DIR FOOTER
     set fp [open $DIR/posts/archive.html w]
     puts $fp [make_header "Idle Cycles"]
-    puts $fp {    <h1>Post Archive</h1>
+    puts $fp {    <h1>Post Archive</h1>}
+    puts $fp {    <div>
+      <form action="/cgi-bin/search" method="POST" enctype="multipart/form-data">
+       <input type="text" name="terms" placeholder="your search terms...">
+       <input type="submit">
+      </form>
+    </div>
     <ul>}
     db eval {select path, date, title, path_slug from post order by date desc} {
        puts $fp [subst {      <li>

The FTS tables have been quietly just working for ages. The new CGI program on the server is this (basically unchanged since originally writing about it):

#!/usr/bin/env tclsh
package require sqlite3
package require ncgi

sqlite3 db /var/www/data/posts.db -create false -readonly true
::ncgi::parse
set terms [::ncgi::value terms]
puts {Content-Type: text/html

<head>
  <meta charset="utf-8">
  <link rel="icon" href="data:,">
</head>
<h2>Search Results:</h2>
<ul>
}
if { [ catch {
    db eval {
       select post.title, path_slug, date
	 from post_fts
	 join post using(path)
	where post_fts.body match :terms
	order by date desc
    } {
        puts "<li><a href=\"/$path_slug\">$title</a></li>"
    }
} ] } { puts {<p>problem during search, consider checking
<a href="https://www.sqlite.org/fts5.html#full_text_query_syntax">the query syntax</a></p>} }
puts {</ul>}
db close

load balancer

Perhaps obvious then, I also installed and configured HAProxy to front the web server and do TLS termination. I find it a little easier to administer HAProxy than Nginx, I'm not sure if it isn't just the fact that it is doing less (no static file server!). I hadn't intended to immediately add too much net-new functionality but the addition of some CGI prompted me to throw in some basic rate-limiting to guard against the run of the mill bad bots that crawl the server endlessly. I may yet tweak the limits or expand the handling beyond just the CGI programs. This isn't quite the entire configuration, there's a necessary second frontend for HTTPS but it is nearly copy-paste:

global
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s
    user haproxy
    group haproxy

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000
    timeout client  50000
    timeout server  50000

frontend http-in
    bind :::80
    acl is_blog hdr(host) -i idle.nprescott.com
    stick-table  type ipv6    size 1m    expire 120s    store http_req_rate(120s)
    http-request track-sc0 src
    http-request deny deny_status 429 if { path_beg /cgi-bin/ } { sc_http_req_rate(0) gt 15 }
    http-request set-path /idle.nprescott.com/%[path] if is_blog
    default_backend lighttpd-server

backend lighttpd-server
    # this is only a tiny bit tricky: it is inside the chroot and
    # defined in lighttpd.socket
    server s1 /run/lighttpd.sock

better security

Now that I've bothered to clean it all up I'll cop to the fact that for years I pointed a location directive for Nginx to my own home directory on the server. While this isn't terribly wrong it did tend to get confusing when trying to back things up and move things around. It also limited my ability to opt-into features like private home directories for system services. Since I was already in the weeds re-configuring a web server I bit the bullet and moved things into a more typical /var/www/html-style document root. It is still early but it feels nice to know exactly what is public and where to find things.

I have configured the web server to run as a dynamic, nearly permission-less user inside of a private network. Partly this is for my own peace of mind as I open up some new functionality to the internet via CGI programs (I'm living in the past, having a blast). The end result should be an improved security stance with the load balancer chrooted into a dedicated directory, forwarding traffic to a web-server with no network access, running as a user without means to write to or in many cases read from the file system.

where to now?

I'll see you in another 2 to 5 years!

Okay, fine. I'm happy with this progress and the state of the system. I've cleaned up some cruft in the file system and turned off a few neglected servers and services that had been getting long in the tooth. I'm curious to see how things go with the new additions but have a few things in mind that I'd like to try that are enabled by the new functionality. Probably though things will just keep quietly working like they have been for years. One lingering bit of work is in how I have tended to sync this site with the server. I author it on my computer and would in the past just rsync the directory after running the static site generator. With the addition of full-text search I need to move a second file (the sqlite database) to a second location, maybe this will be the thing that finally gets me to consider a better process.