While trying to document how I have been provisioning my own servers lately I realized what I really wanted was a minimal application to include in the example. I tend to get container fatigue with the amount of moving pieces to maintain with Docker, Podman, Kubernetes, etc. so I went looking for something simpler. What I found was Python's native zipapp.
My motivation is to demonstrate a real-enough application in the context of process management, system security, and network management. That alone feels broad enough that I don't want to bog things down with too many complexities of an example application or any one particular application format. I could have reused a container or static binary but instead I settled on a WSGI application running under a real WSGI server (as opposed to something like a Flask or Django development server). This is pretty close to demonstrative of a real workload and fits on a single screen.
I've written before
about my order of preferences for deploying applications and I
ranked copying Python's virtual environments around pretty low on
the list of ideal candidates. Even worse though is the idea of
making the deploy target do things like
pip install my_application
. It isn't really difficult
to make a container image for this case but it is everything that
follows that becomes annoying. Unless I'm going to recreate my
issues with pip installing a package by requiring the deployment
machine build container images I have to configure an account on a
registry or work out how to copy images between the build and target
machine myself.
My own experience makes me think whatever I write would quickly
become out of date and I wouldn't learn much doing it again. Instead
I'd like to learn something new about potential alternatives for
deploying Python. What I will be deploying is the most basic WSGI
server you can imagine: a 200 OK server in a
file __main__.py
import waitress
def app(environ, start_response):
content_length = environ.get('CONTENT_LENGTH', None)
if content_length is not None:
content_length = int(content_length)
body = environ['wsgi.input'].read(content_length)
content_length = str(len(body))
start_response(
'200 OK',
[('Content-Length', content_length), ('Content-Type', 'text/plain')]
)
return [body]
if __name__ == '__main__':
waitress.serve(app)
Here I am going to
use waitress
because I like how small the entire library is; It has no
dependencies outside of the standard library and that kind of thing
remains exciting
to me. That file is the only file in a
directory demo
. In order to package the requirements
(here just waitress
) alongside it I can use
the --target
flag to pip
:
$ python -m pip install waitress --target demo
The result is a directory tree that looks like this:
demo/
├── bin
│ └── waitress-serve
├── __main__.py
├── waitress
│ ├── adjustments.py
│ ├── buffers.py
│ ├── channel.py
│ ├── compat.py
│ ├── __init__.py
│ ├── __main__.py
│ ├── parser.py
│ ├── proxy_headers.py
│ ├── receiver.py
│ ├── rfc7230.py
│ ├── runner.py
│ ├── server.py
│ ├── task.py
│ ├── trigger.py
│ ├── utilities.py
│ └── wasyncore.py
└── waitress-2.1.2.dist-info
├── entry_points.txt
├── INSTALLER
├── LICENSE.txt
├── METADATA
├── RECORD
├── REQUESTED
├── top_level.txt
└── WHEEL
With a __main__.py
entrypoint for the zipapp module and
all of my (one) requirements prepared it is possible to bundle
everything into a single zip file like this:
$ python -m zipapp --output wsgi-demo.pyz demo
The result is a zip file that the Python interpreter can execute which includes the bundled dependencies in a format that requires no additional work to configure paths or environments. Of course because the interpreter isn't bundled with the zip file there is the potential to write or bundle code that is incompatible with the deployment system's version of python. Along with my preference toward the standard library I like using stable releases and simply testing for backwards compatibility. In this case I packaged the above on my laptop using Python 3.11 but tested it all the way back1 to Python 3.7 which is going end-of-life in two days.
For my own example I will just use the deploy server's version of Python rather than a container for the sake of simplicity. I'm pleased with the flexibility here though; Whether the goal is to provision a known-good target environment and rely on having an LTS release of the runtime or to separate the application from the runtime environment in a container the ziapp seems to mostly stay out of the way.
One of the motivations for investigating alternative deployment options has been how stilted it can feel to drop containers into my existing workflow. Many container technologies don't really play nicely with the host operating system capabilities and instead want to dictate how things are done in the (probably justified) name of uniformity. For example, I like running services under dedicated user accounts with limited permissions. This is a capability built into most Linux distributions via systemd and I am comfortable with the work involved. Container runtimes provide similar functionality but it does not integrate in the same way and often instead relies on the container daemon having root level access to achieve the same results. It is possible, especially with Podman, to use "rootless" containers but they do not integrate with the init system cleanly and I find them harder to manage as a result.
Here then is an opportunity for trying out my zipapp in a workflow
that I prefer. I have written previously about
how systemd-socket-proxy
can be used as a shim to
socket activate services that were not built with socket activation
in mind. With this blindingly simple demo though it is possible to
build in socket activation - I'm trying to demonstrate how I
would like to do things after all.
First, I'll change my WSGI application so that rather than binding
ports it receives a file descriptor from the process management
system (systemd). This is possible in a number of different servers
but I like how simple it is under waitress
:
import socket
import waitress
def app(environ, start_response):
content_length = environ.get('CONTENT_LENGTH', None)
if content_length is not None:
content_length = int(content_length)
body = environ['wsgi.input'].read(content_length)
content_length = str(len(body))
start_response(
'200 OK',
[('Content-Length', content_length), ('Content-Type', 'text/plain')]
)
return [body]
if __name__ == '__main__':
SYSTEMD_FIRST_SOCKET_FD = 3
sockets = [socket.fromfd(SYSTEMD_FIRST_SOCKET_FD, socket.AF_INET, socket.SOCK_STREAM)]
waitress.serve(app, sockets=sockets)
Where before it was possible to execute the server directly it now
requires sockets to be prepared before starting. This is a boon for
deployment but might be annoying for development. Not to worry
though the systemd developers planned for this case too and just
like systemd-socket-proxy
allows for simulating socket
activation, systemd-socket-active
does basically the
reverse. It will let you bind a socket and then invoke your program
as though it were called by the init system. It sounds more
complicated than it is, previously the zipapp was run like this:
$ python wsgi-demo.pyz
It is now run like this for socket activation:
$ systemd-socket-activate -l '127.0.0.1:8080' python wsgi-demo.pyz
Of course that is only for development and testing. The deploy
process will be to define a systemd socket
(wsgi-demo.socket
):
[Socket]
ListenStream=8080
[Install]
WantedBy=sockets.target
Which will have an associated service
(wsgi-demo.service
):
[Unit]
Requires=wsgi-demo.socket
After=wsgi-demo.socket
[Service]
DynamicUser=true
PrivateNetwork=yes
ExecStart=/usr/bin/python /opt/wsgi-demo.pyz
Now my WSGI server has exceedingly tight restrictions on how and what it can access. It has no network access, cannot write to the host system, etc. Additional capabilities can be added slowly and as required which is a much better feeling than the one I get simply exposing ports out of a docker container. Perhaps more exciting though is how this lends itself to things like zero downtime upgrades or how much easier it becomes to scale up WSGI servers now that they are no longer managing their own ports. Some servers allow for a configurable number of workers but I have not yet seen one that allows the workers to be scaled up without restart. With systemd units it is possible to template the service file and as needed launch new services using the same port. This will generally require the application be stateless but can better distribute work across multiple CPUs without requiring changes to the application. The necessary configuration looks like this:
socket:
[Unit]
Description=socket for wsgi-demo %i
[Socket]
ListenStream=8080
ReusePort=true
Service=wsgi-demo@%i.service
[Install]
WantedBy=sockets.target
service:
[Unit]
Description=wsgi-demo server %i
Requires=wsgi-demo@%i.socket
[Service]
DynamicUser=true
PrivateNetwork=yes
ExecStart=/usr/bin/python /opt/wsgi-demo.pyz
With that much done it is possible to start 4 distinct sockets which
will start 4 python processes by doing:
systemctl start wsgi-demo@{1..4}.socket
The way I see it there are a few potential negatives with zipapps,
one already discussed. The interpreter is not bundled and this
invites the chance that the interpreter running the zipapp does not
support some feature of the langauage used. The second is that
zipapps cannot
package C extensions. For most of my uses this does not
matter. The linked page describes how to work around this but the
level of effort is probably close to that of just using
containers. The last hurdle I see is that the workflow is
sufficiently different that it would probably be annoying to
introduce in a team setting. I can easily imagine a less experienced
person accidentally checking the entirety of the installed
requirements into source control because of the way you
can --target
the local directory so easily. It seems
the correct way to build would be with a temporary build directory
that copies the application source and then installs requirements
into it, rather than installing into your source directory
directly. It is easy enough for me to imagine how to do this but it
is one more thing to explain in a team setting.
For my own uses I'm pleasantly surprised with how simple zipapps have made experimenting with different architectures choices like comparing HAProxy roundrobin load balancing with reused ports with a single systemd socket. I will probably use them in the future simply to avoid pulling in ever more dependencies.
I said I would be avoiding containers for this post but they do have value in cases like this. Rather than futz with installing old versions of python to run my zipapp it is possible to use podman to pull old versions and point them at the zip file in a volume mount:
podman run -it --rm -p 8080:8080 -v $PWD:/demo:z python:3.7-slim python /demo/wsgi-demo.pyz