ab Testing

2016-05-06

Which is faster - SQLite or the file system? Performance testing Flask with ApacheBench in a (minimal) real-world scenario.

Overview

As per usual, I'm using glue-boy as a test bed for these kind of experiments. I've written about it before, but it's a simple pastebin, designed to use the file-system. From storage to access records, it's super simple and leverages a lot of little UNIX tools for reporting and cleanup.

I got to thinking that I should try porting it to SQLite and do a compare/contrast of any differences. I decided I'd combine my efforts with a recent interest in performance testing and a great fondness for charts.

This is strictly a comparison test of relative performance between the two implementations. The cited numbers are running realistic loads under the production WSGI server, but all of it is running locally. This is intentional, as I do not want to test the network or the computer I'm running it all on. The response times should not be taken as an indicator of real-world performance.

Changes to glue-boy

You can check out the full changes made to port the application to sqlite3 on github, things to watch for:

how connections are closed after each relevant request
how a "last accessed" field is updated each time a paste is read

I thought of a several ways to optimize both the current operations as well as how I might improve the SQLite implementation, but resisted the urge to change too much. The goal was a feature-for-feature port of the application with no perceivable changes, which I think I accomplished.

Test Setup

I ran each test instance individually and tried to monitor background activity to give a clear picture of actual app performance rather than network lag or background load. I ran the application with gunicorn configured for 4 workers, since that is how I run it in the wild, proxied behind Nginx.

What `ab` Does and Doesn't Do

I used ApacheBench for my tests, it was already installed and I didn't see any deal-breakers, relative to some of the alternatives. There are however, things worth mentioning:

ab can't do localhost, the network address needs to be 127.0.0.1
The URL has to contain a trailing slash (this was only slightly maddening)
POST data must be base64/url-encoded¹

To issue 1,000 GET requests, with 10 simulated concurrent users:

ab -n 1000 -c 10 http://127.0.0.1:8000/content/a8137a

To issue 1,000 POST requests, with 10 simulated concurrent users:

ab -n 1000 -c 10 -T "application/x-www-form-urlencoded" \
-p post-body.txt http://127.0.0.1:8000/

It is probably worth mentioning, glue-boy has never seen anything near these levels, in total requests or concurrent users. The current system pretty much runs itself and has for over a year at this point.

GET Request

ab is pretty limited in what and how it tests, for the data below I was repeatedly re-requesting the same URL/paste content. I inserted about 200 lines of junk text from a log file totaling approximately 9kB.

I will admit some surprise at the results in favor of SQLite here, if you checked the code you will have noticed there are two separate SQL execution calls:

to update the requested record's "last accessed time"
to return the body of the record

And yet, SQLite is about twice as fast as the original implementation which simply reads a text file with the record from disk.

Percentage of the requests served within a certain time (ms)

Percentage	File System	SQLite
50%	28	30
66%	29	31
75%	30	32
80%	31	32
90%	33	34
95%	34	36
98%	36	39
99%	38	46
100%	42	65

I borrowed heavily² in creating the charts below, trying to really identify the meaningful information from the benchmark results.

I found the suggested scatterplot to be more informative, but non-intuitive at first. The thing to keep in mind is the x-axis "wall clock" measure is really only relevant for each individual "mode" (indicated by color).

Connection Times (ms)

	File System	SQLite
min	7	5
mean	66	29
median	65	28
max	84	106

POST Request

I base64 encoded the same junk text file from the GET request, resulting in a file size of about 12.5kB. The trick to matching ab's file format was to specify the form "key" after encoding. In this case the HTML form contains a textarea called "content", so the file looks like:

content=LyoqDQogKiAgR2VuZXJpYyBIVFRQIERldmljZSB2MS4wLjIwMTYwNDAyDQogKg0KICogIFNv\dXJjZSBjb2RlIGNhbiBiZSBmb3VuZCBoZXJlOiBodHRwczovL2dpdGh1Yi5jb20vSlotU21hcnRUaGlu===

I have to mention my surprise again here, after the GET requests turned out in favor of SQLite I had thought for sure the POST requests would do the same. Instead, the file system did better than SQLite on writes, narrowly, but still better.

Percentage of the requests served within a certain time (ms)

Percentage	File System	SQLite
50%	28	30
66%	29	31
75%	30	32
80%	31	32
90%	33	34
95%	34	36
98%	36	39
99%	38	46
100%	42	65

Connection Times (ms)

	File System	SQLite
min	3	14
mean	28	31
median	28	30
max	42	65

Final Thoughts

I haven't yet decided how to proceed with this information, it is a drop-in replacement for the existing system. It is (marginally) faster on reads and nearly matches write speeds, and yet I can't help but feel some reluctance to toss out what has been working without issue for all this time.

The SQLite implementation is certainly more extensible than the plaintext system, with the ability to add database fields as necessary. SQLite also exhibits some curious spiking in the performance tests though, so while it routinely performs better, there are instances where it is dramatically worse. I haven't identified what causes this, probably write-locks.

Making the resulting file-size about 1/3 larger in the process
Apache Bench and gnuplot You're Probably Doing It Wrong

ab Testing

Overview

Changes to glue-boy

Test Setup

What ab Does and Doesn't Do

GET Request

Percentage of the requests served within a certain time (ms)

Connection Times (ms)

POST Request

Percentage of the requests served within a certain time (ms)

Connection Times (ms)

Final Thoughts

What `ab` Does and Doesn't Do