I wrote the other day about a series of design choices that I aim for when starting a new project. In so doing I lured myself into testing out a thought that has been rolling around my head for a while: what is the smallest spring boot application that can fit such an architecture?
I tend to use Python in my own prototypes because it is familiar and I can jump immediately into getting things done. I used it as an example for scaling out across templated systemd services in a nod towards scaling even slow technologies. I have been using Spring Boot more frequently though and I've been impressed with the quality of tooling available in the Java ecosystem. Spring has a reputation for being big (and sometimes slow) and as a consequence many examples look more like something out of Patterns of Enterprise Application Architecture than a Flask application you'd write in an afternoon. What I've been wondering though is whether that is intrinsic to the framework.
The short answer is no. The slightly longer answer is many of the things that I normally do manually in Python are instead sane defaults in Spring Boot and it is actually very ergonomic to bang out a basic HTTP endpoint like I described in the prior post.
I previously wrote an endpoint that records a few parameters to a database for every HTTP request made. Here I'll make an endpoint that reads from that same database but instead queries out some information:
package com.nprescott.singlefilespring; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; import org.springframework.boot.jdbc.DataSourceBuilder; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.jdbc.core.simple.JdbcClient; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RestController; import javax.sql.DataSource; @SpringBootApplication public class SingleFileSpringApplication { static void main(String[] args) { SpringApplication.run(SingleFileSpringApplication.class, args); } } @Configuration class DataSourceConfig { @Bean public DataSource dataSource() { return DataSourceBuilder .create() .driverClassName("org.sqlite.JDBC") .url(""" jdbc:sqlite:requests.db\ ?journal_mode=WAL\ &foreign_keys=true\ &cache_size=10000\ &temp_store=MEMORY\ &synchronous=NORMAL\ &busy_timeout=10000""") .build(); } } @RestController class CountController { private final JdbcClient jdbcClient; public CountController(JdbcClient jdbcClient) { this.jdbcClient = jdbcClient; } @RequestMapping("/count") public Integer countRequests() { return jdbcClient.sql( "select 1" ) .query(Integer.class) .single(); } }
That is as simple as I've been able to make it. Notably there are a
few deficiencies, like how it doesn't query
the requests table at all and instead queries just the
value 1! I started here to get things working first and then flesh
them out. Similarly, this server isn't socket activated, instead
binding the default port for the embedded Tomcat process
(8080). That being said, with a POM file listing out the necessary
dependencies:
mvn spring-boot:run.
I started out with just a static query so as to benchmark things incrementally, the above can do more than 20,000 requests per second with no further tweaking or tuning of the JVM or framework.
Knowing that it works let's swiftly move on to actually querying the
database that is lying around locally, populated with almost half a
million rows after a number of load tests. The simplest query I can
think of that actually looks at the database table is also a bit
pathological: select count(*) from requests;
This works but because SQLite doesn't (as far as I know) maintain
table statistics each invocation produces a full table scan. If you
examine the connection string I gave above I'm setting
a cache_size
of nearly 40MB. While I was using that specific value without too
much thought it so happens that it is approximately twice the size
of the database. This makes it an interesting point to gauge
performance and tweak some knobs. Without a cache size configured
the above endpoint can perform full table scans and respond with the
counted rows about 430 times a second. Setting the cache size like I
have it should basically run the entire database from the cache
(presumably RAM?) and the performance shifts to 6,900 requests per
second.
I expect it is unlikely that an entire database will fit within RAM but it is an interesting data point when weighed against a full table scan of half a million rows. With such variability based only on one configuration parameter and the specific query being run I would categorize this in the realm of application design — if you're going to routinely scan a full table perhaps you'd consider caching that value.
It is worth exploring some of the sane defaults I mentioned. In the
example program I'm using DataSourceBuilder because it
turned up in a search for configuring SQLite pragmas using a
connection string. It turns out it helpfully defaults to using a
connection pool
(HikariCP)
and as a result the pragmas are performed once rather than on every
connection. If instead you do the slightly naive thing and produce
the configuration directly it will mean every request produces a
connection and is reconfigured. In my case that meant this
alternative worked but was about 50% slower:
@Configuration class DataSourceConfig { @Bean public DataSource dataSource() { SQLiteConfig config = new SQLiteConfig(); config.setJournalMode(SQLiteConfig.JournalMode.WAL); config.setBusyTimeout(10_000); config.setSynchronous(SQLiteConfig.SynchronousMode.NORMAL); config.setTempStore(SQLiteConfig.TempStore.MEMORY); config.setCacheSize(10_000); SQLiteDataSource dataSource = new SQLiteDataSource(config); dataSource.setUrl("jdbc:sqlite:requests.db"); return dataSource; } }
as a matter of taste I slightly prefer the explicit/typed configuration building so I went rummaging around for how to use it but still benefit from the connection pool handling. Once you know what you're looking for it is easy to drop it in:
@Configuration class DataSourceConfig { @Bean public DataSource dataSource() { SQLiteConfig config = new SQLiteConfig(); config.setJournalMode(SQLiteConfig.JournalMode.WAL); config.setBusyTimeout(10_000); config.setSynchronous(SQLiteConfig.SynchronousMode.NORMAL); config.setTempStore(SQLiteConfig.TempStore.MEMORY); config.setCacheSize(10_000); // slight gotcha with it being pages instead of kb HikariConfig hikariConfig = new HikariConfig(); hikariConfig.setJdbcUrl("jdbc:sqlite:requests.db"); hikariConfig.setDriverClassName("org.sqlite.JDBC"); hikariConfig.setDataSourceProperties(config.toProperties()); return new HikariDataSource(hikariConfig); } }
I previously said:
Moving to faster technologies doesn't present too much of a challenge and with the architecture described here is achievable incrementally.it seems only right then that I'd demonstrate as much.
I will admit, this one doesn't feel great. The JVM and Spring start times are not blazing fast so it is unlikely that a service I'd write using them would be launched on request. Still though, it is going to bother me if I simply can't figure out it. The short version of socket activation is roughly: the process supervisor (systemd) will listen on a port and when a connection is received the supervisor will hand off a file descriptor to the process which should handle the connection, the receiving process reads from the file descriptor rather than itself binding a port.
What I have managed to find is that there is some support for Tomcat
to not bind a network port and instead receive a channel on which to
communicate. All I have managed to find though is passing references
to inetd (promising!)
and inheritedChannel. Where
I've run into some confusion is the apparent fixedness of the
inherited channel being file descriptor 0 and not
configurable. systemd
uses file descriptor 3.
The closest then that I've got is to do some exciting looking configuration of the servlet connector:
@Configuration class SystemdSocketActivationConfig { @Bean public WebServerFactoryCustomizer<TomcatServletWebServerFactory> useInheritedChannel() { return factory -> factory.addConnectorCustomizers(connector -> connector.setProperty("useInheritedChannel", "true") ); } }
Which, in the style of inetd, should be waiting for stdin so the systemd socket can be configured to use stdin and stdout rather than the usual set of 3+
$ systemd-socket-activate -l 8080 --inetd -- mvn spring-boot:run
I'll admit a little hesitation in suggesting this too strongly yet. I think there are a few likely knobs to tune things, like having the socket immediately launch the service without waiting for activity to reduce the impact of slow star times. More troubling is that I don't yet feel a wealth of confidence in my understanding of the implications of this servlet factory reconfiguration. Sure it works and performance is unchange, but what am I trading by doing it this way?
The previous service used systemd's StateDirectory configuration to
ensure multiple instances referred to the same file on
disk. Adjusting the connection information here is another case
where some of the defaults of the framework kick in and things
actually feel like they get simpler. If the systemd service file is
this (StandardInput=socket is
the inetd
compatibility option):
[Unit]
Description=demo java server %i
Requires=demo@%i.socket
[Service]
DynamicUser=true
PrivateNetwork=yes
StateDirectory=wsgi-demo
StandardInput=socket
Environment="SPRING_DATASOURCE_URL=jdbc:sqlite:%S/wsgi-demo/requests.db?journal_mode=WAL&foreign_keys=true&cache_size=10000&temp_store=MEMORY&synchronous=NORMAL&busy_timeout=10000"
ExecStart=mvn spring-boot:run
Admittedly, a little hairy for putting the full connection string into an environment variable. Also I wouldn't normally use maven to run a real application but you'll forgive me a few shortcuts. The application simplifies through the use of Spring's property/value annotation:
@Configuration class DataSourceConfig { @Bean public DataSource dataSource(@Value("${spring.datasource.url}") String url) { return DataSourceBuilder .create() .driverClassName("org.sqlite.JDBC") .url(url) .build(); } }
Assuming the above service is run from a systemd socket on an alternate port from the previous Python application (8080), an HAProxy configuration to integrate the two and route between them can be as simple as:
global
maxconn 2000
log /dev/log local0
user haproxy
group haproxy
stats socket /run/admin.sock user haproxy group haproxy mode 660 level admin
defaults
timeout connect 10s
timeout client 30s
timeout server 30s
log global
mode http
option httplog
frontend ingress
bind *:80
acl is_count path_beg /count
use_backend java_backend if is_count
use_backend python_backend
backend java_backend
server app2 127.0.0.1:8081
backend python_backend
server app1 127.0.0.1:8080
In practice I think it more likely I'd use a more general route prefix, subdomain, or header but they're all approximately as much work so this will have to be demonstration enough.
I kept thinking I was going to hit some snag but instead I've got a working spring boot application reading a SQLite database and serving HTTP requests all in about a screenful of code. It is socket activated and integrated into systemd's service management without much fanfare. The performance is also much better than an analogous python service without any real work towards it.
The last thing that I haven't figured out yet is how to integrate
systemd readiness notifications into the service management. I think
that might be the last lingering piece before I've got a solid story
around zero-downtime
upgrades.
the final application in full
package com.nprescott.singlefilespring;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.boot.jdbc.DataSourceBuilder;
import org.springframework.boot.web.embedded.tomcat.TomcatServletWebServerFactory;
import org.springframework.boot.web.server.WebServerFactoryCustomizer;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.jdbc.core.simple.JdbcClient;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import javax.sql.DataSource;
@SpringBootApplication
public class SingleFileSpringApplication {
static void main(String[] args) {
SpringApplication.run(SingleFileSpringApplication.class, args);
}
}
@Configuration
class DataSourceConfig {
@Bean
public DataSource dataSource(@Value("${spring.datasource.url}") String url) {
return DataSourceBuilder
.create()
.driverClassName("org.sqlite.JDBC")
.url(url)
.build();
}
}
@Configuration
class SystemdSocketActivationConfig {
@Bean
public WebServerFactoryCustomizer<TomcatServletWebServerFactory> useInheritedChannel() {
return factory -> factory.addConnectorCustomizers(connector ->
connector.setProperty("useInheritedChannel", "true")
);
}
}
@RestController
class CountController {
private final JdbcClient jdbcClient;
public CountController(JdbcClient jdbcClient) {
this.jdbcClient = jdbcClient;
}
@RequestMapping("/count")
public Integer countRequests() {
return jdbcClient.sql(
"select count(*) from requests"
)
.query(Integer.class)
.single();
}
}