Auto Formatters, a Retrospective

2020-04-02

A brief retrospective of code formatters and the value they (don't) provide.

The Retrospective

A few years ago I looked into auto-formatting code in a quest towards standardizing a codebase. At the time I concluded with:

While I am excited at the direction of things like yapf and the assorted *fmt tools, I can't help but think they won't entirely solve the problem without being "baked into" the language.

I thought that requiring others to auto-format their code would be a sufficient hurdle to the promise of standard formatting. With several years retrospective and experience with a few different codebases and languages I think I've entirely changed my mind on the subject.

The problem of style nitpicks during code review are not a consequence of languages or even formatting but instead a cultural problem. Tooling can't fix a people problem and are instead an attractive distraction from the real issue (the culture or intent of code reviews).

An Example

A particularly odious example of this kind of behavior has to be an exchange like the following:

int foo = procedure();
if (foo > someValue) {
  doSomething();
}
callFunc();
This if-block needs whitespace around it for readability

Irrespective of an auto-formatter there will always be room to nitpick style and otherwise exhaust conversation during a code review. The problem usually isn't any real conviction about code-style and instead it becomes an easy target to bloviate without any real inspection of the code.

Auto-formatters can't (and don't) fix this and as such, miss the mark for me. I've come around to the idea that formatting and style have a real place in code. In testing some matrix math in Common Lisp I wrote some code like this:

(let* ((a #2A((-5  2  6 -8)
              ( 1 -5  1  8)
              ( 7  7 -6 -7)
              ( 1 -3  7  4)))
...       
  (make-array '(4 4) :initial-contents `((,x  0  0  0)
                                         ( 0 ,y  0  0)
                                         ( 0  0 ,z  0)
                                         ( 0  0  0  1))))

In this case I've very intentionally formatted the 4x4 matrices in a way that is contrary to most CL formatting — this would be obvious to a person reading the code and very difficult to encode in a formatter. While there's no reason a person couldn't say "is this really the best way to enter the data?", the format conveys intention.

I've watched from a distance as projects like the Python library Black have gained traction with taglines like:

By using Black, you agree to cede control over minutiae of hand-formatting. In return, Black gives you speed, determinism, and freedom from pycodestyle nagging about formatting. You will save time and mental energy for more important matters.

Which sounds good, or at least easy, but my experience simply hasn't borne scrutiny. Style is an aesthetic and thus an entirely human endeavor, ceding it to a machine just means no one is happy with it.

So What?

As I read more about the J programming language I have to contend with the alleged unreadability and consider the potential of literate programming. I can only surmise that there's no getting away from discussions of style and rather than deferring it to the machine it should be discussed with more rigor, not less. I've had a real interest in the ideas of clean room programming since I read Alan Stavely's Toward Zero Defect Programming and while I've never had an opportunity to try it I do wonder whether it might surface (and conclude!) these sorts of discussion in the context of a truly rigorous review.

I've been unable (yet!) to trick convince anyone to practice clean room programming with me and I'm left to wonder whether it is a consequence of the tedium present in so much C-style programming. The APL-family of languages might reduce a page of C code to a single line and thus elevate conversation from the exhaustiveness of a loop to the intent behind a function. Then again, I was pretty wrong about auto-formatters, maybe I'm as wrong about the impact of languages too.