2018-10-13

I've been doing a bit of reading this weekend including a chapter on the Monte Carlo method out of Data Analysis with Open Source Tools and making it about half-way into Practical Common Lisp. It occurred to me it might be fun to write a little program combining the two.

The Monty Hall problem is a well-known, albeit counter-intuitive, puzzle demonstrating a bit of probability theory. The premise is:

Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?

The trick is, the first time you choose, your chance of being correct
is ^{1}⁄_{3} (or, put otherwise,
^{2}⁄_{3} wrong). The claim is, switching your
choice gives you a ^{2}⁄_{3} chance of winning.

Broadly speaking, Monte Carlo methods are those which adhere to the
following pattern *"in order to solve problems that might be
deterministic in principle"*:

- Define a domain of possible inputs
- Generate inputs randomly from a probability distribution over the domain
- Perform a deterministic computation on the inputs
- Aggregate the results

There is, of course, a well-known explanation for the Monty Hall problem, but I'll be honest, seeing a proof for it is pretty dense. Performing a simulation is a nice way to prove it to yourself (as apparently was the case with Paul Erdős).

Having finished up with Touretzky's Common Lisp: A Gentle
Introduction to Symbolic
Computation, I'm
really enamored with the idea of "symbolic" computation. While
Janert's *Data Analysis* performs an analysis in Python, I've been won
over to the idea of mental friction imposed by forcing what should be
a symbolic "thing" into Python's data model and treating it as a
string. It's also just fun.

A first step is to write a helper function for performing a random
choice from a list of options (the function `nth`

takes an
index-postion from a sequence, you can imagine Python's `list[index]`

syntax):

```
(defun random-choice (options)
(nth (random (length options)) options))
```

From there, I've defined a `make-choice`

function, it locally defines
the "doors" and performs a random selection as the first choice,
returning the first choice and the list of remaining doors:

```
(defun make-choice ()
(let* ((doors '(prize goat goat))
(first-choice (random-choice doors)))
(if (equal first-choice 'prize)
(setf doors (list 'prize (random-choice (cdr doors)))
(setf doors (list 'prize first-choice)))
(list first-choice doors)))
```

I've modeled each of the three choices as a function, operating on two parameters the first selection and the remaining choices:

```
(defun remain (first-choice choices)
first-choice)
(defun blindly-choose (first-choice choices)
(random-choice choices))
(defun switch (first-choice choices)
(first (remove first-choice choices)))
```

`blindly-choose`

is the strategy of randomly picking between the first
pick and the remaining. `switch`

achieves it's namesake by removing
the first-round-pick from the list of remaining, and returns the
"first" (and only) option left. With those stategy functions, we can
define a test, which performs some number of trials repeatedly using a
given strategy:

```
(defun test-strategy (strategy num-trials)
(loop for trial from 1 to num-trials collect
(apply strategy (make-choice))))
```

The output of `test-strategy`

is a list of the results for a given
number of "games" played. If we wanted to try 10 games of choosing
blindly:

```
* (test-strategy 'blindly-choose 10)
(GOAT GOAT PRIZE PRIZE GOAT PRIZE GOAT GOAT PRIZE GOAT)
```

If we're following the steps outlined above, we've got a domain of inputs, random input for the first choice and the three strategies outline our deterministic computation. All that's left is to aggregate the results:

```
(defun aggregate (test-output)
(let ((wins (count 'prize test-output))
(losses (count-if-not #'(lambda (x) (equal x 'prize)) test-output)))
(list wins 'wins losses 'losses)))
```

I rather like this as a high-level operational view of what we're doing, the resulting call ends up looking like:

```
* (aggregate (test-strategy 'blindly-choose 100))
(49 WINS 51 LOSSES)
```

In order to do them all at once over a larger number of trials I can also write a short function to summarize each strategy:

```
(defun summarize (&optional (trials 100))
(let ((results (mapcar #'(lambda (s) (cons s (aggregate (test-strategy s trials))))
'(remain blindly-choose switch))))
(format t "~{~{~14a ~a ~a ~a ~a~}~%~}" results)))
* (summarize 1000)
REMAIN 322 WINS 678 LOSSES
BLINDLY-CHOOSE 500 WINS 500 LOSSES
SWITCH 662 WINS 338 LOSSES
```

Just like that, a computer simulation of one thousand takes on the
Monty Hall problem utilizing the three strategies available. Remaining
with the original choice results in winning
^{1}⁄_{3} of the time, blindly picking between
the two remaining in the second round gives you even odds, and
switching gives you a ^{2}⁄_{3} chance of
winning.

[nolan@nprescott.com] $> █