Bayes’ Theorem (part 2)#

Review of Bayes’ Theorem#

Last class we learned about the concept of using Bayes’ theorem to perform a Bayesian update. We wrote out the diachronic version Bayes’ Theorem:

\[ P(H \,\vert\, D)\; = \frac{P(H)P(D\,\vert\,H) \; }{P(D)} \]

In which we named the key components:

  • \(P(H)\), the probability of the hypothesis before we see the data, called the prior probability

  • \(P(H \vert D)\), the probability of the hypothesis after we see the data, called the posterior probability

  • \(P(D \vert H)\), the probabilitiy of the data under the hypothesis, called the likelihood

  • \(P(D)\), the total probability of the data under any hypothesis

We then used a Bayes table to compute the posterior probabilities of all hypotheses in the cookie problem:

table = pd.DataFrame(index=['first bowl', 'second bowl'])
table['prior'] = 1/2, 1/2 # The prior is that both bowls are equally likely
table['likelihood'] = 30/40, 20/40 # The first bowl has 30 vanilla cookies and the second has 20
table['unnormalized'] = table['prior'] * table['likelihood'] #compute the top of the Bayesian update
prob_data = table['unnormalized'].sum() #Compute the "normalizing constant"
table['posterior'] = table['unnormalized'] / prob_data #Calculate the posterier probabilities of both bowls
table
prior likelihood unnormalized posterior
first bowl 0.5 0.75 0.375 0.6
second bowl 0.5 0.50 0.250 0.4

Example Problems#

Being able to apply a Bayesian update is incredibly important, so today we’ll work through two more example problems as a class.

Example #1: The Dice Problem#

First, we’ll try something a just a little bit harder than the cookie problem, an example with more than just two hyptheses.

Suppose we have a box with a 6-sided die, an 8-sided die, and a 12-sided die. We choose one of the dice at random, roll it, and report that the outcome is a 1. What is the probability that we chose the 6-sided die?

Let’s start by thinking intiuitvely about this. Which dice should be most likely?

The 6 sided die is most likely because it has the highest chance of producing a 1 (1/6).

Now let’s set up our Bayes table. What are the hypotheses and what are their prior probabilities?

table = pd.DataFrame(index=['6-sided', '8-sided', '12-sided'])
table['prior'] = 1/3, 1/3, 1/3
table
prior
6-sided 0.333333
8-sided 0.333333
12-sided 0.333333

Next we need to compute the likelihood of the data under each hypothesis.

In other words, what is the probability of rolling a one, given each die?

table['likelihood'] = 1/6, 1/8, 1/12
table
prior likelihood
6-sided 0.333333 0.166667
8-sided 0.333333 0.125000
12-sided 0.333333 0.083333

Now that we have the prior and likelihood for each hypothesis, what do we calculate next?

The “unnormalized posteriors”:

table['unnormalized'] = table['prior'] * table['likelihood']
table
prior likelihood unnormalized
6-sided 0.333333 0.166667 0.055556
8-sided 0.333333 0.125000 0.041667
12-sided 0.333333 0.083333 0.027778

And now there is just one last step to calculate the posterior probabilities. Normalization!

prob_data = table['unnormalized'].sum()
table['posterior'] = table['unnormalized'] / prob_data
table
prior likelihood unnormalized posterior
6-sided 0.333333 0.166667 0.055556 0.444444
8-sided 0.333333 0.125000 0.041667 0.333333
12-sided 0.333333 0.083333 0.027778 0.222222

And there we have it! The probability that we chose the 6-sided die given that we rolled a one is 4/9.

You may have noticed by now that every time we calculate the posterior from the prior and the likelihood we do the exact same steps to caculate the Bayesian update. We can simplify things going forward by introducing an update function to calculate those parts of the table:

def update(table):
    table['unnormalized'] = table['prior'] * table['likelihood']
    prob_data = table['unnormalized'].sum()
    table['posterior'] = table['unnormalized'] / prob_data
    return table

table = pd.DataFrame(index=['6-sided', '8-sided', '12-sided'])
table['prior'] = 1/3, 1/3, 1/3
table['likelihood'] = 1/6, 1/8, 1/12

update(table)
prior likelihood unnormalized posterior
6-sided 0.333333 0.166667 0.055556 0.444444
8-sided 0.333333 0.125000 0.041667 0.333333
12-sided 0.333333 0.083333 0.027778 0.222222

Example #2: The Monty Hall Problem#

Now let’s consider a famously unintuitive problem in probability, the Monty Hall Problem. Many of you may have heard of this one.

_images/Monty_open_door.svg

On his TV Show “Let’s Make a Deal”, Monty Hall would present contestants with three doors. Behind one was a prize, and behind the other two were gag gifts such as goats. The goal is to pick the door with the prize. After picking one of the three doors, Monty will open one of the other two doors revealing a gag prize, and then ask if you’d like to switch doors now.

What do you think, should we switch doors or stick with our original choice? Or does it make no difference?

Most people will say there’s now a 50/50 chance the remaining doors have the prize, so it doesn’t matter.

But it turns out that’s wrong! You actually have a 2/3 chance of finding the prize if you switch doors.

Let’s see why using a Bayes table.

Each door starts with an equal prior probability of holding the prize:

table = pd.DataFrame(index=['Door 1', 'Door 2', 'Door 3'])
table['prior'] = 1/3, 1/3, 1/3
table
prior
Door 1 0.333333
Door 2 0.333333
Door 3 0.333333

What is our data in this scenario? Without loss of generality, suppose we originally picked door 1. Now Monty opens a door (let’s say door 3, again without loss of generality) to reveal a gag prize. So what is the likelihood of the data under each hypothesis?

Hypothesis 1: The prize is behind door 1

In this case Monty chose door 2 or door 3 at random, so he was equally likely to open door 2 and 3, so the observation that he opened door 3 had a 50/50 chance of occuring.

Hypothesis 2: The prize is behind door 2

In this case Monty must open door 3, so the observation that he opened door 3 was guaranteed to happen.

Hypothesis 3: The prize is behind door 3

Monty could not have opened a door with the prize behind it, so the probability of seeing him open door 3 under this hypothesis is 0.

table['likelihood'] = 1/2, 1, 0
table
prior likelihood
Door 1 0.333333 0.5
Door 2 0.333333 1.0
Door 3 0.333333 0.0

And now let’s run our update function and see what the posterior probabilities are:

update(table)
prior likelihood unnormalized posterior
Door 1 0.333333 0.5 0.166667 0.333333
Door 2 0.333333 1.0 0.333333 0.666667
Door 3 0.333333 0.0 0.000000 0.000000

Turns out there is a 2/3 probability the prize is behind door 2! We should switch doors.