Bayes’ Theorem (part 2)#
Review of Bayes’ Theorem#
Last class we learned about the concept of using Bayes’ theorem to perform a Bayesian update. We wrote out the diachronic version Bayes’ Theorem:
In which we named the key components:
\(P(H)\), the probability of the hypothesis before we see the data, called the prior probability
\(P(H \vert D)\), the probability of the hypothesis after we see the data, called the posterior probability
\(P(D \vert H)\), the probabilitiy of the data under the hypothesis, called the likelihood
\(P(D)\), the total probability of the data under any hypothesis
We then used a Bayes table to compute the posterior probabilities of all hypotheses in the cookie problem:
table = pd.DataFrame(index=['first bowl', 'second bowl'])
table['prior'] = 1/2, 1/2 # The prior is that both bowls are equally likely
table['likelihood'] = 30/40, 20/40 # The first bowl has 30 vanilla cookies and the second has 20
table['unnormalized'] = table['prior'] * table['likelihood'] #compute the top of the Bayesian update
prob_data = table['unnormalized'].sum() #Compute the "normalizing constant"
table['posterior'] = table['unnormalized'] / prob_data #Calculate the posterier probabilities of both bowls
table
prior | likelihood | unnormalized | posterior | |
---|---|---|---|---|
first bowl | 0.5 | 0.75 | 0.375 | 0.6 |
second bowl | 0.5 | 0.50 | 0.250 | 0.4 |
Example Problems#
Being able to apply a Bayesian update is incredibly important, so today we’ll work through two more example problems as a class.
Example #1: The Dice Problem#
First, we’ll try something a just a little bit harder than the cookie problem, an example with more than just two hyptheses.
Suppose we have a box with a 6-sided die, an 8-sided die, and a 12-sided die. We choose one of the dice at random, roll it, and report that the outcome is a 1. What is the probability that we chose the 6-sided die?
Let’s start by thinking intiuitvely about this. Which dice should be most likely?
The 6 sided die is most likely because it has the highest chance of producing a 1 (1/6).
Now let’s set up our Bayes table. What are the hypotheses and what are their prior probabilities?
table = pd.DataFrame(index=['6-sided', '8-sided', '12-sided'])
table['prior'] = 1/3, 1/3, 1/3
table
prior | |
---|---|
6-sided | 0.333333 |
8-sided | 0.333333 |
12-sided | 0.333333 |
Next we need to compute the likelihood of the data under each hypothesis.
In other words, what is the probability of rolling a one, given each die?
table['likelihood'] = 1/6, 1/8, 1/12
table
prior | likelihood | |
---|---|---|
6-sided | 0.333333 | 0.166667 |
8-sided | 0.333333 | 0.125000 |
12-sided | 0.333333 | 0.083333 |
Now that we have the prior and likelihood for each hypothesis, what do we calculate next?
The “unnormalized posteriors”:
table['unnormalized'] = table['prior'] * table['likelihood']
table
prior | likelihood | unnormalized | |
---|---|---|---|
6-sided | 0.333333 | 0.166667 | 0.055556 |
8-sided | 0.333333 | 0.125000 | 0.041667 |
12-sided | 0.333333 | 0.083333 | 0.027778 |
And now there is just one last step to calculate the posterior probabilities. Normalization!
prob_data = table['unnormalized'].sum()
table['posterior'] = table['unnormalized'] / prob_data
table
prior | likelihood | unnormalized | posterior | |
---|---|---|---|---|
6-sided | 0.333333 | 0.166667 | 0.055556 | 0.444444 |
8-sided | 0.333333 | 0.125000 | 0.041667 | 0.333333 |
12-sided | 0.333333 | 0.083333 | 0.027778 | 0.222222 |
And there we have it! The probability that we chose the 6-sided die given that we rolled a one is 4/9.
You may have noticed by now that every time we calculate the posterior from the prior and the likelihood we do the exact same steps to caculate the Bayesian update. We can simplify things going forward by introducing an update function to calculate those parts of the table:
def update(table):
table['unnormalized'] = table['prior'] * table['likelihood']
prob_data = table['unnormalized'].sum()
table['posterior'] = table['unnormalized'] / prob_data
return table
table = pd.DataFrame(index=['6-sided', '8-sided', '12-sided'])
table['prior'] = 1/3, 1/3, 1/3
table['likelihood'] = 1/6, 1/8, 1/12
update(table)
prior | likelihood | unnormalized | posterior | |
---|---|---|---|---|
6-sided | 0.333333 | 0.166667 | 0.055556 | 0.444444 |
8-sided | 0.333333 | 0.125000 | 0.041667 | 0.333333 |
12-sided | 0.333333 | 0.083333 | 0.027778 | 0.222222 |
Example #2: The Monty Hall Problem#
Now let’s consider a famously unintuitive problem in probability, the Monty Hall Problem. Many of you may have heard of this one.
On his TV Show “Let’s Make a Deal”, Monty Hall would present contestants with three doors. Behind one was a prize, and behind the other two were gag gifts such as goats. The goal is to pick the door with the prize. After picking one of the three doors, Monty will open one of the other two doors revealing a gag prize, and then ask if you’d like to switch doors now.
What do you think, should we switch doors or stick with our original choice? Or does it make no difference?
Most people will say there’s now a 50/50 chance the remaining doors have the prize, so it doesn’t matter.
But it turns out that’s wrong! You actually have a 2/3 chance of finding the prize if you switch doors.
Let’s see why using a Bayes table.
Each door starts with an equal prior probability of holding the prize:
table = pd.DataFrame(index=['Door 1', 'Door 2', 'Door 3'])
table['prior'] = 1/3, 1/3, 1/3
table
prior | |
---|---|
Door 1 | 0.333333 |
Door 2 | 0.333333 |
Door 3 | 0.333333 |
What is our data in this scenario? Without loss of generality, suppose we originally picked door 1. Now Monty opens a door (let’s say door 3, again without loss of generality) to reveal a gag prize. So what is the likelihood of the data under each hypothesis?
Hypothesis 1: The prize is behind door 1
In this case Monty chose door 2 or door 3 at random, so he was equally likely to open door 2 and 3, so the observation that he opened door 3 had a 50/50 chance of occuring.
Hypothesis 2: The prize is behind door 2
In this case Monty must open door 3, so the observation that he opened door 3 was guaranteed to happen.
Hypothesis 3: The prize is behind door 3
Monty could not have opened a door with the prize behind it, so the probability of seeing him open door 3 under this hypothesis is 0.
table['likelihood'] = 1/2, 1, 0
table
prior | likelihood | |
---|---|---|
Door 1 | 0.333333 | 0.5 |
Door 2 | 0.333333 | 1.0 |
Door 3 | 0.333333 | 0.0 |
And now let’s run our update function and see what the posterior probabilities are:
update(table)
prior | likelihood | unnormalized | posterior | |
---|---|---|---|---|
Door 1 | 0.333333 | 0.5 | 0.166667 | 0.333333 |
Door 2 | 0.333333 | 1.0 | 0.333333 | 0.666667 |
Door 3 | 0.333333 | 0.0 | 0.000000 | 0.000000 |
Turns out there is a 2/3 probability the prize is behind door 2! We should switch doors.