This is my take on a classic probability paradox, the case of Sleeping Beauty. Along the way I’ll explain what I think is wrong with a classic rule of inference, Bayes’ Rule, and what needs to be done to fix it, but I hope you enjoy the paradox even if you don’t have a background in probability theory.
Here’s the scenario: Beauty volunteers for a lab experiment in the experimental philosophy department at her local community college. She goes into the lab on Sunday and the following procedure is explained to her:
- Today we will give you a sleeping potion and flip a fair coin: Heads or Tails each come up half the time.
- On Monday, we will awaken you and ask what probability you assign to the coin coming up Heads; then we’ll tell you the actual result of the coin flip. After that, we will give you another sleeping potion, which will also have the effect of erasing your memory of being awoken, asked, and told.
- If the coin did indeed come up Heads, we will repeat step 2 on Tuesday and again awaken you, ask for the probability that the coin came up Heads, and put you back to sleep. (Of course, your answer would be “100%” if you could remember that you were previously awoken, but since you’ll have had your memory erased you will have no reason to give a different answer than on your first awakening.) If the coin came up Tails, we’ll let you sleep through Tuesday.
- Finally, on Wednesday we will give you the antidote to the sleeping potion and you’ll wake up, having regained any missing memories.
The question is: Upon being awoken partway through the experiment, what answer should Beauty give for the probability of the coin flip being Heads?
The paradox is that there are two standard answers, and both seem to have watertight arguments for them:
- The classical position is that upon awakening, Beauty should assign a probability of 1/2 to the coin flip being Heads. Before downing that first sleeping potion, she knows the coin has equal chances of coming up Heads or Tails, and upon awakening, she has learned no new information that would change her estimate.
- But there’s a competing answer: that when she wakes Beauty should assign a probability of 2/3 to the outcome of Heads. After all, if the experiment is repeated 100 times, then she’ll be awoken about 150 times (50 from the 50 Tails, and 100 from the 50 Heads), and two-thirds of those awakenings will occur under the circumstances that the coin flip was Heads.
Most of the debate between the “halfers” and the “thirders” is along the lines of “Here’s why my answer is right!” “No, you must be wrong, because here’s another reason why my answer’s right!” Well, I’m going to try to explain why I think the halfer argument is wrong, and how when you fix it it agrees with the thirder answer.
First let’s review ordinary probabilistic inference: how to update your probability estimates when you receive new information. For example, if I wake up and have no idea what day of the week it is, I should assign a probability of 5/7 to the proposition that it is a weekday, and 2/7 to it being a weekend day. But if turn over and find I have woken up to our early alarm, which only rings Sunday through Friday, then I can eliminate Saturday as a possibility, and the probability that it is a weekday goes up to 5/6.
In general, a theorem called Bayes’ Rule lets you update probability given new evidence without having to break the situation down into every possibility: for each hypothesis you have (is it a weekday? the weekend?) you multiply its initial probability by the chances of you observing that evidence under that hypothesis, and then rescale all the results so the total sums to one. In our case, my early alarm ringing has a 100% probability of happening if it’s a weekday, but only a 50% probability of happening if it’s the weekend. So I update the probabilities as follows:
The classical “halfer” argument goes that since Beauty’s observations upon waking are the same regardless of the coin flip—in fact, she knows for certain what she will experience even before the coin is tossed—the probability she assigns to Heads must not have changed from its original 1/2.
I claim that when updating according to new evidence, it actually makes sense to multiply not just by the likelihoods of observing that evidence given your various hypotheses, but also by the number of people making that observation. Usually there’s the same number of observers under each hypothesis, so this isn’t a factor, but in our case, two of Beauty’s selves are around to experience each Heads coin flip but only one for each Tails flip:
This is the type of calculation that reduces correctly to the case of individual, equally likely possibilities: if Beauty views each waking as randomly chosen from all the days of her life (say there are N of them), then observing that she is awakening to the coin flip question happens twice as often under the Heads scenario as under Tails, so her probability of Heads should be twice that of Tails.
This picture suggests the following, conventional application of Bayes’ rule to get the same result, where again the “observation” is choosing a random day of Beauty’s life and finding that she is having the coin flip question posed to her:
So there’s my resolution to the Sleeping Beauty paradox: in order to apply Bayes’ Rule consistently, it has to include a factor not just for the likelihood of observing the evidence, but also for the number of people making that observation. We saw in the class size paradox that the average class size experienced by professors is not the same as the one experienced by students, because students are more likely than professors to be in large classes, because larger classes have more students in them. In future blog posts I’ll share some more examples where this sort of probabilistic reasoning comes up in real life (in particular, in hearsay and advertising).
As far as I know, this is not a mainstream variant of Bayes’ Rule, and I would love to know whether there’s an axiomatic system that can deduce it in the same way that Bayes’ Rule can be proven from the usual axioms of probability theory. I’d especially like to know how to make sure, in cases where there’s also a way to use the conventional Bayes’ Rule to get the same answer, that I don’t accidentally double-count the observer factor.
Edit: I’ve found the website where I originally came across the Sleeping Beauty paradox: Some “Sleeping Beauty” postings. It looks like I swapped the roles of “Heads” and “Tails” compared to the usual version, but otherwise it’s the same. It was in fact the Andy and Bob variant that got me to think about likelihood needing an extra factor for the number of observers. Does anyone have any thoughts about that or any other paradox from the list?
Having just read the statement of the paradox and the two takes on it, I wonder: isn’t this just an instance of confused language usage? That is to say, both answers are correct, but they’re answers to two different questions. As to the original question, I think the ‘classical position’ would be correct, since the probability that a single coin toss comes up heads is always just 50%, and this is precisely what seems to be asked. On the other hand, if Sleeping Beauty’s task is to be right about the outcome of the toss as often as possible, she should always say ‘heads’, and then she’d be right about two-thirds of the time.
Now I’ll read the rest of your post, probably to discover that my comment is covered by your discussion.
LikeLiked by 1 person
Thanks for the comment! I’m curious what you think after reading the rest of the post.
I agree that the probability a fair coin toss comes up heads is 50% (by definition), but that doesn’t mean that the probability a fair coin toss came up heads is 50% if you have some information about it (say, that the result was heads, in which case the probability is 100%, but you could also learn partial information that raises the probability to some in-between value). So the question becomes: does the duplicating nature of the experiment somehow constitute a form of information Beauty gets, thereby changing her estimated probability?
(Ik zie trouwens dat jij nederlandstalig bent. Ben jij misschien iemand die ik leerde kennen terwijl ik in Nederland woonde?)
LikeLike
Your solution freely applies The Principal of Indifference to assign equal probabilities to events you cannot distinguish. That’s a reasonable Bayesian approach. What is your opinion on whether p(heads|awake) can be calculated without using The Principle of Indifference?
In my opinion, there are both a “halfer” and a “thirder” probability models that are consistent with the information in the problem. So p(heads|awake) cannot be “objectively” calculated.
It’s interesting that all attempts to demonstrate a practical difference between the “halfer” and the “thirder” positions by betting schemes fail if we assume both “halfer” and “thirders” know that should calculate their betting strategies using p(heads) = 1/2 and the deterministic aspects of the experiment, without ever using p(heads|awake).
LikeLike
Thanks for commenting! As for the principle of indifference, I have the feeling that there’s very little we can say about “objective” probability without falling back on it. But I also think the principle is a reasonable formulation of what we mean when we say we don’t have any information about what state we’re in.
What really put me in the “thirder” camp was trying to work through the “Andy and Bob” paradox from this page. Do you have a good enough sense of the “halfer” position to know what that side’s resolution to the paradox would be?
I would love for you to elaborate on what you mean about betting schemes. To me, all that matters is when the bet takes place: if Beauty places separate bets on the coin flip at each waking, then she should use the 1/3 probability; if the bet takes place once per experiment, either before or after the waking(s), then she should use the 1/2 probability. The most confusing thing to me would be if Beauty is asked upon waking whether her future self (post-experiment) should take a bet that’s good from the 1/3-perspective but bad from the 1/2-perspective. On the one hand, her post-experiment self should have the 1/2-perspective and not take the bet, but her during-experiment self would have the 1/3-perspective; her very existence enough information to make the future bet worthwhile. Is that the interestingness you were referring to?
LikeLike
Suppose whenever Sleeping Beauty is awakened in the experiment, she is asked “What is the fair price for the bet: “Bet owner gets $1 if the coin lands heads”? The resolution of the bet is not as simple as the words used to describe the bet. In any betting situation, the event being bet upon is determined by how the bet is resolved, which includes any future consequences of buying the bet.
If Sleeping Beauty declares X to be the fair price for the bet, then she knows that if the coin lands tails she must be willing to pay price X for the bet both on Monday and Tuesday (because she is rational, therefore consistent). That consequence of the bet is not fully explained by phrasing the bet as if it is a one-time wager.
Sleeping Beauty can calculate the fair price of the bet(s) by computing the expected return from the bet as (1/2)(-X + 1) + (1/2)(-2X + 0) and setting that expression equal to 0.. She obtains X = 1/3 as the fair price. This calculation uses P(heads) = 1/2 together with the that deterministic course of the experiment that follows the coin toss and it assumes she must state the same price X for a bet each time a bet with the same verbal description is offered. The calculation does not involve knowing P(heads| awakened).
Even if Sleeping Beauty was a “halfer”, she would make the same calculation. In the context of the problem, I think it impossible to propose a wager to a rational Sleeping Beauty that detects whether she is a “halfer” or a “thirder” or agnostic about the value of p(heads| awakened). A rational Sleeping Beauty will establish the a fair price of a bet according to known information before resorting to The Principle of Indifference.
LikeLike
I think I see what you mean now: calculating the expected value of such a gamble only uses that P(heads)=1/2, so “thirders” and “halfers” will never disagree.
How about this variant, though? Each time Sleeping Beauty is awakened mid-experiment, she is told that her future, post-experiment self will be offered the bet “Bet owner gets $1 if the coin landed heads” for the price of $0.40, and is asked whether her future self should take it. If I understand the halfer and thirder positions correctly, both groups would agree that future-Beauty would be acting rationally to take the bet, but disagree about whether she should: a halfer-Beauty would just side with future-Beauty, while a thirder-Beauty thinks her awakening is enough evidence of tails that future-Beauty shouldn’t take the bet. Does that sound right to you?
LikeLike
” but disagree about whether she should: a halfer-Beauty would just side with future-Beauty, while a thirder-Beauty thinks her awakening is enough evidence of tails that future-Beauty shouldn’t take the bet. Does that sound right to you?”
I don’t what the prevailing opinion is, but I agree that if a future-Beauty only has the information about how the experiment is conducted then she should set (1/2)$ as fair price for the bet “Bet owner gets $1 if the coin landed heads”.
Why would a thirder-Beauty think that future-Beauty should set a different fair price? I haven’t read any articles by “thirders” who make such a claim. In fact, the claim sounds like something an anti-thirder would accuse “thirders” of believing.
Suppose Sleeping Beauty is offered this bet each time she is awakened: “At the end of the experiment, the bet owner gets $1 if the coin landed heads and no matter how many times you agreed to pay price X for this bet during the experiment, you will receive at most $1 and only pay the price X once”. This is another bet whose fair price can be computed without computing p(heads|awakened). Rational “thirders” and “halfers” should agree on the fair price.
LikeLike
“Why would a thirder-Beauty think that future-Beauty should set a different fair price? I haven’t read any articles by ‘thirders’ who make such a claim. In fact, the claim sounds like something an anti-thirder would accuse ‘thirders’ of believing.”
I agree, it must be an incorrect form of thirdism for thirder-Beauty to say that future-Beauty should set a fair price of $1/3 instead of $1/2 for the bet “Bearer receives $1 if the coin flip was heads.” But I’m not sure where the error in reasoning is for thirder-Beauty to say that upon awakening, having her probability of heads updated to 1/3, she can continue to endorse her future self’s betting even money on heads.
LikeLike
[QUOTE]
But I’m not sure where the error in reasoning is for thirder-Beauty to say that upon awakening, having her probability of heads updated to 1/3, she can continue to endorse her future self’s betting even money on heads.
[/QUOTE]
That’s an interesting question. Surely some academic papers by “thirders” have answers to that dilemma, but I don’t know how they handle it.
How can we translate “endorse” into specific mathematical terms?
Can a thirder-beauty calculate the probability that future-Beauty will win a bet on heads? – or calculate future-Beauty’s expected loss on an even bet?
Sleeping Beauty’s fair price for the bet: “Bet owner gets $1 if future Beauty wins her bet” would be (1/2)$, calculated independently of any estimate of P(heads|awakened), provided Sleeping Beauty was only required to make this bet once per experiment. If the price for the bet had to cover the possibility of making the bet twice if the coin landed tails, then a different fair price would be set , also not based on p(heads|awakened).
We can distinguish at least 4 species of “thirders”
Thirder-1: p(heads| awakened) = 1/3 is a solution given by a plausible probability model, but there are other plausible probability models that give different answers.
Thirder-2: p(heads| awakened) = 1/3 is a solution based on using the information in the problem plus applying the Principle of Indifference, but there may be other possible solutions that can be obtained by applying the Principal of Indifference in a different manner.
Thirder-3: p(heads| awakened) = 1/3 is the solution based on using the information in the problem plus applying the Principle of Indifference. All correct applications of the Principal of Indifference give the same answer.
Thirder-4: p(heads| awakened) = 1/3 is the solution that can be computed by using only information given in the problem.
Thirders 3 and 4 are the ones who might answer a question with a calculation using p(heads| awakened) even when the question can be answered without using p(heads|awakened).
LikeLike
My thoughts on the Andy and Bob problem: The concept of establishing a single fair price for a bet assumes that both the purchaser of the bet and the person offering the bet have the same information. There is nothing paradoxical about people with different information setting different odds for a betting on an event.
We must resolve ambiguities in the statement of the problem. Are we supposed to be uncertain that the “Andy” referred to in the problem is the original Andy? What does the name “Andy”designate? Can it also designate a clone of Andy?
Assume “Andy” designates the original Andy. This leaves the question of whether Bob can distinguish the original Andy from his clone. Assume Bob cannot.
If we assume Andy knows he is the original Andy but Bob doesn’t know whether he is speaking with the original Andy or the clone then Andy and Bob don’t have the same information about the event “Andy was cloned”. (And Bob doesn’t know whether he is making a wager with the original Andy or not.)
To determine Bob’s fair price for a bet, we must resolve the ambiguity of whom he is betting with (Andy or the clone or some other person) and whether Bob must determine the fair price for a bet before examining the rooms.
Consider Bob’s fair price for a bet described as “Bet owner gets $1 if Andy was not cloned”. Consider Bob’s fair price for this bet before Bob examines any rooms. Assume a bet must always offered to him in the first non-empty room he examines, by either Andy or his clone. If Bob pays price X for the bet(s), his expected return is:
(1/2)(-X + 1) + (1/2)( (1/2)(-X + 0) + (1/2)(-X + 0)).
Setting that expression equal to zero gives X = 1/2.
Consider Bob’s fair price for bets with the same verbal description when the bets are offered after he has examined the first room and found it occupied. P(Andy not cloned and first room examined is occupied) = (1/2)(1/2) = 1/4
P( first room examined is occupied) = (1/2)(1/2) + (1/2)(1) = 3/4
P(Andy not cloned | first room examined is occupied) = (1/4)/(3/4) = 1/3. So if Bob pays X for the bet, his expected return is -X + 1/3 and his fair price for the bet is X =1/3.
Consider Andy’s fair price for a bet described by the same phrase that was used above, “Bet owner gets $1 if Andy was not cloned”. Assume Andy knows he is the original Andy – or that we take the viewpoint of the reader of the problem and know that Bob is talking to the original Andy.
P(Andy was not cloned and Bob finds first room occupied and speaks to original Andy) = (1/2)(1/2) = 1/4
P(Bob finds first room occupied and speaks to original Andy) =
(1/2)(1/2) + (1/2)(1)(1/2) = 2/4
P(Andy was not cloned | Bob finds the first room occupied and speaks to the original Andy) = (1/4)/(2/4) = 1/2.
Andy’s expected return on the bet is -X + 1/2 so his fair price is X =1/2.
LikeLike
I agree; there’s no paradox if Andy knows he’s the original but Bob does not: that’s extra information Andy has that Bob doesn’t, so we shouldn’t expect their fair prices for the bet to be the same.
But what if “Andy” doesn’t know whether he is the original or a clone? Then he and Bob have the same information, so it would be a paradox if they compute different fair prices for the same bet.
Now “Andy” calculates:
P(Andy not cloned and Bob finds me in the first room he looks in) = (1/2)(1/2) = 1/4.
P(Bob finds me in the first room he looks in) = (1/2)(1/2) + (1/2)(1/2) = 2/4.
(Whether or not the cloning happens, “Andy” should ascribe a probability of 1/2 to Bob finding him, this particular “Andy”, on his first try.)
Therefore P(Andy not cloned | Bob found me in the first room he looked in) = (1/4)/(1/2) = 1/2. So Andy’s fair price should still be 1/2—there’s the paradox!
LikeLike
But what if “Andy” doesn’t know whether he is the original or a clone? Then he and Bob have the same information,..”
If the reader of the problem doesn’t know whether “Andy” refers to the original Andy or the clone Andy then statements such as “he and Bob have the same information” or “they compute…” are ambiguous.
“Now “Andy” calculates:
P(Andy not cloned and Bob finds me in the first room he looks in) = (1/2)(1/2) = 1/4.”
Apparently we are to assume that original-Andy and clone-Andy (if he exits) execute the same thought process. The being executing this thought process must consider the possibility that he is clone-Andy. If he is clone Andy then p(Andy not cloned and Bob finds me in the first room he looks in) = 0 because “me” would not exist in that cast.
So a calculation for P(Andy not cloned and Bob finds me in the first room) appears to need a given value for P(I am original-Andy). As far as I can see, that is not a well-defined probability since it is unclear whom “I” refers to.
LikeLike
“So a calculation for P(Andy not cloned and Bob finds me in the first room) appears to need a given value for P(I am original-Andy). As far as I can see, that is not a well-defined probability since it is unclear whom ‘I’ refers to.”
Does it help if I say “I” is whichever Andy is currently speaking with Bob? I’m imagining myself in the following situation:
I wake up after the procedure, I look around my room in the post-anesthesia care unit, and see a large numeral 2 painted on the wall. “Was I cloned,” I wonder, “and if so, am I the clone or the original? Hmm, maybe the second question is ill-defined, but the first question still makes sense. What is the probability that the other room, room 1, is occupied?”
Before I have a chance to try any calculations, my friend Bob opens the door.
“Oh, hello!” he says. “I wasn’t sure there’d be anyone in here. I knew you, or both of you, would be waking up around now, so I picked one of the two rooms at random (it happened to be room 2) and wandered in. I’m glad you’re here; if I’d picked an empty room I would be so sad I’d have just gone home. Anyway, I was wondering what you’d be willing to pay for this bet: ‘Pay the bearer of this bet $1 if the coin flip was heads.‘ Before opening the door, my fair price would have been $1/2. But now that my random door-opening revealed an occupied bed, I find it twice as likely that you were cloned and so the bet is only worth $1/3 to me.”
Should that be my fair price for the bet too? Or would Bob and I find it mutually agreeable if I paid him $0.40 for the bet?
My reasoning is the following, and I’d love to know if yours is different:
The way I (Owen) try to make sense of this last probability is to say that when Andy (either Andy) awakens, he should take the mere fact of his existence as evidence in favor of his living in the world in which there are two people having that experience. In fact, he’s twice as likely to be in such a world, so the probability that he wasn’t cloned should go down to 1/3.
LikeLike
[QUOTE]
Does it help if I say “I” is whichever Andy is currently speaking with Bob? I’m imagining myself in the following situation:
[/QUOTE]
It only helps if you can state a probability space where probabilities of events involving a being called “I” are deduced from events in the probability space specified by the problem.
[QUOTE]
1. Since Bob and I have all the same information, my fair price for the bet should also be $1/3 (and so I should not pay him $0.40 for it). According to the fair-bet-price interpretation of probability, this means I should assign a probability of 1/3 to the statement “The coin flip was heads.”
[/QUOTE]
That is an ill-defined claim because “I” does not refer to a specific individual mentioned in the problem. The information in the problem tells us about two possible beings, original-Andy and clone-Andy. If you wish to relate “I” to the given information in the problem, you must specify how facts about original-Andy and clone-Andy have any bearing on “I”.
It is similar to the ambiguity in the Sleeping Beauty Problem where the event “When Sleeping Beauty is awakened” is not an event described in the description of the experiment. In particular, the events B = (tails, Monday, awakened) and C = (tails, Tuesday, awakened) are not mutually exclusive events in the the experiment and, in fact , p(B|C) = 1. So the problem says nothing definite about any probability distribution over a probability space where B and C are mutually exclusive events and only one of them at a time can be the situation “When Sleeping Beauty is awakened”.
We should agree on the probability space described by the problem. It will probably be tedious to write it out!
LikeLike
See if this is correct:
I’ll use a Boolean-algebra like notation where “*” denotes set intersection, “+” denotes set union and “~” denotes set complement.
Events:
H: Andy was not cloned
A1: original Andy is in room 1
(~A1: original Andy is in room 2
C1: clone Andy is in room 1
C2: clone Andy is in room 2
E1: Room 1 is empty
E2: Room 2 is empty
B1: Bob visits room 1 first
(~B1: Bob visits room 2 first)
“I am in room1 and Bob enters room 1 first” will mean (A1 + C1)*B.
The probability space is partitioned into mutually exclusive sets ( “atoms” ) consisting of all possible combinations of intersections of the above 7 sets and their complements. Only 8 of the atoms have a non-zero probability. These are:
a1 = H*A1*~C1*~C2*~E1*E2*B1, p(a1) = 1/8
a2 = H*A1*~C1*~C2*~E1*E2*~B1, p(a2) = 1/8
a3 = H*~A1*~C1*~C2*E1*~E2*B1, p(a3) = 1/8
a4 = H*~A1*~C1*~C2*E1*~E2*~B1, p(a4) = 1/8
a6 = ~H*A1*~C1*C2*~E1*~E2*B1, p(a6) = 1/8
a7 = ~H*A1*~C1*C2*~E1*~E2*~B1,p(a7) = 1/8
a8 = ~H*~A1*C1*~C2*~E1*~E2*B1, p(a8) = 1/8
a9 = ~H*~A1*C1*~C2*~E1*~E2*~B1, p(a9) = 1/8
P( (A1 + C1)*B1) = P( A1*B1 + C1*B1) = P( a1+a6+a8) = 3/8
P( H*(A1+C1)*B1) = P(H*(A1*B1 + C1*B1)) = P(H*A1*B1 + H*C1*B1)
= P(a1) = 1/8
P(H | (A1 + C1)*B1) = (1/8)/(3/8) = 1/3
LikeLike
(You can use <blockquote> … </blockquote> for these fancy block quotes if you want.)
That makes sense; you are translating “I am in room 1 and Bob enters room 1” into “room 1 is occupied and Bob enters room 1,” and then you can use the reasoning you describe to deduce that the updated probability that Andy was cloned is 2/3.
For comparison, here’s a similar problem with only a touch of identity crisis:
You open your daily mail and find a letter with the following message:
What can you deduce from the fact that you received one of these letters? For similar reasons as the Andy and Bob problem, this is not well defined: any recipients of this letter would be a “you” in the same situation. But any information that uniquely identifies you—your name, say—lets you rephrase the information you have as “Stephen Tashiro received one of these letters.” Then you can condition on that evidence, update your probability of heads to 1/3, and reject the offer.
(Sanity check: we can also see whether this offer is good by checking whether the scammer would make or lose money in the long run if all the letter recipients took the bait: since twice as many people come forward if the coin flip is tails as if it is heads, the scammer gets eighty cents for every tails and only has to pay sixty cents for every heads. Good news for them, so bad news for you!)
In the Andy and Bob problem, the analogous identifying feature was not Andy’s name, since he would share it with any clones he has, but the room number he was in. That would let a room 1 awakener translate “I am awake” to “Room 1 is occupied,” and thus update his probability of cloning to 2/3, even before Bob comes in.
But what if he cannot see his room number, or any other feature of his room that he is sure is different from the other room’s? Is there not still a way for him to update the probability of cloning to 2/3? After all, if he learns that he is in room 1 or he learns that he is in room 2, he’ll be able to update his probability. But perhaps you reject the idea that “I am in room 1” is an event you can condition on if you are unsure whether someone else is having the same experience as you?
I do think this is the sort of problem that probability theory should be able to handle, in the sense that ordinary people often express uncertainty about what time it is, and are sometimes more uncertain than others in what should be a quantifiable way, even though being uncertain about the time is a case of being uncertain which of several difficult-to-distinguish people you are.
LikeLike
That’s useful to know. Is there some version of a … tag?
Taking the contents of the letter as true, it has the important information that it was sent to “randomly” selected people. The cultural understanding of “randomly selected” in that context is that each person in the town directory had the same probability of being selected in the case of selecting one person, and each pair of people in the town directory had the same probability of being selected if two were selected.
The solution to the Sleeping Beauty problem would be straightforward if the problem said “At a (single) randomly selected day in the experiment, if Sleeping Beauty is awakened on that day, she is asked for her estimate of the probability that the coin landed heads” (- and we assume Sleeping Beauty knows this happens on a single randomly selected day, even though she does not know what day it is when she is awakened.)
Instead of speaking of a randomly selected day, the Sleeping Beauty problem (Wikipedia version) says “Whenever Sleeping Beauty is awakened….”.
From the viewpoint of computing P(heads | person M received the letter) , does it matter whether the letter was sent to me or to some friend who tells me about receiving the letter?
The identification of “me” as a person who received the letter is relevant to who gets to make the wager, but I don’t see it as critical to computing the posterior probability that the coin landed heads.
m = Person M received the letter
Assume there are N persons in the city directory
P(m) = P(m|H) P(H) + P(m|~H) P(~H) = (1/N)(1/2) + (2/N)(1/2)
= (3/(2N))
P( H*m) = P(m|H) P(H) = (1/N)(1/2) = (1/(2N))
P(H | m) = (1/(2N))/ (3/(2N)) = 1/3
There is a distinction between knowing :
1) “At least one person will receive the letter” (which, assuming the reliability of the post, we know before the letter is mailed)
versus
2) One particular person M did receive the letter.
P(H| At least one person will receive the letter) = 1/2
I assume Bob still visits and tells his tale.
Using the notation from a previous comment, interpret the event “I am awake in some room and Bob visits that room first ” as: (~E1*B1 + ~E2*~B1)
P(~E1*B1 + ~E2*~B1 | H)
P(H | ~E1*B1 + ~E2*~B1) = P(H*(~E1*B1 + ~E2*~B1)) / P(~E1*B1 + ~E2*~B1)
= P( H*~E1*B1 + H*~E2*~B1)/ P(~E1*B1 + ~E2*~B1)
= P(a1 + a4)/ P(a1 + a4 + a6 + a7 + a8 + a9) = (2/8)/(6/8)= 1/3
I’ve glanced at some of the philosophical papers on the Sleeping Beauty problem and they mention concepts like “centered evidence”. I don’t yet appreciated the philosophical issues – I’m sure there something to them. In order to discuss those issues, we must consider ill-posed problems – otherwise we’re just doing standard probability theory.
LikeLike
It’s a little tricky to say, because under most circumstances, if some friend tells you that they received such a letter, you also know that all your other friends who might have told you didn’t. If you let
be the fraction of people who would tell you if they got a letter (and assume that this is independent of whether they were chosen to receive a letter), then the probability p=P(heads| you hear of only one letter) is
, which varies smoothly from 1/3 for small
(the original problem) to 1 for
(you know for sure there was only one recipient). I don’t know if the fact that
when
is significant.
Let’s assume now that he doesn’t. Would it be enough for the Andy in room 1 to know in advance that one of the otherwise-identical recovery rooms has a basket of oranges and the other has a bowl of bananas (assigned randomly by separate fair coin toss), and find upon waking that his room contains oranges, for him to update his probability from 1/3 to 2/3? I don’t see a big difference between oranges and Bob. (Sorry, Bob!)
Assuming yes, would it be enough for the Andy in room 1 to know that at least one of the fruit, door handle design, or wallpaper color is different in the two rooms (similarly randomly assigned), but not know which? Assuming yes, would it be enough to know that the two rooms are distinct in some way? Even if he isn’t sure he’s observed the bit that’s distinct yet? There doesn’t seem to me to be a sharp line between Bob coming and Andy just waking up and looking around.
Good to know! I have read barely any of the literature, as you can probably tell. 🙂 It’s a bit overwhelming to get started when I’m not sure what the conversation-defining papers are. Any advice is welcome!
LikeLike
I agree with p = 1/(3 – 2f) if I include myself as one of the M = (f)(N) people and N is large.
We are pursing the number-of-observers theme of the blog, but I don’t see where this would lead. We could ask the question of what would happen if each of the N people had probability f of telling us if they received the letter. Is that different than having M = (f)(N) people be completely reliable?
The semantics of who is an observer is tricky. There is only one “master observer”, namely the person who is computing the answer to a problem. In this case, the problem solver “observes” that there are other observers in the problem and observes the given properties of those observers.
Redefine the event B1 as:
B1 = bananas are put in room 1 and oranges are put in room 2
~B1 = oranges are put in room 1 and bananas are put in room 2
I interpret the question this way:
K = Some being is original-Andy or clone-Andy and woke up in a room containing bananas.
Find P(H|K).
K = (A1 + C1)*B1 + (~A1 + C2)*~B1
= A1*B1 + C1*B1 + ~A1*~B1 + C2*~B1
= a1 + a6 + a8 + a4 + a9 + a7
P(K) = 6/8
K*H = a1 + a4
P(K*H) = 2/8
P(H|K) = P(K*H)/P(K) = (2/8)/(6/8) = 1/3
If we define the event K as
K = Some being who is original-Andy or clone-Andy wakes up in a room 1 which may or may not have bananas in it.
K = (A1 + C1)*(B1 + ~B1)
K = A1*B1 + A1*~B1 + C1*B1 + C1*~B1
= (a1 + a6) + (a2 + a7) + a8 + a9
p(K) = 6/8
K*H = a1 + a2
P(K*H) = 2/8
P(H|K) = (2/8)/(6/8) = 1/3
If we define the event K as
K = some being, who is original-Andy or clone-Andy wakes up in one of the rooms , which may or may not have bananas in it
K = (A1 + C1)*(B1 + ~B1) + (~A1 + C2)*(B1 + ~B1)
K = A1*B1 + C1*B1 + A1*~B1 + C1*~B1 + ~A1*B1 + ~A1*~B1 + C2*B1 + C2*~B1
= (a1 + a6) + a8 + (a2 + a7) + a9 + (a3+a8) + (a4 + a9) + a6 + a7
P(K) = 1
K*H = a1 + a2 + a3 + a4
P(K*H) = 4/8
P(H|K) = (4/8)/1 = 1/2
If the calculations are correct, Bob seems to function only as a label that Andy sees that tells him he is in one particular room. That is surprising.
LikeLike
I haven’t read the philosophical literature thoroughly either.
One blog entry I like is the Markov chain argument for the “halfer”position, but it does introduce an assumption by relying on the Principle of Indifference.
http://rfcwalters.blogspot.com/2014/08/the-sleeping-beauty-problem-how.html
A philosophical paper I liked because it supported my view that offering Sleeping Beauty a bet on heads is not a “pure” bet. Case 2 on page 3 of https://philpapers.org/archive/YAMLSB/ However, the content of the paper is too philosophical for my taste.
What the philosophers need is some better mathematics. It wouldn’t settle problem of interpreting verbal descriptions into specific mathematical statements, but it could clear up their attempts to mix-in mathematical reasoning at random places in their arguments.
I see two areas that need some math.
The first is the question of defining precisely what it means for two probability problems to be “equivalent”. All sorts of arguments surrounding the Sleeping Beauty problem involve creating a supposedly “equivalent” problem to it. I suspect a proper mathematical definition would involve , in some way, defining isomorphisms and homomorphisms between probability spaces and problems. One suggestion is by member “andrewkirk”: https://www.physicsforums.com/threads/definition-of-equivalent-probability-problems.918552/
The second area is clarifying when the Principle Of Indifference may be applied and investigating its reliability. Usually symmetry arguments can be related in some way to group theory. However, when I read philosophical papers, people just declare things to be “indistinguishable”. The Sleeping Beauty problem is murky. Do we say the 3 situations (Heads, Monday, Awakened), (Tails, Monday, Awakened), (Tails, Tuesday, Awakened) are indistinguishable to SB? Or do we day Monday is indistinguishable from Tuesday? Or do we say a heads-experiment is indistinguishable from a tails-experiment?
It would be nice to define a mathematical format for probability problems that would exhibit the symmetries that allow the Principle Of Indifference to be applied – and try to prove that “equivalent” problems put in that format have the same numerical solution even if the Principle Of Indifference is applied to a different set of variables. Then the philosophers could debate how the verbal description of a problem should be translated into the format.
The question of whether the Principle of Indifference can be paradoxical is big issue in applying probability theory. But I’ve never seen it discussed in the context where a precise definition was given that specified where the principle can be applied.
LikeLike
Thanks for the links! You’ve given me a lot to think about.
Yes, I can certainly imagine a lot of debate along the lines of “If we disregard the irrelevant details, these two problems are ‘the same’ and so have the same solution…” vs. “You can’t disregard those details; they’re exactly what make the two solutions different!” etc.
Andrewkirk’s suggestions certainly seems like a natural one (once you close it under transitivity) for isomorphisms of ordinary probability questions—I think I would rather work directly with isomorphisms of sigma-algebras and forget the underlying spaces, though that might give you the same thing—but it seems like questions about conditioning on “centered evidence” don’t fit so neatly into this framework. So maybe a prerequisite would be an attempt to formalize those questions in the first place.
Yes, that would be very helpful. For example, it’s hard to see how to regard the three outcomes of Monday-Heads, Monday-Tails, and Tuesday-Tails could be related by a group action, but surprisingly, a reader sent me a thirder argument along those lines! It’s a bit tangential to this discussion, but if you’re curious, drop me a line via the “contact” page and I’ll forward it to you.
LikeLike
In the previous post , I was asking about a math tag.
LikeLike
Let me check:
.
Yes: you can write “$ latex [math stuff here]$” without the space between “$” and “latex”.
LikeLike