Feeds:
Posts
Comments

Posts Tagged ‘anthropic’

Unlike Robin Hanson, I am not surprised by who I am.   Sure, most things that exist are not alive, are not human and are not statisticians, but that doesn’t make it surprising that I am.  What I am is the only thing I could have been.

It’s true that Robin is smarter than most people, and most people don’t write a popular blog.  So should he be surprised that he is those things?   The only reason he noted those particular features is because those features already exist.  The question was generated by the result.  Everyone has things about them that are unusual.  Should we all be surprised?  For example, Brenda might be one of the few left-handed female plumbers in Texas.  Should she be surprised?  If everyone has unique things they can point to, then shouldn’t that fail to surprise us?

Consider the t-shirt experiment:

20 t-shirts, each a unique color, are placed in a box.  You are blindfolded.  A shirt is randomly selected from the box and placed on you.  You then remove the blindfold.

Suppose you participate in the experiment, and after you remove the blindfold you observe that your t-shirt is blue.  Your reaction could be: “I’m surprised to be wearing a blue t-shirt.  Only 1 out of 20 shirts was blue.”  But of course, you could say the same thing no matter which t-shirt was selected.  There was a probability of 1 that a shirt that was unlike the other 19 would be selected.  We see the result and then start thinking about how unique that result is.

This kind of reasoning leads to bad inference, such as the self-indication assumption or the doomsday argument.  The wikipedia version of the doomsday argument is: “supposing the humans alive today are in a random place in the whole human history timeline, chances are we are about halfway through it.”  In other words, if there was a time-traveling stork that selects humans from all humans that will ever exist, and randomly places them at various places in the human history timeline, then we are probably about halfway through human existence.  People then debate whether the doomsday conclusion is correct, but do not challenge the assumption that we know is wrong.   The doomsday argument can be rejected by simply noting that the assumption is bad (we are not in a random place in the human history timeline).

We shouldn’t be surprised that we exist, since we had to exist to notice that we exist and ask questions about our existence.  It would be more surprising if we noticed that we didn’t exist.

Read Full Post »

Suppose 50% of people in a population have an asymptomatic form of cancer. None of them know if they have it. One of them is randomly selected and a diagnostic test is carried out (the result is not disclosed to them). If they don’t have cancer, they are woken up once. If they do have it, they are woken up 9 times (with amnesia-inducing drug administered each time, blah blah blah). Each time they are woken up, they are asked their credence (subjective probability) for cancer.

Imagine we do this repeatedly, randomly selecting people from a population that has 50% cancer prevalence.

World A: Everyone uses thirder logic

Someone without cancer will say: “I’m 90% sure I have cancer”

Someone with cancer will say: “I’m 90% sure I have cancer.” “I’m 90% sure I have cancer.” “I’m 90% sure I have cancer.” “I’m 90% sure I have cancer.” “I’m 90% sure I have cancer.” “I’m 90% sure I have cancer.” “I’m 90% sure I have cancer.” “I’m 90% sure I have cancer.” “I’m 90% sure I have cancer.”

Notice, everyone says they are 90% sure they have cancer, even though only 50% of them actually do.

Sure, the people who have cancer say it more often, but does that matter? At an awakening (you can pick one), people with cancer and people without are saying the same thing.

World B: Everyone uses halfer logic

Someone without cancer will say: “I’m 50% sure I have cancer”

Someone with cancer will say: “I’m 50% sure I have cancer.” “I’m 50% sure I have cancer.” “I’m 50% sure I have cancer.” “I’m 50% sure I have cancer.” “I’m 50% sure I have cancer.” “I’m 50% sure I have cancer.” “I’m 50% sure I have cancer.” “I’m 50% sure I have cancer.” “I’m 50% sure I have cancer.”

Here, half of the people have cancer, and all of them say they are 50% sure they have cancer.

My question: which world contains the more rational people?

Read Full Post »

When it comes to probability, you should trust probability laws over your intuition.  Many people got the Monty Hall problem wrong because their intuition was bad.  You can get the solution to that problem using probability laws that you learned in Stats 101 — it’s not a hard problem.  Similarly, there has been a lot of debate about the Sleeping Beauty problem.  Again, though, that’s because people are starting with their intuition instead of letting probability laws lead them to understanding.

The Sleeping Beauty Problem

On Sunday she is given a drug that sends her to sleep. A fair coin is then tossed just once in the course of the experiment to determine which experimental procedure is undertaken. If the coin comes up heads, Beauty is awakened and interviewed on Monday, and then the experiment ends. If the coin comes up tails, she is awakened and interviewed on Monday, given a second dose of the sleeping drug, and awakened and interviewed again on Tuesday. The experiment then ends on Tuesday, without flipping the coin again. The sleeping drug induces a mild amnesia, so that she cannot remember any previous awakenings during the course of the experiment (if any). During the experiment, she has no access to anything that would give a clue as to the day of the week. However, she knows all the details of the experiment.

Each interview consists of one question, “What is your credence now for the proposition that our coin landed heads?”

Two popular solutions have been proposed: 1/3 and 1/2

The 1/3 solution

From wikipedia:

Suppose this experiment were repeated 1,000 times. We would expect to get 500 heads and 500 tails. So Beauty would be awoken 500 times after heads on Monday, 500 times after tails on Monday, and 500 times after tails on Tuesday. In other words, only in a third of the cases would heads precede her awakening. So the right answer for her to give is 1/3.

Yes, it’s true that only in a third of cases would heads precede her awakening.

Radford Neal (a statistician!) argues that 1/3 is the correct solution.

This [the 1/3] view can be reinforced by supposing that on each awakening Beauty is offered a bet in which she wins 2 dollars if the coin lands Tails and loses 3 dollars if it lands Heads. (We suppose that Beauty knows such a bet will always be offered.) Beauty would not accept this bet if she assigns probability 1/2 to Heads. If she assigns a probability of 1/3 to Heads, however, her expected gain is 2 × (2/3) − 3 × (1/3) = 1/3, so she will accept, and if the experiment is repeated many times, she will come out ahead.

Neal is correct (about the gambling problem).

These two arguments for the 1/3 solution appeal to intuition and make no obvious mathematical errors.   So why are they wrong?

Let’s first start with probability laws and show why the 1/2 solution is correct. Just like with the Monty Hall problem, once you understand the solution, the wrong answer will no longer appeal to your intuition.

The 1/2 solution

P(Beauty woken up at least once| heads)=P(Beauty woken up at least once | tails)=1.  Because of the amnesia, all Beauty knows when she is woken up is that she has woken up at least once.  That event had the same probability of occurring under either coin outcome.  Thus, P(heads | Beauty woken up at least once)=1/2.  You can use Bayes’ rule to see this if it’s unclear.

Here’s another way to look at it:

If it landed heads then Beauty is woken up on Monday with probability 1.

If it landed tails then Beauty is woken up on Monday and Tuesday.  From her perspective, these days are indistinguishable.  She doesn’t know if she was woken up the day before, and she doesn’t know if she’ll be woken up the next day.  Thus, we can view Monday and Tuesday as exchangeable here.

A probability tree can help with the intuition (this is a probability tree corresponding to an arbitrary wake up day):

If Beauty was told the coin came up heads, then she’d know it was Monday.  If she was told the coin came up tails, then she’d think there is a 50% chance it’s Monday and a 50% chance it’s Tuesday.  Of course, when Beauty is woken up she is not told the result of the flip, but she can calculate the probability of each.

When she is woken up, she’s somewhere on the second set of branches.  We have the following joint probabilities: P(heads, Monday)=1/2; P(heads, not Monday)=0; P(tails, Monday)=1/4; P(tails, Tuesday)=1/4; P(tails, not Monday or Tuesday)=0.  Thus, P(heads)=1/2.

Where the 1/3 arguments fail

The 1/3 argument says with heads there is 1 interview, with tails there are 2 interviews, and therefore the probability of heads is 1/3.  However, the argument would only hold if all 3 interview days were equally likely.  That’s not the case here. (on a wake up day, heads&Monday is more likely than tails&Monday, for example).

Neal’s argument fails because he changed the problem. “on each awakening Beauty is offered a bet in which she wins 2 dollars if the coin lands Tails and loses 3 dollars if it lands Heads.”  In this scenario, she would make the bet twice if tails came up and once if heads came up.  That has nothing to do with probability about the event at a particular awakening.  The fact that she should take the bet doesn’t imply that heads is less likely.  Beauty just knows that she’ll win the bet twice if tails landed.  We double count for tails.

Imagine I said “if you guess heads and you’re wrong nothing will happen, but if you guess tails and you’re wrong I’ll punch you in the stomach.”  In that case, you will probably guess heads.  That doesn’t mean your credence for heads is 1 — it just means I added a greater penalty to the other option.

Consider changing the problem to something more extreme.  Here, we start with heads having probability 0.99 and tails having probability 0.01.  If heads comes up we wake Beauty up once.  If tails, we wake her up 100 times.  Thirder logic would go like this:  if we repeated the experiment 1000 times, we’d expect her woken up 990 after heads on Monday, 10 times after tails on Monday (day 1), 10 times after tails on Tues (day 2),…., 10 times after tails on day 100.  In other words, ~50% of the cases would heads precede her awakening. So the right answer for her to give is 1/2.

Of course, this would be absurd reasoning.  Beauty knows heads has a 99% chance initially.  But when she wakes up (which she was guaranteed to do regardless of whether heads or tails came up), she suddenly thinks they’re equally likely?  What if we made it even more extreme and woke her up even more times on tails?

Implausible consequence of 1/2 solution?

Nick Bostrom presents the Extreme Sleeping Beauty problem:

This is like the original problem, except that here, if the coin falls tails, Beauty will be awakened on a million subsequent days. As before, she will be given an amnesia drug each time she is put to sleep that makes her forget any previous awakenings. When she awakes on Monday, what should be her credence in HEADS?

He argues:

The adherent of the 1/2 view will maintain that Beauty, upon awakening, should retain her credence of 1/2 in HEADS, but also that, upon being informed that it is Monday, she should become extremely confident in HEADS:
P+(HEADS) = 1,000,001/1,000,002

This consequence is itself quite implausible. It is, after all, rather gutsy to have credence 0.999999% in the proposition that an unobserved fair coin will fall heads.

It’s correct that, upon awakening on Monday (and not knowing it’s Monday), she should retain her credence of 1/2 in heads.

However, if she is informed it’s Monday, it’s unclear what she conclude.  Why was she informed it was Monday?  Consider two alternatives.

Disclosure process 1:  regardless of the result of the coin toss she will be informed it’s Monday on Monday with probability 1

Under disclosure process 1, her credence of heads on Monday is still 1/2.

Disclosure process 2: if heads she’ll be woken up and informed that it’s Monday.  If tails, she’ll be woken up on Monday and one million subsequent days, and only be told the specific day on one randomly selected day.

Under disclosure process 2, if she’s informed it’s Monday, her credence of heads is 1,000,001/1,000,002.  However, this is not implausible at all.  It’s correct.  This statement is misleading: “It is, after all, rather gutsy to have credence 0.999999% in the proposition that an unobserved fair coin will fall heads.”  Beauty isn’t predicting what will happen on the flip of a coin, she’s predicting what did happen after receiving strong evidence that it’s heads.

ETA (5/9/2010 5:38AM)

If we want to replicate the situation 1000 times, we shouldn’t end up with 1500 observations.  The correct way to replicate the awakening decision is to use the probability tree I included above. You’d end up with expected cell counts of 500, 250, 250, instead of 500, 500, 500.

Suppose at each awakening, we offer Beauty the following wager:  she’d lose $1.50 if heads but win $1 if tails.  She is asked for a decision on that wager at every awakening, but we only accept her last decision. Thus, if tails we’ll accept her Tuesday decision (but won’t tell her it’s Tuesday). If her credence of heads is 1/3 at each awakening, then she should take the bet. If her credence of heads is 1/2 at each awakening, she shouldn’t take the bet.  If we repeat the experiment many times, she’d be expected to lose money if she accepts the bet every time.

The problem with the logic that leads to the 1/3 solution is it counts twice under tails, but the question was about her credence at an awakening (interview).

ETA (5/10/2010 10:18PM ET)

Suppose this experiment were repeated 1,000 times. We would expect to get 500 heads and 500 tails. So Beauty would be awoken 500 times after heads on Monday, 500 times after tails on Monday, and 500 times after tails on Tuesday. In other words, only in a third of the cases would heads precede her awakening. So the right answer for her to give is 1/3.

Another way to look at it:  the denominator is not a sum of mutually exclusive events.  Typically we use counts to estimate probabilities as follows:  the numerator is the number of times the event of interest occurred, and the denominator is the number of times that event could have occurred.

For example, suppose Y can take values 1, 2 or 3 and follows a multinomial distribution with probabilities p1, p2 and p3=1-p1-p2, respectively.   If we generate n values of Y, we could estimate p1 by taking the ratio of #{Y=1}/(#{Y=1}+#{Y=2}+#{Y=3}). As n goes to infinity, the ratio will converge to p1.   Notice the events in the denominator are mutually exclusive and exhaustive.  The denominator is determined by n.

The thirder solution to the Sleeping Beauty problem has as its denominator sums of events that are not mutually exclusive.  The denominator is not determined by n.  For example, if we repeat it 1000 times, and we get 400 heads, our denominator would be 400+600+600=1600 (even though it was not possible to get 1600 heads!).  If we instead got 550 heads, our denominator would be 550+450+450=1450.  Our denominator is outcome dependent, where here the outcome is the occurrence of heads.  What does this ratio converge to as n goes to infinity?  I surely don’t know.  But I do know it’s not the posterior probability of heads.

explanation for the title of this post (link)

Long debate about this here

Read Full Post »

Lack of new information

if I observe an event that had a probability of 1 of occurring, I have no new knowledge

That is:  If P(B)=1, then P(A|B)=P(A)

More generally:

if I observe an event that had the same probability of occurring for all hypotheses under consideration, then I have no new knowledge about those hypotheses

Suppose A can take values a1,…,aK. If P(B|A=a1)=…=P(B|A=aK), then P(A|B)=P(A).

The Sleeping Beauty Problem

Consider the Sleeping Beauty problem (quoting wikipedia):

Sleeping Beauty volunteers to undergo the following experiment. On Sunday she is given a drug that sends her to sleep. A fair coin is then tossed just once in the course of the experiment to determine which experimental procedure is undertaken. If the coin comes up heads, Beauty is awakened and interviewed on Monday, and then the experiment ends. If the coin comes up tails, she is awakened and interviewed on Monday, given a second dose of the sleeping drug, and awakened and interviewed again on Tuesday. The experiment then ends on Tuesday, without flipping the coin again. The sleeping drug induces a mild amnesia, so that she cannot remember any previous awakenings during the course of the experiment (if any). During the experiment, she has no access to anything that would give a clue as to the day of the week. However, she knows all the details of the experiment.

Each interview consists of one question, “What is your credence now for the proposition that our coin landed heads?”

When she is asked the interview question, she knows the details of the experiment, and that she has been woken up at least one time.

Let A represent the result of the coin flip (1=heads, 0=tails).

Let  B=1 if sleeping beauty has been woken up at least one time, and B=0 otherwise.

P(A=1)=1/2

P(B=1|A=1)=P(B=1|A=0)=1.  (regardless of whether heads or tails was selected, there was a probability of 1 that Beauty would be woken up, and would have no memory of whether she had been woken up in the past)

Thus, P(A=1|B=1)=P(A=1)=1/2.

So, the fact that she was woken up does not make it more likely that the flip came up tails.  The fact that she was woken up and asked a question provided her with no new information about heads or tails.

However…

Loss functions

Beauty doesn’t like being wrong.  If it landed heads she wants her guess to be p=1 (where p is her guess for the probability of heads).  The farther away from that optimal value she is the bigger the loss.

Let’s make that idea more concrete by adding the following twist to the problem:  suppose every time she is interviewed, she has to pay $|p-A|.

Beauty, being a rationalist, will want to minimize her expected loss.     If heads comes up, she will lose $|p-1| on Monday.  If tails comes up, she will lose $|p-0| on Monday and $|p-0| on Tuesday.  Thus, she will choose p to minimize (1/2)*|p-1|+(1/2)*(|p-0|+|p-0|).

The value of p that minimizes her expected loss is p=0.  That’s my surprise solution to the Sleeping Beauty problem – she should say she’s sure it’s tails.

If instead you use squared error loss, i.e., you minimize (1/2)(p-1)2+(1/2)*2*(p-0)2, then you get the popular p=1/3 solution.  But why squared error loss?

Conclusion

If Beauty sticks to probability laws, and updates based on evidence, she should guess p=0.5.

But, if  Beauty counts being wrong on Monday and Tuesday as twice as disturbing as being wrong just on Monday, then she should guess p=0.  (this isn’t really her credence for heads, just the value that she views as optimal (minimizing loss))

Read Full Post »

In thinking about the self-indication assumption, let’s consider some experiments.

Experiment 1a

Suppose there are 1 million balls in an urn.  1 ball is orange and the rest are blue.

The algorithm goes like this:  flip a coin.  If heads, Large World wins and 999,999 balls will be randomly selected from the urn.  If tails, Small World wins and 1 ball will be drawn from the urn.

Once the ball(s) have been drawn, we are told whether the orange ball was drawn.

Prior probability of Large World: P(heads)=0.5

Posterior probability of Large World: P(heads|orange ball)≈1 and P(heads|orange ball not drawn)≈0

So, knowledge about whether the orange ball was drawn tells us a great deal about what world we are in.

Experiment 1b

Suppose there are 1 million balls in an urn.  All of the balls are blue.

The algorithm goes like this:  flip a coin.  If heads, Large World wins and 999,999 balls will be randomly selected from the urn and then painted orange.  If tails, Small World wins and 1 ball will be drawn from the urn and then painted orange.

Once the ball(s) have been drawn, we are told whether a ball that has subsequently been painted orange was drawn.

Prior probability:  P(heads)=0.5

Posterior probability:  P(heads|at least one blue ball painted orange)=P(heads)=0.5

Because regardless of the result of the coin flip at least one ball would be painted orange, knowing that at least one ball was painted orange tells us nothing about the result of the coin flip.  So in this experiment, the prior probability equals the posterior probability.

Experiment 2a

1,000,000 people are in a giant urn.  Each person is labeled with a number (number 1 through number 1,000,000).

A coin will be flipped.  If heads, Large World wins and 999,999 people will be randomly selected from the urn.  If tails, Small World wins and 1 person will be drawn from the urn.

Ahead of time, we label person #5,214 as special.  After the coin flip, and after the sample is selected, we are told whether special person #5214 was selected.

Prior probability of Large World: P(heads)=0.5

Posterior probability of Large World: P(heads|person #5,214 selected)≈1 and P(heads|person #5,214 not selected)≈0

Experiment 2b

1,000,000 people are in a giant urn.  Each person is labeled with a number (number 1 through number 1,000,000).

A coin will be flipped.  If heads, Large World wins and 999,999 people will be randomly selected from the urn.  If tails, Small World wins and 1 person will be drawn from the urn.

After the coin flip, and after the sample is selected, we are told that person #X was selected (where X is an integer between 1 and 1,000,000).

Prior probability of Large World: P(heads)=0.5

Posterior probability of Large World: P(heads|person #X selected)=P(heads)=0.5

Regardless of whether the coin landed heads or tails, we knew we would be told about some person being selected.  So, the fact that we were told that someone was selected tells us nothing about which world we are in.

Self-indication assumption (SIA)

Recall that the SIA is

Given the fact that you exist, you should (other things equal) favor hypotheses according to which many observers exist over hypotheses on which few observers exist.

“Given the fact that you exist…”  Why me?  Because I was already selected.  I am that ball that was painted orange.  I am person #X.  I only became the special ball and the special number after I was selected.

The mistake of the SIA is the data were generated from experiments like 1b and 2b, but is treated as if it’s from 1a and 2a.

—-

update:  an even more detailed argument here

Read Full Post »