The Prosecutor's Fallacy (2018)

themgt · on Oct 9, 2023

One example of this is people saying "you have a 1 in 10 million chance of getting attacked by a shark!" I thought about this swimming for an hour+ a day at a beach with a huge amount of sharks.

Like "1 in 10 million? What's the average number of hours per year per American spent full body swimming in shark infested waters? Not even a very high % of people who live in this area spend as much time doing that as I do. If I spend 500 hours per year doing that, my risk calculation has gotta be about 50,000 times worse than average."

joe_the_user · on Oct 9, 2023

But even more, some number of shark bites occur each year and for the people bitten, the probability of having been bitten is 1.

The main thing is that sure, "the probability of this highly incriminating series of events happening to you without you being guilty" may be very small. But probability of "A series of highly incriminating and unlikely series of events happening somewhere, to someone" may be very high, nearly one.

hoseja · on Oct 10, 2023

>for the people bitten, the probability of having been bitten is 1

That's completely contrary to the usual use of probability. The probability of having been bitten for everyone else is then 0 and the concept becomes meaningless.

NotYourLawyer · on Oct 9, 2023

https://xkcd.com/795/

kmm · on Oct 9, 2023

I find Bayes' theorem is a bit more intuitive when written out not for probabilities but for the odds ratios, i.e. the ratio of the probabilities of something being true and something being false. Quite often one of these two will be close to zero, meaning the odds ratio will be close to (the inverse of) the probability anyway. So, letting H be the hypothesis and E the evidence, Bayes' theorem looks like

  o(H|E) = o(H) P(E|H)/P(E|¬H)

Or in words, you update the odds ratio simply by multiplying it with the ratio of likelihoods of observing the evidence. Because odds ratios range through all positive numbers, you obviate the annoying normalization step you need with probabilities, which need to remain between 0 and 1.

Applied to the example in the article, a test with with a false positive ratio of 1% can only ever boost the odds ratio of your believe by a factor of 100 (and in the case of the article it's even only 98:1), so if a disease has a prevalence of about 2:10000, such a test can only bring it to about 2:100.

mtklein · on Oct 9, 2023

On the topic of finding Bayes' theorem intuitively, I can never remember it on its own, but starting from the joint probability P(A,B) for any arbitrary A and B always helps me:

    // A and B are arbitrary.
    P(A,B) = P(B,A)

    // These two make sense if I sound them out in words.
    P(A,B) = P(A) P(B|A)
    P(B,A) = P(B) P(A|B)

    // Combine to give Bayes' theorem.
    P(A) P(B|A) = P(B) P(A|B)

mananaysiempre · on Oct 9, 2023

The advantage of the odds formulation is that you can compute posterior odds in your head, whereas computing the posterior probabilities via the standard form of the theorem (that you wrote down here) involves an unpleasant normalization factor.

E.g. for the example in https://en.wikipedia.org/wiki/Base_rate_fallacy#Low-incidenc...:

    prior odds of infection =   2 :  98
  × likelihood ratio        = 100 :   5
  = posterior odds          ≈ 200 : 500

and thus the probabilities (if you actually need them at this point) are ≈ (2/7, 5/7).

See also 3Blue1Brown’s exposition: https://youtu.be/watch?v=lG4VkPoG3ko.

n4r9 · on Oct 9, 2023

Exactly! The mathematical aspect of Bayes' Theorem is a simple algebraic manipulation of conditional probabilities.

Another way to intuitively justify your middle two equations is to visualise a Venn diagram with overlapping circles representing A and B, such that areas correspond to probabilities. Then P(B|A) is the area of the overlap - i.e. P(A,B) - divided by the area of A - i.e. P(A).

lovecg · on Oct 9, 2023

For me, taking the logarithm makes it even more intuitive somehow - the test becomes something like a modifier in an RPG-style game. In your example, the test gives +2 to the possibility of having a disease, a more powerful test could be +3, etc. Adding all the different +/- modifiers works just as well, and all this math is easy to do in one’s head.

Edit: also, obligatory 3blue1brown video https://youtu.be/lG4VkPoG3ko

mananaysiempre · on Oct 9, 2023

Jaynes was a big proponent of taking the logarithm as well (as you probably know), referring to it as “measuring in decibels”. Unfortunately, I don’t actually understand what this log p/(1-p) gadget (“logit”?) actually does, mathematically speaking: it looks tantalizingly similar to an entropy of something, but I don’t think it is? Relatedly, I don’t really see how this would work for more than two outcomes—or why it shouldn’t.

travisjungroth · on Oct 9, 2023

The logit function is just a mapping from non-log prob space to log odds space. It’s the odds formula wrapped in a natural logarithm. In one way, it’s not “doing” much of anything. It’s not the journey, it’s the destination.

Why hang out in log odds space? Well, it’s a bigger space. It’s [-inf, inf], which is bigger than the [0, inf] of odds and the [0, 1] of prob. Odds space means you can multiply without normalizing. Log space means you multiply by adding. And being a natural log means you’re good to go when you start doing calculus and the like.

You can also cover a huge range of probabilities in a small range of numbers. -21 to 21 covers 1 in a billion to 1 - (1 in a billion).

mananaysiempre · on Oct 9, 2023

I usually prefer to write this as

  [P(H|E) : P(¬H|E)] = [P(H) : P(¬H)] × [P(E|H) : P(E|¬H)],

which is admittedly not notation you can find in a high-school textbook, but you still understand what I mean.

One advantage is that it makes it abundantly clear that the restriction to two hypotheses (one of which is H, the other is then inevitably ¬H) is completely incidental and you absolutely can use the analogous expression for more (e.g. the three doors in the Monty Hall paradox). Another, funnier one is that it suggests that when talking about “odds” you’re using projective coordinates instead of the usual L1-normalized ones (and you are! your “o” is then the affine coordinate of course).

a1369209993 · on Oct 9, 2023

> [P(H|E) : P(¬H|E)] = [P(H) : P(¬H)] × [P(E|H) : P(E|¬H)]

Hmm, you can do even better by formatting it as:

  p( H|E)   p( H)   p(E| H)
  ─────── = ───── * ───────
  p(¬H|E)   p(¬H)   p(E|¬H)

which is vertically symmetric about H/¬H, or possibly:

  ⎡p( H|E)⎤ = ⎡p( H)⎤ [*] ⎡p(E| H)⎤
  ⎣p(¬H|E)⎦   ⎣p(¬H)⎦     ⎣p(E|¬H)⎦

(where `[*]` is elementwise mutiplication over a vector).

mananaysiempre · on Oct 9, 2023

What I wanted to say is, the :-separated blocks are not fractions, they are two-component vectors up to a multiple (aka coordinates on [a part of] the projective real line, for those who like that sort of thing), and the “two” here is completely incidental.

For example, suppose that, in the Monty Hall paradox, you chose door one and observed Monty open door three. Then the posterior odds for the car being behind each door are calculated as

    prior        =  1  : 1 : 1
  × likelihoods  = 1/2 : 1 : 0
  = posterior    = 1/2 : 1 : 0
  =                 1  : 2 : 0

(Exercise: prove this formulation of the Bayes theorem for >2 hypotheses.)

a1369209993 · on Oct 9, 2023

> the :-separated blocks are not fractions, they are two-component vectors up to a multiple

Projective coordinates are fractions, or more precisely the ℤ⁺²-projective vector space[1] (over the positive integers) is ℚ⁺, the positive rational numbers[0]. You generally want to use ℝ instead of ℤ for obvious reasons, but that's not really "not fractions" in any useful sense.

But writing it as vectors does have the benefit of making the generalization from ℝ² to ℝⁿ (n≥2) somewhere between obvious and trivial, whereas fractions are at-least-implicitly specific to the ℝ² case.

0: give or take some issues with 0 and ∞ that we don't care about because 0 and 1 are not probabilities

1: I'm sure someone can out-pedant me on the terminology here, but the point is it's two-dimensional, which (2) is the fewest dimensions this works non-degenerately for (leaving one degree of freedom, versus zero for the ℝ¹ case, and negative one (aka doesn't work at all) for ℝ⁰)

a1369209993 · on Oct 11, 2023

> the ℤ⁺²-projective vector space

Semi-erratum: that's "(ℤ⁺)²" (aka "ℤ⁺×ℤ⁺"), not "ℤ⁽⁺²⁾".

xpe · on Oct 9, 2023

> Therefore, the Prosecutors Fallacy is a subtle error that requires careful thought to understand, which makes teaching it all the more challenging. Visual explanations may be helpful.

Indeed. I've read and re-read the article. I don't see a clear explanation of the fallacy that fits into one sentence.

DSMan195276 · on Oct 9, 2023

It's not quite a single sentence, but IMO the best example of the prosecutors fallacy is the lottery: The chance of winning the powerball lottery is 300 million to 1, so clearly anybody who wins _must_ have cheated since 300 million to 1 is practically impossible.

Obviously that's not true, people legitimately win the powerball all the time despite the odds. The issue with this logic is judging the odds from the position of a _particular_ person winning vs. the odds of _anybody_ winning. Your individual odds are 1 in 300 million, but since the powerball gets many 100s of millions of entries clearly it's likely that eventually someone will win.

Effectively, the prosecutors fallacy is presenting a very low probability as though that indicates something couldn't have happened. A probability alone is useless without the size of the population that probability is picking from, "low probability" events like winning the lottery happen all the time when your population is large enough.

cbsmith · on Oct 10, 2023

I often take it to the ridiculous point and use a scenario with only one possible positive case. Like:

There is only one person in the world who is you. Let's say I had a facial recognition system that had a false positive rate of only 1 in a million. If I tried to find you by having the system scan through pictures of literally everyone on earth, how many people would I incorrectly identify as you?

eep_social · on Oct 9, 2023

Great explanation.

I occasionally get to remind coworkers that at 100 requests per second, one in a million is about 8 times per day [1]. Usually mentioned in the context of handling, understanding, or planning for corner cases.

[1] 100rps*60*60*24/1E6=8.64

joe_the_user · on Oct 9, 2023

That's a good explanation.

The other to add is that someone combs a data file for "unlikely events" and finds one event they consider unlikely happened to a given person, you have to consider the probability of any event that looks unlikely occurring at all, not just the probability of that unlikely event happening.

anonymous_sorry · on Oct 9, 2023

"False positives are rare" does not imply that a positive result it is probably true. True positives might be even rarer than false positives.

The problem is false positive rates are usually expressed in terms of "what proportion of negative cases will be reported as positive?" which can't answer the question "what proportion of positive results will be wrong?". Answering the latter also depends on how many true positives there are, which may vary, or even be completely unknown. The false negative rate also needs to be factored in.

cbsmith · on Oct 10, 2023

Yup. It's very similar to the problem of explaining the importance of precision and recall in information retrieval.

cainxinth · on Oct 9, 2023

The prosecutor's fallacy is the mistake of confusing the probability of a piece of evidence given guilt with the probability of guilt given a piece of evidence.

In other words, it’s asking: “If Bob is guilty of this robbery, what are the chances we'd find his fingerprints at the crime scene?” When you should be asking: “Given that we've found Bob’s fingerprint at the crime scene, what’s the likelihood that he is guilty?"

anonymous_sorry · on Oct 9, 2023

> “If Bob is guilty of this robbery, what are the chances we'd find his fingerprints at the crime scene?”

It's more like saying "the chance of these fingerprints being mistaken for Bob's are 1 in 100000. Therefore there's a 99999 in 100000 Bob is guilty of this robbery.

Does Bob have a motive? Means? A history of crime? Is there evidence he was in the area? Could he have been in the house for another reason?

And the biggie: did he only come under suspicion after police trawled through a database of 100000 fingerprints and matched the prints to Bob? Because you should expect a false positive in a database of that size.

cainxinth · on Oct 9, 2023

You're right, mine misses the base rate part.

aidenn0 · on Oct 9, 2023

I tried several times to make it fit in one sentence, but short of a Dickensian abuse of semi-colons it's not really possible to fully explain in a sentence.

My best effort: If you have a very large imbalance in your population vs. your target group (e.g. everybody in NYC, vs the one person who robbed a bank, or the whole population vs. people with a rare disease), then seemingly strong evidence is much weaker than it appears.

Longer example:

Lets say we've matched the DNA of a bank robber to someone using a test with a 1-in-a-million false positive rate. That, on its own, means there is still over a 95% chance (23/24) that they are innocent given that with 24M people in the NYC area on a given day, we would expect 24 people to match.

The prosecutors fallacy would be saying that there is instead only a 1-in-a-million chance that this person is innocent.

torstenvl · on Oct 10, 2023

Imagine I have a simple electrical circuit on main power. This circuit has only one outlet on it, which is connected only to one floor lamp. Imagine further that the light is currently on.

If I flip the switch off, what is the probability that the light turns off?

If the light goes off, what is the probability that I flipped the switch off?

Those probabilities are not identical.

---

Imagine that I have been administered a standard FDA-approved HIV test.

If I am HIV-negative, what is the likelihood that the test comes back positive? (I.e., what is the false positive rate?)

If the test comes back positive, what is the likelihood that I am HIV-negative?

Those probabilities are not identical.

---

Imagine that your client's DNA is tested against DNA found at a murder scene, comparing the short tandem repeats at 13 locations. There is no other evidence linking your client to the crime scene or to the murder.

If your client was not at the scene, what is the likelihood of all 13 loci being matches?

If all 13 loci are matches, what is the likelihood your client was not at the scene?

Those probabilities are not identical.

yoz · on Oct 9, 2023

The Prosecutor's Fallacy happens when someone misinterprets the low odds of a specific innocent circumstance as sufficient to show that the accused's innocence is equally unlikely, because they have implicitly assumed that the odds of guilt are higher.

Basic example: take a deck of cards and throw them in the air so the cards land in a random arrangement. If someone looks at the arrangement and says, "the odds against these cards landing in this specific arrangement are trillions to one, therefore someone must have arranged them like this deliberately," that's the Prosecutor's Fallacy.

For a tragic example from real life, see the case of Sally Clark[1] who was wrongly convicted for the murder of her two baby sons. At trial, the only other possible explanation given for their deaths was Sudden Infant Death Syndrome (SIDS).

A paediatrician testified that the odds of a single baby dying of SIDS were 1 in 8,500, so the probability of two specific babies dying in this way was the square of those odds (1 in 72 million). He concluded that the high improbability of this specific innocent circumstance meant that murder was a more likely explanation.

There were several other problems with this testimony (not to mention the evidence presented), but this demonstrates the Prosecutor's Fallacy: the paediatrician misinterpreted the odds of two specific babies dying from SIDS as the odds of ever seeing SIDS claim a pair of babies. The odds he quoted are in the millions, but the number of two-child families is in the millions too; this means that such events will probably happen eventually, thanks to the Law of Truly Large Numbers[2].

[1] https://en.wikipedia.org/wiki/Sally_Clark [2] https://en.wikipedia.org/wiki/Law_of_truly_large_numbers

cbsmith · on Oct 10, 2023

It's worse than that, because the 1 in 72 million calculation assumed that the contributing factors for SIDS are uniform across all children, when even then we knew absolutely that they were not. The UK, with a population of 67 million (and that's current population, not the population when she was on trial), has far fewer than 72 million households in the UK with two children, if the probability of SIDS was distributed uniformly over all children, the odds of there being a household where two kids died of SIDS in the UK at any moment in time would seem to be less likely than not.

However, we knew even then that if one baby from a particular pair of parents and a particular environment dies of SIDS, the odds of a second baby dying from SIDS with the same parents & environment increase dramatically. So the odds of two babies in the same household dying from SIDS aren't nearly 1 in 72 million.

The Prosecutor's Fallacy just makes the magnitude of the statistical miscalculation even more absurd.

Sadly, everyone, including her own defense team, thought that the correct "expert witness" for this kind of testimony would be a medical expert, rather than a statistical expert.

cbsmith · on Oct 10, 2023

The clear explanation to compute the odds you need to combine the distribution of positives & negatives with the false positive (or false negative) rate; a probability derived purely from a false positive rate presumes a uniform distribution of positives and negatives.

Spooky23 · on Oct 9, 2023

It's pretty simple, but in trying to demonstrate why, the author misses the point:

When an individual with something to gain asserts some probabilities, disregard that assertion.

Forget about medicine or criminal trials. Think of a sales professional - their assertions are commonly disregarded (rightly) as fluff. But put a white coat or nice suit on and people get deferential. A prosecutor in the courtroom is little different than a car salesman.

anonymous_sorry · on Oct 9, 2023

That is absolutely not the prosecutor's fallacy.

What you're describing is closer to "argument from authority".

chongli · on Oct 10, 2023

I have to say that this is one of the most frustrating things about the institution of criminal law. Knowledge of Bayes’ theorem, and statistics more broadly, already exists within our culture. Yet this fallacy continues to be applied in court, leading to horrific miscarriages of justice.

Why can’t we just solve this problem? Educate judges in basic statistics, appoint court statisticians for more complex cases, and allow judges to hold people in contempt of court for trying to make these kinds of statistically fallacious arguments.

kr0bat · on Oct 9, 2023

>Notice that at 0.02% prevalence the two conditional probabilities differ by 97% but at 20% prevalence the difference is only 3%. Therefore, the Prosecutor’s Fallacy is not an issue when the prevalence (or prior likelihood of guilt) is high, because the conditional probabilities are similar.

Whoa I think the point makes sense but I'm not sure about the data used to demonstrate it. The difference between probabilities at 20% prevalence were 1% and 4%. That's not a 3% difference, that's a 3 percentage point difference that results in a 400% difference in probability. That's not similar at all.

xg15 · on Oct 9, 2023

Another way to illustrate the base rate fallacy is to get rid of any randomness for a moment and imagine a perfectly regular and deterministic toy universe:

For the disease test, suppose every patient has a unique ID, starting at #0 and counting up consecutively.

Because it's a toy universe, we can say that the actually infected patients are exactly the ones with IDs #0, #10000, #20000 etc.

The test will come back positive if either the patient is infected (i.e. false-negative rate is 0, P(positive|infected) is 1.0) or if the ID is one of #1, #1001, #2001, #3001, etc.

This results in a base rate of 1/10000 and a false positive rate of (almost) 1/1000.

If you now look at all the IDs for which the test will be positive, those will be: #0, #1, #1001, #2001, ..., #9001, #10000, #10001, #11001, #12001, ..., #20000, #20001, #21001, etc etc.

It's easy to see that for each true positive (IDs that end with 0), there are 10 false positives (IDs that end with 1) - so the probability that some specific ID from that list is a true positive P(infected|positive) is 1/10 = 0.1, i.e. it's more likely the patient is not infected than that they are. (It's still more likely that they are infected than it would be for a random pick from the general population: 1/10 vs 1/10000)

Now consider a second population with a higher base rate: Now the infected patients are #0, #50, #100, #150 etc, i.e. the base rate is 1/50.

If you look at the IDs with positive tests again, you get #0, #1, #50, #100, #150, ..., #1000, #1001, #1050, ..., #2000, #2001 etc etc.

Now there are 50 true positives for each false positive, i.e. P(infected|positive) is something like 50/51 =~ 0.98.

So the test suddenly got much more "reliable", even though the false-positive rate didn't change, only through a change in base rate.

derbOac · on Oct 9, 2023

I wish these types of fallacies, related to ignoring base rates and equating likelihoods with posteriors, were more widely appreciated.

However, a big problem in practice — and a reason why these fallacies exist aside from simple ignorance or mistake — is that the correct number, the posterior, requires knowing the prior base rates of something, which is often unknown. In some settings, the base rates are very well-characterized, but in others you really have no idea. In a lot of those cases, knowing the prior is qualitatively similar to knowing the posterior, which you're trying to figure out, so all you're left with with any certainty is the likelihood.

The fallacy exists, it's important, but sometimes I think there's a bit more to it than simple ignorance. Sometimes there's no information on prior rates, or you don't really know which prior distribution something comes from, there's a mixture for example.

nonameiguess · on Oct 9, 2023

I've never actually heard it called the "prosecutor's fallacy" before, but it should be obvious why. The sort of thing often being looked for is something like "is planning an insurrection against the government," "is a terrorist," "is a murderer." We don't know the true base rates for any of these things, but we do know the base rate is very low. Almost nobody is a terrorist or a murderer.

Also, it becomes easier to understand some types of frustrating or often wrong processes when we take into accounts not just rates of different error types, but the relative costs of them. A whole lot of criminals never see justice or get off on technicalities because the social cost in terms of destroying trust in the legal system is much higher for putting innocent people in prison than it is for failing to catch or failing to punish the guilty. Why do we seem to go so overboard with cancer screenings when the false positive rate is so high? Because the worst that can happen is mostly annoyance, wasted time, and wasted money. The worst that can happen with false negatives is you die. Why do our dragnets for terrorists seem to be so much more sensitive than dragnets for murder, rape, and property crime? Because even though the false positive rate here is even higher, the damage done by a successful terrorist attack is far greater. Why are FAANG hiring processes so jacked up? Because, right or wrong, the cost of hiring a bad engineer is perceived to be far higher than the cost of failing to hire a good one, especially when you get so many applicants that you're guaranteed to staff to the level you need virtually no matter how high a rate you reject at.

denton-scratch · on Oct 9, 2023

> the damage done by a successful terrorist attack is far greater.

Terrorist attacks are incredibly rare; murder, rape and (especially) property crime are commonplace, and don't rate even a column-inch in newspapers. Once upon a time, kids, there was a job called "court reporter").

How many people have you known who were victims of terrorist attacks? Right - zero. I don't know anyone who knows anyone who was the victim of a terrorist attack. How many people do you know who haven't been the victim of a personal crime, like assault, robbery or rape? Again, the expected answer is roughly zero.

Most successful applications of political violence (I hate the term "terrorism") result in just a handful of deaths/injuries. By "successful" I don't mean they achieved their objectives; I just mean the attack wasn't foiled before it happened.

dmoy · on Oct 9, 2023

> I've never actually heard it called the "prosecutor's fallacy" before, but it should be obvious why

The article goes into detail about this. If you look at the cases it links, some of them are pretty egregiously bad misuse of statistics to put people away. Sometimes while gaslighting the suspect at the same time.

makeitdouble · on Oct 9, 2023

I feel it could come down to not using statistics to infer a given conclusion.

For instance base probability accounted, if there was a 1 in a trillion chance someone was at the right place the right time, just straight assuming it couldn't happen by chance is still wrong. By definition that chance was not 0.

At some point a practical decision could be made to cut prosecution cost, but it should be understood that nothing was proven.

wongarsu · on Oct 9, 2023

I think the legal system understands quite well that it doesn't prove anything (in the mathematical sense), which is why it has different standards of proof.

A 1 in 1 trillion chance would be considered both "beyond reasonable doubt" (enough for criminal matters) and satisfy the much weaker "balance of probabilities" usually applied to civil matters.

Of course there are plenty of examples to point to where people were convicted despite very reasonable doubt.

RandomLensman · on Oct 9, 2023

And if that even happens a lot in the world (e.g., many nurses looking at many patients every day), then that chance is quite likely to realize somewhere.

xpe · on Oct 9, 2023

TLDR: I highly recommend HN readers skip this poorly-written article and go read "Example 2: Drunk drivers" on Wikipedia's article about the Base Rate Fallacy instead [1]. You'll probably learn it more and get better context.

[1] https://en.wikipedia.org/wiki/Base_rate_fallacy instead.

Ok, I'm going to put on my harsh editor's hat. I'll abbreviate the Prosecutor's Fallacy as PF. Here are some glaring deficiencies:

1. The article goes not get to the point quickly (or at all!); it doesn't define PF up front nor ever.

2. The article doesn't properly situate the concept; it does not recognize synonyms for PF, which include: "base rate fallacy", "base rate neglect", or "base rate bias", "defense attorney's fallacy".

3. Poor logical flow. One example: Saying the PF "involves" conditional probability without having defined it first, much less at all -- is frustrating to the reader.

4. Poor reasoning. One example: The claim that the PF "is most often associated with miscarriages of justice." isn't plausible nor defended. It should be rephrased to say "is often associated".

5. A lack of organizational cues. Most obviously, the article doesn't have any headings. It desperately needs them.

6. Various formatting problems. For one, the top quotation should be formatted as a blockquote.

Overall, the article would benefit from several more drafts. I'm pretty disappointed a professional organization would hit "publish" on this one. I'll try to find a way to share my feedback with the editor(s) and author.

In the meanwhile, I suggest HN people read this instead:

https://en.wikipedia.org/wiki/Base_rate_fallacy

Why? The concept is explained in a self-contained way, followed by self-contained examples. I really appreciate that approach.

Slight tangent: On a more happy note, I think a lot of software developers are better writers than they give themselves credit for. I don't think most software developers would make the same mistakes as this author. I'm not saying we're all great writers, but many of us do have a strong sense of logical flow and organization.

igiveup · on Oct 9, 2023

Isn't this just hypothesis testing? [1]

Zero hypothesis: N deaths happen by chance, given a known probability distribution of patient deaths.

Alternative: There was a different probability distribution in play (apparently facilitated by a specific nurse).

P-value: 1 in 342 million, really convincing.

So, is the fallacy that somebody calls the number "probability" rather than "p-value"? Or am I getting it wrong?

[1] https://en.wikipedia.org/wiki/Statistical_hypothesis_testing

ivanbakel · on Oct 9, 2023

No, the fallacy comes from choosing the sample before analysing the probability.

The probability a specific nurse could have such specific bad luck is very low, but there are of course many nurses, and each nurse treats many patients. What is the probability any nurse would have such bad luck, over a long period? How does that probability compare to the probability of murder, which is also estimable? Only either unlucky nurses or murderers end up in the docket - so the p-value really depends on the probability that the prosecutor faces an unlucky nurse versus a murderer.

A simpler comparison: a die with a thousand faces is quite unlikely to land on any particular face. When you roll it, it gives you a sample - is it more likely that the die is weighted to that face, or that the die is fair?

josephcsible · on Oct 9, 2023

This fallacy reminds me of https://en.wikipedia.org/wiki/Testing_hypotheses_suggested_b... in particular. If you didn't have any reason to look for wrongdoing other than a statistical dataset, that same dataset is never sufficient to confirm the resulting suspicion.

igiveup · on Oct 9, 2023

I see. Physicists face this problem with the Large Hadron Collider, and many possible hypotheses explaining its results.

Yet, I think many many nurses are needed to beat the 342 million.

sealeck · on Oct 9, 2023

No the issue is when prosecutors mix up Pr[evidence | innocence] with Pr[innocence | evidence]. It isn't correct to conclude from the former that someone is guilty.

igiveup · on Oct 9, 2023

I believe Pr[evidence | innocence] is p-value (or maybe one minus p-value, not sure). Statisticians use this routinely to test "innocence". It does not mean probability of innocence, but it means something.

hgomersall · on Oct 9, 2023

Yeah, it means the probability of the evidence given the person is innocent. If they use it for anything else they're using it wrongly and they shouldn't.

taneq · on Oct 9, 2023

This seems to me to be the equivalent of begging the question ('assuming the consequent', not 'requesting the question be asked').

mannykannot · on Oct 9, 2023

I think there are many examples where no question-begging premises are involved. For example, in the somewhat canonical example of incorrectly inferring the presence of a disease from a test when the base rate is low [1], the premises are the positive test result, the false-positive and false-negative rates for the test, and whatever premises about statistical reasoning lead to the calculation of the incorrect probability.

[1] https://en.wikipedia.org/wiki/Base_rate_fallacy#Example_1:_D...

shib71 · on Oct 10, 2023

Reminds me of an old Discworld chestnut:

    Scientists have calculated that the chances of something so patently absurd actually existing are millions to one. But magicians have calculated that million-to-one chances crop up nine times out of ten.

1970-01-01 · on Oct 9, 2023

>He mustn’t love me anymore, as it’s been 3 days and he hasn’t returned my call.

I understand the others, but what's the fallacy here? Assuming all return calls (evidence) must occur within X days?

nonameiguess · on Oct 9, 2023

It's the same as all the others. Consider the base rate for all possible reasons a person might not return a call:

- Has been injured or incapacitated

- Didn't see the call

- Didn't receive the call

- He actually did return the call, but you didn't notice or didn't receive it

- Has been swamped by other demands

- Simply forgot

- Lost his phone

- Has become generally depressed or despondent

- Fell out of love with the caller and is avoiding saying so

It doesn't seem likely that the rate for the last reason is high enough to outweigh all of the other possible reasons, and the evidence you have is equal evidence for all possible causes, not evidence specifically that he stopped loving you.

seabass-labrax · on Oct 9, 2023

It's that penultimate one which I think is the most greatly underestimated possibility. I've had good friends 'ghost me', only for them to tell me after finally getting in contact that they were depressed. Frequently that can be triggered by health issues, so it's not necessarily true that you have to be incapacitated for a injury to stop you from responding to people normally.

I feel that my society lacks a sensitive, reliable way of communicating that one is depressed in a way that lets both parties 'off the hook' for being distant, yet reserving the possibility of continuing (and confirming the desire to continue) the relationship at a later point. Restarting old friendships has been really difficult in my experience, even when it would be mutually beneficial for both parties.

denton-scratch · on Oct 9, 2023

> not evidence specifically that he stopped loving you.

I think that last example is problematic. Did he ever start loving you? How would you know? How would he know? What is love, anyway? (~Charles Rex)

What evidence would support the belief that someone does or doesn't love you? Erich Fromm defined four types of love; but just because someone is your mother doesn't mean that their attitude is one of Motherly Love. They might not even like you.

Basically, "Love" is a field with no definitions, no evidence, and nothing even close to certainty. It's all just feels.

xpe · on Oct 9, 2023

A nice uplifting list indeed. As to all save the last two, one might hope that eventually love would manifest as action. Emotions, logic, and time are complicated for people.

hammock · on Oct 9, 2023

It’s likely he’s just busy

lr4444lr · on Oct 9, 2023

... and still, he doesn't find talking to her relaxing at the end of the day. Unless he is incapacitated or dead, definitely a huge red flag.

thechao · on Oct 9, 2023

This fallacy always reminds me of the "Jonah" sequence in the "Master and Commander" movie; or, any of the witch tests used by Monty Python.

cbsmith · on Oct 9, 2023

I always explain it this way:

"The odds of getting a Royal Flush in poker are 649739 : 1. So, if you are playing someone and they present a Royal Flush, what are the odds they are cheating? What if the person you were playing was the Dali Lama [or insert another person who has little reason to cheat]?"

chmod600 · on Oct 9, 2023

Intellectually I understand this, but it's really really hard (for me) to consistently avoid this fallacy.

EGreg · on Oct 9, 2023

Conditional probabilities are what we actually operate with in the real world. They are all independent — in a chain of events, conditioning B on A makes the result independent of A. Like a sales funnel, each step can be multiplied because they are conditional that you got up to thay point!