Tuesday, August 22, 2017

The Birthday Problem

You are at a party chatting with the host when it is discovered that two of the guests have the same birthday.  "Wow, what are the odds of that?" asks your host.  People start throwing out answers:

"One in a million."
"One in 365."
"One in 365 squared."

"Even odds," is your response, whereupon everyone looks at you like you are nuts.  Intuitively, it seems a highly unlikely event.  After all, the likeliness that Cheryl and Louis would both be born on April 19 is remote (1/365 x 1/365)*.  Intuitively, we tend to substitute the probability of finding a pair with the probability of finding the pair which we found.

The same sort of response comes into play in parapsychology.  A correspondence is found between events (examples, a mediumship reading hits on a hummingbird tattoo, a past life regression finds the painter of a hunchback) and the likeliness of both events is too low to be due to chance.  Some sort of anomalous information must be present to account for the match, right?

If we go back to the question I asked in my previous blog post on evidence. “What would we expect in the absence of anomalous information?” It turns out we expect to find remarkable correspondences which have a very low probability of occurrence due to chance.

The Birthday Problem is a well-known puzzle which answers our host's question.  What is the probability that two people will share a birthday in a group of n people?  When there are 23 people (as your quick count at the party confirmed), the probability is 50%. This answer seems counter-intuitive, as the probability of the match we found is very low.  It becomes less counter-intuitive when we realize that any two party members who happened to share any birthday would have been equally remarkable.  In a group of 23, there are 253 different pairs which can be formed when looking for a single match, which makes the task appear easier than we first realized (especially when we also drop the requirement that the matching birthdate is April 19).  The take home message from the birthday problem is that the probability of finding a match is very different from the probability of the match you found.

When it comes to the issues discussed on parapsychology forums, including many of the anti-science issues like evolution denial, much use is made of the assumption that if events are extremely unlikely, then happenstance isn't a valid option. So we get Intelligent Design proponents who treat the (extremely low) probability that a mutation would produce a specific protein as evidence for a explanation involving God. Or we get Andrew Paquette publishing a paper demonstrating that it would be extremely unlikely for his dream to correspond to events in real life (for the few dreams selected post hoc because they corresponded) unless there was anomalous information.

I'm not sure how to overcome this misunderstanding. Perhaps we can ask, “what other remarkable events could have come to our attention in the same way?” In the case of proteins and mutation, we are looking at a vast range of mutations which may undergo selection for a great variety of functions - not a single function selected post hoc. The emergence of a few dozen useful functions out of that environment seems less unexpected even in the absence of God. For a remarkable correspondence in one of Andrew Paquette’s “pre-cognitive”dreams, we are looking at tens of thousands of dreams in which there is potential for a correspondence to be found. That he was able to find a handful which were somewhat remarkable seems to be expected in the absence of anomalous information.

Remarkable stories which come to our attention after the fact can’t serve as evidence of the paranormal when this is also exactly what we'd expect to see in the absence of anomalous information. Yet from personal experience and surveys, the bulk of proponents will say that they believe because of their experience of a remarkable story (either their own or someone else’s). 

https://en.m.wikipedia.org/wiki/Birthday_problem

  • For the sake of forestalling another common probability error, the probability of an event is not one over the number of possible events unless the events are uniformly distributed, so the probability that someone was born on April 19 is not really 1/365.  Even though there are 365 days (excluding leap years) in a year, the distribution of births is not uniform.  The actual probability would have to be determined empirically, based on census data, and may be something like 1/378.


Thursday, August 10, 2017

Information that confirms an idea isn’t evidence.


Okay, this title seems counter-intuitive – how could information that supports an idea not be evidence for the idea? Sure, it may be “weak”, but it has to count at least a little, doesn’t it? Yet it turns out that, in most cases, data that supports an idea is more likely to be produced when the idea is false than when the idea is true.

It starts with the famous Ioannidis paper, “Why most published research results are false”. Table 4 outlines the positive-predictive values under a variety of different types of investigation (positive-predictive values (PPV) are the proportion of positive findings that are true-positives vs. all positives (true plus false-positives)). Note that most of our discourse – people who claim to have had a weirdly accurate reading from a medium, conspiracy theories making the round of social media, experiences of horrific side-effects from vaccines/statins/<insert substance of choice here>, etc. – doesn’t even remotely reach the level of “exploratory study” with respect to rigor. But even with some element of ‘rigor’, a positive result from an exploratory study is still tens to hundreds of times more likely to be a false-positive than a true-positive. This means that the ability to “confirm” an idea (i.e. find information which supports the idea) says much, much more about how easy it is to find confirming information even when the idea is false, than it says about whether the idea is true.

We saw this in my prior post, where hummingbird statements, which are often used as proof that a particular medium’s reading depends on anomalous information, are also easily produced when the idea that mediums are taping into anomalous information is false. This prevents us from being able to distinguish which of the great variety of contradictory and fantastical statements made about the afterlife may actually be true.

Unfortunately, confirmation bias tends to ensure that we spend our time looking for this confirming information, instead of looking for information that would help us distinguish between ideas that are true or false.

In another famous experiment, the Wason selection task, which asks you to turn over a card or two in order to test whether a rule about those cards is true, is a test of this bias. Fewer than 10% of the people taking the test (even intelligent university students), pick cards which adequately test the idea. Almost everyone picks the card that would confirm the idea, but few also pick the card that would tell you whether the idea is false. This leads us to think that we are building evidence for an idea, by finding more and more examples that confirm the idea, even when the idea is false.

When faced with evaluating whether something might be true, don’t look at the ‘evidence’ for the idea. Ask yourself, “what might I expect to see if this isn’t true?”  Perhaps what you’d expect to see if the idea isn’t true is pretty much the same as what you’d expect to see if the idea is true. If you spend even five minutes on Snopes looking at the plethora of false and unproven conspiracy theories out there, it becomes pretty obvious that no matter how sketchy the idea (Pizzagate anyone?), it’s pretty easy to build a case for it even when it’s false. The existence of ‘evidence’ may just tell you that ‘evidence’ is easy to produce, not that the conspiracy may be valid.