Monday, April 7, 2014

Shifting goalposts?

I commented on another blog about a study which had some similarities to the Sheldrake staring experiments.

http://thinkingdeeper.wordpress.com/2014/03/09/sheldrake-vs-ubc-the-same-experiment/

http://hct.ece.ubc.ca/publications/pdf/gauchou-rensink-cac2012.pdf

http://www.sheldrake.org/files/pdfs/papers/sensoryclues.pdf

While I was reading the UBC paper, I was aware that I felt less critical about the paper than I would be if it had been a parapsychology paper.  Considering Dean Radin's criticisms from my previous blog post, and my criticisms of Radin's presentation of the blessed tea study, is it fair for me to be any less critical of the UBC paper (or alternatively, more critical of parapsychology papers)?  After all, like Sheldrake's and Radin's papers, there were multiple ways offered to analyze the results, the findings were post hoc, and novel outcome measures were offered.

Or were they?

An important design choice in the UBC paper is highlighted by contrasting it with Radin's paper.  Radin gave two groups of people tea which had been blessed or not, and measured change in mood and the subject's belief that they were in the intervention group.  The authors of the UBC study asked people general knowledge questions explicitly and implicitly (through the use of a Ouija board), and measured accuracy and the subject's belief that they were guessing at the answers.  In both cases, the significant finding was an interaction between the intervention and the belief condition.  Amongst those who believed they received the blessed tea, those who actually received the blessed tea had more improvement than those who did not.  Amongst those who believed they were guessing, those who were asked general knowledge questions implicitly (via the Ouija board) performed more accurately than when asked those questions explicitly.

Why are Radin's findings likely false, while the UBC study findings may be true?  The biggest difference is that Radin's findings are post hoc, while those in the UBC study were pre-planned.  Post-hoc testing violates the assumptions which underlie statistical significance testing, which reduces the validity of the results.

How can we tell whether a finding is pre-planned vs. post hoc?  It is not sufficient for the researcher to state a comparison was pre-planned.  And merely choosing to measure a number of different variables does not qualify as pre-planning.  So we can look at other factors, such as experimental manipulation, descriptions of the planning, and the analysis of the results.

The UBC group deliberately manipulated the belief condition by selecting questions which the subject identified as guesses.  They were identified as "guesses" independently of the accuracy of the answer and independently of their use in the "Ouija" board condition.  This experimental manipulation must be pre-planned.  There was no equivalent in Radin's study.  To be equivalent, Radin would also need to manipulate the belief condition (in this case, by manipulating what information was given to the subjects).  Unlike the UBC study, "belief" was a dependent variable in Radin's study, so it wouldn't be possible to form groups on the basis of "belief" prior to the drinking of the tea.

Another way to tell whether a comparison was pre-planned is to look at which comparisons were used in the sample size calculations (if reported).  In the UBC study, there are no sample size calculations reported.  In Radin's study, he reports that the sample size was assumed to be adequate based on his intentional chocolate study.  In that study, mood level (not change in mood) on each day was compared between conditions and "belief" was not a reported variable.  Had "belief" been a pre-planned condition in the tea study, it should have been accounted for, in some way, in the sample size assessments.

Finally, a quick way to check whether a comparison was pre-planned is to look at whether all the subjects are included in the analysis and whether the reasons for any exclusions are independent of the outcome.  In the UBC study, the analysis included 21/27 of the subjects who participated in the study.  Exclusions were based on a lack of success (i.e. movement of the planchette without conscious interference) in the use of the Ouija board and were unrelated to the outcome.  Radin included 40% of the subjects in his analysis, excluding more than half of the participants.  Thirty-two out of 221 were dropped for reasons unrelated to the outcome.  The remainder (101/221) were dropped for reasons which were strongly related to the outcome.  It would be very unlikely that a researcher would pre-plan a comparison which would so dramatically violate the significance testing.

To be fair, there is a good chance that the UBC study results are also false.  The sample size was small and it was somewhat exploratory, even if it was well-designed in comparison to Radin's study.  It will be interesting to see whether the findings hold up under attempted replications.

Linda