Saturday, January 27, 2018

Was Bem dishonest?

This is an overly contentious title, but we now seem to have confirmation that Bem provided a false description of experiment 5 in "Feeling the Future". Dr. R. on the Replicability index blog has made Bem's data available for download. 
https://replicationindex.wordpress.com/2018/01/20/my-email-correspondence-with-daryl-j-bem-about-the-data-for-his-2011-article-feeling-the-future/

The data for experiment 5 consists of 100 subjects, but there is a clear condition change after the first 50 subjects, in the number of trials each subject was exposed to. There is also a period of about 4 weeks separating the trials done on the first 50 subjects and the last 50.

Bem states, in "Feeling the Future", that the preliminary results of Experiment 5 (and 6 and 7) were previously reported in 2003. That report is here - https://pdfs.semanticscholar.org/8033/f0406daadc956c18d847cb39afc1610b2e73.pdf. The condition change I mention above is consist with the change in conditions between experiment 101 and 102. 

The first experimental series (101) in that report consists of the following:
     34 women
     16 men
     negative/high arousal hit rate = 55.8%
     t-test(49) = 2.41
     p = 0.01 one-tailed

     "control" hit rate = 49.8%
     t-test of the difference(49) = 2.28
     p = 0.027 two-tailed.

If we use the data on experiment 5 which Dr. R. made available from "Feeling the Future," and perform the same analysis on the first 50 subjects, we get:
     34 women
     16 men
     negative/high arousal hit rate = 55.8%
     t-test(49) = 2.41
     p = 0.01 two-tailed

     control hit rate = 49.8%
     t-test of the difference(49) = 2.28
     p = 0.027 two-tailed

It's pretty clear that both reports are talking about the same data. The description of this experiment from 2003 states:


"For the PH studies, the pictures were divided into six categories defined by crossing 3 levels of valence (negative, neutral, positive) with 2 levels of arousal (low, high)...

The first, Experiment 101, was designed to see if the PH procedure would yield a significant psi effect on any kind of target. Accordingly, the 6 kinds of picture pairs composed by crossing 3 levels of valence (negative, neutral, positive) with 2 levels of arousal (low, high) were equally represented across the 48 trials of the session, 8 of each kind...

The results were clear cut: Only the negative/high arousal pictures produced a significant psi effect...

After the fact, then, this experiment can be conceptualized as comprising 8 negative trials and 40 low-affect (“control”) trials."

But the description of this experiment, eight years later, in "Feeling the Future," states:


"This first retroactive habituation experiment comprised trials using either strongly arousing negative picture pairs or neutral control picture pairs;"

There is no mention of the fact that Bem started by looking for an effect for any kind of target, not just negative/high arousal. And that further experiments were planned on the basis of those results. And there is no mention that the "neutral controls" were a post-hoc compilation of pictures with a variety of valence and arousal levels, some of which were not "neutral" or not "low arousal".

A key criticism of "Feeling the Future" is that the results likely do not represent a true effect if these reports are cherry-picked from among a larger pool of exploratory studies. Yet even in the recent email exchange with Dr. R., he states, "Nor did I discard failed experiments or make decisions on the basis of the results obtained." This is clearly false in at least one of the experiments.

In light of these findings, perhaps Dr. R. is right in asking for retraction of "Feeling the Future".

No comments:

Post a Comment