Another study with the same goal of comparing the results from different research teams found similar disparities, though the graphs aren't quite as pretty.
What do you mean, all different? Most are exactly the same. The first 4 are a bit low and the last 3 a bit high, but last 2 and first also extremely wide, so irrelevant anyway. Everything else agrees, most within >99 % confidence with only slight differences on the absolute values.
9 of the teams reaching a different conclusion is a pretty large group. Nearly a third of the teams, using what I assume are legitimate methods, disagree with the findings of the other 20 teams.
Sure, not all teams disagree, but a lot do. So the issue is whether or not the current research paradigm correctly answers "subjective" questions such as these.
If we only look that those with p <0.05 (green) and with 95 % confidence interval, then there are 17 teams left. And they all(!) agree with more than 95% conference.
I'm no expert on statistics, but I know enough that repeated experiments should not yield wildly different results unless: 1) the phenomenon under observation is extremely subtle so results are getting lost in noise, 2) the experiments were performed incorrectly, or 3) the results aren't wildly divergent after all.
the whole point of statistics is to extract subtle signals from noise, if you're getting wildly different results, the problem is you're under-powered.
Thanks for taking the time to post these links, just letting you know you're efforts have benefited at least one person who's gonna enjoy reading this.
Scientists who fiddle around like this — just about all of them do, Simonsohn told me — aren’t usually committing fraud, nor are they intending to. They’re just falling prey to natural human biases that lead them to tip the scales and set up studies to produce false-positive results.
Since publishing novel results can garner a scientist rewards such as tenure and jobs, there’s ample incentive to p-hack.
I mean really, making claims they aren't committing fraud yet in the very next paragraph demonstrates their motivation... To commit fraud
Nevermind the numerous cases of published papers being bunk. And that something like 80% of published science isn't reproduceable...which is part of what publishing is to enable.
Why have 4 of the studies seemingly not used error bars at all‽ Like I get that different analyses will arrive at different results, but they should always have error bars, right?