Neuroskeptic: Neuroscience Fails Stats 101?

Sunday 11 September 2011

Neuroscience Fails Stats 101?

According to a new paper, a full half of neuroscience papers that try to do a (very simple) statistical comparison are getting it wrong: Erroneous analyses of interactions in neuroscience: a problem of significance.

Here's the problem. Suppose you want to know whether a certain 'treatment' has an affect on a certain variable. The treatment could be a drug, an environmental change, a genetic variant, whatever. The target population could be animals, humans, brain cells, or anything else.

So you give the treatment to some targets and give a control treatment to others. You measure the outcome variable. You use a t-test of significance to see whether the effect is large enough that it wouldn't have happened by chance. You find that it was significant.

That's fine. Then you try a different treatment, and it doesn't cause a significant effect against the control. Does that mean the first treatment was more powerful than the second?

No. It just doesn't. The only way to find that out would be to compare the two treatments directly - and that would be very easy to do, because you have all the data to hand. If you just compare the two treatments to control you might end up with this scenario:

Both treatments are very similar but one (B) is slightly better so it's significantly different from control, while A isn't. But they're basically the same. It's probably just fluke that B did slightly better than A. If you compared A and B directly you'd find they were not significantly different.

An analogy: Passing a significance test is like winning a prize. You can only do it if you're much better than the average. But that doesn't mean you're much better than everyone who didn't win the prize, because some of them will have almost been good enough.

Usain Bolt is the fastest man in the world (when he's not false-starting himself out of races). Much faster than me. But he's not much faster than the second fastest man in the world.

Nieuwenhuis S, Forstmann BU, & Wagenmakers EJ (2011). Erroneous analyses of interactions in neuroscience: a problem of significance. Nature neuroscience, 14 (9), 1105-7 PMID: 21878926

13 comments:

Stephen said...: This also applies to comparisons between the control and just one treatment in a pre and post test design. Basically the difference between within treatment differences and between treatment differences.; 11 September 2011 at 19:01
Jake said...: Damn, this is a little disturbing. I have occasionally seen this error in non-neuroscience papers, but I've never gotten the impression that it was a particularly pervasive problem (try saying that last bit five times fast). I admittedly don't read many neuroscience papers, but I find it surprising that this error would be so much more common in those literatures. Why might this be the case?; 11 September 2011 at 19:50
jay uhdinger said...: Wow that's surprising. Didn't think it would be so wide spread!; 12 September 2011 at 00:11
drbrocktagon said...: This comment has been removed by the author.; 12 September 2011 at 01:15
drbrocktagon said...: Your analogy implies that you are non-significantly slower than the second fastest man in the world. Can you confirm?; 12 September 2011 at 01:18
Anonymous said...: From: http://www.foxnews.com/health/2011/09/09/study-clouds-picture-on-omega-3s-and-heart-health/

"Vedtofte said that men who ate more omega-3 fatty acid-rich foods also seemed to gain protection from heart disease, but that the statistical differences were small so the effect could be due to chance."

They mean effect size, right?; 12 September 2011 at 04:21
Pseudonymoniae said...: I've only really just gotten into reading neuroscience papers consistently after entering grad school, but this doesn't surprise me in the slightest. I regularly comment to others in my lab about all the bad stats I run into, and not just in low quality journals. Nature journals are probably the worst. Often the authors just report p-values, without naming the test, other times they report "t-test" without mentioning ever having conducted an ANOVA or using any sort of correction for multiple comparisons. And for some bizarre reason, there doesn't appear to be any to be any standard for reporting stats at the end of papers. One paper will report ANOVAs and post hoc tests for all but the most basic comparisons, while some others literally don't report any stats at all. There was a Nature paper from a couple weeks ago that I read which did exactly this. I'll see if I can find it.

Btw NS, that graph doesn't illustrate an interaction, because it only has one independent variable. An interaction occurs when there is a significant difference across groups on one independent variable which also differ across groups within another independent variable.

So for example, we could look at an interaction in the following study: divide our animals into those who carry a hemoglobin mutation and those who do not (independent variable #1) and also into those which are fed an iron supplement and those which are fed no supplement (second independent variable) and then compare them on a dependent measure (oxygenation of tissue x during exercise). When we run our 2x2 ANOVA, a significant interaction could indicate something like "only those animals which carried the mutation and were not given an iron supplement had insufficient oxygenation of tissue x". (Well, you would need post hoc tests to identify which group had the low level of oxygenation.); 12 September 2011 at 04:32
Pseudonymoniae said...: ...forgot to say: that graph can only have a main effect, e.g. requires a one-way ANOVA.; 12 September 2011 at 04:33
petrossa said...: Statistics is a bit like quantum states. Every number exists in any relation it's just luck of the draw which calculation produces meaningful results.

Since a comparison to another calculation isn't menaingful you are stuck with that statistic that either means something, a lot or nothing depending on the aspect of it you perceive.

It's more a virtual result then a real result.

fMRI being a great example how you can make it do whatever you want.; 12 September 2011 at 06:17
mathii said...: The title of this article makes the point quite succinctly I think: "The Difference Between “Significant” and “Not Significant” is not Itself Statistically Significant"

http://www.stat.columbia.edu/~gelman/research/published/signif4.pdf; 12 September 2011 at 12:13
jamzo said...: pdf of full article posted at

http://www.ejwagenmakers.com/2011/NieuwenhuisEtAl2011.pdf; 12 September 2011 at 14:56
Anonymous said...: Statistics is Photoshop for data.

DC; 13 September 2011 at 19:50
RK CLOTHING DESIGNS said...: Hi great article I really enjoy reading this blog,thanks, good topic I am pleased to say it is interesting that this blog has a great variety of viewpoints to better understand the situation and that is what most caught my attention and has a great variety of comments thanks
www.klonopinx.com; 18 November 2011 at 18:49

Neuroskeptic

Sunday 11 September 2011

Neuroscience Fails Stats 101?

13 comments:

Labels

Links