“A statistically significant departure from an assumed-to-be-true null hypothesis is by itself no proof of anything. Likewise, failure to achieve statistical significance at the .05 or other stipulated level is not proof that nothing of importance has been discovered.”
Well, his opening example of his disbelief in the Higgs discovery does not breed confidence in the basis of his article.
Clearly he has no understanding of The Standard Model or the basis for Statistical Physics, which leads to the inevitable use of statistics in the discovery process.
If he is so lacking in understanding in this, and so “knee jerk” in his opposition to the valid use of statistics why believe anything that follows in his text.
A very unsatisfactory article indeed.
I got into stats while I was in healthcare. There’s a huge difference between things that are statistically significant and things that are clinically significant; in the same way, there’s a huge difference between measurable differences between groups and differences that matter between groups.
As WB noted, if you have two groups and measure anything, the groups are nearly certain to differ in the measured characteristic. If we took the AFC players and the NFC players and used their weights, they’d average slightly differently. But it wouldn’t mean anything on the field.
It’s likely that the difference in weight between the two groups would not meet a 95% confidence test, but it might. That test (I’ve forgotten the details) has to do with measuring whether the groups as samples are more like each other than the group members are as individuals.
I dispute the assertion that VIOXX increased heart attack rates. The FDA report showed that what happened was that those in the control group stopped having CVE (cardiovascular events) while those in the VIOXX group continued having CVE at the same rate that those in both groups had for the first 18 months of the study. The difference between the control group and the VIOXX group was caused by the sudden and unexpected cessation of CVE in the control group. The logical reaction to this result would have been to insist that every person on the planet be supplied with whatever brand of sugar was in the placebo. Either that or acknowledge that an aberration had spoiled the test.
Hypothesis testing is one of the great ideas of man. But it is not a litmus test. For example, if one measures the heights of 10 year old children in various cities, one is likely to find a pair of cities with a statistical difference in means. The difference would likely be in inconsequential. This problem shows up in statistically significant differences in medical outcomes. VIOXX, for example, raised the rate of heart attacks from 0.75% to 1.50% in a sample of patients, a shocking result when the differences in rates were applied to the entire population using the medication. Such numbers should have, but didn’t, show up in public health statistics. This was but one more example of statistically significant results still being noise of some sort. The rule of thumb is that relative risk ratios have to be at least x to warrant concern. I put x at 10, others put it as low as 3.
But statistical significance is still a valuable concept.