Power calculations for the Oregon Medicaid study: Kevin Drum:
Let’s do the math. In the Oregon study, 5.1 percent of the people in the control group had elevated GH [glycated hemoglobin, aka A1C, or colloquially, blood sugar] levels. Now let’s take a look at the treatment group. It started out with about 6,000 people who were offered Medicaid. Of that, 1,500 actually signed up. If you figure that 5.1 percent of them started out with elevated GH levels, that’s about 80 people. A 20 percent reduction would be 16 people.
So here’s the question: if the researchers ended up finding the result they hoped for (i.e., a reduction of 16 people with elevated GH levels), is there any chance that this result would be statistically significant? [...] The answer is almost certainly no. It’s just too small a number.
I plugged these numbers into Stata’s sample size calculation program (sampsi) to do a power calculation for the difference between two proportions. I found that the probability of this result occurring under the null hypothesis that Medicaid would have no effect on GH levels is 0.35. The null cannot be rejected. We knew this from the paper, and, hence, all the hubbub. (Never mind that we also cannot reject a much larger effect. The authors cover this in their discussion.)
The standard level of statistical significance is rejecting the null with 0.95 probability. Assuming the same baseline 5.1% elevated GH rate and a 20% reduction under Medicaid, what sample size would we need to achieve a 0.95 level of significance? Plugging and chugging, I get about 30,000 for the control group and a 7,500 treatment (Medicaid) group. (I’ve fixed the Medicaid take-up rate at 25%, as found in the study.) This is a factor of five bigger than the researchers had.
Now, caveats: I’m taking the baseline rate, 5.1% from the study itself. But we know it is estimated with some imprecision…. The analysis in the study is not as simple as a straight comparison of two proportions…. It is always possible I’ve made an error…