Here’s the deal — statistical power is kinda important. It probably seems esoteric. It sounds weird. For some reason, maybe because I watched cartoons in the ’80s, when I see “statistical power” I conjure up this weird image of He-Man (of Masters of the Universe fame). I don’t know why.
Sometimes scientists will write things that sound kinda like playground taunts, or at least they conjure up those images in my head. For instance, Porter et al. (2022) said “the unusually large spread in the distribution of PFOS, PFOA, and PFHxS meant our study had relatively large power to detect a statistically significant association for these compounds.” Within the context of the paper, it sounds like the authors are saying that their study has more statistical power than papers that rely on data from the CDC’s National Health and Nutrition Examination Survey (NHANES) and its measurements of various PFAS chemicals.
Again — in my head, I’m picturing the paper’s authors’ data being more beefed up and thus more powerful than data from NHANES.
In general, more statistical power is a good thing. In an ideal case, it would mean there’s a higher likelihood that a negative result is a true negative. But when authors don’t report their statistical power, and instead make relative statements, it’s impossible to judge which study is actually better — again, it kinda feels like a playground taunt.
The authors’ justification for saying that their study has more power is based on the sentence just before the one I quoted above, where they say, “the statistical power of a study to detect an association is increased by a large variation in the distribution of the independent variable of interest (Freund et al., 1980).” They are citing a textbook from 1980. Cool. Case closed, right?
Not quite so fast. Porter et al. are actually missing some key facts — and those key facts are what make the authors’ statement misleading in my opinion. In fact, the statistical power for the slope of that independent variable of interest they mention (and it is actually with respect to the slope), is not only dependent upon the dispersion/variation/standard deviation of the independent variable. Oh, no, it’s dependent on much, much more.
Power and the Single Independent Variable
Those of us who have been fortunate enough to take a regression class will probably remember that you can calculate statistical power on a lot. You’ve got the R-squared value (so the model itself) and the individual independent variables.
Yes, it makes sense that having a large dispersion of data in the independent variable will increase the statistical power. The easiest way to see this is if we consider a simple linear regression. We have some dependent variable we care about, say amount of IgG antibody, and the concentration of PFOS (a type of PFAS chemical), our independent variable.
If you only have 2 concentrations of PFOS, say 1uM and 1,000uM, the model’s not going to be very good. You won’t have a good idea of what the shape of the line really should be. Sure, you can infer a lot in between from 1-1,000uM, but the slope may be extremely inaccurate. But if you had, say 0.01uM, 0.1uM, 1uM, 10uM, 100uM, and 1,000uM — now you’ll get a better model. You’ll get a better sense of the slope.
This is what Porter et al. are talking about when they say, “the statistical power of a study to detect an association is increased by a large variation in the distribution of the independent variable of interest (Freund et al., 1980).” That “large variation” (which is a very relative term, and without knowing what they’re comparing to is kinda worthless) refers to the dispersion of chemical concentrations in their study.
But, as James Hanley explains in his paper, and which is explained in other places as well, the statistical power of an independent variable is more complicated than just relying on the dispersion of the independent variable. The statistical power is dependent upon the standard error of the independent variable which is:
SE = RMSE x [1/SDx] x [1/sqrt(n)]
Note that the dispersion Porter et al. are talking about is SDx. But look at the other two terms — n and RMSE. The term n is the sample size. RMSE is the root mean square error. That is the square root of the difference between the actual y value (IgG concentration in this example) and the mean from the regression analysis (so y – regressed_mean(y)).
That SE value then gets plugged into the standard power formula. Et voila!
So, Let’s Recap — What Did Porter et al. Do Wrong?
So here’s where Porter et al. went wrong. They are making a statement that their study has greater relative power than something else, and they leave the reader to guess it’s the NHANEs data. Porter et al. say they have more statistical power solely because they have a larger dispersion in their chemical concentration data for each PFAS chemical.
But that assumes that NHANES, or whatever they’re comparing to, has the same RMSE and sample size that they have. What are the odds that any other study will have the same RMSE? Pretty low, right?
So, in my opinion Porter et al. are making a very misleading statement; I’m actually inclined to call it a false statement.
Now, was it an intentionally false and misleading statement? I can’t answer that. I also don’t know what Freund, et al. said in the 1980 edition of their textbook, but I do know what they said in their 2010 edition:
This expression shows that the variance of β1 increases with larger values of the population variance, and decreases with larger sample size and/or larger dispersion of the values of the independent variable.
Freund et al. 2010 (Statistical Methods, Third Edition, publisher: Elsevier)
And that ladies and gentlemen again reinforces the point: the statistical power associated with an independent variable in a regression analysis is dependent upon more than just the dispersion of the independent variable.
Verdict:
In my opinion, Porter et al. made a false and misleading statement about the statistical power of their study. In fact, what they should have done is to calculate the power of their study, report it, and compare it to the standard — which is currently 80% power (and to be frank, post hoc power analyses usually result in inflated power values, but that’s a chat for another day).