London: Academic Press, 2022. — 212 p.
Wise Use of Null Hypothesis Tests: A Practitioner's Handbook provides readers with the foundational knowledge needed to devise and successfully validate their research. This volume provides the conceptual background needed to fully understand this methodology, including how to determine a null hypothesis, which test is most appropriate, T Tests, common misconceptions, and research study design. Written by a neurobiologist for the neuroscientist, readers will have a better understanding of null hypothesis tests in their journey to better quality research and publishing validated, significant results.
Epigraph
About the author
What makes this book different?
This is not a mathematics book
Conventional books get it wrong
Ronald Fisher got it right, and his method is simple
What conventional books leave out
How to use this book
The conventional method is a flawed fusion
Three statisticians, two methods, and the mess that should be banned
Wise use and testing nulls that must be false
Null hypothesis testing in perspective
Notes
The point is to generalize beyond our results
Samples and populations
Real and hypothetical populations
Randomization
Know your population, and do not generalize beyond it
Notes
Null hypothesis testing explained
The effect of sampling error
The logic of testing a null hypothesis
We should know from the start that many null hypotheses cannot be correct
The traditional explanation of how to use p
What use of α accomplishes
The flawed hybrid in action
Criticisms of the flawed hybrid
We should test nulls in a way that answers the criticisms
How to use p and α
Mouse preference, done right this time
More p-values in action
What were the nulls and predictions?
What if p=?
A radical but wise way to use p
or ? p or P?
Notes
How often do we get it wrong?
Distributions around means
Distributions of test statistics
Null hypothesis testing explained with distributions
Type I errors explained
Probabilities before and after collecting data
The null’s precision explained
The awkward definition of p explained
Errors in direction
Power and errors in direction
Manipulating power to lower p-values
Increasing power with one-tailed tests
Power and why we should we set α to or higher
Power, estimated effect size, and type M errors
How can we know a population’s distribution?
Notes
Important things to know about null hypothesis testing
Examples of null hypotheses in proper statistics books and what they really mean
Categories of null hypotheses?
What if is important to accept the null?
Never do this
Null hypothesis testing as never explained before
Effect size: what is it and when is it important?
We should provide all results, even those not statistically “significant”
Notes
Common misconceptions
Null hypothesis testing is misunderstood by many
Statistical “significance” means a difference is large enough to be important—wrong!
p is the probability of a type I error—wrong!
If results are statistically “significant,” we should accept the alternative hypothesis that something other than the n
If results are not statistically “significant,” we should accept the null hypothesis—wrong!
Based on p we should either reject or fail to reject the null hypothesis—often wrong!
Null hypothesis testing is so flawed that we should use confidence intervals instead—wrong!
Power can be used to justify accepting the null hypothesis—wrong!
The null hypothesis is a statement of no difference—not always
The null hypothesis is that there will be no significant difference between the expected and observed values—very, ver
A null hypothesis should not be a negative statement—wrong!
Notes
The debate over null hypothesis testing and wise use as the solution
The debate over null hypothesis testing
Communicate to educate
Plan ahead
Test nulls when appropriate, not promiscuously
Strike the right balance between what is conventional and what is best
Think outside of the null hypothesis test
Encourage our audience to draw their own conclusions
Allow ourselves to draw our own conclusions
Strike the right balance when providing our results
Know the misconceptions and do not fall for them
Do not say that two groups “differ” or “do not differ”
Provide all results somehow
Other reformed methods of null hypothesis testing
Notes
Simple principles behind the mathematics and some essential concepts
Why different types of data require different types of tests
Simple principles behind the mathematics
Numerical data exhibit variation
Nominal data do not exhibit variation
How to tell the difference between nominal and numerical data
Simple principles behind the analysis of groups of measurements and discrete numerical data
Variance: a statistic of huge importance
Incorporating sample size and the difference between our prediction and our outcome
Drawing conclusions when we knew all along that the null must be false
Degrees of freedom explained
Other types of t tests
Analysis of variance and t tests have certain requirements
Do not test for equal variances unless …
Simple principles behind the analysis of counts of observations within categories
Counts of observations within categories
When the null hypothesis specifies the prediction
When there is only one degree of freedom
When the null hypothesis does not specify the prediction
Interpreting p when the null hypothesis cannot be correct
× Designs and other variations
The problem with chi-squared tests
The reasoning behind the mathematics
Rules for chi-squared tests
Notes
nine The two-sample t test and the importance of pooled variance
Comparing more than two groups to each other
If we have three or more samples, most say we cannot use two-sample t tests to compare them two samples at a time
Analysis of variance
The price we pay is power
Comparing every group to every other group
Comparing multiple groups to a single reference, like a control
Is all of this a load of rubbish?
Notes
Assessing the combined effects of multiple independent variables
Independent variables alone and in combination
No, we may not use multiple t tests
We have a statistical main effect: now what?
We have a statistical interaction: things to consider
We have a statistical interaction and we want to keep testing nulls
Which is more important, the main effect or the interaction?
Designs with more than two independent variables
Use of analysis of variance to reduce variation and increase power
Notes
Comparing slopes: analysis of covariance
Analysis of covariance
Use of analysis of covariance to reduce variation and increase power
More on the use of analysis of covariance to reduce variation and increase power
Use of analysis of covariance to limit the effects of a confound
Note
When data do not meet the requirements of t tests and analysis of variance
When do we need to take action?
Floor effects and the square root transformation
Floor and ceiling effects and the arcsine transformation
Not as simple as a floor or ceiling effect—the rank transformation
Making analysis of variance sensitive to differences in proportion—the logarithmic transformation
Nonparametric tests
Transforming data changes the question being asked
Notes
Reducing variation and increasing power by comparing subjects to themselves
The simple principle behind the mathematics
Repeated measures analysis of variances
Multiple comparisons tests on repeated measures
When subjects are not organisms
When repeated does not mean repeated over time
Pretest-posttest designs illustrate the danger of measures repeated over time
Repeated measures analysis of variance versus t tests
The problem with repeated measures
The requirement for sphericity
Correcting for a lack of sphericity
Multiple comparisons tests when there is a lack of sphericity
The multivariate alternative to correction
Notes
What do those error bars mean?
Confidence intervals
Testing null hypotheses in our heads
Plotting confidence intervals
Error bars and repeated measures
Plot comparative confidence intervals to make the overlap myth a reality
Notes
Appendix A Philosophical objections
Decades of bitter debate
We want to know when we are wrong, not how often
Setting α to does not mean that % of all null-based decisions are wrong
There are better ways to analyze and interpret data
The fallacy of affirming the consequent
Some say our method cannot be used to determine direction
The return of one-tailed tests
Kaiser’s absurd directional two-tailed tests
Invoking power to justify Kaiser’s directional two-tailed tests
Fisher did not follow Kaiser’s rules
Still not convinced?
Notes
Appendix B How Fisher used null hypothesis tests
Why follow my advice?
Fisher tested for direction
Others did too
Fisher believed α should vary according to the circumstances
Fisher came close to saying there should be no α at all
In practice, Fisher did not categorize outcomes
Fisher’s language answers many criticisms of null hypothesis testing
Except for Fisher’s use of “significant”
Fisher’s inconsistency explained
Fisher’s thinking expressed in one word
We have come a long way since Fisher, but the wrong way?
Notes
Appendix C The method attributed to Neyman and Pearson
Neyman and Pearson with Pearson
Neyman and Pearson without Pearson
An important limitation
Alternatives are always infinitely numerically precise
The method step-by-step
The method’s influence on the flawed hybrid
The method’s fate in the world of the flawed hybrid
Power spreads its wings
Neyman et al’s method has no place in science
Notes
Back Cover