5 Matching Annotations
  1. Nov 2021
    1. In this report, we investigated performance of the omnibus test using simulated data. The hierarchical procedure is a widely used approach for comparing multiple (more than two) groups.[1] The omnibus test is intended to preserve type I errors by eliminating unnecessary post-hoc analyses under the null of no group difference. However, our simulation study shows that the hierarchical approach is not guaranteed to work all the time. The omnibus and post-hoc tests are not always in agreement. As our goal of comparing multiple groups is to find groups that have different means, a significant omnibus test gives a false alarm, if none of the post-hoc tests are significant. But, most important, we may also miss opportunities to detect group differences, if we have a non-significant omnibus test, since some or all post-hoc tests may still be significant in this case.Although we focus on the classic ANOVA model in this report, the same considerations and conclusions also apply to more complex models for comparing multiple groups, such as longitudinal data models [2]. Since for most models, post-hoc tests with significant levels adjusted to account for multiple testing do not have exactly the same type I error as the omnibus test as in the case of ANOVA, it is more difficult to evaluate performance of the hierarchical procedure. For example, the Bonferroni correction is generally conservative.Given our findings, it seems important to always perform pairwise group comparisons, regardless of the significance status of the omnibus test and report findings based on such group comparisons.

      Post hoc not significant when omnibus test is significant.

  2. Aug 2020
  3. May 2019
    1. Multiple comparisons: It is not good practice to test for significant differences among pairs of group means unless the ANOVA suggests some such differences exist. Nevertheless, I admit it is tempting to take another look at the comparison of G1 with G3 (ignoring the existence of G2 and perhaps assuming normality), but then you should use a Welch t test to account for the differences in sample variances, and you should not make claims about the result unless the P-value is as low as .01 or .02. Looking at that difference more carefully might prompt a subsequent experiment.

      Test for significance among pairs when the overall f test is not significant.

  4. Jan 2016
    1. Nothing would have changed

      This is always a retrospective view. Always worry about the post hoc view that this represents. Perhaps the change was inevitable given the set of initial conditions the classroom represented at this point in time. Perhaps we need the hard rock problem that the status quo ante bellum represents. Perhaps it is just one narrative of many that are equal to more compelling.