6 Matching Annotations
  1. Jul 2018
    1. On 2017 Jan 24, Anthony Jorm commented:

      Thank you to the authors for providing the requested data. I would like to provide a further comment on the effect size for the primary outcome of their intervention, the Social Acceptance Scale. Using the pre-test and post-test means and standard deviations and the correlation between pre-test and post-test, they calculate a Cohen’s d of 0.186, which is close to Cohen’s definition of a ‘small’ effect size (d = 0.2). However, I believe this is not the appropriate method for calculating the effect size. Morris & DeShon Morris SB, 2002 have reviewed methods of calculating effect sizes from repeated measures designs. They distinguish between a ‘repeated measures effect size’ and an ‘independent groups effect size’. Koller & Stuart appear to have used the repeated measures effect size (equation 8 of Morris & DeShon). This is not wrong, but it is a different metric from that used in most meta-analyses. To allow comparison with published meta-analyses, it is necessary to use the independent groups effect size, which I calculate to give a d = 0.14 (using equation 13 of Morris & DeShon). This effect size can be compared to the results of the meta-analysis of Corrigan et al. Corrigan PW, 2012 which reported pooled results from studies of stigma reduction programs with adolescents. The mean Cohen’s d scores for ‘behavioral intentions’ (which the Social Acceptance Scale aims to measure) were 0.302 for education programs, 0.457 for in-person contact programs and 0.172 for video contact programs. I would therefore conclude that the contact-based education program reported by Koller & Stuart has a ‘less than small’ effect and that it less than those seen in other contact-based and education programs for stigma reduction in adolescents.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

    2. On date unavailable, commented:

      None


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

    3. On 2016 Nov 10, Heather Stuart commented:

      We would like to thank Professor Jorm for his careful consideration of our results and his comment. As requested, we have provided the following additional data analysis. 1. Report means, standard deviations and Cohen’s d with 95% CI for the primary outcome. This will allow comparison with the results of the meta-analyses by Corrigan et al. Corrigan PW, 2012 and Griffiths et al. Griffiths KM, 2014. Professor Jorm’s questions raise the important issues of what constitutes a meaningful outcome when conducting anti-stigma research and how much of an effect is noteworthy (statistical significance aside). We discussed these issues at length when designing the evaluation protocol and based on the book Analysis of Pretest-Posttest Designs (Bonate, 2000) we took the approach that scale scores are not helpful for guiding program improvements. Aggregated scale scores do not identify which specific areas require improvement, whereas individual survey items do. We also considered what would be a meaningful difference to program partners (who participated actively in this discussion) and settled on the 80% (A grade) threshold as a meaningful heuristic describing the outcome of an educational intervention. Thus, we deliberately did not use the entire scale score to calculate a difference of means. Our primary outcome was the adjusted odds ratio. When we convert the odds ratio to an effect size (Chinn, 2000)we get an effect size of 0.52, reflecting a moderate effect. The mean pretest Social Acceptance score was 24.56 (SD 6.71, CI 24.34-24.75) and for the post-test it was 23.62 (SD 6.93, CI 23.40-23.83). Using these values and the correlation between the 2 scores (0.73) the resulting Cohen’s d is 0.186, reflecting a small and statistically significant effect size. It is important to point out that the mean differences reported here do not take into consideration the heterogeneity across programs, so most likely underestimate the effect. This might explain why the effect size when using the OR (which was corrected for heterogeneity) was higher than the unadjusted mean standardized effect. Whether using a mean standardized effect size or the adjusted odds ratio, results suggest that the contact based education is a promising practice for reducing stigma in high school students.<br> 2. Data on the percentage of ‘positive outliers’ to compare with the ‘negative outliers’. Because we had some regression to the mean in our data, we used the negative outliners to rule out the hypothesis that the negative changes noted could be entirely explained by this statistical artefact. We defined negative outliners as the 25th percentile minus 1.5 times the interquartile range. Outliners were 3.8% for the Stereotype Scale difference score and 2.8% for the Social Acceptance difference score suggesting that some students actually got worse. We noted that males were more likely to be among the outliers. Our subsequent analysis of student characteristics showed that males who did not self-disclose a mental illness were less likely to achieve a passing score. This supported the idea that a small group of students may be reacting negatively to the intervention and becoming more stigmatized. While the OR alone (or the mean standardized difference) could, as Professor Jorm indicates, mask some deterioration in a subset of students, our full analysis was designed to uncover this exact phenomenon.<br> Professor Jorm has asked that we show the positive outliers. If we define a positive outliner as the 75th percentile plus 1.5 times the interquartile range, then 1.9% were outliners on the Stereotype Scale difference score and 2.3% are outliers on the Social Acceptance distance score, suggesting that the intervention also resonated particularly well with a small group of students.<br> Thus, while contact based interventions appear to be generally effective (i.e. when using omnibus measures such as a standardized effect size or the adjusted odds ratio), our findings support the idea that effects are not uniform across all sub-sets of students (or, indeed programs). Consequently, more nuanced approaches to anti-stigma interventions are needed, such as those that are sensitive to gender and personal disclosure along with fidelity criteria to maximize program effects.

      1. Data on changes in ‘fail grades’, i.e. whether there was any increase in those with less than 50% non-stigmatizing responses<br> In response to Professor Jorm’s request for a reanalysis of students who failed, we defined a fail grade as giving a stigmatising response to at least 6 of the 11 statements, (54% of the questions). At pretest, 32.8% of students ‘failed’ on the Stereotype scale, dropping to 23.7% at post-test (reflecting a decrease of 9.1%). For the Social acceptance scale, at pretest 28.5% ‘failed’, dropping to 24.8% at post-test, reflecting (a decrease of 3.7%). Using McNemar’s test, both the Stereotype scale (X2 (1) = 148.7, p <.001) and the Social Acceptance scale (X2 (1) = 28.4, p <.001) were statistically significant lending further support to our conclusion that the interventions were generally effective. Bonate, P. L. (2000). Analysis of Pretest- Posttest Designs. CRC Press. Chinn, S. (2000). A simple method for converting an odds ratio to effect size for use in meta-analysis. Statistics in Medicine, 3127-3131.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

    4. On 2016 Jul 20, Anthony Jorm commented:

      The authors of this study conclude that “contact-based education appears to be effective in improving students’ behavioural intentions towards people who have a mental illness”. However, it is not clear that the data on the primary outcome measure (the Social Acceptance Scale) support this conclusion. The authors measured change on this primary outcome in two ways. The first is a difference score calculated by subtracting post-test scores from pre-test scores. The second is a dichotomous grade score, with 80% non-stigmatizing responses defined as an ‘A grade’. With the difference scores, the authors do not report the means, standard deviations and an effect size measure (e.g. Cohen’s d) at pre-test and post-test, as is usually done. This makes it impossible to compare the effects to those reported in meta-analyses of the effects of stigma reduction interventions. Instead, they report the percentage of participants whose scores got worse, stayed the same or got better. It is notable that a greater percentage got worse (28.3%) than got better (19.8%), indicating that the overall effect may have been negative. The authors also report on the percentage of participants who got worse by 5 or more points (the ‘negative outliers’: 2.8%), but they do not report for comparison the percentage who got better by this amount. The dichotomous A grade scores do appear to show improvement overall, with an odds ratio of 2.57. However, this measure could mask simultaneous deterioration in the primary outcome in a subset of participants. This could be assessed by also reporting the equivalent of a ‘fail grade’. I request that the authors report the following to allow a full assessment of the effects of this intervention: 1. Means, standard deviations and Cohen’s d with 95% CI for the primary outcome. This will allow comparison with the results of the meta-analyses by Corrigan et al. Corrigan PW, 2012 and Griffiths et al. Griffiths KM, 2014. 2. Data on the percentage of ‘positive outliers’ to compare with the ‘negative outliers’. 3. Data on changes in ‘fail grades’, i.e. whether there was any increase in those with less than 50% non-stigmatizing responses.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

  2. Feb 2018
    1. On 2016 Jul 20, Anthony Jorm commented:

      The authors of this study conclude that “contact-based education appears to be effective in improving students’ behavioural intentions towards people who have a mental illness”. However, it is not clear that the data on the primary outcome measure (the Social Acceptance Scale) support this conclusion. The authors measured change on this primary outcome in two ways. The first is a difference score calculated by subtracting post-test scores from pre-test scores. The second is a dichotomous grade score, with 80% non-stigmatizing responses defined as an ‘A grade’. With the difference scores, the authors do not report the means, standard deviations and an effect size measure (e.g. Cohen’s d) at pre-test and post-test, as is usually done. This makes it impossible to compare the effects to those reported in meta-analyses of the effects of stigma reduction interventions. Instead, they report the percentage of participants whose scores got worse, stayed the same or got better. It is notable that a greater percentage got worse (28.3%) than got better (19.8%), indicating that the overall effect may have been negative. The authors also report on the percentage of participants who got worse by 5 or more points (the ‘negative outliers’: 2.8%), but they do not report for comparison the percentage who got better by this amount. The dichotomous A grade scores do appear to show improvement overall, with an odds ratio of 2.57. However, this measure could mask simultaneous deterioration in the primary outcome in a subset of participants. This could be assessed by also reporting the equivalent of a ‘fail grade’. I request that the authors report the following to allow a full assessment of the effects of this intervention: 1. Means, standard deviations and Cohen’s d with 95% CI for the primary outcome. This will allow comparison with the results of the meta-analyses by Corrigan et al. Corrigan PW, 2012 and Griffiths et al. Griffiths KM, 2014. 2. Data on the percentage of ‘positive outliers’ to compare with the ‘negative outliers’. 3. Data on changes in ‘fail grades’, i.e. whether there was any increase in those with less than 50% non-stigmatizing responses.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

    2. On 2017 Jan 24, Anthony Jorm commented:

      Thank you to the authors for providing the requested data. I would like to provide a further comment on the effect size for the primary outcome of their intervention, the Social Acceptance Scale. Using the pre-test and post-test means and standard deviations and the correlation between pre-test and post-test, they calculate a Cohen’s d of 0.186, which is close to Cohen’s definition of a ‘small’ effect size (d = 0.2). However, I believe this is not the appropriate method for calculating the effect size. Morris & DeShon Morris SB, 2002 have reviewed methods of calculating effect sizes from repeated measures designs. They distinguish between a ‘repeated measures effect size’ and an ‘independent groups effect size’. Koller & Stuart appear to have used the repeated measures effect size (equation 8 of Morris & DeShon). This is not wrong, but it is a different metric from that used in most meta-analyses. To allow comparison with published meta-analyses, it is necessary to use the independent groups effect size, which I calculate to give a d = 0.14 (using equation 13 of Morris & DeShon). This effect size can be compared to the results of the meta-analysis of Corrigan et al. Corrigan PW, 2012 which reported pooled results from studies of stigma reduction programs with adolescents. The mean Cohen’s d scores for ‘behavioral intentions’ (which the Social Acceptance Scale aims to measure) were 0.302 for education programs, 0.457 for in-person contact programs and 0.172 for video contact programs. I would therefore conclude that the contact-based education program reported by Koller & Stuart has a ‘less than small’ effect and that it less than those seen in other contact-based and education programs for stigma reduction in adolescents.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.