2 Matching Annotations
  1. Jul 2018
    1. On 2016 Aug 08, UFRJ Neurobiology and Reproducibility Journal Club commented:

      The article by Schreiber et al. touches on an interesting subject: namely, the relation between actin remodeling and synaptic plasticity. However, some of the experiments presented raise concerns related to the analysis of nested data, which violates the assumption of independency between experimental units required by conventional statistical methods – a rather common problem in neuroscience papers (Aarts E, 2014).

      On figure 3 (K-O) the legend mentions a sample size is of clusters (i.e. GFP-positive PURA-containing mRNP particles), observed in hipocampal cell cultures derived from transgenic and wild-type animals. However, there is no mention of how many different animals were used. If more than one cluster measured came from the same animal, they do not constitute independent samples; thus, statistics based on clusters should not be used to evaluate hypotheses relating to the effect of genotype. However, it is unclear on the text whether this was the case. On figure 7 (F) it is stated that 15/16 slices used were obtained from 9/10 animals per genotype; therefore, some of the animals contributed with more than one slice, leading to the same problem of non-independence between units. Still on figure 7 (A – B) and on figure 8 (E) the sample size is given in number of cells, and again it is unclear whether they came from different animals or not. The presence of nested data in an experiment tends to lead experimental values obtained from the same animal/cell to be more similar than if they were obtained from different animals. This causes differences between units within experimental groups to be smaller than those between groups, and thus increases the type 1 error rate of statistical tests that assume independent observations (such as t tests and ANOVA), leading to more false-positive results. A simple correction for this problem would be to calculate a mean value for each animal, and then use animal-level statistics for each comparison. Alternatively, multi-level analysis can be used to try to separate variances at the different levels (e.g.slice-level vs. animal-level variation). (Aarts E, 2014).

      Another concern in this article is the analysis of subgroups. On figure 3 (F-O), the clusters were divided into 2 groups based on their mobility, and each of these groups was evaluated in 3 parameters: total distance moved, maximal velocity and maximal distance to origin. This amounts to 6 analyses in total, 2 of which had a statistically significant result. It is not clear from the methods whether dividing the clusters was an a priori decision, or whether the authors perceived the whole group as heterogeneous and therefore decided to divide it. Such a posteriori analysis of subgroups will inevitably increase the number of comparisons performed, and thus the chance of obtaining false-positive results by chance (Lagakos SW, 2006). In this case, statistical significance thresholds could have been corrected by the authors to account for the number of comparisons generated by subgroup analysis.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

  2. Feb 2018
    1. On 2016 Aug 08, UFRJ Neurobiology and Reproducibility Journal Club commented:

      The article by Schreiber et al. touches on an interesting subject: namely, the relation between actin remodeling and synaptic plasticity. However, some of the experiments presented raise concerns related to the analysis of nested data, which violates the assumption of independency between experimental units required by conventional statistical methods – a rather common problem in neuroscience papers (Aarts E, 2014).

      On figure 3 (K-O) the legend mentions a sample size is of clusters (i.e. GFP-positive PURA-containing mRNP particles), observed in hipocampal cell cultures derived from transgenic and wild-type animals. However, there is no mention of how many different animals were used. If more than one cluster measured came from the same animal, they do not constitute independent samples; thus, statistics based on clusters should not be used to evaluate hypotheses relating to the effect of genotype. However, it is unclear on the text whether this was the case. On figure 7 (F) it is stated that 15/16 slices used were obtained from 9/10 animals per genotype; therefore, some of the animals contributed with more than one slice, leading to the same problem of non-independence between units. Still on figure 7 (A – B) and on figure 8 (E) the sample size is given in number of cells, and again it is unclear whether they came from different animals or not. The presence of nested data in an experiment tends to lead experimental values obtained from the same animal/cell to be more similar than if they were obtained from different animals. This causes differences between units within experimental groups to be smaller than those between groups, and thus increases the type 1 error rate of statistical tests that assume independent observations (such as t tests and ANOVA), leading to more false-positive results. A simple correction for this problem would be to calculate a mean value for each animal, and then use animal-level statistics for each comparison. Alternatively, multi-level analysis can be used to try to separate variances at the different levels (e.g.slice-level vs. animal-level variation). (Aarts E, 2014).

      Another concern in this article is the analysis of subgroups. On figure 3 (F-O), the clusters were divided into 2 groups based on their mobility, and each of these groups was evaluated in 3 parameters: total distance moved, maximal velocity and maximal distance to origin. This amounts to 6 analyses in total, 2 of which had a statistically significant result. It is not clear from the methods whether dividing the clusters was an a priori decision, or whether the authors perceived the whole group as heterogeneous and therefore decided to divide it. Such a posteriori analysis of subgroups will inevitably increase the number of comparisons performed, and thus the chance of obtaining false-positive results by chance (Lagakos SW, 2006). In this case, statistical significance thresholds could have been corrected by the authors to account for the number of comparisons generated by subgroup analysis.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.