504 Matching Annotations
  1. Last 7 days
  2. Oct 2019
    1. A Million Brains in the Cloud

      Klein and Gosh published this research idea and opened it to review. For the MOOC activity "open peer review" we want you to read and annotate their proposal using this Hypothes.is layer. Please sign up to Hypothes.is and join the conversation! Simply highlight the text passage that you want to comment on, or create a new Page Note for general comments on the research proposal.

    1. The Politics of Sustainability and Development

      This reading is to help you better understand the role and importance of literature review. Literature review connects us to a bigger community of scientists who study the same research topic, and helps us build up, illustrate, and develop our theory (what is happening between the IV and the DV?) and research design (how one plans to answer the RQ).

    1. According to the McDonald's website, here are the ingredients in the French fries: Ingredients: Potatoes, Vegetable Oil (Canola Oil, Corn Oil, Soybean Oil, Hydrogenated Soybean Oil, Natural Beef Flavor [Wheat and Milk Derivatives]*), Dextrose, Sodium Acid Pyrophosphate (Maintain Color), Salt. *Natural beef flavor contains hydrolyzed wheat and hydrolyzed milk as starting ingredients. Here is the ingredient list for McDonald's hash browns, taken from their website: Ingredients: Potatoes, Vegetable Oil (Canola Oil, Soybean Oil, Hydrogenated Soybean Oil, Natural Beef Flavor [Wheat and Milk Derivatives]*), Salt, Corn Flour, Dehydrated Potato, Dextrose, Sodium Acid Pyrophosphate (Maintain Color), Extractives of Black Pepper. *Natural beef flavor contains hydrolyzed wheat and hydrolyzed milk as starting ingredients

      Elsewhere there’s the claim that McDonalds fries are cooked in some amount of beef fat in processing. Warrants further vetting.

  3. Sep 2019
    1. Transparent Review in Preprints will allow journals and peer review services to show peer reviews next to the version of the manuscript that was submitted and reviewed.

      A subtle but important point here is that when the manuscript is a preprint then there are two public-facing documents that are being tied together-- the "published" article and the preprint. The review-as-annotation becomes the cross-member in that document association.

    1. “To me, this is symptomatic of a much larger problem of transparency within the company. Nobody is forthcoming with information that dramatically affects editorial,” Binkowski said. “One of those things was me not knowing if I was in trouble.”

      This firing of Snopes managing editor reiterates previous concerns over an unhealthy working environment. This article is old enough I'd like to see followups.

    1. When I asked how many articles she’d written for the site, she came back with a “verified count” of 1,905. She told me how she came to that number: “By examining every Snopes.com HTML file on my computer, rereading every email David and I exchanged from 1997 until now, and in cases where doubt still existed, examining my research files. The task took a week, but I am satisfied I now have a fair list and that all lurking doubles (a result of David’s penchant for renaming files) have been excised.”

      Impressive. Her alleged painstaking data-hoarding makes me like her. I'm not sure what to think of David's ambiguity yet.

    1. I am writing this review for the Drummond and Sauer comment on Mathur and VanderWeele (2019). To note, I am familiar with the original meta-analyses considered (one of which I wrote), the Mathur and VanderWeele (henceforth MV2019) article, and I’ve read both Drummond and Sauer’s comment on MV2019 and Mathur’s review of Drummond and Sauer’s comment on MV2019 (hopefully that wasn’t confusing). On balance, I think Drummond and Sauer’s (henceforth DSComment) comment under review here is a very important contribution to this debate. I tended to find DSComment to be convincing and was comparatively less convinced by Mathur’s review or, indeed, MV2019. I hope my thoughts below are constructive.

      It’s worth noting that MV2019 suffered from several primary weaknesses. Namely:

      1. On one hand, it didn’t really tell us anything we didn’t already know, namely that near-zero effect sizes are common for meta-analyses in violent video game research.
      2. MV2019, aside from one brief statement as DSComment notes, neglected the well-known methodological issues that tend to spuriously increase effect sizes (unstandardized aggression measures, self-ratings of violent game content, identified QRPs in some studies such as the Singapore dataset, etc.) This resulted in a misuse of meta-analytic procedures.
      3. MV2019 naïvely interprets (as does Mathur’s review of DSComment) near-zero effect sizes as meaningful, despite numerous reasons not to do so given concerns of false positives.
      4. MV2019, for an ostensible compilation of meta-analyses, curiously neglect other meta-analyses, such as those by John Sherry or Furuyama-Kanamori & Doi (2016).

      At this juncture, publication bias, particularly for experimental studies, has been demonstrated pretty clearly (e.g. Hilgard et al., 2017). I have two comments here. MV2019 offered a novel and not well-tested alternative approach (highlighted again by Mathur’s review) for bias, however, I did not find the arguments convincing as this approach appears extrapolative and produces results that simply aren’t true. For instance, the argument that 100% of effect sizes in Anderson 2010 are above 0, is quickly falsified merely by looking at the reported effect sizes in the studies included, at least some of which are below .00. Therefore, this would appear to clearly indicate some error in the procedure of MV2019.

      Further, we don't need statistics to speculate about publication bias in Anderson et al. (2010) as there are actual specific examples of published null studies missed by Anderson et al. (see Ferguson & Kilburn, 2010). Further, the publication of null studies in the years immediately following (e.g. von Salisch et al., 2011) indicate that Anderson's search for unpublished studies was clearly biased (indeed, I had unpublished data at that time but was not asked by Anderson and colleagues for it). So there's no need at all for speculation given we have actual examples of missed studies and a fair number of them.

      It might help to highlight also that traditional publication bias techniques probably are only effective with small sample experimental studies. For large sample correlational/longitudinal studies, effect sizes tend to be a bit more homogeneous, hovering closely to zero. In such studies the accumulation of p-values near .05 is unlikely given the power of small studies. Relatively simple QRPs can make p-values jump rapidly from non-significance to something well below.05. Thus, traditional publication bias procedures may return null results for this pool of studies, despite QRPs, and thus, publication bias having taken place.

      It might also help to note that meta-analyses with weak effects are very fragile to unreported null studies, which probably exist in greater numbers (particularly for large n studies) that would be indicated by publication bias techniques.

      I agree with Mathur’s comment about experiments not always offering the best evidence, given lack of generalizability to real-world aggression (indeed, that’s been a long-standing concern). However, it might help DSComment to note that, by this point, probably the pool of evidence least likely to find effects are longitudinal studies. I’ve got two preregistered longitudinal analyses of existing datasets myself (here I want to make clear that citing my work is by no means necessary for my positive evaluation of any revisions on DSComment), and there are other fine studies (such as Lobel et al., 2017, Breuer et al., 2015, Kuhn et al., 2018; von Salisch et al., 2011, etc.) The authors may also want to note Przybylski and Weinstein (2019) which offer an excellent example of a preregistered correlational study.

      Indeed, in a larger sense, as far as evidence goes, DSComment could highlight recent preregistered evidence from multiple sources (McCarthy et al., 2016; Hilgard et al., 2019, Przybylski & Weinstein, 2019, Ferguson & Wang, 2019, etc.) This would seem to be the most crucial evidence and, aside from one excellent correlational study (Ivory et al.) all of the preregistered results have been null. Even if we think the tiny effect sizes in existing metas provide evidence in support of hypotheses (and we shouldn’t), these preregistered studies suggest we shouldn’t trust even those tiny effects to be “true.”

      The weakest aspect of MV2019 was the decision to interpret near-zero effects as meaningful. Mathur, argues that tiny effects can be important once spread over a population. However, this is merely speculation, and there’s no data to support it. It’s kind of a truthy thing scholars tend to say defensively when confronted by the possibility that effect sizes don’t support their hypotheses. By making this argument, Mathur invites an examination of population data where convincing evidence (Markey, Markey & French, 2015; Cunningham et al., 2016; Beerthuizen, Weijters & van der Laan, 2017) shows that violent game consumption is associated with reduced violence in society. Granted, some may express caution about looking at societal-level data, but here is where scholars can’t have it both ways: One can’t make claims about societal-level effects, and then not want to look at the societal data. Such arguments make unfalsifiable claims and are unscientific in nature.

      The other issue is that this line of argument makes effect sizes irrelevant. If we’re going to interpret effect sizes no matter how near to zero as hypothesis supportive, so long as they are “statistically significant” (which, given the power of meta-analyses, they almost always are), then we needn’t bother reporting effect sizes at all. We’re still basically slaves to NHST, just using effect sizes as a kind of fig leaf for the naked bias of how we interpret weak results.

      Also, that’s just not how effect sizes work. They can’t be sprinkled like pixie dust over a population to make them meaningful.

      As DSComment points out, effect sizes that are this small have high potential for Type 1 error. Funder and Ozer (2019) recent contributed to this discussion in a way I think was less than helpful (to be very clear I respect Funder and Ozer greatly, but disagree with many of their comments on this specific issue). Yet, as they note, interpretation of tiny effects is based on such effects being “reliable”, a condition clearly not in evidence for violent game research given the now extensive literature on the systematic methodological flaws in that literature.

      In her comment Dr. Mathur dismisses the comparison with ESP research, but I disagree with (or dismiss?) this dismissal. The fact that effect sizes in meta-analyses for violent game research are identical to those for “magic” is exactly why we should be wary of interpreting such effect sizes as hypothesis supportive. Saying violent game effects are more plausible is irrelevant (and presumably the ESP people would disagree). However, the authors of DSComment might strengthen their argument by noting that some articles have begun examining nonsense outcomes within datasets. For example, in Ferguson and Wang (2019) we show that the (weak and in that case non-significant) effects for violent game playing are no different in predicting aggression than nonsense variables (indeed, the strongest effect was for the age at which one had moved to a new city). Orben and Przybylski (2019) do something similar and very effective with screen time. Point being, we have an expanding literature to suggest that the interpretation of such weak effects is likely to lead us to numerous false positive errors.

      The authors of DSComment might also note that MV2019 commit a fundamental error of meta-analysis, namely assuming that the “average effect size wins!” When effect sizes are heterogeneous (as Mathur appears to acknowledge unless I misunderstood) the pooled average effect size is not a meaningful estimator of the population effect size. That’s particularly true given GIGO (garbage in, garbage out). Where QRPs have been clearly demonstrated for some studies in this realm (see Przybylski & Weinstein, 2019 for some specific examples of documentation involving the Singapore dataset), the pooled average effect size, however it is calculated, is almost certainly a spuriously high estimate of true effects.

      DSComment could note that other issues such as citation bias are known to be associated with spuriously high effect sizes (Ferguson, 2015), another indication that researcher behaviors are likely pulling effect sizes above the actual population effect size.

      Overall, I don’t think MV2019 were very familiar with this field and, appearing unaware of the serious methodological errors endemic in much of the literature which pull effect sizes spuriously high. In the end, they really didn’t say anything we didn’t already know (the effect sizes across metas tend to be near zero), and their interpretation of these near-zero effect sizes was incorrect.

      With that in mind, I do think DSComment is an important part of this debate and is well worth publishing. I hope my comments here are constructive.

      Signed, Chris Ferguson

    2. [This was a peer review for the journal "Meta-Psychology", and I am posting it via hypothes.is at the journal's suggestion.]

      I thank the authors for their response to our article. For full disclosure, I previously reviewed an earlier version of this manuscript. The present version of the manuscript shows improvement, but does not yet address several of my substantial concerns, each of which I believe should be thoroughly addressed if a revision is invited. My concerns are as follows:

      1.) The publication bias corrections still rely on incorrect statistical reasoning, and using more appropriate methods yields quite different conclusions.

      Regarding publication bias, the first analysis of the number of expected versus observed p-values between 0.01 and 0.05 that is presented on page 3 (i.e., “Thirty nine…should be approximately 4%”) cannot be interpreted as a test of publication bias, as described in my previous review. The p-values would only be uniformly distributed if the null were true for every study in the meta-analysis. If the null does not hold for every study in the meta-analysis, then we would of course expect more than 4% of the p-values to fall in [0.01, 0.05], even in the absence of any publication bias. I appreciate that the authors have attempted to address this by additionally assessing the excess of marginal p-values under two non-null distributions. However, these analyses are still not statistically valid in this context ; they assume that every study in the meta-analysis has exactly the same effect size (i.e., that there is no heterogeneity), which is clearly not the case in the present meta-analyses. Effect heterogeneity can substantially affect the distribution and skewness of p-values in a meta-analysis (see Valen & Yuan, 2007). To clarify the second footnote on page 3, I did not suggest this particular analysis in my previous review, but rather described why the analysis assuming uniformly distributed p-values does not serve as a test of publication bias.

      I would instead suggest conducting publication bias corrections using methods that accommodate heterogeneity and allow for a realistic distribution of effects across studies. We did so in the Supplement of our PPS piece (https://journals.sagepub.com/doi/suppl/10.1177/1745691619850104) using a maximum-likelihood selection model that accommodates normally-distributed, heterogeneous true effects and essentially models a discontinuous “jump” in the probability of publication at the alpha threshold of 0.05. These analyses did somewhat attenuate the meta-analyses’ pooled point estimates, but suggested similar conclusions to those presented in our main text. For example, the Anderson (2010) meta-analysis had a corrected point estimate among all studies of 0.14 [95% CI: 0.11, 0.16]. The discrepancy between our findings and Drummond & Sauer’s arises partly because the latter analysis focuses only on pooled point estimates arising from bias correction, not on the heterogeneous effect distribution, which is the very approach that we described as having led to the apparent “conflict” between the meta-analyses in the first place. Indeed, as we described in the Supplement, publication bias correction for the Anderson meta-analyses still yields an estimated 100%, 76%, and 10% of effect sizes above 0, 0.10, and 0.20 respectively. Again, this is because there is substantial heterogeneity. If a revision is invited, I would (still) want the present authors to carefully consider the issue of heterogeneity and its impact on scientific conclusions.

      2.) Experimental studies do not always yield higher-quality evidence than observational studies.

      Additionally, the authors focus only the subset of experimental studies in Hilgard’s analysis. Although I agree that “experimental studies are the best way to completely eliminate uncontrolled confounds”, it is not at all clear that experimental lab studies provide the overall strongest evidence regarding violent video games and aggression. Typical randomized studies in the video game literature consist, for example, of exposing subjects to violent video games for 30 minutes, then immediately having them complete a lab outcome measure operationalizing aggression as the amount of hot sauce a subject chooses to place on another subject’s food. It is unclear to what extent one-time exposures to video games and lab measures of “aggression” have predictive validity for real-world effects of naturalistic exposure to video games. In contrast, a well-conducted case-control study with appropriate confounding control and assessing violent video game exposure in subjects with demonstrated violent behavior versus those without might in fact provide stronger evidence for societally relevant causal effects (e.g., Rothman et al., 2008).

      3.) Effect sizes are inherently contextual.

      Regarding the interpretation of small effect sizes, we did indeed state several times in our paper that the effect sizes are “almost always quite small”. However, to universally dismiss effect sizes of less than d = 0.10 as less than “the smallest effect size of practical importance” is too hasty. Exposures, such as violent video games, that have very broad outreach can have substantial effects at the population level when aggregated across many individuals (VanderWeele et al., 2019). The authors are correct that small effect sizes are in general less robust to potential methodological biases than larger effect sizes, but to reiterate the actual claim we made in our manuscript: “Our claim is not that our re-analyses resolve these methodological problems but rather that widespread perceptions of conflict among the results of these meta-analyses—even when taken at face value without reconciling their substantial methodological differences—may in part be an artifact of statistical reporting practices in meta-analyses.” Additionally, the comparison to effect sizes for psychic phenomena does not strike as particularly damning for the violent video game literature. The prior plausibility that psychic phenomena exist is extremely low, as the authors themselves describe, and it is surely much lower than the prior plausibility that video games might increase aggressive behavior. Extraordinary claims require extraordinary evidence, so any given effect size for psychic phenomena is much less credible than for video games.

      Signed, Maya B. Mathur Department of Epidemiology Harvard University

      References

      Johnson, Valen, and Ying Yuan. "Comments on ‘An exploratory test for an excess of significant findings’ by JPA loannidis and TA Trikalinos." Clinical Trials 4.3 (2007): 254.

      Rothman, K. J., Greenland, S., & Lash, T. L. (2008). Modern epidemiology (Vol. 3). Philadelphia: Wolters Kluwer Health/Lippincott Williams & Wilkins.

      VanderWeele, T. J., Mathur, M. B., & Chen, Y. (2019). Media portrayals and public health implications for suicide and other behaviors. JAMA Psychiatry.

    1. Introduction

      Introduction is a bit longer summary of the entire paper. This is where researchers describe and justify their research questions and briefly discuss what is to come. Typically, introduction is about 500 -- 1000 words.

      Please identify and highlight a research question(s).

  4. Aug 2019
  5. Jul 2019