    4. recipient of the Placide Nicod foundation

      something amiss here

    5. vaccine development
    6. overall mean of 8.7

      In such a skewed distribution, the mean does not have much of a meaning. The peak does, though, and the interpretation of it makes sense.

    7. Table 2.

      The DOI given for Fehr and Perlman should be "10.1007/978-1-4939-2438-7_1" , as correctly stated in https://zenodo.org/record/3901741/files/top_20_wiki_cited_doi_annotated_europmc.csv .

    8. WikiProject Medicine (WPM)

      WikiProject Medicine has been mentioned a number of times already, so introducing thee acronym here is odd.

    9. as was during

      as was the case during

    10. month

      perhaps plural is intended here?

    11. B

      typo in "WikiHypelink"

    12. Figure S4).

      perhaps say something about the time lag between the two - spikes in pageviews often trigger spikes in editing, which then trigger activity around references

    13. COVID-19 “Wiki project”
    14. 2020

      add comma

    15. strives for an especially rigorous sourcing policy

      striving for a policy is an odd framing here, especially if that policy (MEDRS) already exists

    16. sources from

      something missing

    17. 204 references

      This number is not obvious from Fig. 2D., which would suggest something like 160. What is missing?

    18. publisher

      should be plural

    19. publisher

      should be plural

    20. D

      WHO is listed once by acronym, once by name

    21. being


    22. A

      In the figure legend, "# Citation" should be "# Citations"

    23. in function of

      should be "as a function of"; also for other figures

    24. s


    25. S3

      It is confusing that Fig. S3 is mentioned before Fig. S2.

    26. p-value < 10−15

      give precise p value

    27. productmoment

      add space

    28. there is low anti-correlation (−0.2) but highly significant


    29. Table S3

      looks like Table 3 is meant here

    30. stay both up to date

      check talk pages; consider that, especially in the early phases of the pandemic, peer-reviewed research was often not available on some specific issues, so the information would either have to be based on other sources (e.g. media reports) or not included. Such things are discussed on the talk pages of individual Wikipedia articles and of the relevant WikiProjects.

    31. peerreviewed

      add dash

    32. preferred

      perhaps the wrong word if it is a policy

    33. generally cites preprints more than it was found to on the topic of COVID-19

      A simple explanation for this is that Wikipedia has a lot of content on topics for which preprints are (or at least have traditionally been) more popular than in medicine and biology.

    34. by more 10 %

      this is weird grammar

    35. The later


    36. fraction of preprints

      the percentages given actually do not refer to the preprints, just the red slice does - this should be clarified.

    37. themsleves


    38. preprints

      arXiv has been around since 1991, and preprints had been around before arXiv

    39. percentag

      Note that the open access percentage in the overall Wikipedia corpus is below that of the COVID-19 corpus is not surprising because the overall corpus tends to include more references from before 2020, and the percentage of open access is growing over time.

    40. extensive

      that's probably not the best fit here

    41. scarped


    42. impact factor of over 42

      citation needed

    43. (Supplementary figure S1A)

      metadata about the supplementary files is scarce

    44. dump

      in the legend for Fig. S1C, "May" should be upper case

    45. S1B

      there seems to be some figure confusion here, along with a superfluous parenthesis

    46. Twitter and Facebook

      consider putting this in italics, similar to the journal names

    47. s


    48. Dr. Anthony Fauci

      Not sure what that degree does here - the Wikipedia article title does not contain it: https://en.wikipedia.org/wiki/Anthony_Fauci .

    49. Charles Prince of Whales

      Here, the comma is missing. It is present in the CSV used for visualization, which contributes to minor formatting problems.

    50. Together

      add comma

    51. affiliated with WikiProject COVID-19

      I think what is meant is https://en.wikipedia.org/wiki/Category:WikiProject_COVID-19_articles . A link, citation or footnote would be useful.

    52. community-created COVID-19 template

      there are actually several such templates, but I assume the one referred to here is the navigation template https://en.wikipedia.org/wiki/Template:COVID-19_pandemic . A link, citation or footnote would be useful.

    53. the Github repositories

      Please archive the repositories on Zenodo as well, e.g. as described at https://guides.github.com/activities/citable-code/ .

    54. 10.5281/zenodo.3901741

      Nothing wrong with citing this:

      Sobel, Jonathan, Benjakob, Omer, & Aviram, Rona. (2021). A meta analysis of Wikipedia's coronavirus sources during the COVID-19 pandemic (Version 0.1) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3901741

      Alternatively, simply hyperlinking would be useful: http://doi.org/10.5281/zenodo.3901741 .

    55. retrieve any Wikipedia article and its content, both in the present - i.e article text, size, citation count and users - and in the past - i.e. timestamps, revision IDs and the text of earlier versions

      This functionality has high potential for being reused by others (including for replication, e.g. for running the numbers until May 2021), and perhaps even be expanded on. Kudos to the authors for creating and sharing such a resource! Would be good to share a quick tutorial too (say, as a Jupyter notebook or RMarkdown file), with a complete workflow, from the MediaWiki API calls to the timeline and network visualization.

    56. below

      The date of publication is often ambiguous (e.g. "online first" versus "PDF-only" versus full-text versus paper version, or actual versus scheduled publication date) or even incorrect. How should this be taken into account?

    57. in years

      a scale of years is not useful at the beginning of a pandemic

    58. such

      I suggest to replace the hashtag notation in the equation with something other of a less informal nature.

      At this point, it is not clear whether "references" in equ. 1 refers to "non-DOI" references or "total of DOI references and non-DOI references" ==> this info comes too late - see "Ranging from 1 to 0" in the "Scientific Score" section

    59. A)

      Consider using a logarithmic axis for the citations.

    60. centralized nature

      A key thing here is probably the combination of decentralized activity with some central coordination (including through shared values, principles and workflows).

      In https://arxiv.org/abs/2006.08899 , this results in what they called "coherence of collaboration".

    61. using

      add "the"

    62. DOI

      should be plural

    63. release

      should be plural

    64. Moreover

      add comma

    65. Table 1.

      It is annoying to the reader that BioRxiv renders tables as bitmaps.

    66. citation count were also analysed to help gauge academic quality

      equating citation count with academic quality is problematic

    67. set of regular expressions

      What were the sources for these? Note that Wikidata also provides such regexes, e.g. at https://www.wikidata.org/w/index.php?title=Property:P957&oldid=1429624446#P1793 .

    68. count

      should be plural

    69. in the COVID-19 Wikipedia corpus

      add comma

    70. as well as the papers’ authors

      This still belongs to the "retrieved" clause. You probably did not retrieve "the authors, publishers [etc.]" but metadata about them, which may have unique and/ or ambiguous components.

    71. For the dump and the COVID sets, the latency was computed (to gauge how much time had passed from an article’s publication until it was cited on Wikipedia), and for all three sets we retrieved their articles’ scientific citations count (the number of times the paper was cited in scientific literature), their Altmetric score, as well as the papers’ authors, publishers, journal, source type (preprint server or peer-reviewed publication), open-access status (if relevant), title and keywords.

      This sentence is too long - please chop it up.

    72. for all three sets

      add comma

    73. Altmetric

      add link/ ref

    74. merited

      perhaps also mention that anyone can edit Wikipedia

    75. scientific citations count

      unclear how this was done. This is also relevant to Table 3.

    76. the latency was computed (to gauge how much time had passed from an article’s publication until it was cited on Wikipedia

      This way of defining the latency in the parenthesis is a bit odd. Perhaps cite ref. 8 here already, which is currently introduced later.

    77. three DOI sets

      refer to the visualization, as per the previous comment at https://hypothes.is/a/-hG2Gsf1Eeu-0ecpwAEkmA

    78. while “corpus” describes the body of Wikipedia articles, “sets” is used to describe the bibliographic information relating to academic papers (like DOIs).

      mention Fig. S1A, as it helps with understanding this. Alternatively, use a table

    79. Europmc

      EuroPMC or Europe PMC

    80. Wikipedia COVID-19 corpus - the dump from May 2020, the COVID-19 Corpus and the scientific sources from the Europmc COVID-19 search

      This is rather confusing to parse.

    81. the distribution Altmetrics score in Wikipedia COVID-19 corpus

      the distribution of Altmetrics scores in the Wikipedia COVID-19 corpus

    82. The resulting “COVID-19 corpus” comprised a total of 231 Wikipedia articles

      clarify how that number came about: explain the steps more clearly

    83. mwcite

      add link/ citation

    84. This set was

      Better: "The DOIs from this set were"

    85. retrieved

      add by when, to clarify the time horizon of the current study.

    86. COVID-19, SARS-CoV2, SARS-nCoV19 keywords

      Others have looked into search strings more comprehensively, e.g. https://doi.org/10.2196/23449

    87. EuroPMC

      the database is called "Europe PMC" or occasionally "PMC Europe". The R package EuroPMC accesses data from Europe PMC.

      Also, adding a link to or a reference about Europe PMC may be useful

    88. guaged


    89. WikipediR

      add link and/ or citation

    90. WikiCitationHistoRy

      Cite the software properly:

      • create an official release with a version number
      • archive a copy of the release on Zenodo
      • cite the DOI of the Zenodo archive, along with the version number
    91. WikiProject COVID-19

      Here, it would be useful to add a link: https://en.wikipedia.org/wiki/Wikipedia:WikiProject_COVID-19 .

      Disclosure: I am actively involved.

    92. Identifier

      this should be plural

    93. maintain high standards

      such standards have not been introduced to the manuscript yet, so this statement comes a bit out of the blue.

    94. surge in editing activity

      citation needed. The relationship between news and Wikipedia editing has been studied before, including in the context of the current pandemic - see e.g. https://arxiv.org/abs/2006.08899 and references therein.

    95. rigid sourcing policy

      The key policy here is https://en.wikipedia.org/wiki/WP:MEDRS . It is referred to in several indirect ways throughout the manuscript but gets only one explicit mention, (towards the end) and no link or reference.

    96. scientificness in what we term an article’s Scientific Score

      this is problematic - see comments below

    97. coronavirus

      Perhaps clarify that the term is used in this text as essentially interchangeable with "COVID-19-related".

    98. after the pandemic broke out

      add comma

    99. add comma

    100. references

      the motivation for looking at references may not be obvious to some readers, e.g. since most news articles do not provide references.

    101. Wikipedia

      perhaps refer back to it as that “key tool for global public health promotion”, so as to make the sentence flow better.

    102. scientific topics like

      perhaps better: "emerging scientific topics like" - otherwise, it's not clear what the "like" refers to

    103. academic

      perhaps better say "preprint" here

    104. coronavirus

      this is ambiguous

    105. WHO

      Perhaps this acronym does not need to be explained in this context, but it is nevertheless good practice to explain them upon first usage.

    106. Wikipedia has over 130,000 different articles relating to health and medicine (1)

      Not sure which Wikipedia or Wikipedias are referred to here, or where that number comes from. The cited source is more specific:

      "Wikipedia had 155,805 medical articles across 255 natural languages at the end of 2013. A further 31 languages did not contain any medical articles per our methodology. Of the more than 155,000 articles, 29,072 (18.66%) were in English."

    107. on the COVID-19

      add "pandemic"

    108. of


    109. Abstract

      The PDF also has a set of keywords - not sure where to find these in the HTML version.

      They were "COVID-19 | Wikipedia | Infodemic | sources", and I think the latter is a bit too ambiguous to be useful.

    110. In future work, we hope the tools and methods developed here in regards to the first wave of the pandemic will be used to examine how these same articles fared over the entire span of 2020, as well as helping others use them for research into other topics on Wikipedia.

      I hope so too, and I might use the tool chain myself, so I would like to encourage the authors to share it more completely.

      This popup does not provide sufficient resolution to actually access the information contained in the file in any useful manner.

    112. we could not properly clean redundant entries (i.e “WHO”, “World Health Organisation”)

      these two strings would seem straightforward to map to each other, which would also enhance Fig. 2D

    113. with published paper

      add "the"

    114. articles

      drop the s

    115. announced

      add link

    116. dual usage of established science and a community of volunteers

      don't forget the shared values and open infrastructure

    117. Wikipedia


    118. ntegrated into these articles in the near future

      There is also https://en.wikipedia.org/wiki/Wikipedia's_response_to_the_2019%E2%80%9320_coronavirus_pandemic , which may well end up incorporating some of the materials from the present manuscript.

    119. whether this dynamic changed as 2020 progressed

      If the workflows described here in the paper were shared more comprehensively, it should be relatively straightforward to rerun the analysis to include times after the period considered here.

    120. decrees


    121. ,


    122. decision by academic publications’ like Nature and Science to lift paywall and open public access

      This blurs the meaning of the term "open access", which does not include temporary lifts of paywalls.

    123. openaccess

      add dash

    124. rigorously implemented across thousands of articles

      perhaps clarify that most of those have never been "locked" in any way.

    125. special status and preference

      "prominence" is perhaps a better term in this context.

    126. (20).

      The URL in this link is broken.

    127. prevents anonymous editors


    128. MEDRS

      This is the first concrete mention of MEDRS, even though it was insinuated, alluded to or otherwise invoked on several occasions above.

      The policy should thus be introduced earlier, and with a link: http://en.wikipedia.org/wiki/WP:MEDRS .

    129. outbreak

      better: "course" or similar

    130. was


    131. Chinarelated

      add dash

    132. for


    133. supplementary data (3)

      Not clear what the revision ID refers to. Also, CSV has some formatting issues.

    134. we observed six prominent Wikipedia articles emerge in this network

      It's not clear how the prominence of these six (and their number) was determined - bot from the enlarged part in Fig. 4 and from the inset of the full graph, other numbers could well be a reasonable choice.

    135. driving millions to the article and subsequent ones like those in our network.

      citation needed

    136. special banner located on the top of every single article in English

      The banner was about a message from the Wikimedia Foundation.

    137. placed on the English Wikipedia’s homepage

      There is a process by which Wikipedia articles pertaining to current news can be linked from the Wikipedia homepage, and this process came into play here.

    138. here

      link is missing

    139. is


    140. supplementary data (2))

      the weight column in the CSV is not explained

    141. indicating a decrease in scientificness over time.

      This needs brushing. While correct in terms of the definition of "scientificness" given in equ. 1, this phrase leaves too much room for misinterpretation.

    142. article


    143. article


    144. happen


    1. import pandas as pd

      Apparently, Hypothesis cannot annotate the In [8]: text, so I am using this import command to comment on the cell numbering.

      For sharing executed Jupyter notebooks, it is important to do at least one of the following:

      • keep a full version history of the states of each cell (e.g. via ProvBook)
      • Before sharing, clear all outputs, restart the kernel and rerun all cells

      Neither was done here, so it is theoretically possible that some previous state of one of the cells could influence the calculations shown in the notebook.

      I do not think this problem occurred here, but such occurrences are hard to review, so it is best to follow the practices outlined above.

    2. 0 0.000011 0.000010 0.000011 owens_lake_T8-W_P1 1.0 1 0.000018 0.000017 0.000019 owens_lake_T8-W_P1 2.0 2 0.000005 0.000004 0.000005 owens_lake_T8-W_P1 3.0

      Compared to the version I ran, the sauter_diameter dataframe had a different order.

      It would probably be advisable to avoid that by defining the order in some way.

    1. Pause. Scientists for Future hat vom 18. Dezember bis einschließlich 6. Januar geschlossen. Vom 14. bis 18. Dezember arbeitet nur ein Teil des Teams.Wir wünschen allen eine ruhige und schöne Zeit zwischen den Jahren und alles Gute für 2020!

      Time for an update

    1. Policymakers in Guinea

      Given that the Case studies were broken down by country, I would have expected at least the first paragraph of this section to bind the Case studies together from a policy diffusion/ policy learning perspective (e.g. in terms of more or less strenuous conditions during outbreaks or 'peacetime'), before zooming back in to highlight certain aspects from the Case studies.

    2. Similarly

      Odd way of referring to a statement not in the previous sentence but in the sentence before that.

    3. s

      delete the "s"

    4. -

      Not sure what that dash is doing here.

    5. SOP

      I assume this stands for "Standard Operating Procedures", but it was not defined as such.

    6. broaden the definition of a ‘researcher’ to include a molecular biologist and basic science researcher, and to widen the scope of research ethics

      In order to adapt to new contexts, policy diffusion often triggers such semantic drift of key concepts.

      Would be great to see that linked to the policy learning framework.

    7. Ideally, international research collaboration that involves the sharing of biological materials and data should contribute to capacity building, which includes the capability of an ethics committee to support ethically sound arrangements that engender credibility and trust22.

      Perhaps some comments as to how the international guidelines cover local training and capacity building would be a useful addition.

    8. CIOMS Guidelines serve as a helpful reference in the drafting of a new regulation

      Good example of policy diffusion

    9. In international collaborations, an agreement may be imposed on local researchers with no possibility of negotiating favourable terms on confidentiality, intellectual property rights, return of results and benefit sharing.

      Clear example of equity being neglected

    10. appropriately drafted material transfer agreement (MTA), as there is currently no legal requirement to that effect

      Are there templates available for "appropriately drafted MTAs"?

    11. based on the recommendations and standards set out by international organisations like the World Medical Association and CIOMS

      Reference to policy diffusion

    12. Regarding recommended practices in international ethical policy documents, these are not sufficiently disseminated or internalized, hence gaps still exist in relation to best practices and critical aspects of data practices. To address this challenge, it is not only essential to disseminate and promote these policies, but to also adapt them to the contexts and situations where they are applicable through training and capacity building.

      Given that the article is framed as being about policy diffusion and using a policy learning framework, I would have expected more details here.

    13. Urgency Operation Center

      This seems to exist already, but still I could not find it.

    14. Coordination Cell Unit

      Does this exist already? I could not find it.

    15. greater integration of data, data security, and data sharing through the establishment of a searchable database.

      Would be great to connect these efforts with others who work on this from the data end, e.g. RDA as mentioned above.

      Also, the presentation at http://www.gfbr.global/wp-content/uploads/2018/12/PG4-Alpha-Ahmadou-Diallo.pptx states

      This data will be made available to the public and to scientific and humanitarian health communities to disseminate knowledge about the disease, support the expansion of research in West Africa, and improve patient care and future response to an outbreak.

      but the notion of public access is not clearly articulated in the present article.

    16. 12

      This is a nice conference report, but are there any slides or recordings available from the event as well?

    17. platform

      Does it have a name and online presence? The details provided here go beyond what's given in reference 13, but some more detail would still be useful, e.g. to connect the initiative to efforts directed at data management and curation more generally, for instance in the framework of the Research Data Alliance, https://www.rd-alliance.org/ .

    18. establishment of a post-Ebola crisis biological materials and data-sharing platform

      Very useful initiative!

    19. Key ethical goals of an integrated platform to access biological samples and related data have early-on in the discussions been identified as protection of human rights and transparency, equitable service delivery and reduction of the information gap within the scientific and medical communities.

      Odd word order

    20. The members of the Steering Committee sit for a three-year term, and is renewable once.

      The sentence is non-grammatical.

    21. apply

      Some more details would be useful here, perhaps in a dedicated Methods section.

    22. four theories

      What about at least mentioning the other three?

    23. Manjulika Vaz1*

      ORCID is missing

    24. meeting of the Global Forum in Stellenbosch, South Africa, on 13 and 14 November 2018

      I followed this remotely via Twitter (don't know whether other remote channels were available). My notes on this sit at https://github.com/Daniel-Mietchen/events/issues/508 .

    25. In three of these countries (i.e. Guinea, Argentina and India), the CIOMS Guidelines have had direct influence over their domestic governance policies on the subject. Its impact was greatest for Guinea and Argentina, whose governance policies had to be adapted in response to the Ebola virus epidemic in West Africa and the Zika virus epidemic in Latin America. In both countries, sharing of biological materials and related data with international organisations increased significantly to meet therapeutic and research needs during the outbreaks. International organisations have had a comparatively greater role in bringing about policy change in Guinea when compared with Argentina, mainly due to the fragility of the health system in Guinea in 2014. In contrast, policy in India and in Malawi occurred under less strenuous conditions. This may account for the relatively greater emphasis on control and limits to cross-border transferability in their policies when compared with those of Guinea and Argentina.

      I would have expected the Background section to set the stage for the case studies, not to sum up their results.

    26. 1

      Given that this document cites a number of non-persistent web resources, I have archived a copy of https://wellcomeopenresearch.org/articles/4-170/v1 at http://web.archive.org/web/20191224000829/https://wellcomeopenresearch.org/articles/4-170/v1 using the "Save outlinks" mode.

      Probably a good idea to do this routinely for all articles in the journal.

    27. https://apps.who.int/iris/bitstream/handle/10665/205944/WHO_HIS_SDS_2016.2_eng.pdf;jsessionid=A4CF65ABC4B7A3FF8C19502C4EF9905F?sequence=1.
    28. jsessionid=A4CF65ABC4B7A3FF8C19502C4EF9905F?sequence=1.

      Not sure what the purpose of quoting with session ID is here, as these are usually non-persistent and not usable by anyone other than the website operator.

    29. explain as follows

      It could be made clearer that the following is quoted directly from the CIOMS document, and specifically the Governance section of the Commentary on Guideline 11.

    30. CIOMS Guidelines

      It could be made clearer that this refers to the document cited in Footnote 1.

    31. https://www.who.int/blueprint/what/research-development/guidance_for_managing_ethical_issues.pdf?ua=1.
    32. Available at: https://cioms.ch/wp-content/uploads/2017/01/WEB-CIOMS-EthicalGuidelines.pdf.

      This URL seems very unstable, so I archived the file at http://web.archive.org/web/20191223233751/https://cioms.ch/wp-content/uploads/2017/01/WEB-CIOMS-EthicalGuidelines.pdf .

      In general, it is good practice to provide not just links but also an archived version when citing a URL.

      Of course, it would be even better if policies themselves were FAIR (Findable, Accessible, Interoperable and Reusable), as discussed, for instance, in https://github.com/Daniel-Mietchen/events/blob/master/PIDapalooza-2018.md .

    33. Athula Sumathipala
    34. St John's Medical College and St John's Research Institute

      As far as I can tell, the two are (nowadays at least) separate entities, so why are they listed under the same affiliation?

    35. Alpha A. Diallo https://orcid.org/0000-0001-5149-24454*

      That ORCID profile has this paper as its only entry. Is this correct?

    36. Ana G. Palmero https://orcid.org/0000-0002-5781-90712*

      That ORCID profile has this paper as its only entry. Is this correct?

    37. OPP1151904

      Would be nice if that identifier would lead somewhere useful. A web search for it yielded https://doi.org/10.12688/wellcomeopenres.15442.1 , which is also included in the collection "GFBR: The ethics of data sharing and biobanking in health research" available via https://wellcomeopenresearch.org/collections/gfbr18 .

    1. These findings show that WDR60 mutations can cause skeletal ciliopathies and suggest a role for WDR60 in ciliogenesis.

      This is referenced in the Wikidata entry about WDR60 at https://www.wikidata.org/w/index.php?title=Q21124736&oldid=619612847#P682 , which states that WDR60 (Q21124736) is involved in the biological process (P682) of embryonic skeletal system morphogenesis (Q14886895).

    1. 10.

      I'm missing something about making policies themselves more FAIR, which we had included in an earlier version of these draft principles.

    1. I left a number of annotations here as part of my review of the paper.

      The review is accessible via https://doi.org/10.21956/wellcomeopenres.13272.r25804 , and the notes I took on the process can be found at https://github.com/Daniel-Mietchen/ideas/issues/494 .

    2. and


    3. The ethics committee will pass those requests it considered reasonable to the corresponding author for execution.

      That's not a long-term solution - what if someone comes along a few years down the line, or 20?

    4. the documents were coded

      looks like a missing "that"

    5. .tab

      i.e. tsv

    6. in the public domain

      not in the copyright sense