10,000 Matching Annotations
  1. Apr 2023
    1. In 2016, Microsft launched a Twitter bot that was intended to learn to speak from other Twitter users and have conversations. Twitter users quickly started tweeting racist comments at Tay, which Tay learned from and started tweeting out within one day.

      Is there any way that a bot can go uncorrupted? Or does the code have to be edited manually by the people behind it so that it can continue on with new functions to avoid being corrupted?

    2. It seems nefarious that a bot like this was created without programming some sort of hate detection into its code. Is it hard to regulate bot hate speech? With the rise of AI, this seems like a prevalent conversation to be having

    3. Most social media platforms provide an official way to connect a bot to their platform (called an Application Programming Interface, or API). This lets the social media platform track these registered bots and provide certain capabilities and limits to the bots (like a rate limit on how often the bot can post).

      Is that talking about the "claiming you are not a robot" part every time I try to log in to my account? Like "choosing the pictures below related to the traffic lights" or sending a verification code to your phone. Now I think it is a necessary part to protect the already fragile internet environment.

    1. In a cooking recipe, the equivalent would be spaces, containers, bowls, or cups to hold ingredients. So you might place the ingredients on the counter in preparation for cooking. Or you might combine some ingredients in a mixing bowl, so the mixing bowl holds the combined ingredients through each step, like:

      As a person who never wrote a single line of code before, i am very appreciate that the textbook is written this way to give me a clear introduction of how coding works. This really helped of illustrating the concepts for me.

    2. Psuedocode is intended to be easier to read and write.

      I feel like psuedocode is very helpful for those who just starting to learn coding. The structure and language of actual code can be confusing at first, but it become easier once you understand the logic.

    3. Pseudocode is often used by programmers to plan how they want their programs to work, and once the programmer is somewhat confident in their pseudocode, they will then try to write it in actual programming language code.

      This connects to the analogy about language translation we learned about in class. By breaking language into pieces we can understand, it becomes easier to grasp as a concept. This has for sure helped my understanding.

    4. Psuedocode is intended to be easier to read and write. Pseudocode is often used by programmers to plan how they want their programs to work, and once the programmer is somewhat confident in their pseudocode, they will then try to write it in actual programming language code.

      That is true! I took CSE142 before, and Psuedocode is a useful tool for programmer to plan how the code should work. It is really a easier version to read, and help people set up a great fundamental skill on programming.

    5. - Whenever someone tags me in a post - like their post which has me tagged

      that's true. Such program similar with the contents I took from CSE 121. For java, they will divide the whole sentence said form the human into "system slices" that follow the principles of the programming code. Also, since the whole sentence is too long for the computer, they would like to split them into several lines which should be easier for the computer to recognize and run.

    1. So, for example, if we made a form that someone needed to enter their address, we could assume everyone is in the United States and not have any country selection.

      What's interesting is that the internet is very US centric probably because the biggest internet companies were founded in the US. But online forms pigeonhole the rest of the world into a US centric limitation like for example addresses assuming City, State, Zip Code even if the user is from a country where States aren't a thing.

    1. Bots present a similar disconnect between intentions and actions. Bot programs are written by one or more people, potentially all with different intentions, and they are run by others people, or sometimes scheduled by people to be run by computers. This means we can analyze the ethics of the action of the bot, as well as the intentions of the various people involved, though those all might be disconnected.

      The ethical considerations surrounding bots are very complex, as they involve multiple layers of responsibility and intention. Analyzing the intentions of the human who code the bots' program or even users of the bots highlights the need for a comprehensive understanding of the ethical implications of automation.

    2. Bots present a similar disconnect between intentions and actions. Bot programs are written by one or more people, potentially all with different intentions, and they are run by others people, or sometimes scheduled by people to be run by computers.

      Bots themselves are programs made up of code with no self-awareness, but the people who create and write them often do so with different personal intentions. Even if the creators themselves are relatively impartial, it is still difficult to ensure that those people who use these bots do not have specific purposes. Therefore, it is necessary and useful to analyze the ethics of robots by analyzing the behaviors and intentions of those people involved.

    1. We will not consider those to be bots

      This is very interesting. I always thought that any automated response system was referred to as bot accounts but to know that these are actually humans is very curious. In this case there is really no difference between being a code and being a human paid to post these things.

    1. Daniel老爷子回忆了自己三十多年的编程经历,他发现在修改代码时,好的代码会给人一种非常愉悦的感觉。你可以轻松找到需要修改的地方,而且,那个地方的代码是如此的易于理解,以至于一眼就能看出来代码在干什么。你可以很自信的完成修改,并且确信不会引入额外的副作用。代码是那么的鲜活,它会主动的指引你去你想去的地方,并且热情的欢迎你四处游览,就像在你熟悉的家里一样!

      评判维度

      常用评价

      1. 可维护性(maintainability) 🚩
        • 生手能快速进行维护;
      2. 可读性(readability)🚩

        • 简单来说,code review给人感受
        • Any fool can write code that a computer can understand. Good programmers write code that humans can understand.
      3. 可扩展性(extensibility)🚩

        • 代码应对未来需求变化的能力
      4. 灵活性(flexibility)
        • 宽泛易扩展、易复用或者易用
      5. 简洁性(simplicity)
        • 思从深而行从简
      6. 可复用性(reusability)
        • 语言特性、原则、技巧都为了这个目标
      7. 可测试性(testability)
        • 简单的写出单测
    1. // this field will contain additional metadata pertaining to success/failure. eg: for success it can be opened, clicked, etc.

      we can have this code comment next to the DeliveryTime parameter.

    1. pre

      By going back to these original “black box” schematics of the human mind, she shows why information was initially separated from meaning, why thought was translated as an electronic signal or code, and patterning or predictability became the reason for being of communication. (p. 25) Although all these changes were brought about by pragmatic decisions made toward abstracting a logical model of human thought, each had far-reaching cultural consequences as the emerging technologies spread through society.

    1. After struggling with this problem for a while and still being far from solving this issue, I realized that I was making too many requests to the website; which made me come up with the idea of saving all the pages I needed to scrape on my local computer. Next, I started sending requests to these local HTML files instead and kept adapting my code.

      I had similar problem on this.

    1. . I define these domains of professional competence in Chapter 11 of Foundations, specifically on pages 341-351. I did this for you. You can use the description of the required “Domains of Professional Competence” for the pathology of an attachment-based model of “parental alienation” (i.e., attachment-trauma reenactment pathology mediated by narcissistic/borderline personality pathology) to establish the boundaries of professional competence required under Standard 2.01 (and 9.01) of the Ethical Principles of Psychologists and Code of Conduct of the American Psychological Association.
    1. I tried my best to persuade Apple to delay it, but I only got still-fairly-vague wording around it being likely to ship as it was.

      Huh? Why? Why even waste the time? Just go fix your code.

    2. Firstly my understanding of the purpose of specs was to preserve web compatibility - indeed the HTML Design Principles say Support Existing Content. For example when the new Array flatten method name was found to break websites, the spec was changed to rename it to flat so it didn't break things. That demonstrates how the spec reflects the reality of the web, rather than being a justification to break it. So my preferred solution here would be to update the spec to state that HTML canvas and OffscreenCanvas should support the same contexts. It avoids the web compatibility problem we faced (and possibly faced by others), and also seems more consistent anyway. Safari should then delay shipping OffscreenCanvas until it supported WebGL, and then all the affected web content keeps working.

      This is a huge reach.

      Although it's debatable whether having mismatched support is a good idea for a vendor, arguing that it breaks the commitment to compatibility is off. Construct broke not because something was removed, but because something was added and your code did not handle that well.

    1. How easy is it for users to accomplish basic tasks the first time they encounter the design?

      Data Camp, a programming application meant for users of all experience, is great with this notion specifically. It has recently made transitions into a mobile app as well, which I think has only increased learnability. Allowing students to code via their phones opens up room for more learning, because users as comfortable with the tech involved.

    1. self-intersections can occur.

      what happens if this occurs? currently the code detects this and stops in a controlled manner so that the user can repeat the analyses with a smaller value

    1. l code-davinci-002 and our student modelFlanT5 use different tokenizers, we address the tokenizeralignment problem by dynamic programming

      what ?

    1. time_stamp: (required) NX_DATE_TIME ISO 8601 formatted time code with local time zone offset to UTC information included when this results file was created.

      this is a bug in this appdef, paraprobe-toolbox==0.4 writes out start_time and end_time like for all other NXapm_paraprobe_results_* appdefs these timestamps are again ISO8601 compliant timestamps

    1. Look for alternatives6.Is there another tool/app that will do the same job that ismore protective of your students’ privacy? For example,Pencil Code, a coding tool, does not collect or allow thesharing of any personally identifiable information. Toolsfunded by external sources (e.g., grants) may not collectpersonally identifiable data because they are notexpecting a return on investment.

      This does not seem feasible, at least in my case. It is unfortunate, but there is so much to consider when choosing tools and resources. With the rise of the cheating tools in math, I am always looking for different question types that will steer students in the direction of doing their own work. It is also the case that while there are some websites that are rather comprehensive in their array of questions, I still find the need to rely on more than just one to cover the required standards as best I can.

      Now as far as how protective tools/apps are of student privacy I have not yet found any that can do the same job or do it as well. What is available that meets the privacy criteria is better served for review or intervention purposes rather than for a main assignment.

      For this reason, I hope that MOOC and OER continue to develop at a rapid pace in terms of math so that students have access to quality tools that are not built on the preying or collection of personally identifiable information.

    1. Put another way, you need a sidecar database. The data moat needs to be fast and queryable. This is a Search Problem!

      Ummm can't you guys separate the codebase and have the different instances of the LLM talk to one another, for example having a frontend and backend part of the code base. Or with Express handeling routing, auth, URL parsing and the like

    2. In an ideal world, you’d just pass your entire code base in with each query. In fact, Jay Hack just tweeted a graph showing how the latest context window size in GPT-4 compares to some popular code bases:

      Oh shit popular code bases actually fit into GPT4. Imagine if we wrote nice and concise programs that followed the unix philosophy!?!?!?!?!

    3. You talk to LLMs by sending them an action or query, plus some relevant context. So for instance, if you want it to write a unit test for a function, then you need to pass along that whole function, along with any other relevant code (e.g. test-fixture code) so that it gets the test right.

      Can we tell ChatGPT to write Action Queries? This thing has agency hidden inside it I can feel it. ChatGPT does feel like an AI from neuroancer though, and not the cool ones

    4. If you can build something as big as Amazon Web Services with a stack based on a simple service call, or whole social networks and customer service suites based on simple browser-to-browser communication, or a robust way of delivering and managing software based on a little process isolation code, then just imagine how big a thing you could build – bear with me here – if you had the goddamn Singularity as your starting point?

      Phrasing

    1. Specialty: Qualified Residential Treatment Program (QRTP)Specialty Code: 689 Enrollment Type: Facility Each service location must complete a separate application. A QRTP cannot be enrolled as a Residential Child Care Facility (RCCF-provider type 52) at the same time. Required Attachments: Department of Human Services, Office of Early Childhood, Division of Early Care and Learning, Time Limited Child Care License indicating Service Type as: Qualified Residential Treatment Program. Accreditation by the Joint Commission, the Commission on Accreditation of Rehabilitation Facilities, or Council on Accreditation of Services for Families and Children. Attestation Form for Facilities Enrolling with Health First Colorado (RCCF & QRTP), which can be found on the Provider Forms web page under the Provider Enrollment & Update Forms drop-down, must be completed and attached for in state only. W9 (signed and dated within the last 6 months). Voided business check (no temporary checks or deposit slips) or bank letter (dated within the last 6 months). Malpractice/Liability insurance information must be entered in the application; however, proof of insurance is not a required attachment.

      QRTP licensed facilities Colorado

  2. localhost:3000 localhost:3000
    1. Hacking is a bit more problematic. It suggests a more free-form approach to programming, such as ignoring guidelines on how to structure code to be readable to others. Moreover, the term also encompasses the exploitation of computer technology for committing crimes. So it is probably best to avoid using it to refer to the practice of writing programs.

      so true! dont hack

  3. Mar 2023
    1. as a preview of what might happen with a full accounting of uncertainty. Code, data, and modified workbook are available.

      This seems to be done in the file sensitivity analysis.R, pulling parameters from the linked Gsheet

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC- 2023-01819

      Corresponding author(s): Gernot Längst and Harald Wodrich

      Full revision of the manuscript

      1. General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      2. Point-by-point description of the revisions

      Dear Reviewers, thank you very much for your appreciation of our study and your input. In this point-to-point response, we amended our text marked in blue colour.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The authors have addressed the nucleoprotein structure of human adenovirus during the very early stages of infection, and its relationship to onset of expression of viral genes, using a combination of RNA-seq, MNase-seq, ChIP-seq and single genome imaging. They show that in the virion and the newly-infecting DNA, protein VII is precisely position at specific sites on the viral DNA, with greater accessibility at early gene promoters compared to other regions. Nucleosomes containing H3.3 replace specific protein VII at distinct positions at the transcription start sites of genes, which are then acetylated. Association with histones and nucleosomes occurs prior to transcription. These studies confirm and greatly expand on results already in the literature, and also elucidate a novel role for protein VII in orchestrating positioning of nucleosomes prior to initiation of transcription.

      The authors provide excellent data in support of their conclusions and, in many instances, use alternative experiments (i.e. two different approaches) to support their claims. The details of methods are adequate (with small exceptions outlined below) and statistical methods appropriate.

      Minor comments:

      Line 561 "Protein VII molecules were exchanged for positioned nucleosomes at the +1 site of actively transcribed genes". This statement seems to suggest that the +1 position almost acts as a nucleating site, where replacement of a single, specific protein VII molecule at +1 is an initiating event, which then spreads from that site and into the rest of the gene. Data shown in Figure 6G and 6H shows that H3.3 appears to be found equally along the full length of E1A as early as 1 hr post infection (with no real "enhancement" at the +1 position), and that the overall levels simply increase over the next 4 hrs.

      As the reviewer pointed out, the histone ChIP-seq peaks are broader than the +1 nucleosome region, extending into the transcribed regions of the gene. This is expected, as the mean length of the immunoprecipitated DNA is about 400bp long. Still, ChIP-seq peaks are in proximity to the transcription start site and overlap with the position of the +1 nucleosome. As we do not have the required resolution, we toned done our statement. The text now reads as follows: “Protein VII molecules were exchanged for nucleosomes downstream of the transcription start site, overlapping the +1 nucleosome site, of actively transcribed genes“ (line 568 ff).

      Curiously, the authors chose not to use a wildtype virus for their studies - the virus contains a deletion in the E3 region. For clarity, I suggest that the authors should preferentially use an alternative designation for their virus rather than HAd-C5. Perhaps HAd-C5delE3 to differentiate this work from studies that truly use wildtype virus.

      As requested by the reviewer we have updated the nomenclature to HAd-C5dE3 throughout the text and the figures.

      The obvious limitation of the studies using the fluorescent TAF1-beta to label Ad genomes is that as protein VII is replaced by nucleosomes, the genomes would have declining detection by this method. Genomes devoid of protein VII would be "invisible".

      Our MNase data show that within the first 4h only a fraction of pVII is removed from the viral genome e.g. at early genes, while most of the genome remains bound by protein VII. This should provide enough binding sites for TAF1-beta to label Ad genomes without a significant drop in the signal. Furthermore, our recent work (PMID:29997215, Fig. 1D) compared the TAF1-beta labelling system with a second in vivo detection system (AnchOR3) that directly labels the viral DNA independently of protein VII in the same cells. This direct comparison of two technically non-related methods to detect individual incoming adenoviral genomes in living cells showed the equivalence of both methods, at least for the first hours of infection showing that partial removal of protein VII does not affect the fluorescent TAF1-beta staining.

      Line 275 "Interestingly, a central region of the viral genome (Late3) and a region between the E3 and E4 genes exhibited almost no peaks" for protein VII. The virus utilized in this study lacked at least part of the E3 region. Did this deletion "cause" this region to be devoid of protein VII? Is the same absence of protein VII peaks observed in a fully wildtype virus? Also, can the authors provide any speculation as to why the Late3 region also lacks protein VII?

      We confirm the reviewer's observation. The region marked as Late3 and the region between E3 and E4 is present in the genome and is, as the reviewer observed, not chromatinized in our analysis. At this point, we can only speculate. We have two not mutually exclusive hypotheses. First, both regions could be involved in the proper packaging of the viral genome into the capsid. Physical constraints during packaging may preclude this region from being packaged into pVII. Second, as we observed that pVII positioning correlates with distinct DNA sequence patterns (revised Fig.4 D and E, see response to reviewer 3 for details), it might be that the sequence composition at the pVII depleted regions disfavour pVII assembly to keep those regions available for cellular factors that drive processes post genome delivery, such as transcription. Our time-resolved MNase analysis shows that indeed post genome delivery, this site in the Late3 region becomes protected (Fig. 5C), suggesting the binding of one or more cellular factors. As shown in Figure S6 we find conserved binding sites for several transcription factors at this MNase protected site.

      Whether the chromatinization devoid regions would shift in position, remain in place or be chromatinized in a wildtype virus has to be addressed in the future and cannot be answered at this point. To address the comment, we have expanded the discussion (line 620 ff)

      Line 569 "Reasons could be that the few genomes undergoing nucleosome assembly and active transcription produce the replication enzymes, whereas the bulk of genomes enters replication without activation as an elegant way to avoid repeated chromatinization." This argument may make sense in the context of a high MOI infection, but would certainly limit virus function during normal, pathogenic infection where the MOI is likely extremely low. Essentially, the authors data predicts that 80% of normal, low MOI infections don't progress to gene expression (at least during the first 4 hrs analyzed in this study).

      We follow the argument of the reviewer. The high MOI in our study was necessary to perform the combined ‘omics’ approach to arrive at meaningful data within reasonable sequencing depth. To have equivalence we also used high MOI for the imaging approach. A detailed analysis for the effect of low MOI as well as positioning effects (see reviewer comment below) on transcriptional activation is an important question and will be addressed in future studies that require different techniques in addition. To address this comment, we have updated the discussion to emphasize the importance of MOI and positioning effects (line 587 ff).

      Line 576 "This observation is in agreement with recent pVII-ChIP experiments showing transcription and replication independent pVII removal in early infection (Giberson et al., 2018; Komatsu and Nagata, 2012; Komatsu et al., 2011)." The authors can also state that histone and nucleosome deposition is also independent of transcription and replication, as has been alluded to in the same cited studies but proven more directly in this study.

      We have changed the text accordingly (line 576 and 598).

      Line 672 - the authors should be more definitive in the MOI that are used in all of their experiments. Line 672 states that an MOI of 3000 physical particles are applied per cell. There can be great variation between cell lines in how much virus binds to (and enters) a cell based on the surface levels of Ad receptors on different cell types. However, in general, 3000 is very high. Work by Wang et al. (PMID:24139403) showed that at an MOI of 200 or below most Ad will traffic correctly to the nucleus, whereas at an MOI above 200 there is a significant defect in Ad trafficking within the cell. How is this expected to affect all of the results in this study?

      We agree with this and the other reviewer that this is an important issue. The actual dose of virus that enters a given cell is dependent on the concentration of virus particles in the inoculum and the time and temperature this inoculum is in contact with the cells and the cells respective susceptibility to the virus. We applied an infection dose of 3000 physical particles per cell in a defined volume (1ml) at 37˚C for 30min followed by inoculum removal. We prefer this description because with these infection conditions, we find on average well below 100 virus particles that enter the cell (=> This is e.g., reflected in the number of accumulating genomes shown in figure 2A). In contrast, this permits to have enough viruses inside the cell to perform the different “omics” techniques applied in our study to obtain meaningful results at reasonable sequencing depths. This experimental setting was carefully chosen in full awareness of the work by Wang et al., cited by the reviewer, to avoid e.g., overloading the nuclear import rate. Thus, our experimental conditions do not exceed the “MOI of >200” that would affect nuclear import rates. The number (>200) in the Wang et al. study refers to the number of virus particles inside the cell, the infection condition used in the Wang study was an MOI of 30 bound to Hela cells in the cold for 30min and warmed for 150min which is significantly more virus than we have used in our study. We have expanded the information on the MOI used in the material and methods section to clarify this point (line 685 ff).

      Figure 5 is of low resolution and was difficult to read.

      We thank the reviewer for spotting it. It seems that the Figure quality was compromised during the PDF conversion. We updated the Figures and checked the resolution after PDF conversion.

      Figure S3 is missing a box from the top set of images indicating the region that is expanded in the detail picture.

      We updated Figure S3

      While I realize it is supplemental data, the difference in quality between the agarose gels shown in Figure S4A and S5A is shocking.

      The nature of the experiments is very different and therefore the expected MNase digestion profiles on agarose gels look different. In Figure S5 viral particles were digested with MNase, resulting in a smeary decrease in DNA size. This looks very different from the regular MNase pattern of whole cells that is dominated by the regularly spaced nucleosomes in the heterochromatic regions of the genome. As pVII protects only about 70bp of DNA and its spacing is not as homogenous as the nucleosomal spacing, the pictures shown in Figure S5A were expected as they are.

      Figure S7 is of low resolution.

      We updated the Figures and checked the resolution after PDF conversion.

      Reviewer #1 (Significance (Required)):

      At least in the field of adenovirus research, this is a very important study. There has been considerable debate in the field regarding the timing and degree of protein VII removal and histone deposition, and the necessity of active transcription for these two events. The data provided in this manuscript clearly shows that some protein VII is removed from early active genes and replaced by nucleosomes, and that these events occur prior to initiation of transcription. The authors speculate that the specific placement of protein VII, a protamine-like protein, on the Ad genome prescribes where nucleosomes are placed. This finding should be of interest to a broad general audience, as it provides novel information on chromatin assembly within mammalian cell. Key words for this reviewer: adenovirus research, HAdV nucleoprotein structure

      Reviewer #2 (Evidence, reproducibility and clarity (Required))

      The submitted manuscript presents a detailed and comprehensive analysis of the adenoviral nucleoprotein complexes as infection progresses, starting with the "adenosome" assembled with pVII which are then progressively replaced with H3.3.-containing nucleosomes as the infection progresses. The submission presents a combination of in situ and populational analyses of the viral DNA accessibility and complexes through infection. I brief, the infecting viral genomes are assembled in some 250 adenosomes with pVII, which become progressively replaced as infection progresses with nucleosomes containing H3.3 and acetylated H3K17, starting at the active promoters of the E genes. Chromatin remodeling precedes transcription, and the accessibility differs for genes of different kinetic classes at differ times after infection, although there is no correlation between accessibility and H3.3. or acetylation content. Only about 20% of the genomes become transcriptionally active, though, which somewhat complicates the analyses of the populational studies of accessibility and occupancy. Overall, the study is well conceived, performed and presented. A few issues that deserve further analyses and discussion, as described below.

      Major issues.

      As figure 2 nicely shows, only about 20% of the intranuclear genomes become transcriptionally active. However, MNase and ChIP analyses cannot differentiate these genomes from the 80% that are transcriptionally inactive. The interpretation of the positioning of pVII (figure 4) or the changes in compaction of the adenoviral chromatin at different loci (figure 5) does not appear to consider this heterogeneity other than for a brief comment about the stringent MNase digestion in page 11. The authors favor a model in which the changes in compaction shown in figure 5, at mild MNase digestions, directly correlate with transcription of the respective genes. This could well be correct, and in fact the correlation may be underestimated as 80% of the genomes may not undergo any changes, but it may also be incorrect. The analyses presented cannot differentiate whether the changes in chromatin compaction occur in only a subset of genomes or in all the genomes, regardless of whether they are transcribed or not, or even only in the non-transcribed genomes (which appears extremely unlikely). This intrinsic limitation to the methods used (and I know of no better alternative) should be acknowledged and discussed for the benefit of the reader. This limitation also impacts the analyses of the lack of correlation between H3.3 and acetylated H3K27 occupancy and compaction.

      A discussion is amended and located starting from line 571 in the text. “The heterogeneity of 80% inactive genomes and 20% activated genomes complicates the analysis of the MNase-seq data. High MNase concentrations do not differentiate between both states, and we suggest that low MNase conditions capture the dynamic viral proportion, changing and preparing its genome for gene activation. The data nicely suggest such a scenario, but there is the caveat that we catch an effect of the mixed population that we cannot differentiate.”

      The analysis of the histone ChIP is discussed below.

      Perhaps out of necessity to reach the required sensitivity, a high multiplicity of infection was used (although the actual moi is not stated, there are about 25-30 pVII foci/ per nuclei). The presentation, analyses and discussion of the results should emphasize this context. For example, one would presume that at low moi, when only one genome enters each cell, the percentage of transcriptionally active genomes in a given cell will be either 0 or 100%, but the "system" becomes saturated as more and more genomes enter the nucleus at higher moi resulting in only a subset of them being transcriptionally active. Along this line of reasoning, it is intriguing that the percentage of genomes estimated to be in nucleosomes at 4 hpi (14%) approaches the percentage of transcribed genomes.

      This issue was also raised by reviewer 1 (see detailed comment above). The reviewer is correct that we chose to use a higher MOI to reach the required sensitivity in our different “Omics” assays. The imaging approach was adapted to reach the morphological equivalence to fit this analysis. We agree that it would be interesting to also study the MOI effect on transcriptional activation (as well as positioning effects, see comment below) but this requires different approaches and will be addressed in a future study. To address this comment (and others in this review) we revised the text in the discussion to emphasize the importance of MOI and possible other effects such as positioning (line 587 ff).

      The changes in chromatin compaction presented in figure 5 are in some respect puzzling. The compaction of most of the late genes increases as infection progresses, at least for the first four hours, as the authors discuss. However, the L genes appear to be at least as accessible as the E ones at the early times, when only the E are transcribed to high levels. This appears counterintuitive, and may not be consistent with the main conclusion that increase accessibility to a given gen directly correlates to its transcriptional activity level. The data presented in Figure 5C deserves a more nuanced analysis and discussion, parsing out the changes in accessibility to each given gene at different times from the different accessibility to the different genes at any given time. The later does not appear to support the main conclusion reached by the authors that accessibility to each individual gen correlates with its transcriptional level.

      We thank the reviewer for raising this point. While the viral genomes enter the nucleus, the viral chromatin structure is tightly condensed. Therefore, it is unlikely that after nuclear entry the viral chromatin undergoes further compaction. With our analysis, we expect to detect only decompaction of genomic sites relative to 0 hpi, when the virus has not entered the nucleus yet. At some sites and particularly at the Late genes the signal is decreasing, most likely due to normalization to sequencing depth and the variation in the number of viral genomes but not due to changes in compaction. We realized that the negative accessibility scores we used in the study are misleading and give a false impression. Therefore, we changed the analysis in that way, that negative values were not permitted and converted to zeros.

      Additionally, we raised the temporal resolution of the analysis and compared the accessibility at all available timepoints against 0 hpi as suggested by the reviewer. Now, we clearly observe, that most accessibility changes are accomplished rapidly after nuclear import, already at 1 hpi and do not change much after, until 4 hpi. Regions of decompaction coincide with early expressed genes and occur before transcription, underscoring the conclusions made in the study. Nevertheless, while most genomic regions covering late genes do not show decompaction, we observed some local sites showing a high accessibility score. As transcription at those sites appears later in the life cycle of the virus, we can only speculate about the function e.g. as enhancer elements.

      The Text and Figures were changed accordingly (line 347 ff).

      New legend:

      __C) __Profile illustrates HAd-C5dE3 genome coverage by low MNase-seq fragments. The average of two replicates is shown, except at timepoint 0 hpi where only one replicate was available. The accessibility score was calculated as the log(fold-change) between the indicated timepoint and 0 hpi. The score was assessed for each pVII peak (orange bars) and negative scores were set to 0. A new accessibility peak arising during infection in the Late3 region is marked by an asterisk. __D) __Boxplot showing the accessibility score distribution in each domain at each tested timepoint after infection.

      Minor comments

      The authors may wish to highlight in the discussion that the analyses are so far limited to a single adenovirus.

      We have taken up the suggestion of the reviewer and included it in the discussion part, starting at line 607:

      “The structural analysis is still limited to a single adenovirus genotype and it will be interesting to test whether these dynamic changes are conserved among other adenoviruses. Furthermore, reproducing such organization in adenoviral vectors could result in efficient and sustained transgene expression.”

      The y-axes in the transcriptome figures (figure 1 B, S2) could be presented in Log(2) scale, such that transcript levels at all times can be appreciated in the same graph (the earlier times are just not visible in a linear scale)

      As requested by the reviewer we changed the data to log2 scale. As there is no qualitative difference to the log10 scale, presented in the original version, we would like to keep the figure as it is. To highlight changes at early time points we generated the average expression of early genes in Fig1C.

      As an information for the reviewer, we provide here the data plotted as log2 scale.

      The (lack of) phenotype of the 24xMS2 binding site recombinant adenovirus used should be shown.

      We observed no difference in phenotype between the parental and the MS2 modified virus. We updated Figure S3 and included a gel analysis and specific infectivity data to show this absence of difference.

      The kymograph analyses presented in figure 3B appear to show that there are some sites of transcript accumulation sites which do not harbor viral genomes (i.e., green only tracks). Moreover, the interpretation of the TAF1beta-mCherry signal is complicated by the (fully expected) significant "background" signal. Although these results are consistent with those obtained by RNAscope/pVII staining, there appears to be intrinsic limitations to the system, which preclude reaching strong conclusions from it. These confirmatory analyses should probably be moved to the supplementary information section and removed from the main text and figures. The longer evaluation data mentioned as not shown in page 8 is critical to the conclusions and should be shown.

      Here we disagree with the reviewer and prefer to keep the data as main figure. All (immobile) transcript accumulation sites are identified by the kymograph analysis and coincide with a genome while free transcripts show a high mobility that is not picked up in the kymograph analysis. This is independently verified in the provided supplemental movies. Depending on the positioning of the genome inside the living cell, accumulating transcripts can appear adjacent to or on top of a genome. This explains the slight shift between RNA and DNA signal for some genomes in the merged image of the kymograph. This is expected as only fully transcribed transcripts and not nascent transcripts are marked by MS2 (the MS2 loops are positioned in the 3’UTR). Also, all genomes (transcribing and non-transcribing) can be identified in the kymograph above background level. To clarify the representation, we have added labels to the kymograph to show which signal is DNA and RNA and a merge respectively. We are convinced that this data set is in strong support of our study, as it is the only technique that permits the discrimination of transcribing and non-transcribing genomes in living cells at real time.

      As requested, we have also added two additional examples for a longer observation period (10min) into the supplemental data Fig. S3C.

      Although the plot of cleavage frequency presented in figure S5 is clear, it would be beneficial to the reader if the actual peaks were also presented to compare their distribution (if any) in gDNA and virus particle.

      In Figure S5 we wanted to test whether the regions lacking pVII peaks are resulting from the absence of pVII, protecting the DNA, and therefore being fully hydrolyzed by MNase, or whether this region is tightly packed by pVII thereby protecting DNA from MNase digestion. To test both possibilities we used a very limited MNase digestion approach, where even free DNA is not fully hydrolyzed, allowing the capture of DNA fragments. Therefore, the sequenced fragments comprise a mixture of protected and un-protected fragments. In this assay, the pVII protected fragments are not fully digested to the monomeric state, but a mix of mono-, di- and other multimers are present. As reflected by the fragment size distribution with the peak between 100-200 bp (Fig S5B), pVII dimers are predominantly enriched when compared to the high MNase digestion used to map pVII positions (compare to Fig4 B). Therefore, the peaks in the S5 data set have a low resolution and do not provide exact pVII positions (see below). Therefore, we would like to keep S5 as it is. We clarified this point in the text (line 279 ff)

      Legend:

      Fragment coverage plot of MNase digestions of gDNA (black) or Ad chromatin in virus particles (purple).

      The mRNA analyses of selected transcription factors provides little information, as there is no context, there is variability between experiments, and in most cases the changes appear modest. As these results are not critical to the conclusions or analyses, perhaps the authors may wish to remove them from the manuscript. Alternatively, more in-depth analyses would be required.

      We agree with the reviewer, that more information for the reader is needed. Therefore, we performed a statistical analysis of expression changes between 0 hpi and 4 hpi of the shown transcription factors using DESeq2. We added the corresponding log2(fold-change) and p-values to the figure. And adapted the text (line 471) and figure accordingly.

      Legend:

      Gene expression changes of transcription factors over the infection time course. P-values and log2(fold-changes) from differential gene expression analysis between 4 hpi and 0 hpi using DESeq2 are indicated. ns = not significant

      It is unclear why the even distribution of H3.1-flag signal across the genome is considered indicative of no specific recruitment. The results presented are equally consistent with equal incorporation across the genome. Perhaps the authors have some additional information, such as an irrelevant antibody, input DNA, or the like, to support the conclusion. If so, that evidence should be presented and discussed. If not, the interpretation should be revisited. As an added complexity, endogenous H3.1 is normally expressed during S-phase. It is possible that Adenovirus infection may induce higher levels of expression of (untagged)endogenous H3.1, which would outcompete the tagged ectopically expressed histone. These analyses deserve a more nuanced and in-depth analysis.

      We have taken several measures in the study to address the concern of the reviewer. We consider timepoint 0 hpi as background control as the viral genome has not entered the nucleus yet. Consistently, we observe very few reads mapped to the Ad genome regardless of antibody and construct used (Fig 6B). Additionally, all samples at 0 hpi cluster together in PCA (Fig 6C) and correlation analysis (Fig S7D)

      H3.1 Flag tagged samples show at later timepoints (1 - 4hpi) slightly higher percentages of mapped reads to Ad, but plateau already at 1 hpi (Fig 6B) and cluster together in PCA (Fig 6C) and correlation analysis (Fig S7D) with 0 hpi samples. The low background signal starting at 1 hpi for H3.1 might arise due to the change of Ad genome location to the nucleus.

      Even though, the number of Ad mapped reads at later timepoints was low in H3.1 Flag tagged samples, it could still be that they accumulate at few sites on the Ad genome indicating a specific deposition. We tested this by plotting the signal across the whole Ad genome (Fig S7E) and zooming into the data (compare scale of H3.3 and H3.1 plot), but we could not detect any reproducible local enrichments. To enable the reader a better comparison between the levels of H3.1 incorporation with H3.3 we put now both on the same scale (Fig 6D and Fig S7D) clearly showing that we cannot detect H3.1 incorporation at Ad genomes in the first 4 hours of infection. The H3.1 signal corresponds to the background noise. We think for two reasons, that it is very unlikely that endogenous H3.1 outcompetes the tagged H3.1:

      • The time scale for the cells to transition into S-Phase and upregulate endogenous H3.1 would be only 1-2 hours in our timeseries experiments and therefore too short. To also show these experimentally we amended an experiment for the reviewer that is not included in the manuscript. The Western blots below show that the protein amount of H3 does not increase in the first 4hours of infection. Cells were infected and whole cell extracts were prepared 4hpi.
      • As most cells are not in S-phase in our experiments, the expression levels of H3.3 variant is higher than H3.1. With the Flag ChIPs we can clearly show that the tagged H3.3 are not outcompeted by endogenous H3.3. As there is a high sequence similarity between H3.3 and H3.1 it is very unlikely that they behave in that regard differently.

        It is highly unlikely that the somewhat higher H3K27ac signal observed in the H3.3 than in the H3.1 expressing cells may result from higher H3.3. occupancy in the viral genome as speculated in page 13. The total levels of H3.3. are unlikely to increase by the ectopically expressed one, and even if they did it is not likely that the occupancy of the viral genome would be limited by the levels of H3.3. This speculation should be removed.

      We removed the speculation.

      Materials and methods are too concise. A longer more detailed version, as supplementary information, would be highly desirable.

      We extended the materials and methods part.

      Reviewer #2 (Significance (Required)):

      The major strengths of this manuscript lie on its comprehensiveness, using several in situ and populational approaches to address biologically critical questions regarding the regulation of viral replication by chromatin and epigenetics. Experiments appears very well designed and performed and are mostly clearly presented. The interpretation analyses and discussion of the results may benefit from a more nuanced analysis of the issues posed by the existence of different populations of viral genomes in the cells infected at high moi and the accessibility across different genes at any given time versus the levels of transcription of the different genes, which appears not to be fully consistent with one of the main conclusions reached.

      This study makes a very significant contribution, describing the dynamic changes in the adenoviral nucleoprotein complexes at the early times of infection and providing a full description of both the adenosomes and the nucleosomes in more and less transcribed loci. The results are properly analyzed in context of what is known about the regulation of viral gene transcription by chromatin dynamics in other systems, including similarities and differences. This study is likely to be of high interest to a wide audience, ranging from virologist to epigeneticists, to those working in gene therapy and vectored vaccines.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript "Adenoviral chromatin organization primes for early gene activation" combines RNA-seq, MNase-seq, ChIP-seq, and single genome and transcript imaging (immunofluorescence, RNA-scope, and live cell techniques) during early Adenovirus infection in vitro to characterise the spatiotemporal dynamics of viral chromatin organisation and association with gene transcription. The manuscript is an interesting read and the authors have combined multiple complimentary techniques to make a substantial contribution to understanding the early events occurring after nuclear import of viral genomes. Adenoviruses are important causes of human and animal pathology, are a useful model of non-integrating extra-chromosomal DNA virus infection in mammalian cells, and are useful vectors for vaccination and the discoveries may influence gene therapy DNA vector design. The chromatin organisation in adenovirus infection is distinct from other DNA viruses, and is relatively poorly understood compared to, for example, SV40 or herpesviruses. The manuscript describes an early transition from purely viral chromatin with Adenovirus protein pVII packaging the virus in virions, to a viral-human hybrid chromatin pattern with apparently strategically positioned H3.3 nucleosomes and viral pVII "Adenosomes" in the early hours after nuclear import of the viral genome. The data shows that packaged Adenoviruses are in a transcriptionally accessible form and gene expression occurs rapidly after infection, the combination of the MNase-seq data with ChIP-seq data is particularly interesting demonstrating and average ~238 adenosomes positioned by specific DNA code protecting 60-70bp of DNA, and that the genome is accessible at loci that also decondense on infection, with adenosomes being replaced by cellular H3.3 containing nucleosomes at distinct sites. Particularly they show that +1 H3K27 acetylated nucleosomes are acquired at the TSS of key early genes. The authors argue that their spatiotemporal data imply that this chromatin transition "primes" for early gene transcription. The manuscript is well written, uncovers important viral chromatin biology by combining multiple experimental techniques, and the data is generally very clearly presented. A few comments follow. Major concerns: • Abstract and Title: o the abstract and title suggest that because the chromatin changes are observed coincidentally or before transcriptional changes, and that this means that these chromatin changes "prime" (title) or are "required" and play a "central role" (abstract) in early gene expression. The temporal relationship would be consistent with chromatin changes being required for transcriptional changes, but do not imply necessity. Experiments to demonstrate the necessity of these changes for early gene transcription are lacking, and I recommend amending the text or additional experiments to provide this evidence directly.

      We observe a clear timing of events, with chromatin opening, nucleosome assembly at the 5’ end of the gene followed by transcriptional activation, suggesting that these structural changes are essential for gene activation. Still, we cannot prove the direct dependency. Therefore we toned down the title of our manuscript and formulate the findings more conservatively.

      The title now reads: “Changes in adenoviral chromatin organization precede early gene activation”

      Results:o The IF data in Fig S1 is convincing, showing viral particles are accessible quickly in the nucleus. Although no statistics are provided for S1B and C, pVII foci appear at 0.5hpi and appear to mostly accumulate between 0.5hpi and 1hpi with further import between 1hpi and 4hpi. Can the authors be sure that a single pVII IF focus represents a single genome? If genomes tend to aggregate as they accumulate the number of foci per nucleus may not increase linearly with the number of genomes imported. Have the authors considered analysing the intensity of the individual pVII foci over the time points? A related question is whether the authors assume that all packaged virions contain intact complete viral genomes? Many viruses comprise some mixture of complete and incomplete packaged genomes, and the subsequent analyses determine the proportion of transcriptionally active copies with RNA-Scope to a single transcript E1A which lies at one end of the viral genome. Please comment explicitly on whether this is assumed and whether this assumption is realistic in light of known Adenovirus biology.

      We appreciate the reviewer's concern. Several studies in the adenovirus field have shown equivalence between protein VII nuclear foci and individual genomes, including our own (PMID: 26332038). Probably the most accurate study was performed by Daniel Engels lab PMID: 19406166, who used nuclear protein VII foci to titrate viral as well as vector genomes. In contrast, a different study from Patrick Hearings lab PMID: 21345950 showed that past 4hpi, the number of nuclear protein VII foci gradually declines. Based on our experience and because our study is limited to 4 hpi we are confident that protein VII foci accurately reflect individual viral genomes.

      Concerning genome packaging, adenovirus particles contain a single viral genome that is protected at each end by a covalently attached protein preventing its degradation. The packaging of adenoviruses is extremely efficient and only complete genomes are packaged into fully assembled particles. All viruses used in this study have been purified by double CsCl gradient purification. This density gradient based purification protocol removes all particles that are either empty or damaged or would contain partial genomes.

      o The RNA-Seq data in Fig 1 and Fig S2 and Table S1 demonstrates transcription of early genes is barely observable at 1hpi but is observable by 2hpi and is clearly much increased by 4hpi. Fig 2C, visualising pVII foci directly within single cells, suggests that approximately 80% of foci are observed by 1hpi and a further 20% between 1hpi and 2hpi and little thereafter. These data convincingly demonstrate that nuclear import is rapid, typically occurring in the first hour. The E1A RNA-Scope data in figure 2, visualising individual mRNA transcripts of E1A, is more sensitive than the bulk RNA-Seq, and shows transcripts at 1hpi with clearly discernible transcription by 2hpi (2A&D) which suggests that transcription occurs early, by 2hpi. Thus transcription lags nuclear genome import by approximately one hour by these methods. However, the conclusions of the subsequent analyses depend on the chromatin changes clearly preceding, rather than being approximately coincident with transcription, therefore transcription being evident by 2hpi is relevant as figure 6A and D suggest that the chromatin remodelling is subtle before 2hpi on the bulk sequencing analyses. The authors should comment on this given the importance to their argument.

      As stated by the reviewer we observe a clear lag between nuclear import and transcriptional activation. And we do also observe the largest changes in nucleosome occupancy (ChIP-seq data) between 1 and 2 hpi (Fig6A and D). Compared to 0hpi, we observe the strongest increase of nucleosome occupancy between 1hpi and 2hpi (4-8fold effect), whereas depending on the area a 2-3fold increase in occupancy can be observed from 2hpi to 4hpi (Fig6D). An effect that one would expect with chromatin structure preceding gene activation. Furthermore, the timing of nucleosome assembly perfectly matches the increase of MNase accessibility at 1 hpi, supporting our conclusions.

      o The validation of the E1A probe specificity in Fig 2B looks convincing, but there are no data presented for multiple cells to reassure that this image is representative. The equivalent figure for 2D for the Ad5-GFP control would address this.

      We include a large field overview with multiple cells for virus and vector control as new supplemental figure S2B showing that the RNAscope detection of the E1A transcript is highly specific.

      o Figure 2E is presented as a colocalization analysis but appears to be a ratio of mRNA foci to pVII foci per cell. If this is an incorrect interpretation then some clarification in the figure legend would be helpful. If this interpretation of these data is correct, then it is not truly a colocalization analysis, as a single genome may give rise to multiple transcripts and so a ratio We apologize that this figure was not clear. The data are based on real colocalizations and represent the number of pVII dots positive for E1A normalized with the total number of nuclear pVII. We have clarified the figure legend accordingly.

      o The live cell imaging experiments are elegant and convincing, but the agreement in Fig 3D of the % colocalization in MS2-BP data with the RNA-scope data is potentially misleading for the reasons outlined in the prior comment. Is the data in Fig 2E the same as the data in the right hand panel of Fig 3D. If so please comment on the n discrepancy (n=30 in 2E vs n=22 in 3D). The observation that 20% of genomes are transcriptionally active, via bursting or otherwise, is interesting, and would be consistent with the Suomalainen et al reference. The authors discuss two hypotheses to explain these findings: transcriptional bursting or a subset ~20% of genomes being transcriptionally active. This is an interesting and begs the question as to why this may occur. Assuming all imported genomes are intact (previous comment), it appears from the presented images that the foci at the radial periphery of the nucleus may be more frequently transcriptionally active, despite the nuclear periphery being enriched for heterochromatin. The authors might consider analysing the radial position of their TAF1B-mCherry genomes (active and inactive) as this might support position effect variegation rather than bursting as an explanation and they appear to already have the data to perform such analyses.

      o In the presented images (Fig 3A and Fig S3) it appears a higher proportion of genomes than 20% appear to be transcriptionally active, particularly in the low MOI experiment. The authors may wish to comment on this and quantify whether the proportion of transcribing genomes was affected by the input MOI.

      This and the previous comment concerning the influence of MOI, transcriptional bursting and the positioning effect of the genome on the transcriptional activity have also been in part raised above. As stated in our response to reviewer 1 we have used a high MOI in our experiments to have equivalence between all experimental approaches. We agree with the reviewers that all aspects (dose, bursting and positioning) merit a detailed investigation, which we plan in future studies. To be consistent and comparable in our comprehensive approach we decided to not include such studies here as they would address a different question. Nevertheless, to address this (and the above) comments we now mention positioning effects in the results (line 214) and enlarged the discussion (line 587 ff) where we especially raised awareness that such pertinent questions can be addressed with the tools presented in our study.

      We also decided to visually separate the comparison of MS2 and RNAscope data to avoid misleading the reader. Furthermore, the RNAscope data have been replaced. The RNAscope data are indeed from Fig. 2. The difference in n was due to our mistake showing two different normalized data sets. Data were either normalized using total amount of nuclear protein VII (Fig. 2E) or the total amount of nuclear E1A signals (Fig 3D), which due to the more heterogenous signal did not include all cells. In the updated version both figures display data normalized by total amount of nuclear protein VII

      o Fig 4C suggests that there is a large GC preference (or bias) in the pVII occupied regions. The authors may wish to comment on this and present a track with Adenovirus GC composition in Fig 4D.

      We thank the reviewer for raising this point. As suggested by the reviewer we analysed the GC content under pVII peaks and in the linker DNA. Indeed, pVII occupied regions have a significant higher GC content indicating that pVII preferentially positions at GC rich regions. We included this analysis as an additional Figure 4E (line 302 f).

      Legend:

      Boxplot showing GC content of pVII occupied (pVII) or free (linker) regions. Two biological replicates are shown side by side and the p-value of a students t-test of the corresponding pairs is indicated above.

      o Figure 6 presents convincing data showing H3.3. nucleosome positioning and acetylation at E1A and the data is nicely presented showing these changes occur early being observable by 2 and 4 hpi. Again, these changes are not convincingly prior to early gene activation but are certainly occurring early, and may occur prior to early gene activation at the level of individual foci, however, this is not demonstrated definitively.

      This question belongs to the same context addressed by the reviewer above. Please refer to the answer given above.

      Minor comments:

      Introduction: o Paragraph 1 - Introduction for DNA viruses in general, but the authors appear to be talking about Adenoviruses specifically, "little is known about the structural organization of the genome" and "nuclear viral genomes could undergo different parallel fates", arguably these statements are not accurate for other DNA viruses (e.g. Epstein Barr Virus) suggest amending the wording for clarity.

      The manuscript text was updated as suggested.

      o paragraph 2 - Why do the authors say that Adenoviruses are prototypic DNA viruses?

      We removed the term prototypic.

      o Paragraph 3 - A recent study is referenced but multiple references are given.

      The references were updated

      o "Protein VII stays associated with the viral genome imported to the nucleus, while pV dissociates from the viral DNA following ubiquitylation (Puntener et al., 2011). The fate of the μ-peptide is not known". - The reference suggests that pV dissociates on entry to the cytoplasm and during capsid disassembly at the nuclear pore. I find this sentence confusing as it doesn't make it clear that pV is lost before nuclear entry which is important for interpreting the data.

      We clarified this in the manuscript text

      Results:

      o Figure 5 is almost unreadable due to low resolution.

      We updated the Figures and checked the resolution after PDF conversion.

      o Reference to Fig 4C in text comes after Fig4D.

      The order of Figure panels was changed accordingly.

      Reviewer #3 (Significance (Required)):

      The manuscript is well written, uncovers important viral chromatin biology by combining multiple experimental techniques, and the data is generally very clearly presented

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript "Adenoviral chromatin organization primes for early gene activation" combines RNA-seq, MNase-seq, ChIP-seq, and single genome and transcript imaging (immunofluorescence, RNA-scope, and live cell techniques) during early Adenovirus infection in vitro to characterise the spatiotemporal dynamics of viral chromatin organisation and association with gene transcription. The manuscript is an interesting read and the authors have combined multiple complimentary techniques to make a substantial contribution to understanding the early events occurring after nuclear import of viral genomes.

      Adenoviruses are important causes of human and animal pathology, are a useful model of non-integrating extra-chromosomal DNA virus infection in mammalian cells, and are useful vectors for vaccination and the discoveries may influence gene therapy DNA vector design. The chromatin organisation in adenovirus infection is distinct from other DNA viruses, and is relatively poorly understood compared to, for example, SV40 or herpesviruses. The manuscript describes an early transition from purely viral chromatin with Adenovirus protein pVII packaging the virus in virions, to a viral-human hybrid chromatin pattern with apparently strategically positioned H3.3 nucleosomes and viral pVII "Adenosomes" in the early hours after nuclear import of the viral genome. The data shows that packaged Adenoviruses are in a transcriptionally accessible form and gene expression occurs rapidly after infection, the combination of the MNase-seq data with ChIP-seq data is particularly interesting demonstrating and average ~238 adenosomes positioned by specific DNA code protecting 60-70bp of DNA, and that the genome is accessible at loci that also decondense on infection, with adenosomes being replaced by cellular H3.3 containing nucleosomes at distinct sites. Particularly they show that +1 H3K27 acetylated nucleosomes are acquired at the TSS of key early genes.

      The authors argue that their spatiotemporal data imply that this chromatin transition "primes" for early gene transcription. The manuscript is well written, uncovers important viral chromatin biology by combining multiple experimental techniques, and the data is generally very clearly presented. A few comments follow.

      Major concerns:

      • Abstract and Title:
        • the abstract and title suggest that because the chromatin changes are observed coincidentally or before transcriptional changes, and that this means that these chromatin changes "prime" (title) or are "required" and play a "central role" (abstract) in early gene expression. The temporal relationship would be consistent with chromatin changes being required for transcriptional changes, but do not imply necessity. Experiments to demonstrate the necessity of these changes for early gene transcription are lacking, and I recommend amending the text or additional experiments to provide this evidence directly.
      • Results:
        • The IF data in Fig S1 is convincing, showing viral particles are accessible quickly in the nucleus. Although no statistics are provided for S1B and C, pVII foci appear at 0.5hpi and appear to mostly accumulate between 0.5hpi and 1hpi with further import between 1hpi and 4hpi. Can the authors be sure that a single pVII IF focus represents a single genome? If genomes tend to aggregate as they accumulate the number of foci per nucleus may not increase linearly with the number of genomes imported. Have the authors considered analysing the intensity of the individual pVII foci over the time points? A related question is whether the authors assume that all packaged virions contain intact complete viral genomes? Many viruses comprise some mixture of complete and incomplete packaged genomes, and the subsequent analyses determine the proportion of transcriptionally active copies with RNA-Scope to a single transcript E1A which lies at one end of the viral genome. Please comment explicitly on whether this is assumed and whether this assumption is realistic in light of known Adenovirus biology.
        • The RNA-Seq data in Fig 1 and Fig S2 and Table S1 demonstrates transcription of early genes is barely observable at 1hpi but is observable by 2hpi and is clearly much increased by 4hpi. Fig 2C, visualising pVII foci directly within single cells, suggests that approximately 80% of foci are observed by 1hpi and a further 20% between 1hpi and 2hpi and little thereafter. These data convincingly demonstrate that nuclear import is rapid, typically occurring in the first hour. The E1A RNA-Scope data in figure 2, visualising individual mRNA transcripts of E1A, is more sensitive than the bulk RNA-Seq, and shows transcripts at 1hpi with clearly discernible transcription by 2hpi (2A&D) which suggests that transcription occurs early, by 2hpi. Thus transcription lags nuclear genome import by approximately one hour by these methods. However, the conclusions of the subsequent analyses depend on the chromatin changes clearly preceding, rather than being approximately coincident with transcription, therefore transcription being evident by 2hpi is relevant as figure 6A and D suggest that the chromatin remodelling is subtle before 2hpi on the bulk sequencing analyses. The authors should comment on this given the importance to their argument.
        • The validation of the E1A probe specificity in Fig 2B looks convincing, but there are no data presented for multiple cells to reassure that this image is representative. The equivalent figure for 2D for the Ad5-GFP control would address this.
        • Figure 2E is presented as a colocalization analysis but appears to be a ratio of mRNA foci to pVII foci per cell. If this is an incorrect interpretation then some clarification in the figure legend would be helpful. If this interpretation of these data is correct, then it is not truly a colocalization analysis, as a single genome may give rise to multiple transcripts and so a ratio <1 could be expected even if all pVII foci were transcribing genomes. In addition, the text suggests that because there is no statistically significant difference between 2hpi and 4hpi a plateau has been reached. However, the difference between 1h and 2h is barely significant (p=0.04) and the mean is increased between 2h and 4h, albeit non-significantly, so the plateau is not convincingly demonstrated. If the authors wish to perform a colocalization analysis rather than a ratio, they might assign each transcript to the nearest pVII IF focus within the nucleus and count the proportion of pVII foci with any transcripts assigned over time. Alternatively, they can amend the description of this analysis in the figure and text.
        • The live cell imaging experiments are elegant and convincing, but the agreement in Fig 3D of the % colocalization in MS2-BP data with the RNA-scope data is potentially misleading for the reasons outlined in the prior comment. Is the data in Fig 2E the same as the data in the right hand panel of Fig 3D. If so please comment on the n discrepancy (n=30 in 2E vs n=22 in 3D). The observation that 20% of genomes are transcriptionally active, via bursting or otherwise, is interesting, and would be consistent with the Suomalainen et al reference. The authors discuss two hypotheses to explain these findings: transcriptional bursting or a subset ~20% of genomes being transcriptionally active. This is an interesting and begs the question as to why this may occur. Assuming all imported genomes are intact (previous comment), it appears from the presented images that the foci at the radial periphery of the nucleus may be more frequently transcriptionally active, despite the nuclear periphery being enriched for heterochromatin. The authors might consider analysing the radial position of their TAF1B-mCherry genomes (active and inactive) as this might support position effect variegation rather than bursting as an explanation and they appear to already have the data to perform such analyses.
        • In the presented images (Fig 3A and Fig S3) it appears a higher proportion of genomes than 20% appear to be transcriptionally active, particularly in the low MOI experiment. The authors may wish to comment on this and quantify whether the proportion of transcribing genomes was affected by the input MOI.
        • Fig 4C suggests that there is a large GC preference (or bias) in the pVII occupied regions. The authors may wish to comment on this and present a track with Adenovirus GC composition in Fig 4D.
        • Figure 6 presents convincing data showing H3.3. nucleosome positioning and acetylation at E1A and the data is nicely presented showing these changes occur early being observable by 2 and 4 hpi. Again, these changes are not convincingly prior to early gene activation but are certainly occurring early, and may occur prior to early gene activation at the level of individual foci, however, this is not demonstrated definitively.

      Minor comments:

      • Introduction:
        • Paragraph 1 - Introduction for DNA viruses in general, but the authors appear to be talking about Adenoviruses specifically, "little is known about the structural organization of the genome" and "nuclear viral genomes could undergo different parallel fates", arguably these statements are not accurate for other DNA viruses (e.g. Epstein Barr Virus) suggest amending the wording for clarity.
        • paragraph 2 - Why do the authors say that Adenoviruses are prototypic DNA viruses?
        • Paragraph 3 - A recent study is referenced but multiple references are given.
        • "Protein VII stays associated with the viral genome imported to the nucleus, while pV dissociates from the viral DNA following ubiquitylation (Puntener et al., 2011). The fate of the μ-peptide is not known". - The reference suggests that pV dissociates on entry to the cytoplasm and during capsid disassembly at the nuclear pore. I find this sentence confusing as it doesn't make it clear that pV is lost before nuclear entry which is important for interpreting the data.
      • Results:
        • Figure 5 is almost unreadable due to low resolution.
        • Reference to Fig 4C in text comes after Fig4D.

      Significance

      The manuscript is well written, uncovers important viral chromatin biology by combining multiple experimental techniques, and the data is generally very clearly presented

    1. the completions and chat completions endpoint can be used for virtually any task including content or code generation, summarization, expansion, conversation, creative writing, style transfer, and more.

      Advantages in Layman tersm

    1. RxJS helps developers author declarative code for handling side effects and asynchronous actions with continuous data streams and subscriptions.

      Use ful to manage the complexities of subscriptions and data stream processing.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):


      Summary:

      In this manuscript, Roberts et al. hypothesised that the 5:2 diet (a popular form of IF, a dietary strategy within the Intermittent fasting that is thought to increase adult hippocampal neurogenesis - AHN) would enhance AHN in a ghrelin-dependent manner. To do this, the Authors used immunohistochemistry to quantify new adult-born neurons and new neural stem cells in the hippocampal dentate gyrus of adolescent and adult wild-type mice and mice lacking the ghrelin receptor, following six weeks on a 5:2 diet. They report an age-related decline in neurogenic processes and identify a novel role for ghrelin-receptor in regulating the formation of new adult

      born neural stem cells in an age-dependent manner. However, the 5:2 diet did not affect new neuron or neural stem cell formation in the dentate gyrus, nor did alter performance on a spatial learning and memory task. They conclude that the 5:2 diet used in their study does not increase AHN or improve associated spatial memory function.

      Major comments:

      One criticism might be the fact that many aspects are addressed at the same time. For instance it is not fully clear the role of ghrelin with respect to testing the DR effects on AHN. Although the link between ghrelin, CR and AHN is explained by citing several previous studies, it is difficult to identify the main focus of the study. Maybe this is due to the fact that the Authors analyse and comment throughout the paper the different experimental approaches used by different

      Authors to study effect of DR to AHN. This is not bad in principle, since I think the Authors have a deep knowledge of this complex matter, but all this results in a difficulty to follow the flow of the rationale in the manuscript.

      We appreciate the reviewer’s critique regarding the rationale of the studies presented in the manuscript.

      The role of ghrelin in the regulation of AHN by dietary interventions such as CR and IF is a major interest of our lab and is the main focus of the study. We, and others, have shown that ghrelin mediates the beneficial effects of CR on AHN. It is often assumed that ghrelin will elicit similar effects in other DR paradigms. We selected the 5:2 diet since it is widely practiced by humans, but it has not been well tested experimentally.

      We sought to empirically test how the neurogenic response to 5:2 differed between mice with functional and impaired ghrelin signaling.

      Given that plasma ghrelin levels and AHN are reduced during ageing, we also wanted to determine if 5:2 diet could slow or even prevent neurogenic decline in ageing mice.

      We will re-write the manuscript to ensure that our primary aim is clearly presented. We will also reanalyze the data, with genotype and 5:2 diet as key variables. To help maintain focus, the variable of age will be analyzed separately. This amendment will, we hope, help the reader follow the narrative of our manuscript.

      Another major point: the Discussion is too long. The Authors analyse all the possible reasons why different studies obtained different results concerning the effectiveness of DR in stimulating adult neurogenesis. Thus, the Discussion seems more as a review article dealing with different methods/experimental approaches to evaluate DR effects. We know that sometimes different results are due to different experimental approaches, yet, when an effect is strong and clear, it occurs in different situations. Thus, I think that the Authors must be less shy in expressing their conclusions, also reducing the methodological considerations. It is also well known that sometimes different results can be due to a study not well performed, or to biases from the Authors.

      In our discussion, we felt that it was particularly important to be as rigorous as possible in contextualizing our findings with other published data, whilst highlighting methodological differences. Our aim was to be as precise as possible when comparing findings across studies, however, this resulted in the narrative drifting from the key objectives of our study – namely, to determine the effect of 5:2 diet on neurogenesis and whether or not ghrelin-signalling regulated the process. We will amend the text of the discussion to ensure that the key points of our study are only compared and contrasted with relevant studies in the field. We thank the reviewer for their candid comment.


      Minor comments:

      • This sentence: "There is an age-related decline in adult hippocampal neurogenesis" cannot be put in the HIGHLIGHTS, since is a well known aspect of adult hippocampal neurogenesis

      The reviewer is correct to state this. Our study replicates this interesting age-related phenomenon. However, we will remove it from the ‘Highlights’ section.

      • Images in Figure 5 are not good quality.

      We apologise for this oversight. We will review each figure and panel to ensure that high-resolution images, that are appropriately annotated, are used throughout the manuscript.

      • In general, there are not a lot of images referring to microscopic/confocal photographs across the entire manuscript.

      We structured the manuscript with a limited number of figures and associated microscope captured panels, with the aim of presenting representative images to illustrate the nature and quality of the IHC protocols. However, we will amend the figures for the revised manuscript to provide representative microscopy images, with each group included and clearly annotated.

      • The last sentence of the Discussion "These findings suggest that distinct DR regimens differentially regulate neurogenesis in the adult hippocampus and that further studies are required to identify optimal protocols to support cognition during ageing" is meaningless in the context of the study, and in contrast with the main results. Honestly, my impression is that the Authors do not want to disappoint the conclusions of the previous studies; an alternative is that other Reviewers asked for this previously.

      We do not believe that this statement is contradictory to our findings, as distinct DR paradigms do appear to regulate AHN in different ways. However, we agree that we can be more explicit with regards to our own study findings and will prioritize the conclusions of our study over those of the entire field during revision.

      Reviewer #1 (Significance (Required)):

      value the significance of publishing studies that will advance the field.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):


      In this manuscript, Roberts et al. investigate the effect of the 5:2 diet on adult hippocampal neurogenesis (AHN) in mice via the ghrelin receptor. Many studies have reported benefits of dietary restriction (DR) on the brain that include increasing neurogenesis and enhancing cognitive function. However, neither the mechanisms underlying the effects of the 5:2 diet, nor potential benefits on the brain, are well understood. The authors hypothesize that the 5:2 diet enhances AHN and cognitive function via ghrelin-receptor signaling. To test this, they placedadolescent and adult ghrelin receptor knockout or wild type mice on either the 5:2 or ad libitum (AL) diet for 6 weeks, followed by spatial memory testing using an object in place (OIP) task. The authors also assessed changes in AHN via IHC using multiple markers for cell proliferation and neural stem cells. The authors observed a decrease in AHN due to age (from adolescent to adult), but not due to diet or ghrelin-receptor signaling. While loss of the ghrelin-receptor impaired spatial memory, the 5:2 diet did not affect cognitive function. The authors conclude that the 5:2 diet does not enhance AHN or spatial memory.

      We thank the reviewer for this summary. We note that there was a significant reduction in new neurones (BrdU+/NeuN+) cells in GHS-R null animals, regardless of age or diet (3 way ANOVA of age, genotype and diet (sexes pooled): Genotype P = 0.0290). These data suggest that the loss of ghrelin receptor signalling does impair AHN. However, we will re-analyse our data in light of reviewer 1 comments to remove ‘age’ as a variable. The new analyses and associated discussion will be presented in our revised manuscript.

      The authors use a 5:2 diet but fail to provide a basic characterization of this dietary intervention. For example, was the food intake assessed? In addition to the time restriction of the feeding, does this intervention also represent an overall caloric restriction or not? According to the provided results, the 5:2 diet does not appear to regulate adult hippocampal neurogenesis contrary to the authors' original hypothesis. Did the authors measure the effects of the 5:2 diet on any other organ system? Do they have any evidence that the intervention itself resulted in any well documented benefits in other cell types? Such data would provide a critical positive control for their intervention.

      This is an important point raised by the reviewer. Currently, we carefully quantified weight change across the duration of the study. However, we do not know whether the 5:2 diet reduced overall food intake or whether it impacted the timing of feeding events. To overcome this limitation, we will now test what impact the 5:2 dietary regime has on food intake and the timing of feeding. This study will allow us to correlate any changes with 5:2 diet. In addition, we have collected tibiae to quantify skeletal growth and have collected both liver and plasma (end point) samples which will be used to assess changes in the GH-IGF-1 axis. These additional studies will allow us to characterise the effects of the 5:2 paradigm on key indicators of physiological growth. These new data will be incorporated into the revised manuscript.

      Based on the effects of ghrelin in other dietary interventions, the authors speculate that the effect of the 5:2 diet is similarly mediated through ghrelin. However, the authors do not provide any basic characterization of ghrelin signaling to warrant this strong focus on the GSH-R mice. While the GSH-R mice display changes in NSC homeostasis and neurogenesis, none of these effects appear to be modified by the 5:2 diet. Thus, the inclusion of the GSH-R mice does not seem warranted and detracts from the main 5:2 diet focus of the manuscript.

      The role of ghrelin signalling via its receptor, GHSR, is a central tenet of our hypothesis. The loxTB-GHS-R null mouse is a well validated model of impaired ghrelin signalling, in which insertion of a transcriptional blocking cassette prevents expression of the ghrelin receptor (ZIgman et al.2005 JCI). We have previously shown that this mouse model is insensitive to calorie restriction (CR) mediated stimulation of AHN, in contrast to WT mice (Hornsby et al. 2016), justifying its suitability as a model for assessing the role of ghrelin signalling in response to DR interventions, such as the 5:2 paradigm. Whilst our findings do not support a role for ghrelin signalling in the context of the 5:2 diet studied, we did follow the scientific method to empirically test the stated hypothesis. While critiques of experimental design are welcome, the removal of these data may perpetuate publication bias in favour of positive outcomes and is something we wish to avoid.

      Neurogenesis is highly sensitive to stress. The 5:2 diet may be associated with stress which could counteract any benefits on neurogenesis in this experimental paradigm. Did the authors assess any measures of stress in their cohorts? Were the mice group housed or single housed?

      We thank the reviewer for raising this point. We have open-field recordings that will now be analysed to assess general locomotor activity, anxiety and exploration behaviour. Additionally, we will assess levels of the stress hormone, ACTH, in end point plasma samples. These datasets will be incorporated into the revised manuscript.

      The authors state that the 5:2 diet led to a greater reduction in body weight (31%) in adolescent males compared to other groups. However, it appears that the cohorts were not evenly balanced and the adolescent 5:2 male mice started out with a significantly higher starting weight (Supplementary Figure 1). The difference in starting weight at such a young age is significantly confounding the conclusion that the 5:2 diet is more effective at limiting weight gain specifically in this group.

      We thank the reviewer for highlighting this limitation. In the revision we will re-focus our discussion around the Δ Body weight repeated measures data, which compares the daily body weight of each group to its baseline value - thereby normalising any intergroup differences in starting weight. Furthermore, we will restructure figures 1 and S1 so that figure 1 presents only the repeated measure Δ Body weight data, while data for body weight both at baseline and on the final day of the study will be presented in figure S1.

      The authors count NSCs as Sox2+S100b- cells. However, the representative S100b staining does not look very convincing. Instead, it would be more appropriate to count Sox2+GFAP+ cells with a single vertical GFAP+ projection. Alternatively, the authors could also count Nestin-positive cells. Additionally, the authors label BrdU+ Sox2+ S100B- cells as "new NSCs". However, it appears that the BrdU labeling was performed approximately 6 weeks before the tissue was collected (Figure 1A). Thus, these BrdU-positive NSCs most likely represent label retaining/quiescent NSCs that divided during the labeling 6 weeks prior but have not proliferated since. As such, the term "new NSC" is misleading and would suggest an NSC that was actively dividing at the time of tissue collection.

      We apologise for presenting low-resolution images – these will be replaced by high-resolution images in the revised manuscript. In this study we have quantified the actively dividing BrdU+/Sox2+/S100B- cells that represent type II NSCs (rather than GFAP+ or Nestin+ type I NSCs) that have incorporated BrdU within the time period of the 6-week intervention. We appreciate the reviewer’s comments concerning the “new NSCs” terminology. We agree that we should be more specific in clarifying that the NSCs identified are those labelled during the 1st week of the 6-week intervention. We will amend this throughout the revised manuscript by re-naming these cells as 6-week old NSCs.

      Overall, this manuscript lacks a clear focus and narrative. Due to a lack of an affect by the 5:2 diet on hippocampal neurogenesis, the authors mostly highlight already well-known effects of aging and Grehlin/GSH-R on neurogenesis. Moreover, the authors repeatedly use age-related decline and morbidities as a rational for their study. However, they assess the effects of the 5:2 diet on neurogenesis only in adolescent and young mature but not aged mice.

      To provide greater clarity, and in accordance with reviewer 1’s comments, we will amend the text throughout to provide a focus on the data obtained. The objective of the changes will be to re-enforce the original study narrative. In relation to the use of the term ‘age-related decline’ or ‘age-related changes’, we think that these are appropriate to our study. Physiological ageing doesn’t begin at a specific point of chronological time, but is a process that is continuously ongoing. Indeed, our data is in agreement with previous studies reporting an age-related reduction in AHN at 6 months of age (e.g Kuhn et al.1996).

      Minor Points

      The authors combine the data from both male and female mice for most bar graphs. While this does not appear to matter for neurogenesis or behavioral readouts, there are very significant sexually dimorphic differences with respect to body size and weight. As such, male and female mice in Figure 1D,F should not be plotted in the same bar graph.

      We agree that sexual dimorphism exists with respect to body size and weight. We used distinct male and female symbols for each individual animal on these bar graphs, but do agree with the reviewer that sexual dimorphic differences should be emphasized. To achieve this, we will include additional supplementary graphs presenting the sex differences in starting weight, final weight, and weight change versus starting weight.

      The Figure legends are very brief and should be expanded to include basic information of the experimental design, statistical analyses etc.

      We thank the reviewer for this comment. We will provide specific experimental detaisl in the revised figure legends.

      Many figures include a representative image. However, it is often unclear if that is a representative image of a WT or mutant mouse, or a 5:2 or control group (Figure 2A, 3A, 4A, 5A).

      We structured the manuscript with a limited number of figures and associated microscope captured panels, with the aim of presenting representative images to illustrate the nature and quality of the IHC protocols. However, we will amend the figures for the revised manuscript to provide representative microscopy images, with each group included and clearly annotated.

      It would be helpful to provide representative images of DCX-positive cells in Figure 3A-F. Additionally, the authors should include a more extensive description of how this quantification was performed in the method section.

      We will revise the manuscript to provide representative high-resolution Dcx+ images displaying cells of each category. The method will also be revised to include a detailed description of how the quantification and classification was performed.

      The authors state "the hippocampal rostro-caudal axis (also known as the dorsoventral[] axis". However, the rostral-caudal and dorsal-central axis are usually considered perpendicular to one another.

      We agree that the dorso-ventral and rostral-caudal axes are anatomically distinct. The terms are often used interchangeably in the literature, which can lead misinterpretations (e.g the caudal portion of dorsal hippocampus is often mislabelled as ventral hippocampus). To avoid ambiguity, mislabelling or misidentification, we will include a supplementary figure detailing our anatomical definitions of the rostral and caudal poles of the hippocampus, alongside representative images and the bregma coordinates.

      Reviewer #2 (Significance (Required)):


      Understanding the mechanisms of a popular form of intermittent fasting (5:2 diet) that is not well understood is an interesting topic. Moreover, examining the effect of this form of intermittent fasting on the brain is timely. Notwithstanding, while the authors use multiple markers to validate the effect of the 5:2 diet on adult hippocampal neurogenesis, concerns regarding experimental design, validation, and data analysis weaken the conclusions being drawn.

      We thank reviewer 2 for this significance statement. We will revise the manuscript, as mentioned above, to clarify the experimental design, improve presentation of the data, and re-focus the narrative of the primary aims of the study.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):


      Summary


      In this study, Roberts and colleagues used a specific paradigm of intermitted fasting, the 5:2 diet, meaning 5 days ad libitum food and 2 non-consecutive days of fasting. They exposed adolescent and adult wild-type mice and ghrelin receptor knockout mice (GHS-R-/-) for 6 weeks to this paradigm, followed by 1 week ad libitum food. They further used the "object in place task" (OIP) to assess spatial memory performance. At the end of the dietary regime, the authors quantified newborn neurons and neural stem cells (NSCs) by immunohistochemistry. Roberts

      et al. show that the 5:2 diet does not change the proliferation of cells in the hippocampus, but report an increased number of immature neurons (based on DCX) in all the mice exposed to the 5:2 diet. This change however did not result in an increased number of mature adult-born neurons, as assessed by a BrdU birthdating paradigm. The authors further show diet-independent effects of the ghrelin receptor knockout, leading to less adult born neurons, but more NSCs in the adolescent mice and a lower performance in the OIP task.

      Major comments:

      The main conclusion of this study is that a specific type of intermitted fasting (5:2 diet) has no effects on NSC proliferation and neurogenesis. As there are several studies showing beneficial effects of intermitted fasting on adult neurogenesis, while other studies found no effects, it is important to better understand the effects of such a dietary paradigm.

      The experimental approaches used in this manuscript are mostly well explained, but it is overall rather difficult to follow the results part, as the authors always show the 4 experimental groups together (adolescent vs adult and wt vs GHS-R-/-). They highlight the main effects comparing all the groups, which most of the time is the factor "age". Age is a well-known and thus not surprising negative influencer of adult neurogenesis. Instead of focusing on the main tested factor, namely the difference in diet, the authors show example images of the two age classes

      (adolescent vs adult), which does not underly the major point they are making. Most of the time, they do not provide a post hoc analysis, so it is difficult to judge if the results with a significant main effect would be significant in a direct 1 to 1 comparison of the corresponding groups. The authors point out themselves that previous rodent studies did not use such a 5:2 feeding pattern, so having diet, age and genotype as factors at the same time makes the assessment of the diet effect more difficult.

      The manuscript would improve if the authors restructure their data to compare first the diet groups (adolescent wt AL vs 5:2 and in a separate comparison adult wt AL vs 5:2) and only in a later part of the results check if the Ghrelin receptor plays a role or not in this paradigm.

      We thank the reviewer for these comments. In line with comments from the other reviewers we will re-formulate the presentation of our datasets. We will remove ‘age’ as a key variable as age related changes are to be expected. For the revision, we will separate the adolescent and adult mouse data sets, plotting individual graphs for both. This should provide a clearer focus on 5:2 responses in both assessed genotypes.

      This re-configuration will impact the data being analysed and, therefore, the statistical analysis presented. In our original manuscript post hoc analyses were performed, however, only significant post hoc comparisons were highlighted (e.g figure 5). Non-significant post hoc comparisons have not been presented. In the method section of the revised manuscript, we will clarify that we’ll report post hoc differences when they are observed.

      During our study design, we decided to assess diet and genotype in parallel - as part of the same analysis. This seemed to us to be the most appropriate statistical method, so that we assessed dietary responses in both WT and GHS-R null mice.

      As this 5:2 is a very specific paradigm, it is furthermore difficult to compare these results to other studies and the conclusions are only valid for this specific pattern and timing of the intervention (6 weeks). It remains unclear why the authors have not first tried to establish a study with wildtype mice and a similar duration as in previous studies observing beneficial effects of intermitted fasting on neurogenesis. Like this, it would have been possible to make a statement if the 5:2 per se does not increase neurogenesis or if the 6 weeks exposure were just too short.

      The reviewer raises this relevant point which we considered during the study design period. Given that we had previously reported significant modulation of AHN with a relatively short period of 30% CR (14 days followed by 14 days AL refeeding (Hornsby et al.2016)), we predicted that a 6 week course on the 5:2 paradigm (totalling 12 days of complete food restriction over the 6 week period) would provide a similar dietary challenge. The fact that we did not observe similar changes in AHN with this 5:2 paradigm is notable.

      The graphical representation of the data could also be improved. Below are a few

      examples listed:

      1.) Figure 1 B and C, the same symbol and colours are used for the adolescent and adult animals, which makes the graphs hard to read. One colour and symbol per group throughout the manuscript would be better.

      We thank the reviewer for this comment. We will amend the presentation of the graphs throughout the manuscript to ensure that they are easier to interpret.

      2.) The authors found no differences in the total number of Ki67 positive cells in the DG. However, Ki67 staining does not allow to conclude the type of cell which is proliferating. It would thus strengthen the findings if this analysis was combined with different markers, such as Sox2, GFAP and DCX.

      Double labelling of Ki67 positive cells would allow for further insight into the identity of distinct proliferating cell populations. However, quantifying Ki67 immunopositive cells within the sub-granular zone of the GCL, as a single marker, is commonly used in studies of AHN. Given that studies of intermittent fasting, calorie restriction and treatment with exogenous acyl-ghrelin report no effect on NPC cell division, we decided not to pursue this line of inquiry.

      3.) In Figure 3, the authors say that the diet increases the number of DCX in adolescent and adult mice, which is not clear when looking at the graph in 3B. Are there any significant differences when directly comparing the corresponding groups, for instance the WT AL vs the WT 5:2? It is further not clear how the authors distinguished the different types of DCX morphology-wise. The quantification in C and D would need to be illustrated by example images. Furthermore, the colour-code used in these graphs is not explained and remains unclear

      While the 3 way ANOVA does yield a significant overall effect for diet, we agree that it is indeed difficult to see a difference on the graph, although the mean values of the adolescent 5:2 animals are more prominent than the AL counterparts. Mean +/- SEM will be provided in the supplementary section of the revised manuscript. Furthermore, we will clarify the method used to identify distinct DCX+ morphologies, include representative high-resolution images of each DCX+ cell category, and amend the colour coding to avoid misinterpretation.

      4) In Figure 5, the authors show that the number of new NSCs is significantly increased in the adolescent GHS-R-/- mice, independent of the diet, but this increase does not persist in the adult mice. They conclude that "the removal of GHS-R has a detrimental effect on the regulation of new NSC number..." this claim is not substantiated and needs to be reformulated. As the GHS-R-/- mice have a transcriptional blockage of Ghrs since start of its expression, would such an effect on NSC regulation not result in an overall difference in brain development, as ghrelin is also important during embryonic development?

      This is an interesting point. However, we disagree that the statement "the removal of GHS-R has a detrimental effect on the regulation of new NSC number..." is unsubstantiated, since it does not exclude any developmental deficits in these mice that may account for the differences observed. Nonetheless, we will rephrase the sentence to clarify our intended point and remove any ambiguity.

      5.) In Figure 6, the authors asses spatial memory performance with a single behavioral test, the OIP. As these kind of tests are influenced by the animal's motivation to explore, it's anxiety levels, physical parameters (movement) etc., the interpretation of such a test without any additional measured parameters can be problematic. The authors claim that the loss of GHS-R expression impairs spatial memory performance. As the discrimination ratio was calculated, it is not possible to see if there is an overall difference in exploration time between genotypes. This would be a good additional information to display.

      We thank the reviewer for this insight. We have open-field recordings that will now be analysed to assess general locomotor activity, anxiety and exploration behaviour. These data, alongside exploratory time of the mice during the OIP task will be incorporated into the revised manuscript.

      Besides these points listed above, the methods are presented in such a way that they can be reproduced. The experiments contained 10-15 mice per group, which is a large enough group to perform statistical analyses. As mentioned above, the statistical analysis over all 4 groups with p-values for the main effects should be followed by post hoc multiple comparison tests to allow the direct comparison of the corresponding groups.

      Reviewer #3 (Significance (Required)):

      In the last years, growing evidences suggested that IF might have positive effect on health in general and also for neurogenesis. However, a few recent studies report no effects on neurogenesis, using different IF paradigms. This study adds another proof that not all IF paradigms influence neurogenesis and shows that more work needs to be done to better understand when and how IF can have beneficial effects. This is an important finding for the neurogenesis field, but the results are only valid for this specific paradigm used here, which limits its significance. The reporting of such negative findings is however still important, as it shows that IF is not just a universal way to increase neurogenesis. In the end, such findings might have the potential to bring the field together to come up with a more standardized dietary intervention paradigm, which would be robust enough to give similar results across laboratories and mouse strains, and would allow to test the effect of genetic mutations on dietary influences of neurogenesis.

      We thank the reviewer for their insightful and thorough feedback.

      1. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      The manuscript has not been revised at this stage.

      2. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      • *

      We have included in our replies to the reviewers a description of the amendments that we will make to our manuscript. Two requested revisions stand out as being unnecessary or cannot be provided within the scope of a revision.

      The first was the request to perform the 5:2 study in older mice. This an interesting suggestion, however, the expense and time needed to maintain mice into old age (e.g >18 months) cannot be provided within the scope of our revision. In addition, given that we report no effect of the 5:2 paradigm on AHN in adolescent (7 week old) and adult (7 month old) mice, there is less justification for such a study in older mice.

      The second request, that we disagree with, was to remove data relating to the GHS-R null mice (see reviewer 2, point 2). The role of ghrelin signalling via its receptor, GHS-R, is a central tenet of our hypothesis. Whilst our findings do not support a role for ghrelin signalling in the context of the 5:2 diet studied, we followed the scientific method to empirically test the stated hypothesis. While critiques of experimental design are welcome, the removal of such data may perpetuate publication bias in favour of positive outcomes and is something we wish to avoid.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this study, Roberts and colleagues used a specific paradigm of intermitted fasting, the 5:2 diet, meaning 5 days ad libitum food and 2 non-consecutive days of fasting. They exposed adolescent and adult wildtype mice and ghrelin receptor knockout mice (GHS-R-/-) for 6 weeks to this paradigm, followed by 1 week ad libitum food. They further used the "object in place task" (OIP) to assess spatial memory performance. At the end of the dietary regime, the authors quantified newborn neurons and neural stem cells (NSCs) by immunohistochemistry. Roberts et al. show that the 5:2 diet does not change the proliferation of cells in the hippocampus, but report an increased number of immature neurons (based on DCX) in all the mice exposed to the 5:2 diet. This change however did not result in an increased number of mature adult-born neurons, as assessed by a BrdU-birthdating paradigm. The authors further show diet-independent effects of the ghrelin receptor knockout, leading to less adult born neurons, but more NSCs in the adolescent mice and a lower performance in the OIP task.

      Major comments:

      The main conclusion of this study is that a specific type of intermitted fasting (5:2 diet) has no effects on NSC proliferation and neurogenesis. As there are several studies showing beneficial effects of intermitted fasting on adult neurogenesis, while other studies found no effects, it is important to better understand the effects of such a dietary paradigm.

      The experimental approaches used in this manuscript are mostly well explained, but it is overall rather difficult to follow the results part, as the authors always show the 4 experimental groups together (adolescent vs adult and wt vs GHS-R-/-). They highlight the main effects comparing all the groups, which most of the time is the factor "age". Age is a well-known and thus not surprising negative influencer of adult neurogenesis. Instead of focusing on the main tested factor, namely the difference in diet, the authors show example images of the two age classes (adolescent vs adult), which does not underly the major point they are making. Most of the time, they do not provide a post hoc analysis, so it is difficult to judge if the results with a significant main effect would be significant in a direct 1 to 1 comparison of the corresponding groups. The authors point out themselves that previous rodent studies did not use such a 5:2 feeding pattern, so having diet, age and genotype as factors at the same time makes the assessment of the diet effect more difficult. The manuscript would improve if the authors restructure their data to compare first the diet groups (adolescent wt AL vs 5:2 and in a separate comparison adult wt AL vs 5:2) and only in a later part of the results check if the Ghrelin receptor plays a role or not in this paradigm.

      As this 5:2 is a very specific paradigm, it is furthermore difficult to compare these results to other studies and the conclusions are only valid for this specific pattern and timing of the intervention (6 weeks). It remains unclear why the authors have not first tried to establish a study with wildtype mice and a similar duration as in previous studies observing beneficial effects of intermitted fasting on neurogenesis. Like this, it would have been possible to make a statement if the 5:2 per se does not increase neurogenesis or if the 6 weeks exposure were just too short.

      The graphical representation of the data could also be improved. Below are a few examples listed:

      1. Figure 1 B and C, the same symbol and colours are used for the adolescent and adult animals, which makes the graphs hard to read. One colour and symbol per group throughout the manuscript would be better.
      2. The authors found no differences in the total number of Ki67 positive cells in the DG. However, Ki67 staining does not allow to conclude the type of cell which is proliferating. It would thus strengthen the findings if this analysis was combined with different markers, such as Sox2, GFAP and DCX.
      3. In Figure 3, the authors say that the diet increases the number of DCX in adolescent and adult mice, which is not clear when looking at the graph in 3B. Are there any significant differences when directly comparing the corresponding groups, for instance the WT AL vs the WT 5:2? It is further not clear how the authors distinguished the different types of DCX morphology-wise. The quantification in C and D would need to be illustrated by example images. Furthermore, the colour-code used in these graphs is not explained and remains unclear.
      4. In Figure 5, the authors show that the number of new NSCs is significantly increased in the adolescent GHS-R-/- mice, independent of the diet, but this increase does not persist in the adult mice. They conclude that "the removal of GHS-R has a detrimental effect on the regulation of new NSC number..." this claim is not substantiated and needs to be reformulated. As the GHS-R-/- mice have a transcriptional blockage of Ghrs since start of its expression, would such an effect on NSC regulation not result in an overall difference in brain development, as ghrelin is also important during embryonic development?
      5. In Figure 6, the authors asses spatial memory performance with a single behavioral test, the OIP. As these kind of tests are influenced by the animal's motivation to explore, it's anxiety levels, physical parameters (movement) etc., the interpretation of such a test without any additional measured parameters can be problematic. The authors claim that the loss of GHS-R expression impairs spatial memory performance. As the discrimination ratio was calculated, it is not possible to see if there is an overall difference in exploration time between genotypes. This would be a good additional information to display.

      Besides these points listed above, the methods are presented in such a way that they can be reproduced. The experiments contained 10-15 mice per group, which is a large enough group to perform statistical analyses. As mentioned above, the statistical analysis over all 4 groups with p-values for the main effects should be followed by post hoc multiple comparison tests to allow the direct comparison of the corresponding groups.

      Minor comments:

      The authors should provide more information in the figure legends and always show representative images of the parameters analyzed. Some of the images are also of low resolution and should be replaced with higher resolution images (for instance Fig. 5A). The significant P values of the multiple comparison between groups should be added into the figures.

      Significance

      In the last years, growing evidences suggested that IF might have positive effect on health in general and also for neurogenesis. However, a few recent studies report no effects on neurogenesis, using different IF paradigms. This study adds another proof that not all IF paradigms influence neurogenesis and shows that more work needs to be done to better understand when and how IF can have beneficial effects. This is an important finding for the neurogenesis field, but the results are only valid for this specific paradigm used here, which limits its significance. The reporting of such negative findings is however still important, as it shows that IF is not just a universal way to increase neurogenesis. In the end, such findings might have the potential to bring the field together to come up with a more standardized dietary intervention paradigm, which would be robust enough to give similar results across laboratories and mouse strains, and would allow to test the effect of genetic mutations on dietary influences of neurogenesis.

    1. En effet, bien qu’il soit distant de l’entreprise il reste soumis aux limites légales imposées par le Code du travail (10h par jour, 48 heures par semaine au maximum, 44 heures en moyenne sur douze semaines) et au repos quotidien minimal de 11 heures

      rappel de la legislation de base

    1. However, the models were able to fully or mostly complete many relevant subtasks. Given only the ability to write and run code, models appear to understand how to use this to browse the internet, get humans to do things for them, and carry out long-term plans – even if they cannot yet execute on this reliably

      Woah!

    1. Since this book is also about ethics, we should mention that the first thing these women were asked to program on the ENIAC was some calculations to help build thermonuclear bombs. How do you think they might have felt about being asked to do this? The building of those bombs involved many scientists and other professionals along the way, several of whom were not on board with the idea of what their calculations were being used for. This has raised questions about moral responsibility: were the women made complicit in whatever moral wrongs may have come about using calculations they performed using the ENIAC?

      This note added to my understanding including the pictures here and the description of the history of computer language as well as the code entered by the women to generate a series of thoughts. It made me aware of the tremendous advances in computer language and technology today.

    1. When someone wants a computer to perform a task (that hasn’t already been programmed), a human programmer will act as a translator to translate that task into a programming language. Next, a compiler (or interpreter) program will translate the programming language code into the binary code that the computer runs. In this set-up, the programming language acts as an intermediate language the way that French did in my earlier analogy.

      Reading this I think the analogy is very accurate and concise. I fully understand the meaning of the existence of computer language and how computers receive human instructions.

    1. map & and_then

      用于进一步处理Result值。

      它们都对Ok内部值应用一个闭包,对Err值保留。

      区别在于,对于闭包返回值: * map,会自动将其包装为Result::Ok类型; * and_then,不会。

      code: * result.and_then(|value| value + 2),结果为内部值的类型。 * result.map(|value| value + 2),结果为Result类型。

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Thank you for the rapid and favorable reviews of our manuscript entitled “Long-Read Genome Assembly and Gene Model Annotations for the Rodent Malaria Parasite Plasmodium yoelii 17XNL.” We particularly appreciated that both reviewers had substantial, detailed expertise with the sequencing and assembly of Plasmodium genomes, and valued their questions and suggestions to ensure high rigor of our work. We have addressed all of the reviewers’ comments in the revised manuscript, and have provided a point-by-point response to each below.

      Response to Reviewers

      Note: Point-by-point responses are provided in italics below each reviewer comment below. Line numbers referenced in our responses refer to their final line position in the Track Changes version of the manuscript.


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The manuscript entitled "Long-Read Genome Assembly and Gene Model annotation for the Rodent Malaria Parasite P. yoelii 17XNL" is a well-written manuscript providing updates and important observations about the genome assembly and annotation of this specific non-lethal isolate. The group overall did a great job showing how the application of newer technologies such as long-read DNA and direct RNA sequencing to generate top-quality genomes to be used as a reference for the community. Here are some comments about the work presented:

      Response: Thank you for your positive feedback and suggestions on how to clarify these findings. We have improved the revised manuscript based on your feedback and suggestions below.

      Major comments: - The authors added several result information across the methods section. Making the text repetitive, since the same is also presented in the results section. Please revise the method section to remove results from this section.

      Response: We agree and have streamlined both the Results and Methods sections to remove redundancy in these descriptions.

      • Some methods are also redundant in the Result section. For example, in line 141-142, the group describe which DNA extraction kit they used (again this is correctly mentioned in the methods section).

      Response*: We agree and have removed minutiae such as these from the Results section. These details remain in the Methods section to ensure reproducibility. *

      • Besides important, the group added several information about method comparison between base call accuracy and sequencing methods. I agree that having this information in the supplemental material is great, but I would be careful to not focus too much on those, since most of the observations are already well-known by the community and focus more in the biological relevance of what is being generated with the newly updated genome.

      Response: The advances in base calling algorithms do make substantial improvements to the Nanopore reads. We have only included a short description of this in the main manuscript and feel this is an appropriate amount of context for the typical reader. Those that love these details and want to dig further can find this content in our supplemental information.

      • The group did a great job generating two versions of the genome, and an updated gene annotation set using long-read sequencing. But the major question is, how about alternative splicing? They mention the use of it (line 350) but I don't see any result about how many alternative transcripts were observed, and if they were differentially detected in different life stages of the sets used for the RNA sequencing. This is a very important result to be added since one of the key pieces of information that long-read RNA sequencing brings for Genome annotation.

      Response: We have now expanded this description in the manuscript to note that 866 genes are predicted to have multiple transcript isoforms (Lines 240-241). Moreover, we have now generated a Supplemental Table 4 that lists these isoforms in the revised manuscript. As we have not conducted further validation of this large number of transcript isoforms, we have left the description at this level.

      • Same observation as above for potential long ncRNAs.

      Response: We agree that lncRNAs are a fascinating aspect of the biology of the parasite, but a proper analysis of this class of RNA is far outside of the scope of this current study. Automatic identification approaches with Nanopore data will likely yield high numbers of false positives, which require manual curation for rigorous annotation. We hope others can use these data to accelerate such studies as well.

      • From what I understand the Hifi run was able to generate a gapless genome assembly and the ONT run did not. What was the final coverage for each? From my experience with P. falciparum genomes, ONT even with the rapid kit was able to generate chromosomal level assemblies if the coverage was >100x (but again, this is not a rule). Add those valuable observations about the depth so the reader can check if other variables in the comparison should be made.

      Response: This is a particularly interesting aspect of not only our datasets, but of other Plasmodium genomes as well. This issue occurs at least in part due to the presence of many repeated elements in the subtelomeric regions. It is important to note that these repeated elements do not resolve into a single haplotype in an assembly due to conflicting information, not due to lack of coverage. For instance, regions may differ by only a few nucleotides that each have significant read support. We are particularly interested in a recent preprint that concludes that P. falciparum harbors extrachromosomal plasmids with these var sequences present (doi.org/10.1101/2023.02.02.526885). *If this observation is supported via peer review, this interpretation could also begin to explain our results with P. yoelii 17XNL as well. *

      • Also be sure that the structural comparisons between the genomes are not the ones used after running ragtag.py. If so, there is a high chance of structural bias in the scaffolded contigs.

      Response: We apologize for the confusion. We did not use ragtag for the PacBio assembly, and all structural and variant comparisons were done using the PacBio assembly. However, we did use ragtag for the Nanopore assembly that is included in this study as an additional resource to our community. These data were not used for variant calling though.

      • How Prokka differed from Braker2 for the Mitochondria/API annotation? This needs to be very well described since prokka is made for prokaryotic organisms and not for eukaryotic ones. And Braker2 uses a custom build dataset for training, which I believe contains known information about MIT/API for Plasmodium species.

      Response: We first applied Braker2 to the organellar genomes and identified only 6 genes in the apicoplast genome and only 2 genes in the mitochondrial genome. Due to their prokaryotic origin, we then tested if Prokka could alleviate this issue. To do so, we applied Prokka to the 17X reference genome and found that it detected all of its annotated organellar genes. Therefore, we also applied Prokka to our Py17XNL genome to annotate the genes found on the apicoplast and mitochondrial genomes. As a final validation check, the gene annotations on these two organellar genomes are effectively identical between 17X and 17XNL. This is consistent with the sequencing results and assemblies that show that the apicoplast genome is identical and the mitochondrial genome differs in a single, notable deletion in 17XNL.

      • Figure 5B, what is the peak observed in the mitochondria? What genes? Repeats?

      Response: What appears to be an inward pointed trough actually reflects the deletion of bases in 17XNL compared to the 17X assembly. We have clarified this in the manuscript on Lines 296-297 and in the legend of Figure 5.

      Minor comments: - For Oxford nanopore sequencing using the ligation kit, did the group check for potential chimeric reads generated by the protocol?

      Response*: We did. We used the adapter trimming software, Porechop, to identify and bin chimeric reads that were eliminated from the dataset. This method is described in the Makefile associated with the manuscript. *

      • Check if all species are italicized (for example, line 187 P. yoelii is not)

      Response: We have italicized this instance of P. yoelii and have reviewed the document to search for any other words that should be italicized.

      • In methods add the parameters for minimap2 for the direct RNA alignment

      Response*: We would encourage readers to view our MakeFile that has all of the commands and parameters used for the bioinformatic work reported here. *

      • For variant calling, I would use a minimum of 10x coverage to make a variant call instead of 5x. Besides looking well reproducible between all checks, I would be careful mainly with the single bp deletions with a such low threshold.

      Response: Read counts for the called variants were generally greater than 20. Moreover, we took these validations a step further and manually curated these variants using the data from multiple sequencing platforms used in this study to ensure high rigor in making these variant calls. We have further clarified this in the revised manuscript.

      • In some parts of the methods, the authors mentioned slight modifications in some protocols (for example, lines 443 and 454), besides well described in the text, could you highlight what were the modifications in the text? This will facilitate many other researchers to understand why those modifications were needed.

      Response: We have clarified these modifications in the revised manuscript. In short, these modifications consisted of: 1) For the HMW gDNA prep kit, an agitation speed of 1500 rpm was used as opposed to the recommended 2000 rpm due to limitations of our instruments. 2) A slow end over end mixing by hand was preferred over using a vertical rotating mixer as yield was consistently greater with this change. 3) For the RNeasy kit, the lysate was passed through a 20-gauge needle for homogenization of the sample. Instead of an on-column DNaseI treatment, the RNA was treated with DNaseI off of the column to promote complete DNA digestion. 4) A second elution from the RNeasy column was performed in order to improve yield.

      • As mentioned in the major, the data analysis method section needs rework to remove results from the text.

      Response: We have revised the manuscript accordingly.

      • The group mentioned that small contigs not mapping to Py17X were discarded. What are those? Repeats? Contamination?

      Response: These contigs were of mouse origin, as P. yoelii was grown in Swiss webster mice in this work. We have clarified this in the revised manuscript on Lines 183-184.

      Reviewer #1 (Significance (Required)):

      This work generated a strong method and resource for a better genome assessment of P. yoelii for the community. As I mentioned in my comments, some more details about the findings such as alternative splicing and lncRNAs may strengthen them even more the publication. I know that comparative analysis between Py17X and XNL is not in the scope here, but more information about it, such as a synteny plot would be great for the community to understand that they can rely on this new reference genome. I've been working with eukaryotic and prokaryotic genomes for more than a decade and I have a lot of experience with all the methods presented. I believe that potentially the depth generated for the ONT data may be one of the factors for not reaching the chromosomal level of this isolate, since HiFI was. The group did a great job on the method description, and I believe that the community will be very happy to incorporate this genome as one of the references for this organism.

      Response: We are thrilled that you value the data and the rigor of our approaches. We also believed that a direct comparison between 17X and 17XNL strains is critical. Because of this, we provided details of this comparison in Figures 5 and 6, as well as in supplemental files. Because our colleagues often use these strains interchangeably, it is important for our community to know what differences are present between the parental 17X and the cloned 17XNL line. While substantial identity exists between the 17X and 17XNL strains, there are many variants between them, including many that affect genes that are known to have essential functions for the parasite. For this reason and more, we believe the true 17XNL genome assembly will be a preferred reference once it is fully integrated into PlasmoDB.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The paper has three distinct parts, 1. Assembly of the P. yoelii yoelii 17XNL 2 Annotation of the genome and adding UTR regions 3. Comparing the sequence of 17XNL with 17X .

      Assembly: The authors present a novel assembly for the P. yoelii yoelii 17XNL genome. They used two different approaches, comparing Oxford Nanopore (ONT) long reads + Illumina DNA with PacBio Hifi. None of the approaches generated a telomer to telomer assembly so sequences from the 17X reference was used to fill in the mssing sequence.

      Response: Please also see the comment from Reviewer 1 and our response. The presence of many repeated elements in the subtelomeric regions leads to the challenges noted here about a telomere-to-telomere assembly, as well. The presence of these elements means that the sequences do not resolve into a single haplotype in an assembly due to conflicting information, not due to lack of coverage. Because of this, we have chosen to harmonize the selected haplotype at these subtelomeric regions with that of 17X, while still acknowledging and providing the complex data associated with the subtelomeric regions.

      Annotation Next, they generated long reads (ONT)and Illumina RNA-Seq to improve the annotation. Although, their annotation is not better than the current P. yoelii 17X reference genome in PlasmoDB, they could predict the UTR regions and alternative splice sites due to the 3' capturing approach and long reads. Having the UTR annotated and potentially having alternative splice sides is useful for the field.

      Response: We agree that the additional gene model annotations for both UTRs and alternative transcript isoforms is a valuable resource to our community. We are working with PlasmoDB currently to make these data readily accessible.

      17XNL - 17X comparison The author compared the 17XNL with the 17X reference. Both genomes were done with Pacbio, and it should be noted that P. yoelii has a GC content of probably ~23% with several homopolymer tracks. Further, the 17XNL genotype was obtained from a 17X culture, so the genomes are expected to be very similar as the author noted in the introduction. The authors found ~2000 differences; some are in genes, but many are indels, which very well could be sequencing errors. Finally, the authors claim that this genome could become relevant for the community as new reference to perform analysis. As their genome is so similar to 17X and they have to show that their annotation is at least as good as the current 17X reference genome (manual curated) and the difference are not due to sequence error in 17X or 17XNL.

      Response: As we describe below, we have taken multiple steps to inspect the quality of the 17X genome assembly (it is very robust), to call variants between strains, and to validate them using our data across multiple sequencing platforms and via manual curation. Because of this, we view these as true variants between the 17X and 17XNL genomes

      Major comments Overall I struggle to see the need for a "NEW" P. yoelii reference. It would be good to state how similar these genomes are - they are basically identical. As the 17XNL is curated manually, it would have made more sense to me to start from that one and then generate the UTR annotation and include splice sides. This could be easily loaded into an alternative Web-apollo track and then merged to the current annotation to be useful to the community.

      Response*: We chose to generate a new reference assembly for 17XNL because the current one is from 2002, remains in >5000 contigs, has gene identifiers that do not align with other current Plasmodium gene models (e.g., PY00204 vs. PY17X_0502200), and historically has had problematic gene models attributed to individual genes. This clean start ensures that users can know the provenance of the underlying data that created the genome assembly and gene models. *

      I wonder if many of the differences the authors found between 17X and the 17XNL reference are true. The authors are correct that some differences between 17X and 17XNL are true. I could not find any evidence of genome polishing with tools like Pilon or ICORN to correct sequencing errors, I wonder if these differences are sequencing errors.

      Response: The PacBio-based assembly received no error correction or polishing. It should be noted that all variants that were called automatically were also manually verified using data from multiple sequencing platforms generated in this study. Moreover, for coding sequences, we imposed a threshold that 80% of all reads at the variant’s location needed to support the variant in order to be considered true. Through these strict thresholds, we eliminated many potential variants that only had support from one sequencing platform. We highlight several variants that were confirmed through multiple datasets in Table 2.

      Did the authors look into the reads of the NCBI - GCA_900002385.2 - assembly? Maybe they could use the underlying Illumina reads if theirs don't have enough coverage. Also, the differences between 17X and 17XNL could be that the reference is wrong. How many pseudo genes did they obtain? Are there more or less than in the current reference?

      To confirm the calls, could you also map the 17XNL reads against the 17X reference and see if they are still true. As the same time, map the 17X illumina reads to see if the reference is correct at this state. When looking at the alignments, it can be seen that many different are in low complexity/repetitive regions.

      Response: We analyzed both their raw and assembled data to compare them with our results, and we determined that the 17X data and assembly were robust and that these difference likely reflect true variance between the strains. The 17X reference has 57 pseudogenes that are annotated as pir, fam-a/c, or others. Overall, there were 1057 pir genes annotated in the 17X genome, whereas we annotated 1048 for our Py17XNL genome. There were 302 fam-a/b genes annotated in the 17X genome, whereas we annotated 301 for our Py17XNL genome. As noted above, we confirmed variant calls using data from multiple sequencing platforms in this study as well as through manual curation.

      The authors sequence their genome with a HiFi Pacbio run and also ONG + DNASeq... but why did they not get 16 chomromes out? For example the current P. yoelii reference was assembled directly into far less pieces than theirs [P. chabaudi assembles into 16 pieces]. Could it be a different read depth or is it the fragment length? Could the authors please comment on that. Also, if there were contigs, why did they fill the sequence with 17X sequence, rather than keeping gaps? So in the end, their sequence is a hybrid, of 17X and 17XNL, right?

      Response: Please see our responses above to both Reviewer 1 and 2 regarding the heterogeneity of the subtelomeric regions that indicate that a single haplotype is not readily called. This is not due to insufficient read depth, but rather we believe it reflects something fascinating about Plasmodium genomes in these regions. A recent preprint (doi.org/10.1101/2023.02.02.526885) provides one possible interpretation for this observation.

      Why do you think you had less coverage of CCS read around the telomer ends? Do you think it is a systematic issue of the PacBio Hifi? Did you see any evidence of Illumina or ONT reads - or could it be that while culturing the telomer ends dropped off?

      Response: See our response above about the challenging nature of the subtelomeric regions of Plasmodium genomes. As above, this is not an issue of coverage per se, but rather of heterogeneous related sequences that are not readily resolved into a single haplotype. In order to minimize the risk of sequencing a genome of a mixture of heterogeneous parasites, we sequenced “Pass 0” parasites received directly from BEI Resources to ensure this genome reflects the established P. yoelii 17XNL clone.

      I realised that the authors used a lot of primary tools. I wonder why they chose that path, as there are several tools to do automatic finishing for long read assemblies: Assemblosis, ARAMIS, MpGAP or ILRA. Especially the last one focuses on Plasmodium genomes. Please comment.

      Response: We initially started our bioinformatic analyses using established tools such as these. Specifically, we first tried Companion and ILRA, but the results were not superior to those we achieved with the workflow we describe in this manuscript, which also provided greater parameter control.

      Also, for the annotation, could it not be better to transfer the manually curated genome annotation with LIFT off or RATT? All these tools are widely used in the generation of reference genomes in the parasitology field. I annotated their sequence with Companion, and although their gene models are good and some of the Companion calls might need improvement, overall, the Companion results look more exact to me.

      Response: Companion was the original tool we used for the generation of gene models. While we found that for a pre-package software platform it performed excellently, we found it to be insufficiently customizable and the results were not sufficiently accurate from our assessment. Additionally, lifting over information always raises the risk of imposing a different perspective on what is truly present. We believe that a high quality, de novo assembly is always preferable, and therefore chose this workflow.

      The code is very well organised, and it was easy to follow. Are you planning to put it on a GitHub repository?

      Response: We appreciate this recognition. We believe clear reporting of the bioinformatics work is critical for rigor and reproducibility. Yes, all of this will also be provided in GitHub to benefit the wider community.

      For the annotation in the attachment, there were two files. I had a look at them and they were quite different. As 17X and this genome are basically identical (Response: The two gff files represent either a Nanopore only or hybrid Nanopore+Illumina-based model. The latter produced a more comprehensive annotation of gene models, which is what we have proceeded with. However, we provided both in case end users find value in the Nanopore only annotation which has a 3’ bias due to the mechanism of how sequencing occurs via this approach.

      We have found meaningful variations in genome sequence that potentially impact biological function (see Discussion). Therefore, we maintain that these genomes are not basically identical and are useful to the malaria research community for these reasons and more.

      It is excellent that the genome is submitted to NCBI. Why are there 18k proteins? Are these the alternative spliced forms?

      Response*: We are not certain how this interpretation might have arisen, as we only have reported 7047 potential transcript isoforms to NCBI based upon our data. *

      Minor The current Py 17X genome in PlasmoDB is a Pacbio assembly (https://plasmodb.org/plasmo/app/record/dataset/TMPTX_pyoeyoelii17X), but not part of the 2014 paper. It was submitted later to NCBI than the paper the authors cite. Also, the current P. berghei Pacbio genome is from Fougère et al. PLoS Pathog 2016;12(11):e1005917.

      Response: We have now made a detailed note about the Py17X PacBio dataset in our revised manuscript on Lines 186-187. Mentions of the current P. berghei genome assembly had already cited the Foug’ere et al. publication.

      I tried to open the supplemental tables, but they were all in pdf rather than excel and split over several pages. Two had missing information, e.g. UTR per gene. From the name of the tables, I had an idea of what they should contain, but for a re-submission, it would be good to have them in the correct format.

      Response: We agree that provision of the PDFs of the supplemental files is not the ideal way to review these analyses. The complete data was also provided in the Excel files provided to Review Commons. We will ensure that the affiliate journal receives the Excel files for completion’s sake.

      To me, the beginning of the results reads a bit like an introduction (the part which sequencing technology to use)

      Response: We agree, and as noted to Reviewer 1 above, we have streamlined this section of the revised manuscript.

      Could you add to the tables: Sequence Coverage of the three technology, how many contigs you had before ordering the contigs and the number of pseudogenes in the annotation?

      Response: This information is now provided in Supplemental Table 3 in the revised manuscript.

      I struggle with the section header line 229-230 that the new sequence is more complete as it is a hybrid assembly with 17X. Alternatively, please explain how the consensus was built.

      Response: We agree and have revised this section header for accuracy.

      The authors correctly state that ONG is great, lines 333ff, but why does it not generate telomer-to-telomer chromosomes in this case? Please discusss.

      Response: Please see our response to this above for remarks made by both Reviewer 1 and 2. We have also added clarifying text in our revised manuscript discussing why this may have occurred.

      Reviewer #2 (Significance (Required)):

      General assessment As mentioned above, I struggle to see this as a strong leap for the malaria community to use this genome, as it is so similar to the current 17X genome, which is manually curated in plasmodb. Response: We agree that it is important to know how similar the genomes of 17X and the cloned 17XNL strain are. It is perhaps even more important to know what the key differences are as well. In this study, we have asked and answered these questions, and identified 2000+ variants between the strains. We have manually curated several of the variants that impact the expression of essential/important genes, and found that biologically meaningful differences exist (see Discussion). Finally, we have also provided additional information on the gene models of 17XNL, including an experimental definition of UTRs and transcript isoforms. Together, we hold that these data will not only match those currently available for 17X, but will exceed them. We are currently working with PlasmoDB to make these data readily accessible to our community.

      Advance The authors should make the comparison of ONT and PacBio HiFi clearer and discuss why the technologies still don't generate telomer-to-telomer sequences. From the biological side, none of the found differences were related to the different phenotype between 17X and 17XNL.

      Response: We have provided these comparisons and all related data to the reader in this manuscript, as well as through public depositories. Please see above for our responses as to why a true telomere-to-telomere assembly is challenging with Plasmodium parasites, and for a recent preprint that might provide an explanation for this. Finally, the phenotypic differences between 17X and 17XNL are variable, which might reflect differences in individual parasite stocks as has been historically seen in the spontaneous development of lethality in multiple laboratories. While we do not find any particular genetic difference correlates with a specific phenotype, these data using the cloned 17XNL parasite available from BEI provides a robust reference with a defined parasite stock.

      Audience: I do agree that adding the UTR sequence will be useful for those working with P. yoelii as a model, or who want to do comparative UTR analysis across species.

      Response: We agree that this additional gene model information will be valuable. We are working with PlasmoDB to make this information readily available and are already integrating it into our ongoing studies.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The paper has three distinct parts,

      1. Assembly of the P. yoelii yoelii 17XNL 2 Annotation of the genome and adding UTR regions
      2. Comparing the sequence of 17XNL with 17X .

      Assembly: The authors present a novel assembly for the P. yoelii yoelii 17XNL genome. They used two different approaches, comparing Oxford Nanopore (ONT) long reads + Illumina DNA with PacBio Hifi. None of the approaches generated a telomer to telomer assembly so sequences from the 17X reference was used to fill in the mssing sequence.

      Annotation Next, they generated long reads (ONT)and Illumina RNA-Seq to improve the annotation. Although, their annotation is not better than the current P. yoelii 17X reference genome in PlasmoDB, they could predict the UTR regions and alternative splice sites due to the 3' capturing approach and long reads. Having the UTR annotated and potentially having alternative splice sides is useful for the field.

      17XNL - 17X comparison The author compared the 17XNL with the 17X reference. Both genomes were done with Pacbio, and it should be noted that P. yoelii has a GC content of probably ~23% with several homopolymer tracks. Further, the 17XNL genotype was obtained from a 17X culture, so the genomes are expected to be very similar as the author noted in the introduction. The authors found ~2000 differences; some are in genes, but many are indels, which very well could be sequencing errors.

      Finally, the authors claim that this genome could become relevant for the community as new reference to perform analysis. As their genome is so similar to 17X and they have to show that their annotation is at least as good as the current 17X reference genome (manual curated) and the difference are not due to sequence error in 17X or 17XNL.

      Major comments

      Overall I struggle to see the need for a "NEW" P. yoelii reference. It would be good to state how similar these genomes are - they are basically identical. As the 17XNL is curated manually, it would have made more sense to me to start from that one and then generate the UTR annotation and include splice sides. This could be easily loaded into an alternative Web-apollo track and then merged to the current annotation to be useful to the community.

      I wonder if many of the differences the authors found between 17X and the 17XNL reference are true. The authors are correct that some differences between 17X and 17XNL are true. I could not find any evidence of genome polishing with tools like Pilon or ICORN to correct sequencing errors, I wonder if these differences are sequencing errors. Did the authors look into the reads of the NCBI - GCA_900002385.2 - assembly? Maybe they could use the underlying Illumina reads if theirs don't have enough coverage. Also, the differences between 17X and 17XNL could be that the reference is wrong. How many pseudo genes did they obtain? Are there more or less than in the current reference?

      To confirm the calls, could you also map the 17XNL reads against the 17X reference and see if they are still true. As the same time, map the 17X illumina reads to see if the reference is correct at this state. When looking at the alignments, it can be seen that many different are in low complexity/repetitive regions. The authors sequence their genome with a HiFi Pacbio run and also ONG + DNASeq... but why did they not get 16 chomromes out? For example the current P. yoelii reference was assembled directly into far less pieces than theirs [P. chabaudi assembles into 16 pieces]. Could it be a different read depth or is it the fragment length? Could the authors please comment on that. Also, if there were contigs, why did they fill the sequence with 17X sequence, rather than keeping gaps? So in the end, their sequence is a hybrid, of 17X and 17XNL, right?

      Why do you think you had less coverage of CCS read around the telomer ends? Do you think it is a systematic issue of the PacBio Hifi? Did you see any evidence of Illumina or ONT reads - or could it be that while culturing the telomer ends dropped off?

      I realised that the authors used a lot of primary tools. I wonder why they chose that path, as there are several tools to do automatic finishing for long read assemblies: Assemblosis, ARAMIS, MpGAP or ILRA. Especially the last one focuses on Plasmodium genomes. Please comment.

      Also, for the annotation, could it not be better to transfer the manually curated genome annotation with LIFT off or RATT? All these tools are widely used in the generation of reference genomes in the parasitology field. I annotated their sequence with Companion, and although their gene models are good and some of the Companion calls might need improvement, overall, the Companion results look more exact to me. The code is very well organised, and it was easy to follow. Are you planning to put it on a GitHub repository? For the annotation in the attachment, there were two files. I had a look at them and they were quite different.

      As 17X and this genome are basically identical (<2k variants), would it not be better to transfer the genes from the 17X genome and then add the UTR (see comment before)? The 17X is manually curated. Table 1 and figure 4 show that it is far better. I doubt that the community would use this genome, if the annotation is not lifted over.

      There are two gff files in the supplemental. Which one is better? It is excellent that the genome is submitted to NCBI. Why are there 18k proteins? Are these the alternative spliced forms?

      Minor

      The current Py 17X genome in PlasmoDB is a Pacbio assembly (https://plasmodb.org/plasmo/app/record/dataset/TMPTX_pyoeyoelii17X), but not part of the 2014 paper. It was submitted later to NCBI than the paper the authors cite. Also, the current P. berghei Pacbio genome is from Fougère et al. PLoS Pathog 2016;12(11):e1005917. I tried to open the supplemental tables, but they were all in pdf rather than excel and split over several pages. Two had missing information, e.g. UTR per gene. From the name of the tables, I had an idea of what they should contain, but for a re-submission, it would be good to have them in the correct format. To me, the beginning of the results reads a bit like an introduction (the part which sequencing technology to use) Could you add to the tables: Sequence Coverage of the three technology, how many contigs you had before ordering the contigs and the number of pseudogenes in the annotation? I struggle with the section header line 229-230 that the new sequence is more complete as it is a hybrid assembly with 17X. Alternatively, please explain how the consensus was built. The authors correctly state that ONG is great, lines 333ff, but why does it not generate telomer-to-telomer chromosomes in this case? Please discusss.

      Significance

      General assessment As mentioned above, I struggle to see this as a strong leap for the malaria community to use this genome, as it is so similar to the current 17X genome, which is manually curated in plasmodb.

      Advance The authors should make the comparison of ONT and PacBio HiFi clearer and discuss why the technologies still don't generate telomer-to-telomer sequences. From the biological side, none of the found differences were related to the different phenotype between 17X and 17XNL.

      Audience: I do agree that adding the UTR sequence will be useful for those working with P. yoelii as a model, or who want to do comparative UTR analysis across species.

    1. Le télétravail a été introduit dans le Code du travail à l’article 1222-9 par la loi du 23 mars 2012 (l’article 46 de la loi dite Warsmann définit le télétravail). Cette loi prévoit des mesures de protection des données et de préservation de la vie privée. L’Accord National Interprofessionnel du 19 juillet 2005 dans son article premier donne du télétravail la définition suivante : « Le télétravail est une forme d’organisation et/ou de réalisation du travail, utilisant les technologies de l’information, dans le cadre d’un contrat de travail et dans laquelle un travail, qui aurait également pu être réalisé dans les locaux de l’employeur, est effectué hors de ces locaux de façon régulière ».

      L'auteur pose à nouveau un cadre juridique mentionnant d'une part la préservation de la vie privée et d'autre part la définition légal du télétravail. Registre Epistémique

    1. Author Response

      Reviewer #1 (Public Review):

      Bustion and colleagues outline the creation and testing of an in-silicon method to query gut microbiome databases for genes encoding enzymes predicted to catalyze a reaction of interest, which is provided by the user. Strengths of the tool include attempts to examine nearly 9,000 MetaCyc reactions in a pre-calculated fashion and to rank order enzymes based on their likelihood of catalyzing a reaction. Substrates, products, and even cofactors, if known, are employed to strengthen the power of the search algorithm, which also employs a hidden Markov model to improve the selection of putative hit enzymes. The authors outline high success rates with examples presented and compare those results with other extant methods, which are reported to perform in a less robust manner. Weaknesses include lack of evidence of success on a more difficult "real world" example. However, the tool outlined is a clear advance over existing methods and will be useful to explore the diversity of chemical transformation performed by commensal microbiota.

      We thank Reviewer 1 for their positive feedback and constructive summary. We agree that a real-world example would add confidence to our findings. We previously demonstrated SIMMER’s utility using published datasets. To expand upon these findings, we added another evaluation on an external dataset (Artacho et al., 2020) and performed new experiments to test SIMMER predictions for methotrexate metabolism into DAMPA and glutamate, a reaction known to be performed by the human microbiome but for which human gut strains and specific gut enzymes were not previously known. Both the new external dataset and our experimental findings validate SIMMER’s predictions of bacteria capable of metabolizing methotrexate, the mainline therapeutic for rheumatoid arthritis patients.

      Reviewer #2 (Public Review):

      This work provides a new computational tool for the systematic characterization of biotransformation reactions in the human gut microbiome: given a biotransformation reaction of interest, it predicts a list of candidate bacterial species, enzymes, and EC identifiers putatively capable of performing the queried reaction. The method is innovative and clearly presented.

      The pipeline that relies on both chemical and protein similarity algorithms, is in principle applicable to any biotransformation reaction that can be formulated as linked substrates and products (possibly including co-factors). This contrasts with other approaches that, for example, only rely on smaller databases and solely rely on substrates and chemical similarity. Moreover, SIMMER outperformed two other recently developed methods, against which it was benchmarked for its prediction accuracy when tested on a control test set derived from literature.

      The work interestingly focuses on predicting bacterial enzymes responsible for drug biotransformation, therefore showcasing its potential as a hypothesis generator for characterizing and validating novel bacterial enzymes in vitro.

      The authors correctly describe the relevance of an accurate input (in terms of reaction completeness, including cofactors and reaction products) as paramount for the quality of the prediction.

      The conclusions of this paper are mostly well supported by data, but some aspects of performance evaluation and its generality might benefit from additional elaborations and clarifications.

      1) Great emphasis has been dedicated to the prediction performance of SIMMER over a positive control set derived from the available literature. However, a more extensive description and analysis of false positive results are needed to better understand the possible impact of the (potentially many) false positive predictions listed for each reaction.

      We agree that our analysis would benefit from an assessment of false positives. Unfortunately, current literature usually reports which reactions an enzyme is capable, rather than incapable, of performing. For this reason, we took a conservative approach and decided to define all reactions preceding that which yielded a positive control enzyme sequence as false positives. This is now described above in Essential Revisions Response 1.3.

      2) The authors imply that the current method is superior to two other methods based on accuracy. However, a more extensive description of the benchmarking results would strengthen these benchmarking efforts.

      We have addressed this concern in Essential Revisions Response 3.

      3) The authors only showcase SIMMER in the context of drug metabolism but claim its applicability to be general enough to also describe other biotransformation in the human gut microbiota. Although in principle believable, the authors could improve the credibility and generalizability of their method by demonstrating another use case, e.g., food compounds, for which extensive metagenomic and metabolomic data are already available from previous gut microbiome studies.

      We agree that assessments of SIMMER’s predictions on food metabolism would improve the generalizability of the method. We have edited the text to focus on drug metabolism, as we believe SIMMER’s application to food metabolism merits a more thorough, future investigation.

      4) Showcasing experimental in vitro validation of SIMMER predicted enzyme(s) could greatly strengthen the relevance of this work.

      We have addressed this in Essential Revisions Response 2.

      5) Throughout the text and the title, a more careful and precise phrasing of the tool's scope (characterization of microbiome-encoded enzymatic reactions and not the identification of novel chemical transformations) would improve the reader's understanding of the work.

      We agree, and have reworded many key phrases in the text, including the title.

      Reviewer #3 (Public Review):

      This manuscript presents a new tool, SIMMER, to predict bacterial enzymemediated transformations of compounds, an important and incompletely understood aspect of microbiome drug metabolism. The authors compare their resource to existing resources that allow users to generate hypotheses related to compound toxicity and putative routes of compound metabolism. The authors identify the key innovations of their resource as including full chemical representations of reactions and a novel method to predict an enzyme's EC number (a description of function) from its reaction.

      Strengths

      Generating user-friendly tools to explore existing knowledge of bacterial enzymes and their reactions is important.

      SIMMER is a novel resource where the user provides the substrates and products as input and receives a list of potential microbiome enzymes as output.

      SIMMER includes a novel EC predictor based on reaction rather than based on sequence.

      Weaknesses

      Validation claims are not well supported by the results.

      We have extensively edited the manuscript to better describe our previous computational validations, and we have added new analyses to further evaluate SIMMER. We added an additional validation on an external dataset, an in vitro experimental assessment of SIMMER’s predictions for methotrexate metabolism, two new reactions to the positive control analysis, a false positive rate, and additional comparisons to the two competing methods.

      Need for the user to know both the substrate and the product for a reaction of interest limits the utility of the resource.

      We agree that this is a limitation for the user, but as we show in our Results, relying on substrates alone does not yield appropriate representations of reactions and therefore does not allow for accurate predictions of responsible species/strains and enzymes (i.e., finding True Positives, and confirming associations from previously collected data). We agree that tools requiring only substrates are convenient, but our results show that they are less helpful in finding appropriate metabolism and enzyme predictions. Many studies of biotransformation in the human gut identify the product information or product structure via HPLC, LC-MS, and NMR techniques. In cases where such data was not gathered, or not gathered with enough structural resolution, researchers can use tools such as Biotransformer to make product template predictions before inputting a query to SIMMER. This recommendation is included in the present manuscript’s lines 376–391:

      In instances when DrugBug and MicrobeFDT did make predictions, they suffered from low accuracy (Table 1), which we hypothesized was due to both methods’ reliance on substrate rather than reaction chemistry. Biotransformations involve the relationship between substrate(s), cofactor(s), and an enzyme to yield a particular product(s). As one substrate can exhibit affinity for multiple enzymes, resulting in multiple unique products, sole employment of substrates in a chemical fingerprint does not achieve the resolution necessary to make relevant predictions. To test if SIMMER’s better performance could be attributed to including cofactors and products, we modified our code to run with a chemical representation that includes only the substrate of each positive control reaction. Enzyme prediction accuracy dropped from 88% down to 33%, and EC prediction accuracy dropped from 93% down to 48% (Table 1—source data), supporting the hypothesis that SIMMER’s better performance when compared to DrugBug and MicrobeFDT is due in large part to our using chemical representations that include the full reaction. These results are in line with our previous demonstration that SIMMER clusters enzymatic reaction chemistry only when a full reaction is employed (Figure 2, Figure 2—figure supplement 4).

      Reliance on homology transfer annotation to predict enzyme function; this approach has important, microbiome-relevant, limitations.

      Please refer to our separate Common_Questions.pdf document, Common question 1: Are EC codes sufficient to select enzyme orthologs within an overall class?

    1. Attach packages

      I know the code chunks is helpful (to me at least), but in a report, I would hidden them. Any reader interested in the code should read them in your Rmd file. Only highlight code chunk that is meaningful to the audience.

    1. # AF_PFS: Current asthma among adults aged greater than or equal to 18 years (percentile) # DF_PFS: Diagnosed diabetes among adults aged greater than or equal to 18 years (percentile) # HDF_PFS: Coronary heart disease among adults aged greater than or equal to 18 years (percentile) # LLEF_PFS: Low life expectancy (percentile) # P200_I_PFS: Percent of individuals below 200% Federal Poverty Line, imputed and adjusted (percentile) # LSTF_COUNT: Land surface temperature data points (counts) within the census tract # LSTF_MIN: Land surface temperature minimum within the census tract in degrees fahrenheit (F) # LSTF_MAX: Land surface temperature maximum within the census tract in degrees fahrenheit (F) # LSTF_MEAN: Land surface temperature mean within the census tract in degrees fahrenheit (F) # LSTF_STD: Land surface temperature standard deviation within the census tract in degrees fahrenheit (F)

      This is very helpful, but readers of the map may not pay attention to the code comments - as a matter of fact you may want to hide the code snippet when presentation your product to non-technical audience. Consider rename your column to something expressive but concise.

    1. HTTP is an extensible protocol and 422 is registered in IANA, which makes it a standard status code. So nothing stops you from using 422 in your application. And since June 2022, 422 is defined in the RFC 9110, which is the document that currently defines the semantics of the HTTP protocol:
    1. 15.5.21. 422 Unprocessable Content The 422 (Unprocessable Content) status code indicates that the server understands the content type of the request content (hence a 415 (Unsupported Media Type) status code is inappropriate), and the syntax of the request content is correct, but it was unable to process the contained instructions. For example, this status code can be sent if an XML request content contains well-formed (i.e., syntactically correct), but semantically erroneous XML instructions.
    1. Traditional visual environments visualize the code. They visualize static structure. But that's not what we need to understand. We need to understand what the code is doing.Visualize data, not code. Dynamic behavior, not static structure.”http://worrydream.com/#!/LearnableProgramming

      El asunto es que debido al homomorfismo, el código puede ser visto como datos y viceversa. Las mismas técnicas empleadas en visualizar el uno pueden ser usadas en los otros, como de hecho ya hemos experimentado varias veces en la comunidad de Grafoscopio a través de las narrativas de datos.

    2. Well, the best approximation we have is spreadsheets, actually. That's why they're super popular.Put the data in front, marginalize the code. Code is either "hidden" in cells where you get a live[1] preview of the data or in modules other have built but you can modify[1] if you want to.Now, how do we take this approach to the next level, that's a problem on the scale of figuring out human genetic engineering or fixing climate change :-)[1] Most of the time. relaxing 62 days ago | root | parent | next [–] First let’s figure out how to scale spreadsheets to the complexity of running a medium size business.

      Las hojas de cálculo con esta inversión entre los datos y el código parecen una buena forma de popularizar la programación, si bien, como se dice acá suelen ser no escalables y conducir a código difícil de entender.

      ¿Cómo se podrían traer las ventajas de las hojas de cálculo a las libretas interactivas? Quizás Coda puede ser un buen punto de inspiración, para cosas futuras en Grafoscopio, vía snippets especializados, sin pasar por la complejidad incidental de la web, e implementado directamente dentro de Block, como una aplicación de escritorio, que pueda ser exportada a la web.

    1. Example showing batch size influencing convergence speed.

      this is a bit confusing with the placement here... I'd move it above the corresponding graph and code block

    1. in general, we can conclude that the use of momentum significantly speeds up convergence.

      If the one with momentum only sometimes outperforms the one without, you can't really make this claim for the general case. It does with your code and some of the cases you tested, but that doesn't necessarily mean it does on a general basis

    1. Author Response

      Reviewer #1 (Public Review):

      In this paper, the authors present a method for discovering response properties of neurons, which often have complex relationships with other experimentally measured variables, like stimuli and animal behaviors. To find these relationships, the authors fit neural data with artificial neural networks, which are chosen to have an architecture that is tractable and interpretable. To interpret the results, they examine the first- and second-order approximations of the fitted artificial neural network models. They apply their method profitably to two datasets.

      The strength of this paper is in the problem it is attempting to solve: it is important for the field to develop more useful ways to analyze and understand the massive neural datasets collected with modern imaging techniques.

      The weaknesses of this paper lie in its claims (1) to be model free and (2) to distinguish the method from prior methods for systems identification, including spike triggered averaging and covariance (or rather their continuous response equivalents). On the first claim, the systems identification methods are arguably substantially more model free approach. On the second claim, this reviewer would require more evidence that the presented approach is substantially different from or an improvement on systems identification methods in common use applied directly to the data.

      We thank the reviewer for carefully engaging with the manuscript and believe that our revisions address these points of critique both through novel analysis and through clarifications.

      First claim: We fully agree that systems identification approaches are in theory truly model-free while MINE imposes constraints through the chosen architecture. However, our new analysis comparing MINE to direct fitting of the kernels of a Volterra expansion highlights that this is not really the case in practice. In order to obtain good fits, the model-free-ness has to be substantially reduced by imposing constraints on the degrees of freedom. We quantify this reduction in Figure S3 and directly compare it to the effective degrees of freedom of the CNN. Reducing degrees of freedom is also a theme that can be found throughout the literature on systems-identification, especially when the analysis does not involve Gaussian white noise as input stimuli. We therefore stand by our claim that MINE is “essentially model-free” in the sense that it does not rely on defining a model a-priori much like systems identification. And we also clarify our choice of calling the method “model-free” in the introduction where we state: “While the architecture and hyper-parameters of the CNN used by MINE do impose constraints on which relationships can be modeled, we consider the convolutional network ``model-free’’ because it does not make any explicit assumptions about the underlying probability distributions or functional forms of the data.”

      Second claim: We believe that our new analysis for the comparison with the Volterra expansion approach of systems identification addresses this point. By directly fitting Volterra kernels instead of relying on spike-triggered analysis we put the comparison on a more equal footing than our previous STA/STC exposition. We can show that while the methods are equivalent for Gaussian white noise stimuli, MINE is superior for highly correlated input stimuli. We show that imposing constraints on the regression used to identify the Volterra kernels can overcome this gap to a large extent, but MINE still produces a model that has higher predictive power and MINE also does more than extracting receptive fields. We are also not entirely sure to what extent Wiener/Volterra analysis has been applied to calcium imaging data. While there is a vast body of literature on systems identification, there is little evidence that it has been widely applied to data in which both inputs and outputs are highly correlated across time, such as calcium imaging experiments using naturalistic stimuli. While this doesn’t have to mean anything in and of itself it might point to the fact that this analysis is not easily accessible and requires ample tuning. These are precisely two problems that MINE aims to overcome. We now more explicitly state in the manuscript that we believe this accessibility to be one of the core strengths of MINE.

      Reviewer #2 (Public Review):

      This paper describes a relatively unbiased and sensitive method for identifying the contributions of different behavioral parameters to neural activity. Their approach addresses, in an elegant way, several difficulties that arise in modeling of neuronal responses in population imaging data, namely variations in temporal filtering and latency, the effects of calcium indicator kinetics, interactions between different variables, and non-linear computations. Typical approaches to solving these problems require the introduction of prior knowledge or assumptions that bias the output, or involve a trade-off between model complexity and interpretability. The authors fit individual neuron's responses using neural network models that allow for complex non-linear relationships between behavioral variables and outputs, but combine this with analysis, based on Taylor series approximations of the network function, that gives insight into how different variables are contributing to the model.

      The authors have thoroughly validated their method using simulated data as well as showing its applicability to example state of the art data sets from mouse and zebrafish. They provide evidence that it can outperform current approaches based on linear regression for the identification of neurons carrying behaviorally relevant signals. They also demonstrate use cases showing how their approach can be used to classify neurons based on computational features. They have provided Python code for the implementation and have explained the methods well, so it will be easy for other groups to replicate their work. The method could be applied productively to many types of experiments in behavioral and systems neuroscience across different model systems. Overall, the paper is clearly written and the experiments are well designed and analysed, and represent a useful contribution to the neuroscience field.

      We thank the reviewer for their favorable assessment of our work.

      Reviewer #3 (Public Review):

      In the current study, the authors present a novel and original approach (termed MINE) to analyze neuronal recordings in terms of task features. The method proposed combines the interpretability of regressor-based methods with the flexibility of convolutional neural networks and the aim is to provide an unbiased, "model-free" approach to this very important problem.

      In my opinion, the authors succeed in most of these aspects. They use three datasets: an artificially-generated one that provides a ground-truth, a published dataset from wide-scale cortical mouse recordings and a novel one that studies thermosensation in larval zebrafish. MINE compares favorably in all three cases.

      I believe that the paper would mostly benefit from an increased effort in clear exposition of the Taylor expansion approach, which is at the core of the method. The methods section describes the mathematics, but I wonder whether it would be possible to illustrate or schematize this in a main Figure, e.g. as an addition to Figure 1 or as a new figure. Around line 185, the manuscript reads: "We therefore perform local Taylor expansions of the network at different experimental timepoints. In other words, we differentiate the network's learned transfer function that transforms predictors into neural activity."

      It would help to explicitly state with respect to what the derivative is being computed (i.e. time) and maybe a diagram (which I had to draw to understand the paper) in which a neuronal activity trace is shown and from time t onwards a prediction is computed using terms in the Taylor expansion would be very instructive (showing on an actual trace how disregarding certain terms changes the prediction and hence the conclusions about the actual dependence of the trace on the behavioral features). The formulation in terms of Jacobians and Hessians can then be restricted to the Methods section and the paper will be easier to read for a wider audience.

      We agree with the reviewer that readability is key. We hope that our re-write and re-organization of the manuscript makes it easier to follow. We now start with a unified description of complexity and non-linearity both derived from a Taylor decomposition around the data-average. We use this section (starting Line 91) to lay out the logic of the Taylor expansion and explicitly state that the derivatives describe the expected change in output given any change in predictors. We did not want to remove the math entirely from the paper, simply because we found it hard to explain the concept entirely without it. We have provided an annotation to the formula parts in the new Figure 2 and a small schematic to illustrate the pointwise expansion of the Taylor metric in the new Figure 4.

      The method is presented as a "model-free" approach (title and introduction). I think it would help to discuss this with some precision. The Taylor expansion approach does imply certain beliefs on the structure of the data (which are well founded in most cases). Do the authors agree that MINE would encapsulate any regression model where both linear and interaction terms are allowed to include an arbitrary non-linearity (in the case of the interaction terms, different non-linearities for both variables)? If this is the case, maybe an explicit statement would allow the reader to quickly identify the versatility of MINE.

      We are now attempting to make the statement of model-free more precise through quantifications in our rewritten section on deriving receptive fields. We now provide an explanation in the introduction for why we believe that “model-free” is justified. We state: “While the architecture and hyper-parameters of the CNN used by MINE do impose constraints on which relationships can be modeled, we consider the convolutional network ``model-free’’ because it does not make any explicit assumptions about the underlying probability distributions or functional forms of the data.”

      In principle, MINE can accommodate higher-order interactions as well (say of the form xyz or x*y^2) and it certainly has flexibility in applying nonlinear transformations. However, we did not find a satisfying way to quantify the space of possible models MINE can represent exactly and therefore do not feel comfortable to make a precise statement about this.

      I find the section relating to non-linearities interesting, but was slightly disappointed to find that the authors do not propose a single method. In Figure 3E, the authors show that a logistic regression model that combines the curvature and NLC apporaches outperforms either, but the model is not described in any sort of detail. I appreciate the attempt made by the authors to apply this to the zebrafish imaging dataset in Figure 7, but it was still unclear to me how non-linearities and complexity are related.

      We fully agree with the reviewer. We have now merged non-linearity and complexity determination. We hope that this a) simplifies the paper and b) creates a metric that likely generalizes better and in which specific values are more interpretable. In brief, we now define both the nonlinearity and complexity based on truncations of the Taylor expansion around the data average. This new result section (Lines 90-142) also gives us a chance to (hopefully) better introduce the Taylor expansion approach.

    2. Reviewer #2 (Public Review):

      This paper describes a relatively unbiased and sensitive method for identifying the contributions of different behavioral parameters to neural activity. Their approach addresses, in an elegant way, several difficulties that arise in modeling of neuronal responses in population imaging data, namely variations in temporal filtering and latency, the effects of calcium indicator kinetics, interactions between different variables, and non-linear computations. Typical approaches to solving these problems require the introduction of prior knowledge or assumptions that bias the output, or involve a trade-off between model complexity and interpretability. The authors fit individual neuron's responses using neural network models that allow for complex non-linear relationships between behavioral variables and outputs, but combine this with analysis, based on Taylor series approximations of the network function, that gives insight into how different variables are contributing to the model.

      The authors have thoroughly validated their method using simulated data as well as showing its applicability to example state of the art data sets from mouse and zebrafish. They provide evidence that it can outperform current approaches based on linear regression for the identification of neurons carrying behaviorally relevant signals. They also demonstrate use cases showing how their approach can be used to classify neurons based on computational features. They have provided Python code for the implementation and have explained the methods well, so it will be easy for other groups to replicate their work. The method could be applied productively to many types of experiments in behavioral and systems neuroscience across different model systems. Overall, the paper is clearly written and the experiments are well designed and analysed, and represent a useful contribution to the neuroscience field.

    1. Plain text diagrams - the best diagrams as code toolsaugmentedmind.dehttps://www.augmentedmind.de › plai...augmentedmind.dehttps://www.augmentedmind.de › plai...Tárolt változatOldal lefordítása2020. dec. 20. — Create plain text diagrams using markup, which is converted to images. ... class is represented by a box, and relations are drawn as arrows.

    1. Using pex in combination with S3 for storing the pex files, we built a system where the fast path avoids the overhead of building and launching Docker images.Our system works like this: when you commit code to GitHub, the GitHub action either does a full build or a fast build depending on if your dependencies have changed since the previous deploy. We keep track of the set of dependencies specified in setup.py and requirements.txt.For a full build, we build your project dependencies into a deps.pex file and your code into a source.pex file. Both are uploaded to Dagster cloud. For a fast build we only build and upload the source.pex file.In Dagster Cloud, we may reuse an existing container or provision a new container as the code server. We download the deps.pex and source.pex files onto this code server and use them to run your code in an isolated environment.

      Fast vs full deployments

    1. Separate code from data Keep data immutable Represent data with generic data structures

      b/c the highlight formatting is terrible, I'll repeat it below

      1. Separate code from data
      2. Keep data immutable
      3. Represent data with generic data structures
    1. Let us assume you want to set up a custom rule wherein you want to route all payments coming from a particular referrer code (custom identifier) to a particular payment gateway (for example, Paytm).

      Route all payments coming from a particular referrer code to a particular payment gateway like Paytm using custom identifier.

  4. www.scielo.org.za www.scielo.org.za
    1. In this way, the social order in past Africa and even up to today, in rural Africansettings, is preserved without the need for the presence of law enforcement agenciessuch as the police and the prisons.

      Although I am a very spiritually inclined man and I don't associate with African spirituality as my belief. However in this section again, you can see the contrast between Africans and Western people in maintaining law and order. We are not a people who operate in a written code, the mere fact that African have a heightened spiritual view in contrast to a western view that is scientific in its thinking. Even when we come to matter of maintain law and order the same frame of mind also dictates, people tend to fear their parents, ancestors and gods. in this regard it also shows us that as Africans we have a totally different way to advocate and establish methods for law and order.

    1. Abstract

      Reviewer 1: Milton Pividori

      In this manuscript, the authors analyzed different characteristics that are potentially related to the expression of human genes under IFN-a stimulation. A classification model is built to predict ISG (genes that are upregulated following IFN-a stimulation) from the human fibroblast cell. The model also performs feature selection, and the authors used different test sets (on different types of IFN) to validate their model. The authors provide a web server that implemented this machine learning model. I liked the introduction, the background and motivation were clear. However, the Results section was a bit hard to follow, in particular the implementation of the machine learning models, with different classifiers applied inconsistently across distinct features sets. At the beginning of this section, the authors perform extensive manual feature analyses across different feature types (related to alternative splicing, duplication, and mutation) to build a refined dataset. These analyses basically correlate each individual feature with the expression of genes in the presence of IFN-a. I have several concerns here, related mainly to the correlation between features, that I describe below. General comments: * Regarding reproducibility, the authors provide a Github repository with source code, the model trained and data. From the documentation and notes in the manuscript (lines 1015-1023), looks like this can only be run on mac OS, which makes it very hard for me to test (I'm a Linux user). I recommend the authors to read and follow the article "Reproducibility standards for machine learning in the life sciences" (https://doi.org/10.1038/s41592-021-01256-7). Having, for instance, a Docker image to download and run your analyses would be fantastic. * The authors perform a comprehensive analysis of features that differentiate different gene classes. I wonder why didn't they use first a machine learning model to automatically find these important features, and then try to analyze which features were selected (instead of the other way around as done in the study). I think there is perhaps too much manual feature engineering in the previous steps of training an ML model. * Related to the previous point, in my comments below one of my concerns is about feature correlation. The authors compare individual features regarding their ability to separate different gene classes (ISG vs background vs non-ISG). But one can imagine that some features are highly correlated. Some features might not be useful to separate gene classes from a single-feature analysis (as the authors do at the beginning), but they could be useful in combination with other features. Unless I'm missing an important point, I would leave the machine learning model to learn this and then analyze each feature individually after the model identifies them. * Authors are concerned that including too many features in the support vector machine (SVM) model would complicate the prediction task. To remedy this, they manually select the features according to, in my opinion, a more subjective criterion. Why didn't the authors use a feature selection algorithm here? I know that they propose a model including feature selection, but I guess I don't understand well all the previous manual feature analyses. Using a known feature selection method here would provide a more data-driven approach to improve classification, in addition to their manual expert curation (which is also valid). * They run several classification models, but not consistently across the same set of features. For example, only SVM is run across genetic, parametric, all features, etc, but not the other models. Why is that? * The manuscript would really benefit from a figure with the main steps of the analyses performed, models tested, datasets employed, etc. It's hard to get the big picture as it is now. Results/Evolutionary characteristics of ISGs: Paragraph between lines 131-148: * I think the window size used (mentioned in the text) should be added to the Figure 2 caption * What's the vertical dashed line? In the text, you say that those at the left of this line are IRGs, but I don't understand the meaning of that vertical line (-0.9 log fold change). This explanation, which I didn't see, should be added to the figure caption also. * From the text, I understand that in the subfigures in Figure 2 you have IRGs, non-ISGs and ISGs. Would it be possible, or meaningful for the reader, to add an extra vertical line to separate them? Results/Differences in the coding region of the canonical transcripts: Paragraph between lines 193-208: * If GC-content is underrepresented in ISGs more than non-ISGs, the ApT and TpA should be expected to be more enriched in ISGs, right? Sounds like a redundant analysis. I would expect these two sequencederived features to be correlated. If this is the case, maybe it would be better to highlight other features instead of a correlated/expected one? * Figure 4: here the authors divided the parametric set of features into four categories and compared their representations among ISGs, non-ISGs and background genes. The figure shows p-values of the tests on the y-axis, and the four categories of features on the x-axis. I think it's important to run a negative control: could you please run these tests again, say, 100 times, with gene IDs/names shuffled, and check whether some of these results also appear in these null simulations? Maybe you can keep the same figure, but remove those also found in the null simulations. Paragraph between lines 209-227: * Is it possible that the comparison of codons frequencies (third category of features) is correlated with previous findings (like GC content or ApT/TpA enrichment)? If so, would it be possible that maybe the analysis is also expected or redundant? For example, in ISGs there is an underrepresentation of GCcontent, and you also found that ISGs there is an underrepresentation of "CAG" codons. I might be missing something, but aren't these expected to be correlated? Results / Differences in the protein sequence: Paragraph between lines 302-323: * Figure 6: I would suggest adding the same negative control suggested before. Results / Differences in network profiles * I think it's important to define what are all those eight features in the network analyses (closeness, betweenness, etc), otherwise it's hard to follow what comes next. Results / Features highly associated with the level of IFN stimulations * Figures 9 and 10: it would be good to add the sign of the correlation in the figure, in addition to mentioning it in the caption (as it is now). Results / Difference in feature representation of interferon-repressed genes and genes with low levels of expression * Given the unique patterns or differences between non-ISG class and IRG class, wouldn't it be better to perform different analyses excluding IRG genes? The authors also acknowledge these risks in lines 539- 541. Results / Implementation with machine learning framework * It was hard for me to understand the workflow in this section: you used different machine learning models applied to distinct features sets, for example. Why don't you apply the same set of models to the same set of features? I think this section needs an initial paragraph with a global description of what you are trying to do. * For example, I don't think I understand very well the concept of "disruptive feature". What does it mean? * Table 3: I don't understand the threshold selection here. I guess you refer to classification or decision threshold from a model that outputs a probability of a gene to be ISG or non-ISG. First, I think there should be a line separating each performance measure to clearly show those that are "Thresholddependent" and "Threshold independent" * I also understand that, during cross-validation, you selected for each model/feature set combination, the threshold that maximized the MCC (this is explained in Table 3 as a footnote, but it should be more explicitly mentioned in the text). * Table 3: What is the "Optimum" set of features? Why is this "Optimium set" only used with SVM? * How does the "AUC-driven subtractive iteration algorithm (ASI)" compare with other feature selection algorithms. * Table 5: you mention this in the text, but it would be good to have an extra column indicating which datasets were used for training and which are for testing. * Figure 13: it would be good to have the AUROC in the figure, not only the curves. Web-server: * I think, in general, that the web application needs to be more intuitive and have more documentation. For example, the main interface says "Predict your human genes of interest", what does that mean? What does it predict?

      Reviewer2: Muthukumaran Venkatachalapathy

      First of all, this manuscript is well-written after a thorough research investigation. I enjoyed reading about interferons, interferon stimulating genes (ISGs), mechanisms and signalling pathways. In the introduction, the authors have highlighted the different methods (including other bioinformatics databases) available to identify ISGs and their potential pitfalls. This unmet need is addressed using in silico approaches which were used to classify interferon stimulating genes from non-stimulating ones in human fibroblast cells. Here, the authors have applied a combination of expression data and sequential/compositional features and designed a machine learning model for the prediction of ISGs from non-ISGs. Apart from features like duplication, alternative splicing, mutation and presence of multiple ORFs, the authors extracted various sequential features and found them to be correlated well with ISG prediction. For example, ISGs are prone to GC depletion and a significant difference in the codon usage among ISGs was found. In that context, the authors claim that ISGs are evolutionarily less conserved, codon usage features, genetic composition features, proteomic composition features and sequence patterns (especially like SLNPs and SLAAPs) are optimal parameters that can cumulatively help in differentiating ISGs from non-ISGs. When it comes to building a machine learning model, the authors faced challenges due to similarities between ISGs and IRGs. They have experimented using different algorithms for model building ranging from the decision tree, and random forest and found decent results with support vector machine. Limitation: Model Prediction accuracy was close to 70% for type I and III IFN and it performed below par when it comes to predicting ISGs activated by type II IFN system. There is scope to improvise the model prediction accuracy and extend its usage to type II IFN systems. If the authors could briefly add few points on how to improve the model accuracy and also highlight the application/impact of this work in their discussion, that would help scientists from other background to resonate with this manuscript. Relevance: I believe there are inherent attributes (genetic, compositional, expression) with ISGs which may facilitate or even elevate their expression after IFN stimulation. On the other end, I think these properties may also be leveraged by the viruses to escape or evolve from IFN mediated antiviral response. This study is relevant during the on-going pandemic, this bioinformatics tool can help design better drug target and may indirectly aid in developing novel antiviral compounds. I recommend this work for publication without any changes.

    1. For example, when the user is logging in and we get back an OTP_REQUIRED error code, we can prompt the user for their TOTP using a friendly UI. But if we receive the OTP_INVALID error code, we can display an error message instead.

      .

    1. SearchLocationFilter Data Field Configuration – EFM-5586 (EFSP)

      Configurable feature. This is not likely to be used in Indiana as long as Qwest is around. Will be useful in any state that has a single CMS (not sure which ones that would be).

      This affects primarily the search widget.

      We need to implement this in the code fetcher. See Bug 8219 - Add new columns and tables for codes to support EFM 2022.1.2.

    2. File Types and Filing Components – EFM-5366 (EFSP)

      Configurable feature. Not sure if this will be used in Indiana. STAGE is currently configured for PDF only.

      filing_component_code.allowed_file_types

      We need to implement this in the code fetcher. See Bug 8219 - Add new columns and tables for codes to support EFM 2022.1.2.

    3. Required Service Contacts – EFM-4974 (EFSP)

      Configurable. Not sure if this will be implemented in Indiana. Seems like Indiana would be keenly interested, but this is not currently configured in STAGE.

      We need to implement this in the code fetcher. See Bug 8219 - Add new columns and tables for codes to support EFM 2022.1.2.

    4. Refund Reason Codes – EFM- 4817 (EFSP)

      Core functionality (ECF5).

      We need to implement famde_refund_reason_code as a new table in the code fetcher. See Bug 8219 - Add new columns and tables for codes to support EFM 2022.1.2.

    5. Waiver Types – EFM-4821 (EFSP)

      Configurable. Not sure if this will be implemented in Indiana.

      We need to implement this in the code fetcher. See Bug 8219 - Add new columns and tables for codes to support EFM 2022.1.2.

    1. Reviewer #2 (Public Review):

      In this paper, the authors propose a system for annotating and curating scientific publications in the context of interspecies host-pathogen interactions. This system, called PHI-Canto (the Pathogen-Host Interaction Community Annotation Tool), is an extension of an existing tool (called Canto). In addition, they present the development of new concepts, controlled vocabularies, and an ontology for annotating relevant aspects in this domain, called PHIPO (Pathogen-Host Interaction Phenotype Ontology).

      The approach has been empirically validated by annotating ten publications. The application's source code is available, as well as the associated ontologies and vocabularies and an example of the data resulting from the annotation process.

    2. Author Response:

      Reviewer #1 (Public Review):

      This study presents a resource aiming to unify language and rules used in the literature to describe, curate and assess biology experiments, published or not. Focusing on host-pathogen interactions, the work presents a new ontology and controlled vocabulary, as well as rules to describe 'metagenotypes', a term coined for the joint description of interacting host-pathogen genotypes. 'PHI-Canto' extends a previous resource by also enabling using UniProtKB IDs to curate proteins. Among other important by-products, PHI-Canto could contribute to damping proliferating names and acronyms for genes, processes, and interactions; a chronic annoyance in the biosciences.

      The tool does give the impression that, with sufficient time and usage, it could become a rich and robust resource. Just addressing the Uniprot IDs issue is a nice move.

      We thank the reviewer for their positive comments and acknowledgement of the importance of using unified language in literature curation. We are pleased to see that our effort to improve interoperability and use existing resources has been recognized. We are also pleased that this reviewer recognizes the additional benefits of choosing to use UniProtKB accession numbers. 

      Reviewer #2 (Public Review):

      In this paper, the authors propose a system for annotating and curating scientific publications in the context of interspecies host-pathogen interactions. This system, called PHI-Canto (the Pathogen-Host Interaction Community Annotation Tool), is an extension of an existing tool (called Canto). In addition, they present the development of new concepts, controlled vocabularies, and an ontology for annotating relevant aspects in this domain, called PHIPO (Pathogen-Host Interaction Phenotype Ontology).

      The approach has been empirically validated by annotating ten publications. The application's source code is available, as well as the associated ontologies and vocabularies and an example of the data resulting from the annotation process.

      We thank the reviewer for their positive comments on our framework for curating interspecies interactions literature. We are pleased that the reviewer has recognized that the source code, associated ontologies and curated data are freely available for others to use. We are delighted that the reviewer found the curation of ten trial publications in PHI-Canto informative and benefited from the worked curation examples.

      Reviewer #3 (Public Review):

      In this work, the authors have built a framework for the annotation of interactions between species. The framework includes ontologies, methodologies, and an annotation tool called PHI-Canto. The framework makes use of multiple existing ontologies that are in wide use in the biocuration community. In addition, the authors have built their own project-specific controlled vocabularies and ontologies for the capture of pathogen-host interaction phenotypes (PHIPO), diseases (PHIDO), and environmental conditions (PHI-ECO). Their work builds on and extends methods that have been developed within the Gene Ontology Consortium and model organism databases. The tool PHI-Canto is an extension of the tool Canto developed by PomBase for curation. The authors used this framework to annotate pathogen-host interactions within the Pathogen-Host Interactions Database.

      Strengths: The manuscript is well-written and includes significant detail regarding curation policies/methods and the use of the actual PHI-Canto tool. The appendices are very detailed and provide useful illustrations of the annotation practices and tool interface. The work has built upon and extended well-established standards and methods that have proven their utility over many years of use in the biocuration community. The authors have rigorously tested their framework with the curation of a variety of publications providing a diverse assortment of annotation challenges. The concept of a "metagenotype" is important and providing such a structured system for the capture of this information is useful. All of the materials produced by the work are completely freely available for use by the wider community.

      Weaknesses: There are some areas of the manuscript and appendices which are a bit confusing and could be improved. The authors have developed their own set of disease terms (PHIDO) but do not comment on why existing disease terminologies (such as Mondo or DO) were not used or if the PHIDO terms relate to those other vocabularies. There is no discussion of the possible use of a graph representation for the capture of this complex information (which is being done in many settings including the Gene Ontology with GO Causal Activity Models (GO-CAMs)) or why such a structure was not used. Although the abstract talks about the use of the framework within the PHI database as a test case for broader use regarding interspecies interactions, there is no mention of extending the use of the tool to other species interaction communities beyond pathogen-host interactions.

      We thank the reviewer for their detailed response. We are pleased that the reviewer found the manuscript to be well-written and informative with useful examples. We thank the reviewer for their helpful suggestions to improve the appendices and manuscript text.

      We would like to clarify that PHIDO is not intended to compete with existing disease ontologies: it is instead being used as a placeholder, until the time when its terms can be replaced with terms from existing disease ontologies. PHIDO was an expedient solution, in the sense that it provided the fastest way for us to test the process of curating diseases with PHI-Canto. This is because we only had to convert the existing list of disease names already in PHI-base into a controlled vocabulary, thus removing the need to wait for maintainers of other ontologies to add terms for us (as reported in Urban et al., 2022).

      Additionally, we were required to use terms from PHIDO due to the lack of representation for plant and animal diseases in existing ontologies or vocabularies. Plant disease, in particular, is very underrepresented, with the ontologies we surveyed having either inappropriate semantics (e.g. the Plant Trait Ontology focusing on traits related to disease, rather than the diseases themselves) or still being in development (e.g. the Plant Stress Ontology). The majority of source ontologies used by MONDO are human-centric, and DO is exclusively for human disease, yet human disease represents only part of the focus of PHI-base (~35%). Furthermore, our choice of vocabularies is limited by the fact that Canto currently only supports ontologies in OBO format (for historical reasons).

      We have begun the process of harmonizing disease names in PHI-base with terms from existing disease ontologies – such as MONDO, DO, and the National Cancer Institute Thesaurus – with the ultimate aim of using terms from those ontologies in curation, instead of terms from PHIDO. As general vocabularies for animal and plant disease emerge or are identified, we will extend this procedure to those diseases.

      With regards to a graph representation of the data, we are aware of the examples the reviewer described, and we agree that this type of representation could be preferable. However, our data model is currently constrained by the developers of Canto, who use a relational data model and currently have no plans to implement a graph data model or a graph representation. We acknowledge that query languages like GraphQL can provide a graph-based interface to an existing relational data model, but we believe this would require a significant technological investment. For PHI-base, we plan to enable a graph representation of the data by integrating with existing knowledge graph tools, such as KnetMiner (www.knetminer.com;doi.org/10.1111/pbi.13583), which will provide graph-based queries on PHI-base (albeit only on select species for which knowledge graphs will be provided, i.e. Arabidopsis, rice, wheat, eight plant and human infecting fungal ascomycete pathogens, and two non-pathogenic yeast species). We will also use KnetMiner integration to embed subgraphs of the complete knowledge graph into the gene-centric pages on the PHI-base 5 website.

      We acknowledge the lack of discussion about extending the tool for broader interspecies interactions. These examples may have been omitted from a previous draft due to journal word count limits. Possible future uses of the PHI-Canto schema could include insect–plant interactions (both beneficial and detrimental), endosymbiotic relationships such as mycorrhiza–plant rhizosphere interactions, nodulating bacteria–plant rhizosphere interactions, fungi–fungi interactions, plant–plant interactions or bacteria–insect interactions, and non-pathogenic relationships in natural environments, such as bulk soil, rhizosphere, phyllosphere, air, freshwater, estuarine water or seawater, and tissues or organs (e.g. the gut, lungs, and skin of humans, birds, or other animals). The schema could also be extended to situations where phenotype relations to genes or genotypes have been established for predator–prey relationships, or where there is competition in herbivore–herbivore, predator–predator, or prey–prey relationships in the air, on land or in the water. Customizing Canto to use other ontologies and controlled vocabularies is as simple as editing a configuration file within the source code.

    1. one that continually slips in and out of the white, the Catholic, the Mexican, the indigenous, the instincts.

      That slippery slope is to me considered code-switching and many use this to their advantage OR society decides where one resides.

    1. only the best of intentions

      It seems important to remember this is often not malicious, but rather an honest attempt to help students cope and "code-switch" to survive in an academic environment that is slow to change.

    1. Browser-based interfaces are slow, clumsy, and require you to be online just to use them.

      No they don't.

      This conflates the runs-in-a-browser? property with the depends-on-mobile-code? property.

    1. Author Response:

      What is novel here is that we calculated the time-varying retinal motion patterns generated during the gait cycle using a 3D reconstruction of the terrain. This allows calculation of the actual statistics of retinal motion experienced by walkers over a broad range of normal experience. We certainly do not mean to claim that stabilizing gaze is novel, and agree that the general patterns follow directly from the geometry as worked out very elegantly by Koenderink and others.  We spend time describing the terrain-linked gaze behavior because it is essential for understanding the paper. We do not claim that the basic saccade/stabilize/saccade behavior is novel and now make this clearer.

      The other novel aspect is that the motion patterns vary with gaze location which in turn varies with terrain in a way that depends on behavioral goals. So while some aspects of the general patterns are not unexpected, the quantitative values depend on the statistics of the behavior.  The actual statistics require these in situ measurements, and this has not previously been done, as stated in the abstract.

      The measured statistics provide a well-defined set of hypotheses about the pattern of direction and speed tuning across the visual field in humans. Points of comparison in the existing literature are hard to find because the stimuli have not been closely matched to actual retinal flow patterns, and the statistics will vary with the species in question. However, recent advances allow for neurophysiological measurements and eye tracking during experiments with head-fixed running, head-free, and freely moving animals. These emerging paradigms will allow the study of retinal optic flow processing in contexts that do not require simulated locomotion. While the exact the relation between the retinal motion statistics we have measured and the response properties of motion-sensitive cells remains unresolved, the emerging tools in neurophysiology and computation make similar approaches with different species more feasible.

      A more detailed description of the methods including the photogrammetry and the reference frames for the measurements has been added primarily to the Methods section.

      Reviewer #1 (Public Review):

      Much experimental work on understanding how the visual system processes optic flow during navigation has involved the use of artificial visual stimuli that do not recapitulate the complexity of optic flow patterns generated by actual walking through a natural environment. The paper by Muller and colleagues aims to carefully document "retinal" optic flow patterns generated by human participants walking a straight path in real terrains that differ in "smoothness". By doing so, they gain unique insights into an aspect of natural behavior that should move the field forward and allow for the development of new, more principled, computational models that may better explain the visual processing taking place during walking in humans.

      Strengths:

      Appropriate, state-of-the-art technology was used to obtain a simultaneous assessment of eye movements, head movements, and gait, together with an analysis of the scene, so as to estimate retinal motion maps across the central 90 deg of the visual field. This allowed the team to show that walkers stabilize gaze, causing low velocities to be concentrated around the fovea and faster velocities at the visual periphery (albeit more the periphery of the camera used than the actual visual field). The study concluded that the pattern of optic flow observed around the visual field was most likely related to the translation of the eye and body in space, and the rotations and counter-rotations this entailed to maintain stability. The authors were able to specify what aspects of the retinal motion flow pattern were impacted by terrain roughness, and why (concentration of gaze closer to the body, to control foot placement), and to differentiate this from the impact of lateral eye movements. They were also able to identify generalizable aspects of the pattern of retinal flow across terrains by subsampling identical behaviors in different conditions.

      Weaknesses:

      While the study has much to commend, it could benefit from additional methodological information about the computations performed to generate the data shown. In addition, an estimation of inter-individual variability, and the role of sex, age, and optical correction would increase our understanding of factors that could impact these results, thus providing a clearer estimate of how generalizable they are outside the confines of the present experiments.

      Properties of gait depend on the passive dynamics of the body and factors such as leg length and subject specific cost functions which are influenced by image quality and therefore by optical correction. In this experiment all subjects were normal acuity or corrected to normal (with no information regarding their uncorrected vision). This is now noted in the Methods. The goal of the present work was to calculate average statistics over a range of observers and conditions in order to constrain the experience-dependent properties one might see in neurophysiology. We have added between-subjects error bars to Figure 2 and added gaze angle distributions as a function of terrain for individual observers in the Supplementary materials. Figure 4 b and d now show standard errors across subjects. Individual subject plots are shown in the Supplementary materials. For Figure 2, most variability between subjects occurs in the Flat and Bark terrains where one might expect individual choices of energetic costs versus speed and stability etc might come into play. This is supported by our subsequent unpublished work on factors influencing foothold choice. We have also found that leg length determines path choices and thus will influence the retinal motion. Differences between observers are now noted in the text. These individual subject differences should indicate the range of variability that might be expected in the underlying neural properties and perhaps in behavioral sensitivity. Because of the size of our dataset (n=11) it is not feasible to make comparisons of sex or age. There were equal numbers of males and females and age ranged from 24 to 54. Now noted in the Methods section.

      Reviewer #2 (Public Review):

      The goal of this study was to provide in situ measurements of how combined eye and body movements interact with real 3D environments to shape the statistics of retinal motion signals. To achieve this, they had human walkers navigate different natural terrains while they measured information about eyes, body, and the 3D environment. They found average flow fields that resemble the Gibsonian view of optic flow, an asymmetry between upper and lower visual fields, low velocities at the fovea, a compression of directions near the horizontal meridian, and a preponderance of vertical directions modulated by lateral gaze positions.

      Strengths of the work include the methodological rigor with which the measurements were obtained. The 3D capture and motion capture systems, which have been tested and published before, are state-of-the-art. In addition, the authors used computer vision to reconstruct the 3D terrain structure from the recorded video.

      Together this setup makes for an exciting rig that should enable state-of-the-art measurements of eye and body movements during locomotion. The results are presented clearly and convincingly and reveal a number of interesting statistical properties (summarized above) that are a direct result of human walking behavior.

      A weakness of the article concerns tying the behavioral results and statistical descriptions to insights about neural organization. Although the authors relate their findings about the statistics of retinal motion to previous literature, the implications of their findings for neural organization remain somewhat speculative and inconclusive. An efficient coding theory of visual motion would indeed suggest that some of the statistics of retinal motion patterns should be reflected in the tuning of neural populations in the visual cortex, but as is the present findings could not be convincingly tied to known findings about the neural code of vision. Thus, the behavioral results remain strong, but the link to neural organization principles appears somewhat weak.

      We agree, but we think that strengthening the neural links requires future studies. As mentioned above, it is very difficult to relate the measured statistics to existing neurophysiological literature and we have tried to make this clearer in the Discussion (p14, 15, 16). This is because the stimuli chosen are typically arbitrary and not chosen to be realistic examples of patterns consistent with natural motion across a ground plane. Other stimuli are simply inconsistent with self-motion together with gaze stabilization (eg not zero velocity at the fovea). It has also been technically difficult to map cell properties across the visual field. We have made the comparisons we thought were useful. The point of the paper is to provide a hypothesis about the pattern of direction and speed tuning across the visual field. So the challenge for neurophysiology is to show how the observed cell properties vary across the visual field. Note also that the motion patterns will be influenced by the body motion of the animal in question, and because of this we are now collaborating with a group who are attempting to record from monkey MT/MST during locomotion while tracking eyes and body. Similarly we are training neural networks to learn the patterns generated by human gait to develop more specific hypotheses about receptive field properties.

      Reviewer #3 (Public Review):

      Gaze-stabilizing motor coordination and the resulting patterns of retinal image flow are computed from empirically recorded eye movement and motion capture data. These patterns are assessed in terms of the information that would be potentially useful for guiding locomotion that the retinal signals actually yield. (As opposed to the "ecological" information in the optic array, defined as independent of a particular sensor and sampling strategy).

      While the question posed is fundamental, and the concept of the methodology shows promise, there are some methodological details to resolve. Also, some terminological ambiguities remain, which are the legacy of the field not having settled on a standardized meaning for several technical terms that would be consistent across laboratory setups and field experiments.

      Technical limits and potential error sources should be discussed more. Additional ideas about how to extend/scale up the approach to tasks with more complex scenes, higher speed or other additional task demands and what that might reveal beyond the present results could be discussed.

      This issue is addressed in more detail in the Discussion, second paragraph, and also the second last paragraph.

    2. Reviewer #2 (Public Review):

      The goal of this study was to provide in situ measurements of how combined eye and body movements interact with real 3D environments to shape the statistics of retinal motion signals. To achieve this, they had human walkers navigate different natural terrains while they measured information about eyes, body, and the 3D environment. They found average flow fields that resemble the Gibsonian view of optic flow, an asymmetry between upper and lower visual fields, low velocities at the fovea, a compression of directions near the horizontal meridian, and a preponderance of vertical directions modulated by lateral gaze positions.

      Strengths of the work include the methodological rigor with which the measurements were obtained. The 3D capture and motion capture systems, which have been tested and published before, are state-of-the-art. In addition, the authors used computer vision to reconstruct the 3D terrain structure from the recorded video. Together this setup makes for an exciting rig that should enable state-of-the-art measurements of eye and body movements during locomotion. The results are presented clearly and convincingly and reveal a number of interesting statistical properties (summarized above) that are a direct result of human walking behavior.

      A weakness of the article concerns tying the behavioral results and statistical descriptions to insights about neural organization. Although the authors relate their findings about the statistics of retinal motion to previous literature, the implications of their findings for neural organization remain somewhat speculative and inconclusive. An efficient coding theory of visual motion would indeed suggest that some of the statistics of retinal motion patterns should be reflected in the tuning of neural populations in the visual cortex, but as is the present findings could not be convincingly tied to known findings about the neural code of vision. Thus, the behavioral results remain strong, but the link to neural organization principles appears somewhat weak.

    1. Author Response

      Reviewer #1 (Public Review):

      This work presents a unification model (of sorts) for explaining how the flow of evidence through networks can be controlled during decision-making. The authors combine two general frameworks previously used as neural models of cortical decision-making, dynamic normalization (that implement value encoding via firing activity) and recurrent network models (which capture winner-take-all selection processes) into a unified model called the local disinhibition-based decision model (LDDM). The simple motif of the LDDM allows for the disinhibition of excitatory cells that represent the engagement of individual actions that happens through a recurrent inhibitory loop (i.e., a leaky competing accumulator). The authors show how the LDDM works effectively well at explaining both decision dynamics and the properties of cortical cells during perceptual decision-making tasks.

      All in all, I thought this was an interesting study with an ambitious goal. But like any good study, there are some open issues worth noting and correcting.

      MAJOR CONCERNS

      1. Big picture

      This was a comprehensive and extremely well-vetted set of theoretical experiments. However, the scope and complexity also made the take-home message hard to discern. The abstract and most of the introduction focus on the framing of LDDM as a hybrid of dynamic normalization models (DNM) and recurrent network models (RNMs). This is sold as a unification of value normalization and selection into a novel unified framework. Then the focus shifts to the role of disinhibition in decision-making. Then in the Discussion, the goal is stated as to determine whether the LDDM generates persistent activity and does this activity differ from RNMs. As a reader, it seems like the paper jumps between two high- level goals: 1) the unification of DNM and RNM architectures, and 2) the role of disinhibition. This constant changing makes it hard to focus as the reader goes on. So what is the big picture goal specifically?

      Also, the framing of value normalization and WTA as a novel computational goal is a bit odd as this is a major focus of the field of reinforcement learning (both abstractly at the computational level and more concretely in models of the circuits that regulate it). I know that the authors do not think they are the first to unify value judgements with selection criteria. The writing just comes across that way and should be clarified.

      We thank the Reviewer for their thoughtful consideration of the overall framing of the big picture goals of the paper. Upon reflection, we agree that the paper really centers on the importance of incorporating disinhibition into computational circuit-based models of decision-making. Thus, we have significantly revised the Introduction and Discussion to focus on the theoretical and empirical importance of incorporating disinhibition into computational models of decision-making, and use the integration of value normalization and WTA selection as an example of how disinhibition increases the richness of circuit decision models. Please see the response to recommendations below for more detail on the changes.

      1. Link to other models

      The LDDM is described as a novel unification of value normalization and winner-take-all (WTA) selection, combining value processing and selection. While the authors do an excellent job of referencing a significant chunk of the decision neuroscience literature (160 references!) the motif they end up designing has a highly similar structure to a well-known neural circuit linked to decision-making: the cortico-basal ganglia pathways. Extensive work over the past 20+ years has highlighted how cortical-basal ganglia loops work via disinhibition of cortical decision units in a similar way as the LDDM (see the work by Michael Frank, Wei Wei, Jonathan Rubin, Fred Hamker, Rafal Bogacz, and many others). It was surprising to not see this link brought up in the paper as most of the framing was on the possibility of the LDDM representing cortical motifs, yet as far as I know, there does not exist evidence for such architectures in the cortex, but there is in these cortical-basal ganglia systems.

      We thank the Reviewer for the suggestion to link the LDDM to disinhibition in CBG models; this is indeed an important body of empirical and computational work that we overlooked in the original manuscript. We have now added text to the Discussion to highlight the link between LDDM and these CBL disinhibition models, focusing on how they are conceptually similar and how they differ. Please see our response to recommendations below for a more detailed discussion of the revisions.

      1. Model evaluations

      The authors do a great job of extensively probing the LDDM under different conditions and against some empirical data. However, most of the time there is no "control" model or current state-of-the-art model that the LDDM is being compared against. In a few of the simulation experiments, the LDDM is compared against the DNM and RNM alone, so as to show how the two components of the LDDM motif compare against the holistic model itself. But this component model comparison is inconsistently used across simulation experiments.

      Also, it is worth asking whether the DNM and RNM are appropriate comparison models to vet the LDDM against for two reasons. First, these are the components of the full LDDM. So these tests show us how the two underlying architectural systems that go into LDDM perform independently, but not necessarily how the LDDM compares against other architectures without these features. Second, as pointed out in my previous comment, the LDDM is a more complex model, with more parameters, than either the DNM or RNM. The field of decision neuroscience is awash in competing decision models (including probabilistic attractor models, non-recurrent integrators, etc.). If we really want to understand the utility of the LDDM, it would be good to know how it performs against similarly complex models, as opposed to its two underlying component models.

      We greatly appreciate the Reviewer’s comments on the point of model comparison, which points out that our original manuscript failed to clearly convey a very important difference between the LDDM and the existing RNM(s). In the revision, we now make it clearer that the fundamental difference between the LDDM and the RNMs is the architecture of disinhibition (see the revised Introduction, especially p. 8 lines 164-168). The LDDM is not simply a combination of the DNM model with RNM architecture (a point we may have mistakenly conveyed in the original manuscript): the introduction of disinhibition separates LDDM inhibition into option-selective subpopulations, as opposed to the single pooled inhibition of RNM models. Given this fact, the LDDM predicts unique selectiveinhibition dynamics shown in recent optogenetic and calcium imaging results, a finding inconsistent with the common-pooled and non-selective inhibition assumed in the existing RNMs and many of its variants. Thus, we believe that a comparison between the LDDM and the RNM, which share similar level of complexity and numbers of parameters, is important.

      We also appreciated the Reviewer’s concern about testing the LDDM against alternative models. In order to better connect to the existing literature, we now compare the LDDM to another standard circuit model of decision-making - the leaky competing accumulator (LCA) model. The LCA is a circuit model that captures many of the aspects of perceptual decision-making seen in the mathematical drift diffusion model (DDM), but with a construction that allows for fitting to behavioral data and comparison of underlying unit activities. Please see our response to recommendations below for further detail.

      1. Comparison to physiological data

      I quite enjoyed the comparisons of the excitatory cell activity to empirical data from the Shadlen lab experiments. However, these were largely qualitative in nature. In conjunction with my prior point on the models that the LDDM is being compared against, it would be ideal to have a direct measure of model fits that can be used to compare the performance of different competing "control" models. These measures would have to account for differences in model complexity (e.g., AIC or BIC), but such an analysis would help the reader understand the utility of the LDDM in connecting with empirical data much better.

      We agree with the Reviewer that a quantitative comparison of the match between model neural predictions and empirical neurophysiological data is important. First, we wish to clarify that the model neural predictions are simulated from models fit to the behavioral (choice and RT data), not from fits to the neural activity traces – a point we now clarify in the text. While directly fitting dynamic models (LDDM, RNM, or LCA) to the neurophysiological data is appealing, there are currently several obstacles to this approach. The first problem is the complexity of the dynamic neural traces. Despite the long history of the random-dot motion paradigm, detailed features of the dynamics are still not understood. For example, the stereotyped initial dip after stimulus onset may reflect a reset of the network state to improve signal to noise ratio (Conen and Padoa-Schioppa, 2015) or simply reflect a surround suppression-like lateral inhibition in visual processing. A second problem is that the primary difference between the models is the activity of inhibitory (and disinhibitory) neurons, which are typically not recorded in neurophysiological experiments; thus, there is a lack of empirical data to which to fit the models. In the revision, we clarified that the model fitting to the Roitman & Shadlen data is for behavioral data only, and model unit activity traces are derived from models fit to behavioral data.

      That being said, we agree that a quantitative comparison of model activity predictions is helpful. Because the models are fit not to the neural data but to the behavioral data, rather than using likelihood-based measures like AIC and BIC we used a simple RMSE measure to compare the match between predicted and neural activity patterns (revised Fig. 6E, Fig 6-S4E, Fig 6-S5E). Please see response to recommendations below for details.

      Reviewer #2 (Public Review):

      The aim of this article was to create a biologically plausible model of decision-making that can both represent a choice's value and reproduce winner-take-all ramping behavior that determines the choice, two fundamental components of value- based decision-making. Both of these aspects have been studied and modeled independently but empirical studies have found that single neurons can switch between both of the aspects (i.e., from representing value to winner-take-all ramping behavior) in ways that are not well described by current biological plausible models of decision making.

      The current article provides a thorough investigation of a new model (the local disinhibition decision model; LDDM) that has the goal of combining value representations and winner-takes-all ramping dynamics related to choice. Their model uses biologically plausible disinhibition to control the levels of inhibition in a local network of simulated neurons. Through a careful series of simulation experiments, they demonstrate that their network can first represent the value of different options, then switch to winner-takes-all ramping dynamics when a choice needs to be made. They further demonstrate that their single model reproduces key components of value-based and winner-takes-all dynamics found in both neural and behavioral data. They additionally conduct simulation studies to demonstrate that recurrent excitatory properties in their network produce value-persistence behavior that could be related to memory. They end by conducting a careful simulation study of the influence of GABA agonists that provide clear and testable predictions of their proposed role of inhibition in the neural processes that underlie decision-making. This last piece is especially important as it provides a clear set of predictions and experiments to help support or falsify their model.

      There are overall many strengths to this paper. As the authors note, current network models do not explain both value- based and ramping-like decision-making properties. Their thorough simulation studies and their validation against empirical neural and behavioral data will be of strong interest to neuroscientists and psychologists interested in value- based decision-making. The simulations related to persistence and the GABA-agonist experiments they propose also provide very clear guidelines for future research that would help advance the field of decision-making research.

      Although the methods and model were generally clear, there was a fair amount of emphasis on the role of recurrence in the LDDM, but very little evidence that recurrence was important or necessary for any of the empirical data examined. The authors do demonstrate the importance of recurrence in some of their simulation studies (particularly in their studies of persistence), but these would need to be compared against empirical data to be validated. Nevertheless, the model and thorough simulation investigations will likely help develop more precise theories of value-based decision-making.

      We appreciate the Reviewer’s thoughtful comments. These comments - especially about anatomic recurrence and its relationship to the parameter 𝛼 - inspired us to think more about the uniqueness of the current circuit to others, especially the implications related to the parameters 𝛼 (i.e., self-excitation) and 𝛽 (i.e., local disinhibition). Recurrence is required to drive winner-take-all competition in the standard RNM of decision-making. However, we show here with both analytical and numerical approaches that recurrence helps WTA competition but is not necessary in our model. Instead, the key feature of the LDDM is to utilize disinhibition in conjunction with lateral inhibition to realize winner-take-all competition. That leads to many different predictions of the current model from the existing models, such as selective inhibition and flexible control of dynamics.

      In response to the Reviewer’s points and after careful consideration of the differential equations, we realized that in our model fitting, the 𝛼 parameter fitting to zero does not necessarily mean recurrence should be zero. The 𝛼 parameter shares a lot of similarity to the baseline gain control (parameter BG in our revision), and thus is unidentifiable in the current dataset. In the interest of parsimony, we did not include the parameter BG in the original manuscript, but now include it because it reveals the difficulty of interpreting fit 𝛼 values as simply the level of recurrence.

      Overall, disinhibition (𝛽) in the LDDM is required for WTA activity while recurrence (𝛼) can contribute but is not necessary; however, 𝛼 is theoretically important for generating persistent activity, with the caveat that in the current framework there is an unclear relationship between fit 𝛼 and recurrence. Regardless, we agree that the contribution of 𝛼 to the LDDM framework is worth further testing and examining with future empirical data.

      Reviewer #3 (Public Review):

      Shen et al. attempt to reconcile two distinct features of neural responses in frontoparietal areas during perceptual and value-guided decision-making into a single biologically realistic circuit model. First, previous work has demonstrated that value coding in the parietal cortex is relative (dependent on the value of all available choice options) and that this feature can be explained by divisive normalization, implemented using adaptive gain control in a recurrently connected circuit model (Louie et al, 2011). Second, a wealth of previous studies on perceptual decision-making (Gold & Shadlen 2007) have provided strong evidence that competitive winner-take-all dynamics implemented through recurrent dynamics characterized by mutual inhibition (Wang 2008) can account for categorical choice coding. The authors propose a circuit model whose key feature is the flexible gating of 'disinhibition', which captures both types of computation - divisive normalization and winner-take-all competition. The model is qualitatively able to explain the 'early' transients in parietal neural responses, which show signatures of divisive normalization indicating a relative value code, persistent activity during delay periods, and 'late' accumulation-to-bound type categorical responses prior to the report of choice/action onset.

      The attempt to integrate these two sets of findings by a unified circuit model is certainly interesting and would be useful to those who seek a tighter link between biologically realistic recurrent neural network models and neural recordings. I also appreciate the effort undertaken by the authors in using analytical tools to gain an understanding of the underlying dynamical mechanism of the proposed model. However, I have two major concerns. First, the manuscript in its current form lacks sufficient clarity, specifically in how some of the key parameters of the model are supposed to be interpreted (see point 1 below). Second, the authors overlook important previous work that is closely related to the ideas that are being presented in this paper (see point 2 below).

      1) The behavior of the proposed model is critically dependent on a single parameter 'beta' whose value, the authors claim, controls the switch from value-coding to choice-coding. However, the precise definition/interpretation of 'beta' seems inconsistent in different parts of the text. I elaborate on this issue in sub-points (1a-b) below:

      1a). For instance, in the equations of the main text (Equations 1-3), 'beta' is used to denote the coupling from the excitatory units (R) to the disinhibitory units (D) in Equations 1-3. However, in the main figures (Fig 2) and in the methods (Equation 5-8), 'beta' is instead used to refer to the coupling between the disinhibitory (D) and the inhibitory gain control units (G). Based on my reading of the text (and the predominant definition used by the authors themselves in the main figures and the methods), it seems that 'beta' should be the coupling between the D and G units.

      1b). A more general and critical issue is the failure to clearly specify whether this coupling of D-G units (parameterized by 'beta') should be interpreted as a 'functional' one, or an 'anatomical' one. A straightforward interpretation of the model equations (Equations 5-8) suggests that 'beta' is the synaptic weight (anatomical coupling) between the D and G units/populations. However, significant portions of the text seem to indicate otherwise (i.e a 'functional' coupling). I elaborate on this in subpoints (i-iii) below:

      (1b-i). One of the main claims of the paper is that the value of 'beta' is under 'external' top-down control (Figure 2 caption, lines 124-126). When 'beta' equals zero, the model is consistent with the previous DNM model (dynamic normalization, Louie et al 2011), but for moderate/large non-zero values of 'beta', the network exhibits WTA dynamics. If 'beta' is indeed the anatomical coupling between D and G (as suggested by the equations of the model), then, are we to interpret that the synaptic weight between D-G is changed by the top-down control signal within a trial? My understanding of the text suggests that this is not in fact the case. Instead, the authors seem to want to convey that top-down input "functionally" gates the activity of D units. When the top-down control signal is "off", the disinhibitory units (D) are "effectively absent" (i.e their activity is clamped at zero as in the schematic in Fig 2B), and therefore do not drive the G units. This would in- turn be equivalent to there being no "anatomical coupling" between D and G. However when the top-down signal is "on", D units have non-zero activity (schematic in Fig 2B), and therefore drive the G units, ultimately resulting in WTA-like dynamics.

      (1b-ii). Therefore, it seems like when the authors say that beta equals zero during the value coding phase they are almost certainly referring to a functional coupling from D to G, or else it would be inconsistent with their other claim that the proposed model flexibly reconfigures dynamics only through a single topdown input but without a change to the circuit architecture (reiterated in lines 398-399, 442-444, 544-546, 557-558, 579-590). However, such a 'functional' definition of 'beta' would seem inconsistent with how it should actually be interpreted based on the model equations, and also somewhat misleading considering the claim that the proposed network is a biologically realistic circuit model.

      (1b-iii). The only way to reconcile the results with an 'anatomical' interpretation of 'beta' is if there is a way to clamp the values of the 'D' units to zero when the top-down control signal is 'off'. Considering that the D units also integrate feed- forward inputs from the excitatory R units (Fig 2, Equations 1-3 or 5-8), this can be achieved either via a non-linearity, or if the top-down control input multiplicatively gates the synapse (consistent with the argument made in lines 115-116 and 585-586 that this top-down control signal is 'neuromodulatory' in nature). Neither of these two scenarios seems to be consistent with the basic definition of the model (Equations 1-3), which therefore confirms my suspicion that the interpretation of 'beta' being used in the text is more consistent with a 'functional' coupling from D to G.

      We thank the reviewer for pointing out this confusion. We apologize that the original illustrations (Fig. 2A) and the differential equations in Methods (Eqs. 5-8) did not convey very well our ideas. 𝛽 is intended to reference the coupling from R to D, not a change in the weights between D and G units. We realize there was some confusion on this part due to inconsistency between our original figures, text, and supplementary material.

      Given the lack of clarity in the previous version as well as the Reviewer’s questions, we now emphasize that 𝛽 represents a functional coupling between the R and D neurons. The biological assumption of the disinhibitory architecture is built based on recent findings that VIP neurons in the cortex always inhibit other neighboring inhibitory cells, such as SST and PV neurons, and consequently disinhibit the neighboring primary neurons (e.g., Fu et al., 2014; Karnani et al., 2014, 2016). We did not see evidence in the literature of fast-changing (anatomic) connections between VIP and SST/PV. However, there is evidence that the responsiveness of VIP neurons to excitatory neurons can be modulated by changing the concentrations of neuromodulators, such as acetylcholine and serotonin (Prönneke et al., 2020). While the stereotype of neuromodulator action is slow dynamics, recent findings show that for example basal forebrain cholinergic neurons respond to reward and punishment with surprising speed and precision (18 ± 3ms) (Hangya et al., 2015) to modulate arousal, attention, and learning in the neocortex. Given the large number of studies that identify long-term projections and neuromodulatory inputs to VIP neurons (e.g., Pfeffer et al., 2013; Pi et al., 2013; Alitto & Dan, 2013; Tremblay et al., 2016), we believe that it will be more plausible to assume the connection weights between R and D in our case is quickly modulated within a trial.

      To clarify this issue in the revised manuscript, we made the following corrections:

      1. We repositioned the 𝛽 parameter in Fig. 2A between the connection from R to D, to align the description of 𝛽 modulating R to D in the main text.

      2. We modified the differential equations 5-8 (now numbered as Eqs. 28-32) in Methods (pp. 61) to include the disinhibitory unit D as an independent control from the inhibitory unit I, in order to be consistent with the disinhibitory D units in LDDM. Such a change makes tiny differences in the model predictions (please see dynamics simulated after the change in Fig. 2-figure supplement 1B).

      3. We updated the neural circuit motif in Fig. 2 -figure supplement 1A accordingly.

      2) The main contribution of the manuscript is to integrate the characteristics of the dynamic normalization model (Louie et al, 2011) and the winner-take-all behavior of recurrent circuit models that employ mutual inhibition (Wang, 2008), into a circuit motif that can flexibly switch between these two computations. The main ingredient for achieving this seems to be the dynamical 'gating' of the disinhibition, which produces a switch in the dynamics, from point-attractor-like 'stable' dynamics during value coding to saddle-point-like 'unstable' dynamics during categorical choice coding. While the specific use of disinhibition to switch between these two computations is new, the authors fail to cite previous work that has explored similar ideas that are closely related to the results being presented in their study. It would be very useful if the authors can elaborate on the relationship between their work and some of these previous studies. I elaborate on this point in (a-b) below:

      2a) While the authors may be correct in claiming that RNM models based on mutual inhibition are incapable of relative value coding, it has already been shown previously that RNM models characterized by mutual inhibition can be flexibly reconfigured to produce dynamical regimes other than those that just support WTA competition (Machens, Romo & Brody, 2005). Similar to the behavior of the proposed model (Fig 9), the model by Machens and colleagues can flexibly switch between point-attractor dynamics (during stimulus encoding), line-attractor dynamics (during working memory), and saddle-point dynamics (during categorical choice) depending on the task epoch. It achieves this via a flexible reconfiguration of the external inputs to the RNM. Therefore, the authors should acknowledge that the mechanism they propose may just be one of many potential ways in which a single circuit motif is reconfigured to produce different task dynamics. This also brings into question their claim that the type of persistent activity produced by the model is "novel", which I don't believe it is (see Machens et al 2005 for the same line-attractor-based mechanism for working memory)

      We thank the Reviewer for pointing out the conceptual similarities between the LDDM and the Machens Romo Brody model, and now include a discussion of the link between the two early in the revised Discussion (p. 38, lines 826-837). Please see response to recommendations below for a more detailed discussion of this point.

      2b) The authors also fail to cite or describe their work in relation to previous work that has used disinhibition-based circuit motifs to achieve all 3 proposed functions of their model - (i) divisive normalization (Litwin-Kumar et al, 2016), (ii) flexible gating/decision making (Yang et al, 2016), and working memory maintenance (Kim & Sejnowski,2021)

      The Reviewer notes several relevant papers, and we have now discussed them and their relationship to the LDDM in a revised Discussion section (pp. 35-36). Please see response to recommendations below for a more details.

    2. Reviewer #3 (Public Review):

      Shen et al. attempt to reconcile two distinct features of neural responses in frontoparietal areas during perceptual and value-guided decision-making into a single biologically realistic circuit model. First, previous work has demonstrated that value coding in the parietal cortex is relative (dependent on the value of all available choice options) and that this feature can be explained by divisive normalization, implemented using adaptive gain control in a recurrently connected circuit model (Louie et al, 2011). Second, a wealth of previous studies on perceptual decision-making (Gold & Shadlen 2007) have provided strong evidence that competitive winner-take-all dynamics implemented through recurrent dynamics characterized by mutual inhibition (Wang 2008) can account for categorical choice coding. The authors propose a circuit model whose key feature is the flexible gating of 'disinhibition', which captures both types of computation - divisive normalization and winner-take-all competition. The model is qualitatively able to explain the 'early' transients in parietal neural responses, which show signatures of divisive normalization indicating a relative value code, persistent activity during delay periods, and 'late' accumulation-to-bound type categorical responses prior to the report of choice/action onset.

      The attempt to integrate these two sets of findings by a unified circuit model is certainly interesting and would be useful to those who seek a tighter link between biologically realistic recurrent neural network models and neural recordings. I also appreciate the effort undertaken by the authors in using analytical tools to gain an understanding of the underlying dynamical mechanism of the proposed model. However, I have two major concerns. First, the manuscript in its current form lacks sufficient clarity, specifically in how some of the key parameters of the model are supposed to be interpreted (see point 1 below). Second, the authors overlook important previous work that is closely related to the ideas that are being presented in this paper (see point 2 below).

      1) The behavior of the proposed model is critically dependent on a single parameter 'beta' whose value, the authors claim, controls the switch from value-coding to choice-coding. However, the precise definition/interpretation of 'beta' seems inconsistent in different parts of the text. I elaborate on this issue in sub-points (1a-b) below:

      1a). For instance, in the equations of the main text (Equations 1-3), 'beta' is used to denote the coupling from the excitatory units (R) to the disinhibitory units (D) in Equations 1-3. However, in the main figures (Fig 2) and in the methods (Equation 5-8), 'beta' is instead used to refer to the coupling between the disinhibitory (D) and the inhibitory gain control units (G). Based on my reading of the text (and the predominant definition used by the authors themselves in the main figures and the methods), it seems that 'beta' should be the coupling between the D and G units.

      1b). A more general and critical issue is the failure to clearly specify whether this coupling of D-G units (parameterized by 'beta') should be interpreted as a 'functional' one, or an 'anatomical' one. A straightforward interpretation of the model equations (Equations 5-8) suggests that 'beta' is the synaptic weight (anatomical coupling) between the D and G units/populations. However, significant portions of the text seem to indicate otherwise (i.e a 'functional' coupling). I elaborate on this in subpoints (i-iii) below:

      (1b-i). One of the main claims of the paper is that the value of 'beta' is under 'external' top-down control (Figure 2 caption, lines 124-126). When 'beta' equals zero, the model is consistent with the previous DNM model (dynamic normalization, Louie et al 2011), but for moderate/large non-zero values of 'beta', the network exhibits WTA dynamics. If 'beta' is indeed the anatomical coupling between D and G (as suggested by the equations of the model), then, are we to interpret that the synaptic weight between D-G is changed by the top-down control signal within a trial? My understanding of the text suggests that this is not in fact the case. Instead, the authors seem to want to convey that top-down input "functionally" gates the activity of D units. When the top-down control signal is "off", the disinhibitory units (D) are "effectively absent" (i.e their activity is clamped at zero as in the schematic in Fig 2B), and therefore do not drive the G units. This would in-turn be equivalent to there being no "anatomical coupling" between D and G. However when the top-down signal is "on", D units have non-zero activity (schematic in Fig 2B), and therefore drive the G units, ultimately resulting in WTA-like dynamics.

      (1b-ii). Therefore, it seems like when the authors say that beta equals zero during the value coding phase they are almost certainly referring to a functional coupling from D to G, or else it would be inconsistent with their other claim that the proposed model flexibly reconfigures dynamics only through a single top-down input but without a change to the circuit architecture (reiterated in lines 398-399, 442-444, 544-546, 557-558, 579-590). However, such a 'functional' definition of 'beta' would seem inconsistent with how it should actually be interpreted based on the model equations, and also somewhat misleading considering the claim that the proposed network is a biologically realistic circuit model.

      (1b-iii). The only way to reconcile the results with an 'anatomical' interpretation of 'beta' is if there is a way to clamp the values of the 'D' units to zero when the top-down control signal is 'off'. Considering that the D units also integrate feed-forward inputs from the excitatory R units (Fig 2, Equations 1-3 or 5-8), this can be achieved either via a non-linearity, or if the top-down control input multiplicatively gates the synapse (consistent with the argument made in lines 115-116 and 585-586 that this top-down control signal is 'neuromodulatory' in nature). Neither of these two scenarios seems to be consistent with the basic definition of the model (Equations 1-3), which therefore confirms my suspicion that the interpretation of 'beta' being used in the text is more consistent with a 'functional' coupling from D to G.

      2) The main contribution of the manuscript is to integrate the characteristics of the dynamic normalization model (Louie et al, 2011) and the winner-take-all behavior of recurrent circuit models that employ mutual inhibition (Wang, 2008), into a circuit motif that can flexibly switch between these two computations. The main ingredient for achieving this seems to be the dynamical 'gating' of the disinhibition, which produces a switch in the dynamics, from point-attractor-like 'stable' dynamics during value coding to saddle-point-like 'unstable' dynamics during categorical choice coding. While the specific use of disinhibition to switch between these two computations is new, the authors fail to cite previous work that has explored similar ideas that are closely related to the results being presented in their study. It would be very useful if the authors can elaborate on the relationship between their work and some of these previous studies. I elaborate on this point in (a-b) below:

      2a) While the authors may be correct in claiming that RNM models based on mutual inhibition are incapable of relative value coding, it has already been shown previously that RNM models characterized by mutual inhibition can be flexibly reconfigured to produce dynamical regimes other than those that just support WTA competition (Machens, Romo & Brody, 2005). Similar to the behavior of the proposed model (Fig 9), the model by Machens and colleagues can flexibly switch between point-attractor dynamics (during stimulus encoding), line-attractor dynamics (during working memory), and saddle-point dynamics (during categorical choice) depending on the task epoch. It achieves this via a flexible reconfiguration of the external inputs to the RNM. Therefore, the authors should acknowledge that the mechanism they propose may just be one of many potential ways in which a single circuit motif is reconfigured to produce different task dynamics. This also brings into question their claim that the type of persistent activity produced by the model is "novel", which I don't believe it is (see Machens et al 2005 for the same line-attractor-based mechanism for working memory)

      2b) The authors also fail to cite or describe their work in relation to previous work that has used disinhibition-based circuit motifs to achieve all 3 proposed functions of their model - (i) divisive normalization (Litwin-Kumar et al, 2016), (ii) flexible gating/decision making (Yang et al, 2016), and working memory maintenance (Kim & Sejnowski,2021)

    1. micahchoo committed Mar 21, 2023 Verified This commit was created on GitHub.com and signed with GitHub’s verified signature. GPG key ID: 4AEE18F83AFDEB23 Learn about vigilant mode.

      Welcome to the <span class="npf_color_monica">Fever Dream of the Folder Sorter</span>

      <small>click here and highlights will appear on the page outside the sidebar</small>

      The Context -

      I found a script that did something close to what I wanted to do. So I decided to ask ChatGPT to modify the code to add the features that I wanted. This was an arduous 24-hour process.

      However, in the course of that, a new idea started in my head. I want to use Github's Commits interface as the stage to perform the story of those 24 hours.

      Instructions

      1. You can read the main chapter by expanding the <figure data-orig-height="10" data-orig-width="20"></figure>button.

      2. The code is used as both a setting for the story as well as code as illustration. You can see the code and comments by click on the button that looks like this <figure data-orig-height="10" data-orig-width="20"></figure>

      Additional Notes

      I added * Hypothesis' Via as the narrator * Github's different features for branching and commenting as the narrative spine

      <small>The code conflicts in each new revision, the way the code is presented and the comments are all elements of the story.</small> <small>The split view might be better to view the code with to compare from one version to another</small>

      Read what this code was supposed to do

    1. Update Sort Files in Folders.py

      Click the three dots for the story of this commit. if you click on the <> button at the end of the line, you can see how the code looked for this story

    1. As others pointed out, OATH's claims of "open source" have little meaning when compared to other authentication protocols such as SAML. When you include the entire Liberty Alliance specifications as well as the Web Services Initiative protocols and methods (as devised by Microsoft and IBM) there's nary a proprietary bit of code involved. Actually, there's no code involved at all. Protocols are, by their very nature, open. If you can't read the protocol specification then you can't very well implement it, can you?
    1. Put that TS code in a file your app imports, for example, in remix.env.d.ts, and now the type of name will be the expected one.

      ts declare module "@remix-run/server-runtime" { export interface AppLoadContext { name: string; } }

    1. Thanks to a collaboration with Bloomberg, Babel now supports transforming the "Records and Tuples" stage 2 proposal.The Babel plugin transforms records and tuples syntax using the global Record and Tuple functions:
      Records and Tuples support (#12145)
      <table><tbody style="width:100%;display:table;table-layout:fixed"><tr><th>Input</th><th>Output</th></tr><tr><td><div class="language-js codeBlockContainer_mQmQ theme-code-block" style="--prism-color: #4d4d4c; --prism-background-color: #fdfaeb;"><div class="codeBlockTitle_x_ju">JavaScript</div><div class="codeBlockContent_D5yF">
      <span class="token-line" style="color: rgb(77, 77, 76);"><span class="token keyword" style="color: rgb(137, 89, 168);">let</span><span class="token plain"> data </span><span class="token operator">=</span><span class="token plain"> #</span><span class="token punctuation">{</span><span class="token plain"></span><br></span><span class="token-line" style="color: rgb(77, 77, 76);"><span class="token plain">  </span><span class="token literal-property property">name</span><span class="token operator">:</span><span class="token plain"> </span><span class="token string" style="color: rgb(113, 140, 0);">"Babel"</span><span class="token punctuation">,</span><span class="token plain"></span><br></span><span class="token-line" style="color: rgb(77, 77, 76);"><span class="token plain">  </span><span class="token literal-property property">ids</span><span class="token operator">:</span><span class="token plain"> #</span><span class="token punctuation">[</span><span class="token number" style="color: rgb(245, 135, 31);">1</span><span class="token punctuation">,</span><span class="token plain"> </span><span class="token number" style="color: rgb(245, 135, 31);">2</span><span class="token punctuation">,</span><span class="token plain"> </span><span class="token number" style="color: rgb(245, 135, 31);">3</span><span class="token punctuation">]</span><span class="token plain"></span><br></span><span class="token-line" style="color: rgb(77, 77, 76);"><span class="token plain"></span><span class="token punctuation">}</span><span class="token punctuation">;</span><br></span>
      <div class="buttonGroup_aaMX"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_z5j7" aria-hidden="true"><svg class="copyButtonIcon_FoOz" viewBox="0 0 24 24"><path d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg class="copyButtonSuccessIcon_L0B6" viewBox="0 0 24 24"><path d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div></td><td><div class="language-js codeBlockContainer_mQmQ theme-code-block" style="--prism-color: #4d4d4c; --prism-background-color: #fdfaeb;"><div class="codeBlockTitle_x_ju">JavaScript</div><div class="codeBlockContent_D5yF">
      <span class="token-line" style="color: rgb(77, 77, 76);"><span class="token keyword" style="color: rgb(137, 89, 168);">let</span><span class="token plain"> data </span><span class="token operator">=</span><span class="token plain"> </span><span class="token function maybe-class-name" style="color: rgb(66, 113, 174);">Record</span><span class="token punctuation">(</span><span class="token punctuation">{</span><span class="token plain"></span><br></span><span class="token-line" style="color: rgb(77, 77, 76);"><span class="token plain">  </span><span class="token literal-property property">name</span><span class="token operator">:</span><span class="token plain"> </span><span class="token string" style="color: rgb(113, 140, 0);">"Babel"</span><span class="token punctuation">,</span><span class="token plain"></span><br></span><span class="token-line" style="color: rgb(77, 77, 76);"><span class="token plain">  </span><span class="token literal-property property">ids</span><span class="token operator">:</span><span class="token plain"> </span><span class="token function maybe-class-name" style="color: rgb(66, 113, 174);">Tuple</span><span class="token punctuation">(</span><span class="token number" style="color: rgb(245, 135, 31);">1</span><span class="token punctuation">,</span><span class="token plain"> </span><span class="token number" style="color: rgb(245, 135, 31);">2</span><span class="token punctuation">,</span><span class="token plain"> </span><span class="token number" style="color: rgb(245, 135, 31);">3</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token plain"></span><br></span><span class="token-line" style="color: rgb(77, 77, 76);"><span class="token plain"></span><span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span><br></span>
      <div class="buttonGroup_aaMX"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_z5j7" aria-hidden="true"><svg class="copyButtonIcon_FoOz" viewBox="0 0 24 24"><path d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg class="copyButtonSuccessIcon_L0B6" viewBox="0 0 24 24"><path d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div></td></tr></tbody></table>
    1. One of the most interesting features of ARC Browser is the "boosts", which are little snippets of code to overwrite website functions or design if you know a little CSS and JavaScript.

      How is this different from Greasemonkey or Stylus extension?

    1. <table><tbody><tr><th colspan="4" rowspan="1">Status</th><th colspan="4" rowspan="1">Description</th></tr><tr><td colspan="5" rowspan="1">HIT</td><td colspan="5" rowspan="1">The resource was found in Cloudflare’s cache.</td></tr><tr><td colspan="5" rowspan="1">MISS</td><td colspan="5" rowspan="1">The resource was not found in Cloudflare’s cache and was served from the origin web server.</td></tr><tr><td colspan="5" rowspan="1">NONE/UNKNOWN</td><td colspan="5" rowspan="1">Cloudflare generated a response that denotes the asset is not eligible for caching. This may have happened because:
    2. A Worker generated a response without sending any subrequests. In this case, the response did not come from cache, so the cache status will be none/unknown.
    3. A Worker request made a subrequest (fetch). In this case, the subrequest will be logged with a cache status, while the main request will be logged with none/unknown status (the main request did not hit cache, since Workers sits in front of cache).
    4. A Firewall rule was triggered to block a request. The response will come from the edge network before it hits cache. Since there is no cache status, Cloudflare will log as none/unknown.
    5. A redirect page rule caused the edge network to respond with a redirect to another asset/URL. This redirect response happens before the request reaches cache, so the cache status is none/unknown.
    6. </td></tr><tr><td colspan="5" rowspan="1">EXPIRED</td><td colspan="5" rowspan="1">The resource was found in Cloudflare’s cache but was expired and served from the origin web server.</td></tr><tr><td colspan="5" rowspan="1">STALE</td><td colspan="5" rowspan="1">The resource was served from Cloudflare’s cache but was expired. Cloudflare could not contact the origin to retrieve an updated resource.</td></tr><tr><td colspan="5" rowspan="1">BYPASS</td><td colspan="5" rowspan="1">The origin server instructed Cloudflare to bypass cache via a Cache-Control header set to no-cache,private, or max-age=0 even though Cloudflare originally preferred to cache the asset. BYPASS is returned when enabling Origin Cache-Control. Cloudflare also sets BYPASS when your origin web server sends cookies in the response header.</td></tr><tr><td colspan="5" rowspan="1">REVALIDATED</td><td colspan="5" rowspan="1">The resource is served from Cloudflare’s cache but is stale. The resource was revalidated by either an If-Modified-Since header or an If-None-Match header.</td></tr><tr><td colspan="5" rowspan="1">UPDATING</td><td colspan="5" rowspan="1">The resource was served from Cloudflare’s cache and was expired, but the origin web server is updating the resource. UPDATING is typically only seen for very popular cached resources.</td></tr><tr><td colspan="5" rowspan="1">DYNAMIC</td><td colspan="5" rowspan="1">Cloudflare does not consider the asset eligible to cache and your Cloudflare settings do not explicitly instruct Cloudflare to cache the asset. Instead, the asset was requested from the origin web server. Use Page Rules to implement custom caching options.</td></tr></tbody></table>
    1. likely the new people learning to code and yelling about the new shiny libraries they found

      Bart asked me about what it is that I think causes NPM to be so bad, generally (or something like that), and I responded with the one-word answer "insecurity".

      I think "striving for acceptance" is a better, more diplomatic way to put it.

    1. Another awesome article. I didn't really understand abilities before this aside from knowing they were some sort of algebraic effect implementation (which I also didn't understand).

      The exceptions example and the note about dynamic scoping helped me tie it together mentally. It's like exceptions except it's two way, not one way. It bubbles up to some point, is handled, and then the calling code resumes.

    2. When you call handle to associate an ability with a function, you can pass it some initial state. The handler then passes this state into the implementation functions along with any parameters passed by the client code. When the implementation function returns, it does so by calling handle again, which means it can pass in updated state.

      So.. it gets some setup and teardown data?

    3. Dynamic Scoping Modern programming languages implement global and/or module scope and/or lexical scope. A name defined globally is available everywhere in the code. A name given module scope is only directly accessible within that module (and may be available outside if qualified with the module name). A name defined with lexical scope is available inside the current lexical block and (typically) the blocks it encloses. All three of these are statically defined: the meaning of a variable name can be determined at compilation time. In the past, languages such as Perl also offered dynamic scope. This looks a little like lexical scope, except the names defined in a block are available not just in that block but also in all the functions invoked by that block, and functions invoked below them, and so on. The scope is only determined at runtime: the name exists for the duration of the block that defines it, and it exists in all functions executed during that time. As you can imagine, this was both powerful and widely abused: it’s hard to know just what a name means when its definition depends on the execution flow. This is one reason we don’t often see dynamic scoping in current languages. Unison’s abilities are a form of dynamic scoping. However, they overcome many of the issues with previous kinds of dynamic scoping because they are fully type safe. You cannot accidentally use a name injected freom a higher context, and you always know where every name comes from.

      This is really fascinating! In some ways this makes me think of React's context which enables passing data deeply down a component tree.

    1. Imagine coming back to that code two years later and expecting it to just run. Why wouldn’t it? Nothing has changed.

      It's notable while this is true for internal system concerns it may not necessarily be true for external system concerns. It does avoid the whole left-pad situation though.

    2. If you’ve come across Smalltalk, this is quite similar to it’s idea of an image

      I'm not familiar with the concept of an image in smalltalk.

      Here's what wikipedia has to say:

      Many Smalltalk systems, however, do not differentiate between program data (objects) and code (classes). In fact, classes are objects. Thus, most Smalltalk systems store the entire program state (including both Class and non-Class objects) in an image file. The image can then be loaded by the Smalltalk virtual machine to restore a Smalltalk-like system to a prior state.

      https://hyp.is/JSSGksUWEe2szz-6MaOv-w/en.wikipedia.org/wiki/Smalltalk

    1. The EU Renewable Energy Directive and Irish legislation permit the use of GOs.Though using GOs is legal and overseen by the energy regulator in Ireland, ASAI found claims companies were supplying 100 per cent renewable power were misleading to customers.The advertising watchdog said six sections of its advertiser's code of conduct had been broken including one section that says "advertisers should not exploit the credulity, inexperience or lack of knowledge of consumers".

      Just because the it follows the law, doesn't mean that it's misleading

    1. Reviewer #3 (Public Review):

      The authors describe a machine learning method for classifying the geographic origin of a Salmonella enterica isolate based on its whole-genome sequencing data. This is done at a continent, region, and country level, and the method is shown to be robust to phylogenetic diversity, temporal trends, and possibly some amount of mislabelling (but please see the first concern below). The authors demonstrate that their pipeline produces results in 5 minutes or less, which makes it applicable to many public health microbiology settings.

      Some clear strengths of the paper include:<br /> - the use of a hierarchical classification method, which ensures that only those samples that can be unambiguously classified as belonging to a specific region can get assigned to a sub-region within that region (e.g. continent to country)<br /> - leveraging the UKHSA dataset going back nearly a decade, and containing a comprehensive record of all clinically detected Salmonella enterica infections, which mitigates potential biases and ensures a maximal geographic coverage<br /> - making all the data (microreact) and the source code (GitHub) public, which facilitates replication as well as enables other researchers and public health microbiologists to use the trained models directly on their own data<br /> - the use of unitigs as the basis for prediction, which are more informative than K-mers yet more straightforward to identify than SNPs or gene alleles.

      There are several methodological concerns that should ideally be addressed:<br /> - in addition to the more complex situation of a tourist visiting country A and consuming food from country B, it would be good to rule out a simpler one of the tourist visiting both countries on the same trip (including via a stopover at an airport); the authors should elaborate on the plausibility of missing data on such multi-country trips and their frequency based on the available travel data<br /> - similarly, there appears to be an underlying assumption that the UK is never at the origin of a Salmonella enterica infection in the dataset selected; the authors should explain why that is a reasonable assumption for this dataset<br /> - the increase of infection incidence during the summer months might be at least partly attributable to a greater number of trips abroad during that period - if the authors have corrected their data for this, they should explicitly say so<br /> - lastly, in discussing the outbreak due to Polish eggs, it should be possible to check explicitly what fraction of the training data may have originated from this outbreak to see if this is sufficient to explain the observed poor prediction

      Overall, this is a paper representing a substantial body of work and combining algorithmic advances with practical utility given the rapid turnaround time. It is likely to be generalisable to other pathogens of public health importance and to become integrated into standard protocols for outbreak origin tracing.

    1. The latter skill has become known as “prompt engineering”: the technique of framing one’s instructions in terms most clearly understood by the system, so it returns the results that most closely match expectations – or perhaps exceed them. Tech commentators were quick to predict that prompt engineering would become a sought-after and well remunerated job description in a “no code” future, where the most powerful way of interacting with intelligent systems would be through the medium of human language. No longer would we need to know how to draw, or how to write computer code: we would simply whisper our desires to the machine and it would do the rest. The limits on AI’s creations would be the limits of our own imaginations.

      Not only is "prompt engineering" seen as a source of future employment, it may also be the attack vector of choice against these systems.

    1. so 11 11 11 00:27:51 are the codes in which we could realize that is the path towards enlightenment and recognize the whole consciousness in the three stages of the universe that's 00:28:05 why when we see this number 11 11 11 is the code that makes us know that we are ready to put all the patterns together and to work for the divine

      wow that was fast with the fractal trinities leading up to 11 11 11

    1. Author Response

      Reviewer #2 (Public Review):

      In this manuscript, the authors use an embedding of human olfactory perceptual data within a graph neural network (which they term principal odor map, or POM). This embedding is a better predictor of a diverse set of olfactory neural and behavior data than methods that use chemical features as a starting point to create embeddings. The embedding is also seen to be better for comparison of pairwise similarities (distances of various sorts) - the claim is that proximity of pairs of odors in the POM is predictive of their similarity in neural data from olfactory receptor neurons.

      A major strength of the paper is the conceptualization of the problem. The authors have previously described a graph neural net (GNN) to predict verbal odor descriptors from molecular features (here, a 2019 preprint is cited, but a newer related one in 2022 describing the POM is not cited). They now use the embedding created by that GNN to predict similarities in large and diverse datasets in olfactory neuroscience (which the authors have curated from published work). They show that predictions from POM are better than just generic chemical features. The authors also present an interesting hypothesis that the underlying latent structure discovered by the GNN relates to metabolic pathway proximity, which they claim accounts for the success in the prediction of a wide range of data (insect sensory neuron responses to human behavior). In addition to the creativity of the project, the technical aspects, are sound and thorough.

      There are some questions about the ideas, and the size of the effects observed.

      1) The authors frame the manuscript by invoking an analogy to other senses, and how naturalstatistics affect what's represented (and how similarity is defined). However, in vision or audition, the part of the world that different animals "look at" can be very different (different wavelengths, different textures and spatial frequencies, etc). It is still unresolved why any given animal has the particular range of reception it has. Each animal is presumably adapted for its ecological niche, which can have different salient sensory features. In vision, different animals pick different sound bandwidths or EM spectra. Therefore, it is puzzling to think that all animals will somehow treat chemicals the same way.

      Our assumption (an assumption of the broader interpretation, not of the analyses themselves) that all terrestrial animals have a correlated odor environment is certainly only true for some values of “correlated”. One could imagine, for example, that some animals are able to exploit food energy sources that humans cannot (for example, plants with high cellulose content), and that they might therefore be adapted to smell metabolic signatures of such plants, whereas humans would not be so adapted. This seems quite reasonable and there are probably many such examples. In future work they might be used to test the theory directly: representations might be more likely to differ across species on tasks when the relevant ecological niches are non-overlapping. We have updated the discussion to propose such future tests. However, it is also apparent that the odor environment overall is nonetheless highly correlated across species. Recent work (Mayhew et al, PNAS) showed that nearly all molecules that pass simple mass transport requirements (that should apply to all mammals, at the least) are likely to have an odor to humans, so it seems unlikely that the “olfactory blind spots” are intrinsically large.

      2) The performance index could be made clearer, and perhaps raw numbers shown beforeshowing the differences from the benchmark (Mordred molecular descriptor). For example, can we get a sense of how much variance in the data does it explain, what percent of the hold-out tests does it fit well, etc.?

      The performance index in Figure 1 is required to compare across different types of tasks, which are in turn dictated by the nature of the data (e.g. continuous vs categorical). Regression tasks yields an R2 value and categorical tasks yield an AUROC. We normalized and placed these on a single scale in order to show all of the tasks clearly together. We have added a table to the shared code (from link in Methods section, go to predictive_performance/data/dataset_performance_index_raw.csv) that shows the original (non-normalized) values, for both the POM and the benchmark(s) across multiple seeds and various metrics with the model hyper-parameters that generate the best performance.

      3) The "fitting" and predictions are in line with how ML is used for classification and regression inlots of applications. The end result is a better fit (prediction), but it's not actually clear whether there are any fundamental regularities or orders identified. The metabolic angle is very intriguing, but it looks like Mordred descriptor does a very good job as well (extended figure 5 [now Figure 2-figure supplement 5]). Is it possible to show the relation between metabolic distance and Mordred distance in Figure 2c? In fact, even there, cFP distance looks very well correlated with metabolic distance (we are talking about r= 0.9 vs r = 0.8). This could simply be due to a slightly nonlinear mapping between chemical similarity and perceptual similarity (which was used to get POM distance).

      We show additional “showdown” comparisons between metabolic distance, POM distance, and alternative distance metrics in the new Figure 2-figure supplement 3 and Figure 2-figure supplement 4. Indeed, the Mordred descriptors perform well; after all, metabolic reactants and products must be at least somewhat structurally related. But POM (derived only from human perceptual data) outperforms it significantly. Visual inspection of Figure 2c also reveals that the dispersion of structural distances (at a given metabolic distance) is just much higher than the dispersion of POM distances. This won’t change if one uses a non-linear curve fit, as it is a property of the data itself.

      It’s also worth noting while r=0.8 and r=0.9 might seem close, in terms of variance unexplained (1 - r2) they are approximately two-fold different. Reducing the unexplained variance by half seems like a meaningful difference. Alternatively, if one simulates scatter plots with correlation r=0.8 vs r=0.9, it is apparent that the latter is simply a much tighter relationship.

      4) How frequent are such examples shown in Fig 2d? Pentenal and pentenol are actually verysimilar in many ways, and it may be that Tanimoto distance is not a great descriptor of chemical similarity. cFP edit distance is quite small, just like metabolic distance. The thiol example on the right is much better. Also, even in Fig 2C POM vs metabolic distance, the lowest metabolic distances have large variations in the POM values - so there too, metabolic reactions that create very different molecules in 1 step can vary widely in POM distance as well.

      We agree that Tanimoto distance is not perfect. We were unable to find a measure of structural distance that agreed with human intuitions about “structural distance” in all cases; indeed that intuition is often generated by an understanding of odor/flavor characteristics of function in metabolic networks, which would beg the question! To answer the question about the frequency of examples like the ones shown in Figure 2d, we created a new density map (Figure 2-figure supplement 4) showing the number of one-step metabolite pairs for a given range of POM vs cFP edit/Tanimoto distance. We found >25 pairs of metabolites in the same “small POM distance” and “large structural distance” quadrant from which we found the original examples shown in Figure 2d..

      5) A major worry is that Mordred descriptors are doing fine, and POM offers only a smallimprovement (but statistically significant of course). Another way to ask this question is this: if you plot pairwise correlation/distance of pairs of odors from POM against that for Mordred, how correlated does this look? My suspicion is that it will be highly correlated.

      It will look highly correlated (as shown in the new Figure 2-figure supplement 3). The reason is that metabolic reactions cannot make arbitrary transformations to molecules (the reactants must have some structural relationship to the products) or similarly that olfactory receptors (in any species) cannot have arbitrary tuning – at the end of the day receptors mostly bind to similar-looking classes of molecules. As stated above, we believe that the improvement here is not just statistically significant but meaningful – a 2-fold drop in unexplained variance is large – and that it is important to identify principles by which the nervous system can be tuned, above and beyond the physical constraints imposed by basic rules of chemistry.

      Also, the metabolic distances that we constructed from available data are themselves noisy, since not all metabolic pathways and the compounds that compose them are known, which places an upper bound on the correlation that we could have obtained. Despite that, we still found a correlation of r>0.9.

      6) The co-occurrence in mixtures and close POM distance may arise from the way theembedding was done - with perceptual descriptors used as a key variable. Humans may just classify molecules that occur in a mixture as similar just from experiencing them together. Can the authors show that these same molecules in Fig 4d,e have very similar representations in neural data from insects or mice?

      We have added a new Figure 4-figure supplement 1 to show this. One constraint is that the neural datasets must contain molecules that are also in the natural substance datasets used in Figure 4. In all cases where the data is sufficient to be powered to test the hypothesis (i.e. more than five co-occuring pairs of molecules in essential oil), we observe an effect in the predicted direction.

    2. Reviewer #1 (Public Review):

      This study builds an odorant organization map as estimated by a neural network trained on several odor perceptual classification databases. The authors come up with an attractive hypothesis about the link of odor perception to metabolic connectedness, as opposed to a range of other ways of classifying odorant compounds. There are several interesting implications of this, which the authors touch upon, but could perhaps frame as specific predictions.

      The authors clearly have generated a powerful methodology, a useful classifying network, and a well-organized database. The study would be much stronger if the methodology were more thoroughly explained, with open code and data availability as expected for a computational study, and as a resource for further research on the topic.

      It would also be valuable to place the current findings in the context of considerable earlier work that has sought to map odor perception and place it in the context of structural and chemical features.

    1. In an early blog on Recurrent Neural Networks (still my favorite for its insight into what's really going on), the author found individual neurons that evolved/optimized themselves into being gatekeepers for certain high-level features in the text: this section.After training an RNN to generate C code, one neuron became sensitive to the length of the line, another turned on inside quotation marks and was off outside, another was only on for the predicate of if statements, another for comments or quoted text, and one more for the depth of nested brackets/indentation. Most of the neurons were not easily interpretable, but presumably, combinations of them controlled combinations of high-level features.Could a set of booleans controlling things like "if quotation," "if comment," "if predicate" plus many other conditions be considered an internal representation of C code? If I were to write an algorithm for generating C code, it almost certainly would include variables that controlled these things.The way I look at it, the biggest difference between machine learning and hand-written code is the development process. Hand-written code is like craftsmanship, like building a chair from wood, while machine learning is like farming: putting a seed in the ground, controlling the environment—humidity, temperature, hyperparameters, training datasets—and waiting. Design is good for some products and agriculture is good for others. Agriculture is a particularly good way to make very complex things with loose constraints on how it works: I would not want to design a tree, but when nature grows a tree, I don't care if it has three branches on the left and two on the right or vice-versa.I'm glad that we now have two ways of making software, craftsmanship and farming. It's good to have more tools.

      He basically likens it to evolution without being aware of it. I love how complex and intricately the functions of these neurons emerge, really like evolution.

    1. LogisticRegression:

      good code and docstrings. you could also comment individual lines, but that's more of a stylistic preference than anything else, the docstrings were very informative

    1. In 1958, when the book was first published, homosexuality was a taboo subject, therefore when referenced writers and filmmakers would use discreet codes. Foucault’s theory of ‘repressive hypothesis’ explains how homosexuality was silenced due to its controversial nature, as at the time ‘homosexuality was annexed to mental illness.’ Capote has never confirmed the exact sexual orientation of any of his characters and littered his novella with subtle connotations alluding to homosexuality. In the book it is inferred that bartender Joe Bell is gay. Joe is fascinated with ice hockey and also loves soap operas and the theatre, this juxtaposition of female and male concepts, implies he is pretending to be interested in stereotypically masculine pastimes in an attempt to veil his true sexual orientation. Furthering this impression, Capote feminizes Joe by describing him as arranging flowers with ‘matronly care’. However in the film adaptation, Axelrod completely discarded the character from the film. Likewise, Holly is suggested to be a lesbian, as she mentions her former lesbian roommate and indirectly expresses a sexual interest in other women. Capote later stated ‘its well-known fact that most prostitutes are lesbians- at least 80 percent of them, in any case.’ However, this notion was completely out ruled for the screenplay, which presented her as a straight woman. The narrator, referred to as Paul Varjak in the film, is also hinted to be of a homosexual orientation. In the novella Holly calls him a ‘Maude’, a term used by the gay community in 1950’s and 60’s for a male prostitute. The character has no previous dating history, and is not sexually attracted to Holly. Once again this character was drastically altered for the movie, in which he appears to be a very masculine, heterosexual character that is possessive of Holly. Axelrod was pressured to make these major alterations from the backward societal views of the time, after all the characters had to appeal to this audience. The Hays Code, a motion picture production code adopted in 1930 that ‘reconised producers responsibility to the public’, further forced these adaptations of sexual orientation, as homosexuality was not accepted, the explicit inclusion of gay characters would have been seen as offensive.

      can be used for gei intro

    1. If you haven’t built something useful with your library yet, it is unlikely anyone else will. Code reuse isn’t a good excuse to avoid duplicating code, and writing reusable code inside your project is often a form of preemptive optimization.

      LOL I remember I wrote a wraper for HTML5 Canvas and never used it, there is some serious wisdom here

    2. Never rewrite your code from scratch, ever!.

      Society should rewrite its operating systems from scratch every generation, too bad our Hardware is not standardized well enough.... well not yet anyways...

    1. One of the major benefits of E-Commerce Search is its ability though the UI to handle dictionaries, sorting and ranking, and defining fields and facets to be returned. Previously, making these changes would require a developer to modify the code. However, E-Commerce Search empowers search managers to make these adjustments without relying on a developer.

      Also, to get feedback early on in the process with diagnostics / context preview, fix mistakes before they go live with publications, and get help with weak searches with the dashboard.

    1. Now when App is rendered and a request is initiated to get the LazyLoad code, the fallback Loading is rendered. When this request completes, React will then render LazyLoad.

      js const App = () => { return ( <Suspense fallback="Loading"> <LazyLoad /> </Suspense> ); };

    2. Now, App and LazyLoad are in separate code chunks. A request is sent to fetch the LazyLoad code chunk only when App is rendered. When the request completes, React will then renderLazyLoad. You can verify this by looking at the Javascript requests in the network inspector.

      ```js import React, { lazy } from 'react';

      const LazyLoad = lazy(() => import('./LazyLoad'));

      const App = () => { return <LazyLoad />; }; ```

    1. Reviewer #2 (Public Review):

      In this manuscript, Roberts et al. present XTABLE, a tool to integrate, visualise and extract new insights from published datasets in the field of preinvasive lung cancer lesions. This approach is critical and to be highly commended; whilst the Cancer Genome Atlas provided many insights into cancer biology it was the development of accessible visualisation tools such as cbioportal that democratised this knowledge and allowed researchers around the world to interrogate their genes and pathways of interest. XTABLE is trying to do this in the preinvasive space and should certainly be commended as such. We are also very impressed by the transparency of the approach; it is quite simple to download and run XTABLE from their Gitlab account, in which all data acquisition and analysis code can be easily interrogated.

      We would however strongly advocate deploying XTABLE to a web-accessible server so that researchers without experience in R and git can utilise it. We found it a little buggy running locally and cannot be sure whether this is due to my setup or the code itself. Some issues clearly need development; Progeny analysis brings up a warning "Not working for GSE109743 on the server and not sure why". GSEA analysis does not seem to work at all, raising an error "Length information for genome hg38 and gene ID ensGene is not available". In such relatively complex software, some such errors can be overlooked, as long as the authors have a clear process for responding to them, for example using Gitlab issue reporting. Some acknowledgement that this is an ongoing development would be helpful.

      The authors discuss some very important differences between the datasets in the text. Most notably they differ in endpoints and in the presence of laser capture. We would advocate including some warning text within the XTABLE application to explain these. For example, the "persistent/progressive" endpoint used in Beane et al (next biopsy is the same or higher grade) is not the same as the "progressive" endpoint in Teixeira et al (next biopsy is cancer); samples defined as "persistent/progressive" may never progress to cancer. This may not be immediately obvious to a user of XTABLE who wishes to compare progressive and regressive lesions. Similarly, the use of laser capture is important; the authors state that not using laser capture has the advantage of capturing microenvironment signals, but differentiating between intra-lesional and stromal signals is important, as shown in the Mascaux and Pennycuick papers. The authors cannot do much about the different study designs, but as the goal is to make these data more accessible We think some brief description of these issues within the app would help to prevent non-expert users from drawing incorrect conclusions.

      The authors themselves illustrate this clearly in their analysis of CIN signatures in progression potential. They observe that there is a much clearer progressive/regressive signal in GSE108124 compared to GSE114489 and GSE109743. This does not seem at all surprising, since the first study used a much stricter definition of progression - these samples are all about to become cancer whereas "progressive" samples in GSE109743 may never become cancer - and are much enriched for CIN signals due to laser capture. Their discussion states "CIN scores as a predictor of progression might be limited to microdissected samples and CIS lesions"; you cannot really claim this when "progression" in the two cohorts has such a different meaning. To their credit, the authors do explain these issues but they really should be clearly spelled out within the app.

      We are not sure we agree with their analysis of CDK4/Cyclin-D1 and E2F expression in early lesions. The authors claim these are inhibited by CDKN2A and therefore are markers of CDKN2A loss of function. But these genes are markers of proliferation and can be driven by a range of proliferative processes. Histologically, low-grade metaplasias and dysplasias all represent proliferative epithelium when compared to normal control, but most never become cancer. It is too much of a leap to say that these are influenced by CDKN2A because that gene is inactivated in LUSC; do the authors have any evidence that this gene is altered at the genomic level in low-grade lesions?

      Overall this tool is an important step forwards in the field. Whilst we are a little unconvinced by some of their biological interpretations, and the tool itself has a few bugs, this effort to make complex data more accessible will be greatly enabling for researchers and so should be commended. In the future, we would like to see additional molecular data integrated into this app, for example, the whole genome and methylation data mentioned in line 153. However, we think this is an excellent start to combining these datasets.

    2. Author Response:

      Reviewer #1 (Public Review):<br /> <br /> Roberts et al have developed a tool called "XTABLE" for the analysis of publicly available transcriptomic datasets of premalignant lesions (PML) of lung squamous cell carcinoma (LUSC). Detection of PMLs has clinical implications and can aid in the prevention of deaths by LUSC. Hence efforts such as this will be of benefit to the scientific community in better understanding the biology of PMLs.

      The authors have curated four studies that have profiled the transcriptomes of PMLs at different stages. While three of them are microarray-based studies, one study has profiled the transcriptome with RNA-seq. XTABLE fetches these datasets and performs analysis in an R shiny app (a graphical user interface). The tool has multiple functionalities to cover a wide range of transcriptomic analyses, including differential expression, signature identification, and immune cell type deconvolution.

      The authors have also included three chromosomal instability (CIN) signatures from literature based on gene expression profiles. They showed one of the CIN signatures as a good predictor of progression. However, this signature performed well only in one study. The authors have further utilised the tool XTABLE to identify the signalling pathways in LUSC important for its developmental stages. They found the activation of squamous differentiation and PI3K/Akt pathways to play a role in the transition from low to high-grade PMLs

      The authors have developed user-friendly software to analyse publicly available gene expression data from premalignant lesions of lung cancer. This would help researchers to quickly analyse the data and improve our understanding of such lesions. This would pave the way to improve early detection of PMLs to prevent lung cancer.

      Strengths:

      1. XTABLE is a nicely packaged application that can be used by researchers with very little computational knowledge.<br /> 2. The tool is easy to download and execute. The documentation is extensive both in the article and on the GitLab page.<br /> 3. The tool is user-friendly, and the tabs are intuitively designed for successive steps of analysis of the transcriptome data.<br /> 4. The authors have properly elaborated on the biological interest in investigating PMLs and their clinical significance.

      Weaknesses:

      The article is focused on the development and the utility of the tool XTABLE. While the tool is nicely developed, the need for a tool focussing only on the investigation of PMLs is not justified. Several shiny apps and online tools exist to perform transcriptomic analysis of published datasets. To list a few examples - i) http://ge-lab.org/idep/ ; ii) http://www.uusmb.unam.mx/ideamex/ ; iii) RNfuzzyApp (Haering et al., 2021); iv) DEGenR (https://doi.org/10.5281/zenodo.4815134); v) TCC-GUI (Su et al., 2019). While some of these are specific to RNA-seq, there are plenty of such shiny apps to perform both RNA-seq and microarray data analysis. Any of these tools could also be used easily for the analysis of the four curated datasets presented in this article. The authors could have elaborated on the availability of other tools for such analysis and provided an explanation of the necessity of XTABLE. Since 3 of the 4 datasets they curated are from microarray technology, another good example of a user-friendly tool is NCBI GEO2R. This is integrated with the NCBI GEO database, and the user doesn't need to download the data or run any tools. iDEP-READS (http://bioinformatics.sdstate.edu/reads/) provide an online user-friendly tool to download and analyse data from publicly available datasets. Another such example is GEO2Enrichr (https://maayanlab.cloud/g2e/). These tools have been designed for non-bioinformatic researchers that don't involve downloading datasets or installing/running other tools.

      Two of these tools (IDEP and TCC-GUI) were reviewed in a literature review covering 20 Shiny apps performed two years ago prior to work on XTABLE starting. Three of the suggested tools (IDEP, RNFuzzyApp, TCC-GUI) are for processing only RNA-seq datasets. IDEAMEX appears to be for RNA-seq data only and is severely limited in its downstream analysis capabilities. DEGenR appears to handle microarray datasets and features an option to retrieve data directly from GEO. However, it appears to be based on GEO2R (with additional downstream analyses) where it automatically logtransforms already log-transformed data and unlike GEO2R, you do not have the option to not apply a log-transformation. A refreshed literature search focusing on microarray datasets highlighted three additional tools. iGEAK which hasn’t been updated in three years and seems to have compatibility issues running on new Windows and Mac machines. sMAP, an upcoming Shiny app for microarray data published in bioRxiv on 29 May 2022. MAAP which has the same issue of log-transforming already log-transformed data. iDEP-READS does not list the datasets used in XTABLE. GEO2Enrichr appears to require the counts table and experimental design in one file, performs a “characteristic direction” DEG test and outputs enriched pathways. These apps require not just downloading of datasets but reformatting and renaming of expression data files and creation of additional files for setting up the DEG analysis which is not practical for the number of samples we have (122, 63, 33, 448) even if these apps handled microarray data. XTABLE also incorporates AUC metrics, which is appropriate given the number of samples in each dataset and tool known for adequately controlling FDR, which is not seen in other apps as well as emphasis on individual gene results and interrogation.

      A new paragraph on the discussion section (lines 361-370) of the discussion addresses the potential use of existing applications instead of XTABLE

      Secondly, XTABLE doesn't provide a solution to integrate the four datasets incorporated in the tool. One can only analyse one dataset at a time with XTABLE. The differences in terms of methodology and study design within these four datasets have been elaborated on in the article. However, attempts to integrate them were lacking.

      We repeatedly considered different strategies of integrating the analysis of the four datasets and we always reached the conclusion that it was hardly going to offer any advantage, or that it might be counterproductive.

      Integration can occur at multiple levels. One possibility is to carry out the same analysis (e.g. expression of a given gene in two groups of samples) in all datasets. Since the design and methodologies of the four studies differ substantially (different stages, different definitions of progression status, etc), a unique stratification for all datasets is not possible. Moreover, interrogating the four datasets simultaneously would slow the analysis, with no significant advantage in terms of speed. Another possibility is the integration of results in the same output. For instance, obtain a single chart with the expression of a given gene in multiple subgroups of the four datasets. We think that the results from each cohort should be kept separately and then compared with a similar analysis from other datasets due to differences in design. Scientifically, this is the best way to proceed as it avoids confusions.

      Nevertheless, XTABLE allows the export of data for further analysis. The user can use this option to integrate data using other applications or statistical packages.

      We do understand the attractiveness of integration between the four datasets is and we seriously considered it. But there is a fine balance between user-friendliness, flexibility, and scientific rigour. We think that XTABLE achieves this balance. Increasing integration of datasets might lead to error and wrong conclusions due to biological and methodological differences between studies. We believe that comparing analyses obtained independently from the four cohorts is the most sensible way to proceed.

      We propose to discuss these aspects accordingly.

      The integrative analysis of two or more datasets has been discussed in a new paragraph (382-391)

      The tool also lacks the flexibility for users to add more datasets. This would be helpful when there are more datasets of PMLs available publicly.

      This was also a permanent topic for discussion while designing XTABLE. Creating a tool that could be used to analyse other cohorts of precancerous lesions, while maintaining the ease of use was certainly a challenge. We had to adapt XTABLE to the characteristics of each one of the four databases: specific stratification criteria, different nomenclatures for the different sample types, etc. Designing a shiny app that can be adapted to other present or future datasets without the need of changing the code is simply not practical.

      The flexibility that these other Shiny apps incorporate to analyse any RNA-seq dataset requires the contrasts used for the differentially expressed gene analysis be manually defined. IDEP requires an experimental design file where sample names in the counts file must match exactly the sample names in this experimental design file and pre-processing visualisation is limited to the first 100 samples. RNFuzzyApp is similar but we could not format the experimental design file in a way that did not result in the app crashing upon upload. TCC-GUI requires all the sample names to be renamed to the contrast group with the addition of the replicate number. Apps that allow datasets to be uploaded do not have a practical or easy way to set up the DEG analysis of more than a couple dozen samples.

      Future versions of XTABLE can be updated to include additional curated PML datasets that would enhance hypothesis generation upon request. Importantly, the code is freely available and can be modified by other scientists to add their cohorts of interest, although we agree that a high level of expertise in coding will be needed. We propose to add these considerations to the text.

      The possibilities of expansion of XTABLE to new databases are discussed in lines 392-398

      Understanding the biology of PML progression would require a multi-omics approach. XTABLE analyses transcriptome data and lacks integration of other omics data. The authors mention the availability of data from whole exome, methylation, etc from the four studies they have selected. However, apart from the CIN scores, they haven't integrated any of the other layers of omics data available.

      Only one dataset (GSE108104) contains whole-exome sequencing and methylation data. We considered that a multi-omics approach in XTABLE would result in an overcomplicated application. As far as early detection and biomarker discovery is concerned, transcriptomic data is the most interesting parameter.

      Also discussed in lines 382-391

      Lastly, the authors could have elaborated on the limitations of the tool and their analysis in the discussion.

      We propose to raise these limitations accordingly in the discussion.

      See above.

      Reviewer #2 (Public Review):

      In this manuscript, Roberts et al. present XTABLE, a tool to integrate, visualise and extract new insights from published datasets in the field of preinvasive lung cancer lesions. This approach is critical and to be highly commended; whilst the Cancer Genome Atlas provided many insights into cancer biology it was the development of accessible visualisation tools such as cbioportal that democratised this knowledge and allowed researchers around the world to interrogate their genes and pathways of interest. XTABLE is trying to do this in the preinvasive space and should certainly be commended as such. We are also very impressed by the transparency of the approach; it is quite simple to download and run XTABLE from their Gitlab account, in which all data acquisition and analysis code can be easily interrogated.

      We would however strongly advocate deploying XTABLE to a web-accessible server so that researchers without experience in R and git can utilise it. We found it a little buggy running locally and cannot be sure whether this is due to my setup or the code itself. Some issues clearly need development; Progeny analysis brings up a warning "Not working for GSE109743 on the server and not sure why". GSEA analysis does not seem to work at all, raising an error "Length information for genome hg38 and gene ID ensGene is not available". In such relatively complex software, some such errors can be overlooked, as long as the authors have a clear process for responding to them, for example using Gitlab issue reporting. Some acknowledgement that this is an ongoing development would be helpful.

      We thank the reviewer for these comments. We will inspect the code to address those warnings, implement a system for issue reporting, and add the acknowledgements suggested by the reviewer. Regarding the deployment of XTABLE to a web-accessible server, this could present a challenge in the long term as computing resources need to be allocated for years and the economic cost involved.

      The code has been inspected to remove the warning and errors pointed out by the reviewer.

      The authors discuss some very important differences between the datasets in the text. Most notably they differ in endpoints and in the presence of laser capture. We would advocate including some warning text within the XTABLE application to explain these. For example, the "persistent/progressive" endpoint used in Beane et al (next biopsy is the same or higher grade) is not the same as the "progressive" endpoint in Teixeira et al (next biopsy is cancer); samples defined as "persistent/progressive" may never progress to cancer. This may not be immediately obvious to a user of XTABLE who wishes to compare progressive and regressive lesions. Similarly, the use of laser capture is important; the authors state that not using laser capture has the advantage of capturing microenvironment signals, but differentiating between intra-lesional and stromal signals is important, as shown in the Mascaux and Pennycuick papers. The authors cannot do much about the different study designs, but as the goal is to make these data more accessible We think some brief description of these issues within the app would help to prevent non-expert users from drawing incorrect conclusions.

      The authors themselves illustrate this clearly in their analysis of CIN signatures in progression potential. They observe that there is a much clearer progressive/regressive signal in GSE108124 compared to GSE114489 and GSE109743. This does not seem at all surprising, since the first study used a much stricter definition of progression - these samples are all about to become cancer whereas "progressive" samples in GSE109743 may never become cancer - and are much enriched for CIN signals due to laser capture. Their discussion states "CIN scores as a predictor of progression might be limited to microdissected samples and CIS lesions"; you cannot really claim this when "progression" in the two cohorts has such a different meaning. To their credit, the authors do explain these issues but they really should be clearly spelled out within the app.

      This is a very good point. We will add the warning text about the differences between studies regarding the definition of progression potential and the differences and sample processing (LCM or o not) so that the user is permanently aware of the differences between cohorts.

      A new tab (Dataset) has been added table with the methodologies used in each of each study, and the differences in progression status definitions. Additionally, we emphasized these differences in the main text of the manuscript (lines 296-300 and 403-409).

      We are not sure we agree with their analysis of CDK4/Cyclin-D1 and E2F expression in early lesions. The authors claim these are inhibited by CDKN2A and therefore are markers of CDKN2A loss of function. But these genes are markers of proliferation and can be driven by a range of proliferative processes. Histologically, low-grade metaplasias and dysplasias all represent proliferative epithelium when compared to normal control, but most never become cancer. It is too much of a leap to say that these are influenced by CDKN2A because that gene is inactivated in LUSC; do the authors have any evidence that this gene is altered at the genomic level in low-grade lesions?

      We are grateful for this comment. There is currently not evidence that CDKN2A mutations occur in low-grade lesions and therefore, we cannot argue that the of CDK4/Cyclin-D1 and E2F expression signature are the result of CDKN2A inactivation in low-grade lesions. We propose to modify the text to introduce these caveats to our conclusion an make our interpretations more accurate.

      We have modified the discussion (lines 443-454) to address the interpretation of our results regarding the connection between CDKN2A inactivation and the CDK4/cyclin-D1 and E2F signatures. We now focus our conclusions on the pathway itself and we mention Cyclin-D1 and CDKN2A alterations as a potential modulator of the changes in the pathway, but leaving the discussion open to other drivers.

      Overall this tool is an important step forwards in the field. Whilst we are a little unconvinced by some of their biological interpretations, and the tool itself has a few bugs, this effort to make complex data more accessible will be greatly enabling for researchers and so should be commended. In the future, we would like to see additional molecular data integrated into this app, for example, the whole genome and methylation data mentioned in line 153. However, we think this is an excellent start to combining these datasets.

    1. Vagrancy Law Section 2. Be it further enacted, that all freedmen, free Negroes, and mulattoes in this state over the age of eighteen years found on the second Monday in January 1866, or thereafter, with no lawful employment or business, or found unlawfully assembling themselves together either in the day or nighttime, and all white persons so assembling with freedmen, free Negroes, or mulattoes, or usually associating with freedmen, free Negroes, or mulattoes on terms of equality, or living in adultery or fornication with a freedwoman, free Negro, or mulatto, shall be deemed vagrants; and, on conviction thereof, shall be fined in the sum of not exceeding, in the case of a freedman, free Negro, or mulatto, 150, and a white man, $200, and imprisoned at the discretion of the court, the free Negro not exceeding ten days, and the white man not exceeding six months…. Section 7. Be it further enacted, that if any freedman, free Negro, or mulatto shall fail or refuse to pay any tax levied according to the provisions of the 6th Section of this act, it shall be prima facie evidence of vagrancy, and it shall be the duty of the sheriff to arrest such freedman, free Negro, or mulatto, or such person refusing or neglecting to pay such tax, and proceed at once to hire, for the shortest time, such delinquent taxpayer to anyone who will pay the said tax, with accruing costs, giving preference to the employer, if there be one. Section 8. Be it further enacted, that any person feeling himself or herself aggrieved by the judgment of any justice of the peace, mayor, or alderman in cases arising under this act may, within five days, appeal to the next term of the county court of the proper county, upon giving bond and security in a sum not less than $25 nor more than $150, conditioned to appear and prosecute said appeal, and abide by the judgment of the county court, and said appeal shall be tried de novo in the county court, and the decision of said court shall be final. Civil Rights of Freedmen Section 1. Be it enacted by the legislature of the state of Mississippi, that all freedmen, free Negroes, and mulattoes may sue and be sued, implead and be impleaded in all the courts of law and equity of this state, and may acquire personal property and choses in action, by descent or purchase, and may dispose of the same in the same manner and to the same extent that white persons may: Provided, that the provisions of this section shall not be construed as to allow any freedman, free Negro, or mulatto to rent or lease any lands or tenements, except in incorporated towns or cities, in which places the corporate authorities shall control the same…. Section 7. Be it further enacted, that every civil officer shall, and every person may, arrest and carry back to his or her legal employer any freedman, free Negro, or mulatto who shall have quit the service of his or her employer before the expiration of his or her term of service without good cause, and said officer and person shall be entitled to receive for arresting and carrying back every deserting employee aforesaid the sum of $5, and 10 cents per mile from the place of arrest to the place of delivery, and the same shall be paid by the employer, and held as a setoff for so much against the wages of said deserting employee: Provided, that said arrested party, after being so returned, may appeal to a justice of the peace or member of the board of police of the county, who, on notice to the alleged employer, shall try summarily whether said appellant is legally employed by the alleged employer and his good cause to quit said employer; either party shall have the right of appeal to the county court, pending which the alleged deserter shall be remanded to the alleged employer or otherwise disposed of as shall be right and just, and the decision of the county court shall be final. Penal Code Section 1. Be it enacted by the legislature of the state of Mississippi, that no freedman, free Negro, or mulatto not in the military service of the United States government, and not licensed so to do by the board of police of his or her county, shall keep or carry firearms of any kind, or any ammunition, dirk, or Bowie knife; and, on conviction thereof in the county court, shall be punished by fine, not exceeding $10, and pay the costs of such proceedings, and all such arms or ammunition shall be forfeited to the informer; and it shall be the duty of every civil and military officer to arrest any freedman, free Negro, or mulatto found with any such arms or ammunition, and cause him or her to be committed for trial in default of bail… Section 4. Be it further enacted, that all the penal and criminal laws now in force in this state defining offenses and prescribing the mode of punishment for crimes and misdemeanors committed by slaves, free Negroes, or mulattoes be and the same are hereby reenacted and declared to be in full force and effect against freedmen, free Negroes, and mulattoes, except so far m the mode and manner of trial and punishment have been changed or altered by law…. Section 5. Be it further enacted, that if any freedman, free Negro, or mulatto convicted of any of the misdemeanors provided against in this act shall fail-or refuse, for the space of five days after conviction, to pay the fine and costs imposed, such person shall be hired out by the sheriff or other officer, at public outcry, to any white person who will pay said fine and all costs and take such convict for the shortest time.

      All of the above elements of the Black Codes exploit the rights of African-Americans, which is a serious violation of the philosophy of the Declaration of Independence and tramples on the dignity of the Constitution.

    1. Then Gawain said angrily, “Why talk on thus? Thou dost threaten too long. I hope thy heart misgives thee.”

      Sir Gawain is known for his chivalry. As Gawain reflects, he becomes angry with himself for being deceived by the Green Knight. Sir Gawain demonstrates heroism "by the geometric figure on [his] shield" and the "moral imperfection of his actions." So Gawain feels so shameful for his failure of upholding the chivalry code. Even though he was brave in facing the Green Knight, Sir Gawain might have "failed" his chivalry code; but he showed "Christian humility" by upholding the temptations and honoring the Lord.

      Farrell, Thomas J. “LIFE AND ART, CHIVALRY AND GEOMETRY IN SIR GAWAIN AND THE GREEN KNIGHT.” Arthurian Interpretations, vol. 2, no. 2, 1988, pp. 17–33. JSTOR, http://www.jstor.org/stable/27868637. Accessed 10 Mar. 2023.

    2. “Now take ye this penance, and it shall be for your amendment.” Much mirth thereof did Sir Gawain make. Then they questioned that prince courteously of whence he came; and he told them that he was of the court of Arthur, who is the rich royal King of the Round Table, and that it was Gawain himself who was within their walls, and would keep Christmas with them, as the chance had fallen out.

      In this part of the story, Gawain arrives at the Green Knight's castle to fulfill his part of the bargain of meeting up with him a year after their original fight. The Green Knight's wife tests Gawain's loyalty by seducing Gawain and giving him a green girdle as a keepsake. He accepts because he thinks it will protect him from the Green Knight's axe. In this particular line, the Green Knight is telling Gawain to take on the penance of wearing the green girdle as a sign of his shame and a reminder of his failure to uphold the code of chivalry. Gawain is relieved that he hasn't been severely punished due to his own foolishness.

      Kibin. (2023). The concept of the sacrament of penance in sir gawain and the green knight. http://www.kibin.com/essay-examples/the-concept-of-the-sacrament-of-penance-in-sir-gawain-and-the-green-knight-L4nfHwhq

    3. Then Arthur beheld this adventurer before his high daïs, and knightly he greeted him, for fearful was he never. “Sir,” he said, “thou art welcome to this place–lord of this hall am I, and men call me Arthur. Light thee down, and tarry awhile, and what thy will is, that shall we learn after.

      Arthur does not know the Green Knights intentions yet he still invites him to join him and his knights in their hall. This can be seen as an act of chivalry and shows how Arthur still follows the rules of chivalry as a king and a knight. It could also be a way to gain the upperhand in case the Green Knight did mean harm. According to an article on the Arthurian Code of Chivalry, one should never turn his back on an enemy. While King Arthur may be treating the knight as a guest, he is not unaware of the Green Knights potential.

      Howells, Caleb. What is the arthurian code of chivalry? MythBank. Nov. 25, 2020. https://mythbank.com/what-is-the-arthurian-code-of-chivalry/

    4. she pressed the girdle on him and prayed him to take it, and he granted her prayer, and she gave it him with good will, and besought him for her sake never to reveal it but to hide it loyally from her lord; and the knight agreed that never should any man know it, save they two alone.

      The green girdle can be seen as a symbolism of Lady Bertilak's love for Gawain which is why she made him promise to hide it from everyone even Lord Bertilak. However, in Gawain's case it could be seen as a symbolism of his desperateness for survival. Gawain hides the existence of the girdle from Lord Bertilak as it ensures his survival even though the two agreed to exchange anything they have received during the day and night. It shows how desperate Gawain was to survive that he would throw away his own code of honor.

      Shmoop Editorial Team. “Sir Gawain and the Green Knight Symbolism, Imagery, Allegory.” Shmoop, Shmoop University, 11 Nov. 2008, https://www.shmoop.com/study-guides/literature/sir-gawain-green-knight/analysis/symbolism-imagery-allegory#:~:text=Symbolism%3A%20The%20Green%20Girdle&text=For%20Gawain%2C%20then%2C%20the%20green,of%20his%20code%20of%20honor.

    5. I beseech ye, my lord, let this venture be mine.

      When the Green Knight called the knights' loyalty into question, Gawain saw Arthur's humiliation and felt compelled to defend him. Sheri Ann Strite explains that "Gawain is free to choose his next act, and through his choice he reveals with which set of values he is aligned" (p. 5), Gawain saw this as a chance to prove his morality to Arthur. Gawain's bravery and loyalty to Arthur are demonstrated by his acceptance of the challenge when no one else would. Furthermore, his devotion to Arthur exemplifies the code of chivalry.

    6. “I will take no gift, lady, at this time. I have none to give, and none will I take.”

      Gawain is refusing a gift from the lady, as he has nothing to give in return and is unwilling to take something without offering something of equal value in exchange. He is demonstrating his sense of honor and respect, as well as his adherence to the code of chivalry. He also implies that he is not a materialistic person, and that he values loyalty and honor above material possessions.

      Macy, John. "Summary and Analysis: Lines 1690–82111996; Stanzas 68–821179." CliffsNotes, Houghton Mifflin Harcourt, www.cliffsnotes.com/literature/s/sir-gawain-and-the-green-knight/summary-and-analysis/lines-169082111996-stanzas-68821179. Accessed 8 May 2021.

    7. Round Table

      The Round Table is significant to Sir Arthur and his knights, and it is said to represent "the order of knighthood and the code to which the knights committed themselves" (Lupack). The Round Table is significant to this portion of the text because Arthur will not eat until he has either heard a strange or other tale, or until a stranger approaches and wishes to joust with a member of the Table. By jousting with a member of the table, Arthur has his nightly entertainment so he can feast, and the stranger risks their life hoping that they come out on top of the fight with one of Arthur's chosen knights. By mentioning the Round Table in this portion of the story, the reader can see how significant the symbolism of the Table is, and how important his knights are to Sir Arthur.

      Lupack, Alan. “The Round Table.” The Round Table | Robbins Library Digital Projects, 1 Jan. 1889, https://d.lib.rochester.edu/camelot/theme/round-table.

    8. Therefore I ask in this court but a Christmas jest, for that it is Yule-tide, and New Year, and there are here many fain for sport. If any one in this hall holds himself so hardy, 4 so bold both of blood and brain, as to dare strike me one stroke for another, I will give him as a gift this axe, which is heavy enough, in sooth, to handle as he may list, and I will abide the first blow, unarmed as I sit.

      The Green Knight proposes the question that someone plays a "game" with him that involves a contract. These lines in the poem explores the themes of chivalry as the contract alludes to bravery and probity since it deals with an agreement. He fools Gawain into this contract by disclosing that he will grant him the initial strike, allowing him to slice his head off. Ironically, the Green Knight's characteristics of "Trickery and deceit" are in opposition to the traditional attributes of chivalry (Maldonado 13). This deceit comes as the Green Knight fails to tell Gawain that after he strikes him, he will still be alive. The Green Knight essentially disguises his true identity which "feigns ignorance about Gawain and his quest" (Maldonado 13). In other words, the Green Knight fails to inform Gawain of his immortal-like characteristics considering he survives decapitation, and through his inability to speak the truth, he breaks the chivalric code.

      Maldonado, Joshua David, "The Game at the Green Chapel: A Game-Oriented Perspective on Chivalry in Sir Gawain and the Green Knight" (2020). Senior Projects Spring 2020. 122. https://digitalcommons.bard.edu/senproj_s2020/122

    1. there’s the bootstrapping problem: depending on the framework you’re using, you might need to install Conda and the framework driver before you can get anything going. A Docker image would come prepackaged with both, in addition to your code and its dependencies. So even if your framework supports Conda directly, you might want to use Docker anyway.
      • Matches by Object property ```js const { matches } = require('z')

      const person = { name: 'Maria' } matches(person)( (x = { name: 'John' }) => console.log('John you are not welcome!'), (x) => console.log(Hey ${x.name}, you are welcome!) ) - Matches by type or instancesjs const { matches } = require('z')

      const result = matches(1)( (x = 2) => 'number 2 is the best!!!', (x = Number) => number ${x} is not that good, (x = Date) => 'blaa.. dates are awful!' )

      console.log(result) // output: number 1 is not that good - Matches Array contentjs const { matches } = require('z')

      matches([1, 2, 3, 4, 5])( (a, b, c, tail) => 'a = 1, b = 2, c = 3, tail = [4, 5]'<br /> )

      matches([1, 2])( (a, tail) => 'a = 1, tail = [2]'<br /> )

      matches([1])( (a, b, tail) => 'Will not match here', (a = 2, tail = []) => 'Will not match here', (a = 1, tail) => 'Will match here, tail = []' ) ```

      • Powerful recursive code which will remove sequential repeated items from Array ```js const { matches } = require('z')

      const compress = (numbers) => { return matches(numbers)( (x, y, xs) => x === y ? compress([x].concat(xs)) : [x].concat(compress([y].concat(xs))), (x, xs) => x // stopping condition ) }

      compress([1, 1, 2, 3, 4, 4, 4]) //output: [1, 2, 3, 4] ```

    1. International Currency Support You can receive payments in any of the supported international currencies from the Dashboard. Address Verification System If you are accepting international payments, you can use Curlec's Address Verification System (AVS). AVS verifies if a customer's billing address (postal code and the billing street address) matches the billing address on file with the card issuer. Based on the response from the issuer, Curlec will accept or cancel the transaction. This helps in the prevention of fraud in international payments. Know more about Address Verific

      hide these

    1. We've come up with a rule that helps us here: a change that updates node_modules may not touch any other code in the codebase.

      This makes it sound like a hack/workaround, but to want to do otherwise is to want to do something that is already on its face wrong. So there's no issue.

    1. 1) Combien de temps à l’avance doit-on être convoqué un conseil de discipline ?   L’élève et son représentant légal, s’il est mineur, doivent être convoqués par pli recommandé ou remis en mains propres contre signature au moins 5 jours avant le conseil de discipline. Ce délai de 5 jours est prévu à l’article D 511–31 du code de l’éducation.   Si l’élève et ses parents (s’il est mineur) n’ont pas bénéficié du délai minimum de 5 jours, la procédure est irrégulière car il s’agit d’une violation des droits de la défense.
    1. eLife assessment

      In this study, neurons were recorded and combined across the parahippocampal area while rats performed a memory-guided spatial navigation task. Sophisticated analytical tools were used to provide convincing evidence that neuronal populations in these areas show behavior-related changes that might indicate the encoding of errors by the system. The valuable results suggest that rate remapping is a likely mechanism to support changes in representations that support memory-guided behavior in these regions, most interestingly in neurons that code head direction.

    2. Reviewer #3 (Public Review):

      The authors set out to explore how neurons in the rodent parahippocampal area code for environmental and behavioral variables in a complex goal-directed task. The task required animals to learn the association between a cue and a spatial response and to use this information to guide behavior flexibly on a trial-by-trial basis. The authors then used a series of sophisticated analytical techniques to examine how neurons in this area encode spatial location, task-relevant cues, and correct vs. incorrect responding. While these questions have been addressed in studies of hippocampal place cells, these questions have not been addressed in these upstream parahippocampal areas.

      Strengths:

      1) The study presents data from ensembles of simultaneously recorded neurons in the parahippocampal region. The authors use a sophisticated method for ensuring they are not recording from the same neurons in multiple sessions and yet still report impressive sample sizes.

      2) The use of the complex behavioral task guards against stereotyped behavior as rats need to continually pay attention to the relevant cue to guide behavior. The task is also quite difficult ensuring rats do not reach a ceiling level of performance which allows the authors to examine correct and incorrect trials and how spatial representations differ between them.

      3) The authors take the unusual approach of not pre-processing the data to group neurons into categories based on the type of spatial information that they represent. This guards against preconceived assumptions as to how certain populations of neurons encode information.

      4) The sophisticated analytical tools used throughout the manuscript allow the authors to examine spatial representations relative to a series of models of information processing.

      5) The most interesting finding is that neurons in this region respond to situations where rewards are not received by increasing their firing rates. This error or mismatch signal is most commonly associated with regions of the basal ganglia and so this finding will be of particular interest to the field.

      Weaknesses:

      1) The histological verification of electrode position is poor and while this is acknowledged by the authors it does limit the ability to interpret these data. Recent advances have enabled researchers to look at very specific classes of neurons within traditionally defined anatomical regions and examine their interactions with well-defined targets in other parts of the brain. The lack of specificity here means that the authors have had to group MEC, PaS, and PrS into a functional group; the parahippocampus. Their primary aim is then to examine these neurons as a functional group. Given that we know that neurons in these areas differ in significant ways, there is not a strong argument for doing this.

      2) The analytical/statistical tools used are very impressive but beyond the understanding of many readers. This limits the reader's ability to understand these data in reference to the rest of the literature. There are lots of places where this applies but I will describe one specific example. As noted above the authors use a complex method to examine whether neurons are recorded on multiple consecutive occasions. This is commendable as many studies in the field do not address this issue at all and it can have a major impact as analyses of multiple samples of the same neurons are often treated as if they were independent. However, there is no illustration of the outputs of this method. It would be good to see some examples of recordings that this method classifies as clearly different across days and those which are not. Some reference to previously used methods would also help the reader understand how this new method relates to those used previously.

      3) The effects reported are often subtle, especially at the level of the single neuron. Examples in the figures do not support the interpretations from the population-level analysis very convincingly.

      The authors largely achieve their aims with an interesting behavioral task that rats perform well but not too well. This allows them to examine memory on a trial-by-trial basis and have sufficient numbers of error trials to examine how spatial representations support memory-guided behavior. They report ensemble recordings from the parahippocampus which allows them to make conclusions about information processing within this region. This aim is relatively weak though given that this collection of areas would not usually be grouped together and treated as a single unitary area. They largely achieve their aim of examining the mechanisms underlying how these neurons code task-relevant factors such as spatial location, cue, and presence of reward. The mismatch or error-induced rate remapping will be a particularly interesting target for future research. It is also likely that the analytical tools used in this study could be used in future studies.

    1. Review coordinated by Life Science Editors Foundation

      Reviewed by: Dr. Angela Andersen

      Potential Conflicts of Interest: None

      Background: * mRNAs in polarized cells often have a distinct spatial localization patterns that enable localized protein production * In non-polarized cells, mRNAs encoding membrane and secretory proteins are predominantly translated on the endoplasmic reticulum (ER), some mRNAs are enriched on the mitochondrial surface, some mRNAs are bound to the RNA-binding protein (RBP) TIS11B at the surface of the rough ER in "TIS granules". * The translation of specific mRNAs in TIS granules allows assembly of protein complexes that cannot be established when the mRNAs are translated on the ER but outside of TIS granules (physiological relevance). * The canonical rough ER (CRER) is distinct from the TIS granule ER (TGER), and both are distinct from the cytosol.

      Questions: * Do mRNAs that encode non-membrane proteins differentially localize to the ER or the cytosol? (in steady state) * Does the amount of protein synthesis differ depending on the subcytoplasmic location of an mRNA?

      Summary: * A third of mRNAs that encode non-membrane proteins have a biased localization to TGER or CRER, indicating that the ER membrane is a general site of translation for both membrane and non-membrane proteins. * 52% of mRNAs that encode non-membrane proteins have a biased mRNA transcript localization pattern towards a single cytoplasmic compartment. the TGER, CRER or cytosol. * The localization at the TGER or CRER was largely controlled by a combinatorial code of AU-RBPs at the 3'UTR. TIS11B promotes mRNA localization to TGER and TIA1/L1 to CRER. * LARP4B bound to the 3'UTR promotes cytosolic localization. * The location of translation has an independent effect on protein levels independent of the RBPs/3'UTR: redirecting cytosolic mRNAs to the rough ER membrane increased their steady-state protein levels by two-fold, indicating that the ER environment promotes protein expression. * Compartment-enriched mRNAs differed in their mRNA production and degradation rates, as well as functional classes and levels of their encoded proteins. Therefore the cytoplasm is partitioned into different functional and regulatory compartments that are not enclosed by membranes. * low-abundance proteins are translated in the TGER region. mRNAs encoding zinc finger proteins and transcription factors were substantially enriched at the TGER. These gene classes are usually expressed at lower levels than others.. This localization may regulate protein complex assembly (membrane proteins that are translated in the TGER domain establish protein complexes that cannot be formed when the proteins are translated on the CRER). The TGER may ensure that low-abundance mRNAs are effectively translated into low-abundance proteins. * mRNAs that are the most stable and encode the most highly expressed proteins are enriched on the CRER and include helicases, cytoskeleton-bound proteins, and chromatin regulators, overturning the idea that most non-membrane protein-encoding mRNAs are translated in the cytosol. * mRNAs overrepresented in the cytosol had the highest production and degradation rates and were enriched in proteins involved in mRNA processing and translation factors, whose abundance levels require tight control.

      Advance: Evidence for functional compartmentalization of non-membrane mRNA protein expression in the cytosol vs ER. In steady state, general localization of mRNAs to the ER promotes high protein levels.

      Significance: Engineered 3'UTR sequences could potentially boost protein expression by localizing mRNAs to the ER in experimental settings, for vaccines etc.

      Remaining questions/points: * How does the rough ER stimulate protein expression? * Does the mRNA localization affect complex formation and/or function of non-membrane proteins? * Does this occur in cells other than HEK293T? * Is this regulated?

    1. Instant Refund FeeWe charge a small fee to process instant refunds. If Instant Refund for a payment that is more than 6 months old is not supported, an error message is displayed on the Refund Payment pop-up. You will encounter the following error code and error message in the API response: Error Responsecopy{ "error": { "code": "BAD_REQ

      hide

    1. Address Verification System If you are accepting international payments, you can use Curlec's Address Verification System (AVS). AVS verifies if a customer's billing address (postal code and the billing street address) matches the billing address on file with the card issuer. Based on the response from the issuer, Curlec will accept or cancel the transaction. This helps in the prevention of fraud in international payments. Know more about Address Verification System.

      remove this