10,000 Matching Annotations
  1. Aug 2024
    1. About the data

      Can we give some links here to data and code? Even a setting to 'turn on the code' or some such? I like to do this for transparency and other reasons we've discussed.

      OK we do link the github repo -- maybe add a note about that, and about how to find the code that produces this blog for people who want to see/check?

    1. Many repositories of public code are available under strict open-source licenses that control the use of the code in downstream applications. And indeed, open-source licenses such as GPL may make it more difficult to use tools like Copilot to generate code for many purposes.

      Based on the Washington Post article, it seems as if AI software programmers don't care about copyright and licenses.

    2. This code has been written by millions of programmers around the world, and many of those programmers are not happy. They argue that GitHub should not get to benefit from their hard work in this way, especially not to develop a technology that could put their jobs at risk.

      This seems to be a problem beyond copilot. I have seen this outcry regarding every popular AI software.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this paper, the researchers aimed to address whether bees causally understand string-pulling through a series of experiments. I first briefly summarize what they did:

      - In experiment 1, the researchers trained bees without string and then presented them with flowers in the test phase that either had connected or disconnected strings, to determine what their preference was without any training. Bees did not show any preference.

      - In experiment 2, bees were trained to have experience with string and then tested on their choice between connected vs. disconnected string.

      - experiment 3 was similar except that instead of having one option which was an attached string broken in the middle, the string was completely disconnected from the flower.

      - In experiment 4, bees were trained on green strings and tested on white strings to determine if they generalize across color.

      - In experiment 5, bees were trained on blue strings and tested on white strings.

      - In experiment 6, bees were trained where black tape covered the area between the string and the flower (i.e. so they would not be able to see/ learn whether it was connected or disconnected).

      - In experiments 2-6, bees chose the connected string in the test phase.

      - In experiment 7, bees were trained as in experiment 3 and then tested where the string was either disconnected or coiled i.e. still being 'functional' but appearing different.

      - In experiment 8, bees were trained as before and then tested on a string that was in a different coiled orientation, either connected or disconnected.

      - In experiments 7 and 8 the bees showed no preference.

      Strengths:

      I appreciate the amount of work that has gone into this study and think it contains a nice, thorough set of experiments. I enjoyed reading the paper and felt that overall it was well-written and clear. I think experiment 1 shows that bees do not have an untrained understanding of the function of the string in this context. The rest of the experiments indicate that with training, bees have a preference for unbroken over broken string and likely use visual cues learned during training to make this choice. They also show that as in other contexts, bees readily generalize across different colors.

      Weaknesses:

      (1) I think there are 2 key pieces of information that can be taken from the test phase - the bees' first choice and then their behavior across the whole test. I think the first choice is critical in terms of what the bee has learned from the training phase - then their behavior from this point is informed by the feedback they obtain during the test phase. I think both pieces of information are worth considering, but their behavior across the entire test phase is giving different information than their first choice, and this distinction could be made more explicit. In addition, while the bees' first choice is reported, no statistics are presented for their preferences.

      We agree with the reviewer that the first choice is critical in terms of what the bumblebees have learned from the training phase. We analyzed the bees’ first choice in Table 1, and we added the tested videos. The entire connected and disconnected strings were glued to the floor, the bees were unable to move either the connected or disconnected strings, and avoid learning behavior during the tests. We added the data of bee's each choice in the Supplementary table.

      (2) It seemed to me that the bees might not only be using visual feedback but also motor feedback. This would not explain their behavior in the first test choice, but could explain some of their subsequent behavior. For example, bees might learn during training that there is some friction/weight associated with pulling the string, but in cases where the string is separated from the flower, this would presumably feel different to the bee in terms of the physical feedback it is receiving. I'd be interested to see some of these test videos (perhaps these could be shared as supplementary material, in addition to the training videos already uploaded), to see what the bees' behavior looks like after they attempt to pull a disconnected string.

      We added supplementary videos of testing phase. As noted in General Methods, both connected and disconnected strings were glued to the floor to prevent the air flow generated by flying bumblebees’ wings from changing the position of the string during the testing phase. The bees were unable to move either the connected or disconnected strings during the tests, and only attempted to pull them. Therefore, the difference in the friction/weight of pulling the both strings cannot be a factor in the test.

      (3) I think the statistics section needs to be made clearer (more in private comments).

      We changed the statistical analysis section as suggested by the reviewer.

      (4) I think the paper would be made stronger by considering the natural context in which the bee performs this behavior. Bees manipulate flowers in all kinds of contexts and scrabble with their legs to achieve nectar rewards. Rather than thinking that it is pulling a string, my guess would be that the bee learns that a particular motor pattern within their usual foraging repertoire (scrabbling with legs), leads to a reward. I don't think this makes the behavior any less interesting - in fact, I think considering the behavior through an ecological lens can help make better sense of it.

      Here we respectfully disagree. The solving of Rubik’s cube by humans could be said to be version of finger-movements naturally required to open nuts or remove ticks from fur, but this is somewhat beside the point: it’s not the motor sequences that are of interest, but the cognition involved. A general approach in work on animal intelligence and cognition is to deliberately choose paradigms that are outside the animals’ daily routines-this is what we have done here, in asking whether there is means-end comprehension in bee problem solving. Like comparable studies on this question in other animals, the experiments are designed to probe this question, not one of ecological validity.

      Reviewer #2 (Public Review):

      Summary:

      The authors wanted to see if bumblebees could succeed in the string-pulling paradigm with broken strings. They found that bumblebees can learn to pull strings and that they have a preference to pull on intact strings vs broken ones. The authors conclude that bumblebees use image matching to complete the string-pulling task.

      Strengths:

      The study has an excellent experimental design and contributes to our understanding of what information bumblebees use to solve a string-pulling task.

      Weaknesses:

      Overall, I think the manuscript is good, but it is missing some context. Why do bumblebees rely on image matching rather than causal reasoning? Could it have something to do with their ecology? And how is the task relevant for bumblebees in the wild? Does the test translate to any real-life situations? Is pulling a natural behaviour that bees do? Does image matching have adaptive significance?

      We appreciate the valuable comment from the reviewer. Our explanation, which we have now added to the manuscript, is as follows:

      “Different flower species offer varying profitability in terms of nectar and pollen to bumblebees; they need to make careful choices and learn to use floral cues to predict rewards (Chittka, 2017). Bumblebees can easily learn visual patterns and shapes of flower (Meyer-Rochow, 2019); they can detect stimuli and discriminate between differently coloured stimuli when presented as briefly as 25 ms (Nityananda et al., 2014). In contrast, causal reasoning involves understanding and responding to causal relationships. Bumblebees might favor, or be limited to, a visual approach, likely due to the efficiency and simplicity of processing visual cues to solve the string-pulling task. ”

      As above, it worth noting that our work is not designed as an ecological study, but one about the question of whether causal reasoning can explain how bees solve a string-pulling puzzle. We have a cognitive focus, in line with comparable studies on other animals. We deliberately chose a paradigm that is to some extent outside of the daily challenges of the animal.

      Reviewer #3 (Public Review):

      Summary:

      This paper presents bees with varying levels of experience with a choice task where bees have to choose to pull either a connected or unconnected string, each attached to a yellow flower containing sugar water. Bees without experience of string pulling did not choose the connected string above chance (experiment 1), but with experience of horizontal string pulling (as in the right-hand panel of Figure 4) bees did choose the connected string above chance (experiments 2-3), even when the string colour changed between training and test (experiments 4-5). Bees that were not provided with perceptual-motor feedback (i.e they could not observe that each pull of the string moved the flower) during training still learned to string pull and then chose the connected string option above chance (experiment 6). Bees with normal experience of string pulling then failed to discriminate between connected and unconnected strings when the strings were coiled or looped, rather than presented straight (experiments 7-8).

      Weaknesses:

      The authors have only provided video of some of the conditions where the bees succeeded. In general, I think a video explaining each condition and then showing a clip of a typical performance would make it much easier to follow the study designs for scholars. Videos of the conditions bees failed at would be highly useful in order to compare different hypotheses for how the bees are solving this problem. I also think it is highly important to code the videos for switching behaviours. When solving the connected vs unconnected string tasks, when bees were observed pulling the unconnected string, did they quickly switch to the other string? Or did they continue to pull the wrong string? This would help discriminate the use of perceptual-motor feedback from other hypotheses.

      We added the test videos as suggested by the reviewer, and we added the data for each bee's choice. However, both connected and disconnected strings were glued to the floor, and therefore perceptual-motor feedback was equal and irrelevant between the choices during the test.

      The experiments are also not described well, for my below comments I have assumed that different groups of bees were tested for experiments 1-8, and that experiment 6 was run as described in line 331, where bees were given string-pulling training without perceptual feedback rather than how it is described in Figure 4B, which describes bees as receiving string pulling training with feedback.

      We now added figures of Experiment 6 and 7 in the Figure 1B, and we mentioned that different groups of bees were tested for Experiments 1-9.

      The authors suggest the bees' performance is best explained by what they term 'image matching'. However, experiment 6 does not seem to support this without assuming retroactive image matching after the problem is solved. The logic of experiment 6 is described as "This was to ensure that the bees could not see the familiar "lollipop shape" while pulling strings....If the bees prefer to pull the connected strings, this would indicate that bees memorize the arrangement of strings-connected flowers in this task." I disagree with this second sentence, removing perceptual feedback during training would prevent bees memorising the lollipop shape, because, while solving the task, they don't actually see a string connected to a yellow flower, due to the black barrier. At the end of the task, the string is now behind the bee, so unless the bee is turning around and encoding this object retrospectively as the image to match, it seems hard to imagine how the bee learns the lollipop shape.

      We agree with the reviewer that while solving the task in the last step during training, the bees don't actually see a string connected to a yellow flower, due to the black barrier. Since the full shape is only visible after the pulling is completed and this requires the bee to “check back” on the entire display after feeding, to basically conclude “ this is the shape that I need to be looking for later”.

      Another possibility is that bumblebees might remember the image of the “lollipop shape” while training the bees in the first step, in which the “lollipop shape” was directly presented to the bumblebee in the early step of the training.

      We added the experiment suggested by the reviewer, and the result showed that when a green table was placed behind the string to obscure the “lollipop shape” at any point during the training phase, the bees were unable to identify the connected string. The result further supports that bumblebees learn to choose the connected string through image matching.

      Despite this, the authors go on to describe image matching as one of their main findings. For this claim, I would suggest the authors run another experiment, identical to experiment 6 but with a black panel behind the bee, such that the string the bee pulls behind itself disappears from view. There is now no image to match at any point from the bee's perspective so it should now fail the connectivity task.

      Strengths:

      Despite these issues, this is a fascinating dataset. Experiments 1 and 2 show that the bees are not learning to discriminate between connected and unconnected stimuli rapidly in the first trials of the test. Instead, it is clear that experience in string pulling is needed to discriminate between connected and unconnected strings. What aspect of this experience is important? Experiment 6 suggests it is not image matching (when no image is provided during problem-solving, but only afterward, bees still attend to string connectivity) and casts doubt on perceptual-motor feedback (unless from the bee's perspective, they do actually get feedback that pulling the string moves the flower, video is needed here). Experiments 7 and 8 rule out means-end understanding because if the bees are capable of imagining the effect of their actions on the string and then planning out their actions (as hypotheses such as insight, means-end understanding and string connectivity suggest), they should solve these tasks. If the authors can compare the bees' performance in a more detailed way to other species, and run the experiment suggested, this will be a highly exciting paper

      We appreciate the valuable comment from the reviewer. We compared the bees' performance to other species, and conducted the experiment as suggested by the reviewer.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Smaller comments:

      Line 64: is the word 'simple' needed here? It could also be explained by more complex forms of associative learning, no?

      We deleted “simple”.

      Methods:

      Line 230: was it checked that this was high-contrast for the bees?

      We added the relevant reference in the revised manuscript.

      Line 240: how much sucrose solution was present in the flowers?

      We added 25 microliters sucrose solution in the flowers. We added the information in the revised manuscript.

      Line 266: check grammar.

      We checked the grammar as follows: “During tests, both strings were glued to the floor of the arena to prevent the air flow generated by flying bumblebees’ wings from changing the position of the string.”

      Statistical analysis:

      - What does it mean that "Bees identity and colony were analyzed with likelihood ratio tests"?

      Bees identity and colony was set as a random variable. We changed the analysis methods in the revised manuscript, and results of the all the experiments did not changed.

      - Line 359: do you mean proportion rather than percentage?

      We mean the percentage.

      - "the number of total choices as weights" - this should be explained further. This is the number of choices that each bee made? What was the variation and mean of this number? If bees varied a lot in this metric, it might make more sense to analyze their first choice (as I see you've done) and their first 10 choices or something like that - for consistency.

      This refers to the total number of choices made by each bumblebee. We added the mean and standard error of each bee’s number of choices in Table 1. Some bees pulled the string fewer than 10 times; we chose to include all choices made by each bee.

      - More generally I think the first test is more informative than the subsequent choices, since every choice after their first could be affected by feedback they are getting in that test phase. Or rather, they are telling you different things.

      All the bees were tested only once, however, you might be referring to the first choice. We used Chi-square test to analyze the bumblebees’ first choices in the test. It is worth noting that both connected and disconnected strings were glued to the floor. The bees were unable to move either the connected or disconnected strings during the tests, and only attempted to pull them. Therefore,the feedback from pulling either the connected or disconnected strings is the same.

      - Line 362: I think I know what you mean, but this should be re-phrased because the "number of" sounds more appropriate for a Poisson distribution. I think what you are testing is whether each individual bee chose the connected or the disconnected string - i.e. a 0 or 1 response for each bee?

      We agree with the reviewer that each bee chose the connected or the disconnected string - i.e. a 0 or 1 response for each bee, but not the number. We clarify this as: “The total number of the choices made by each bee was set as weights.” 

      - Line 364-365: here and elsewhere, every time you mention a model, make it clear what the dependent and independent variables are. i.e. for the mixed model, the 'bee' is the random factor? Or also the colony that the bee came from? Were these nested etc?

      We clarify this in the revised manuscript. The bee identity and colony is the random factor in the mixed model.

      - Line 368: "Latency to the first choice of each bee was recorded" - why? What were the hypotheses/ predictions here?

      The latency to the first choice was intended to see if the bumblebees were familiarizing with the testing pattern. A shorter delay time might indicate that the bumblebees were more familiar with the pattern.

      - Line 371: "Multiple comparisons among experiments were.." - do you mean 'within' experiments? It seems that treatments should not be compared between different experiments.

      We mean multiple comparisons among different experiments; we clarify this in the revised manuscript.

      Results

      Experiment 1: From the methods, it sounded like you both analyzed the bees' first choice and their total no. of choices, but in the results section (and Figure 1) I only see the data for all choices combined here.

      In table 1 and in the text you report the number of bees that chose each option on their first choice, but there are no statistical results associated with these results. At the very least, a chi square or binomial test could be run.

      Line 138: "Interestingly, ten out of fifteen bees pulled the connected string in their first choice" - this is presented like it is a significant majority of bees, but a chi-square test of 10 vs 5 has a p-value = 0.1967

      We used the Chi square test to analyzed of the bees’ first choice. We also added the analyzed data in the Table 1.

      Line 143: "It makes sense because the bees could see the "lollipop shape" once they pulled it out from the table." - this feels more like interpretation (i.e. Discussion) rather than results.

      We moved the sentence to the discussion.

      Line 162: again this feels more like interpretation/ conjecture than results.

      We removed the sentence in the results.

      Line 184: check grammar.

      We checked the grammar. We changed “task” to “tasks”.

      Figures

      I really appreciated the overview in Figure 5 - though I think this should be Figure 1? Even if the methods come later in eLife, I think it would be nice to have that cited earlier on (e.g. at the start of the results) to draw the reader's attention to it quickly, since it's so helpful. It also then makes the images at the bottom of what is currently Figure 1 make more sense. I also think that the authors could make it clearer in Figure 5 which strings are connected vs disconnected in the figure (even if it means exaggerating the distance more than it was in real life). I had to zoom in quite a bit to see which were connected vs. not. Alternatively, you could have an arrow to the string with the words "connected" "disconnected" the first time you draw it - and similar labels for the other string conditions.

      We appreciate the valuable comment from the reviewer. We changed Figure 5 to Figure 2, and Figure 4 to Figure 1. We cited the Figures at the start of the results. We also changed the gap distance between the disconnected strings. Additionally, we added arrows to indicate “connected” and “disconnected” strings in the Figure.

      Figure 1 - I think you could make it clearer that the bars refer to experiments (e.g. have an x-axis with this as a label). Also, check the grammar of the y-axis.

      We added the experiments number in the Figures. Additionally, we checked the grammar of the y-axis. We changed “percentages” to “parentage”. 

      I also think it's really helpful to see the supplementary videos but I think it would be nice to see some examples of the test phase, and not just the training examples.

      We added Supplementary videos of the testing phase.

      Reviewer #2 (Recommendations For The Authors):

      Below are also some minor comments:

      L40: "approaches".

      We changed “approach” to “approaches”.

      L42: but likely mainly due to sampling bias of mammals and birds.

      We changed the sentence as follows: String pulling is one of the most extensively used approaches in comparative psychology to evaluate the understanding of causal relationships (Jacobs & Osvath, 2015), with most research focused on mammals and birds, where a food item is visible to the animal but accessible only by pulling on a string attached to the reward (Taylor, 2010; Range et al., 2012; Jacobs & Osvath, 2015; Wakonig et al., 2021).

      L64: remove "in this study"

      We removed “in this study”.

      L64: simple associative learning of what? Isn't your image matching associative too?

      We removed “ simple”.

      L97: remove "a" before "connected".

      We removed “a” before “connected”.

      L136-138: but maybe they could still feel the weight of the flower when pulling?

      Because both strings were glued to the floor in the test phase, the feedback was the same and therefore irrelevant. This information is noted in the General Methods.

      L161: what are these numbers?

      We removed the latency in the revised manuscript.

      L167/ Table 1: I realise that the authors never tried slanted strings to check if bumblebees used proximity as a cue. Why?

      This was simply because we wanted to focus on whether bumblebees could recognize the connectivity of the string.

      Discussion: Why did you only control for colour of the string? What if you had used strings with different textures or smells? Unclear if the authors controlled for "bumblebee smell" on the strings, i.e., after a bee had used the string, was the string replaced by a new one or was the same one used multiple times?

      We used different colors to investigate featural generalization of the visual display of the string connected to the flower in this task. We controlled for color because it is a feature that bumblebees can easily distinguish.

      Both the flowers and the strings were used only once, to prevent the use of chemosensory cues. We clarify this in the revised manuscript.

      L182: since what?

      We deleted “since” in the revised manuscript.

      L182-188: might be worth mentioning that some crows and parrots known for complex cognition perform poorly on broken strings (e.g., https://doi.org/10.1098/rspb.2012.1998 ; https://doi.org/10.1163/1568539X-00003511 ; https://doi.org/10.1038/s41598-021-94879-x ) and Australian magpies use trial and error (https://doi.org/10.1007/s00265-023-03326-6).

      We added the following sentences as suggested by the reviewer: “It is worth noting that some crows and parrots known for complex cognition perform poorly on the broken string task without perceptual feedback or learning. For example, New Caledonian crows use perceptual feedback strategies to solve the broken string-pulling task, and no individual showed a significant preference for the connected string when perceptual feedback was restricted (Taylor et al., 2012). Some Australian magpies and African grey parrots can solve the broken string task, but they required a high number of trials, indicating that learning plays a crucial role in solving this task (Molina et al., 2019; Johnsson et al., 2023).”

      L193: maybe expand on this to put the task into a natural context?

      We added the following sentences as suggested by the reviewer:

      “Different flower species offer varying profitability in terms of nectar and pollen to bumblebees; they need to make careful choices and learn to use floral cues to predict rewards (Chittka, 2017). Bumblebees can easily learn visual patterns and shapes of flower (Meyer-Rochow, 2019); they can detect stimuli and discriminate between differently coloured stimuli when presented as briefly as 25 ms (Nityananda et al., 2014). In contrast, causal reasoning involves understanding and responding to causal relationships. Bumblebees might favor, or be limited to, a visual approach, likely due to the efficiency and simplicity of processing visual cues to solve the string-pulling task. ”

      L204: is causal understanding the same as means-end understanding?

      Means-end understanding is expressed as goal-directed behavior, which involves the deliberate and planned execution of a sequence of steps to achieve a goal. Includes some understanding of the causal relationship (Jacobs & Osvath, 2015; Ortiz et al., 2019). .

      L235: this is a very big span of time. Why not control for motivation? Cognitive performance can vary significantly across the day (at least in humans).

      Bumblebee motivation is understood to be rather consistent, as those that were trained and tested came to the flight arena of their own volition and were foragers looking to fill their crop load each time to return it to the colony.

      L232: what is "(w/w)" ? This occurs throughout the manuscript.

      “w/w” represents the weight-to-weight percentage of sugar.

      L250: this sentence sounds odd. "containing in the central well.." ?? Perhaps rephrase? Unclear what central well refers to? Did the flowers have multiple wells?

      We rephrased the sentence as follows: For each experiment, bumblebees were trained to retrieve a flower with an inverted Eppendorf cap at the center, containing 25 microliters of 50% sucrose solution, from underneath a transparent acrylic table

      L268: why euthanise?

      The reason for euthanizing the bees is that new foragers will typically only become active after the current ones were removed from the hive.

      L270: chemosensory cues answer my concern above. Maybe make it clear earlier.

      We moved this sentence earlier in the result.

      L273: did different individuals use different pulling strategies? Do you have the data to analyse this? This has been done on birds and would offer a nice comparison.

      We analyzed the string-pulling strategies among different individuals, and provided Supplementary Table 1 to display the performances of each individual in different string-pulling experiments.

      L365: unclear why both models. Would be nice to see a GLM output table.

      The duration of pulling different kinds of strings were first tested with the Shapiro-Wilk test to assess data normality. The duration data that conforms to a normal distribution was compared using linear mixed-effects models (LMM), while the data that deviates from normality were examined with a generalized linear-mixed model (GLMM). We added a GLM and GLMM output table in the revised manuscript.

      L377: should be a space between the "." and "This".

      We added a space between the “.” and “This”.

      L383-390: some commas and semicolons are in the wrong places.

      We carefully checked the commas and semicolons in this sentence.

      Reviewer #3 (Recommendations For The Authors):

      Minor comments

      Line 32: seems to be missing a word, suggest "the bumblebees' ability to distinguish".

      we added “the” in the revised manuscript.

      Line 47: it would be good to reference other scholars here, this is the central focus of all work in comparative psychology.

      We added the reference in the revised manuscript.

      Line 50-61: I think the string-pulling literature could be described in more detail here, with mention of perceptual-motor feedback loops as a competing hypothesis to means-end understanding (see Taylor et al 2010, 2012). It seems a stretch to suggest that "String-pulling studies have directly tested means-end comprehension in various species", when perceptual-motor feedback is a competing hypothesis that we have positive evidence for in several species.

      We mentioned the perceptual-motor feedback in the introduction as follow:

      “Multiple mechanisms can be involved in the string-pulling task, including the proximity principle, perceptual feedback and means-end understanding (Taylor et al., 2012; Wasserman et al., 2013; Jacobs & Osvath, 2015; Wang et al., 2020). The principle of proximity refers to animals preferring to pull the reward that is closest to them (Jacobs & Osvath, 2015). Taylor et al. (2012) proposed that the success of New Caledonian crows in string-pulling tasks is based on a perceptual-motor feedback loop, where the reward gradually moves closer to the animal as they pull the strings. If the visual signal of the reward approaching is restricted, crows with no prior string-pulling experience are unable to solve the broken string task (Taylor et al., 2012).

      However, when a green table was placed behind the string to obscure the “lollipop” structure during the training, the bees could not see the “lollipop” during the initial training stage or after pulling the string from under the table. In this situation, the bees were unable to identify the connected string, further proving that bumblebees chose the connected string based on image matching.

      Line 68: suggest remove 'meticulously'.

      We removed “meticulously”.

      Line 99: This is an exciting finding, can the authors please provide a video of a bee solving this task on its first trial?

      We added videos in the supplementary materials.

      Line 133: perceptual-motor feedback loops should be introduced in the introduction.

      We introduced perceptual-motor feedback loops in the revised manuscript.

      Line 136: please clarify the prior experience of these bees, it is not clear from the text.

      We clarified the prior experience of these bees as follow: Bumblebees were initially attracted to feed on yellow artificial flowers, and then trained with transparent tables covered by black tape (S7 video) through a four-step process.

      Line 138: from the video it is not possible to see the bee's perspective of this occlusion. Do the authors have a video or image showing the feedback the bees received? I think this is highly important if they wish to argue that this condition prevents the use of both image matching and a perceptual-motor feedback loop.

      We prevented the use of image matching: the bees were unable to see the flower moving towards them above the table during the training phase in this condition. But the bees may receive visual image both after pulling the string out from the table and in the initial stages of training in this condition.

      Line 147: please clarify what experience these bees had before this test.

      We added the prior experience of bumblebees before training as follow: We therefore designed further experiments based on Taylor et al. (2012) to test this hypothesis. Bumblebees were first trained to feed on yellow artificial, and then trained with the same procedure as Experiment 2, but the connected strings were coiled in the test.

      Line 155: This is a highly similar test to that used in Taylor et al 2012, have the authors seen this study?

      We mentioned the reference in the revised manuscript as follows: We therefore designed further experiments based on Taylor et al. (2012) to test this hypothesis.

      Line 183: This sentence needs rewriting "Since the vast majority of animals, including dogs 183 (Osthaus et al., 2005), cats (Whitt et al., 2009), western scrub-jays (Hofmann et al.,2016) and azure-winged magpies (Wang et al., 2019) are failing in such tasks spontaneously".

      We changed the sentence as suggested by the reviewer as follow:  Some animals, including dogs (Osthaus et al., 2005), cats (Whitt et al., 2009), western scrub-jays (Hofmann et al., 2016) and azure-winged magpies (Wang et al., 2019) fail in such task spontaneously.

      Line 186: "complete comprehension of the functionality of strings is rare" I am not sure the evidence in the current literature supports any animal showing full understanding, can the authors explain how they reach this conclusion?

      We wished to say that few animal species could distinguish between connected and disconnected strings without trial and error learning. We revised the sentence as follows:

      It is worth noting that some crows and parrots known for complex cognition perform poorly on broken string task without perceptual feedback or learning. For example, New Caledonian crows use perceptual feedback strategies to solve broken string-pulling task, and no individual showed a significant preference for the connected string when perceptual feedback is restricted (Taylor et al., 2012). Some Australian magpies and African grey parrots can solve the broken string task, but it required a high number of trials, indicating that learning plays a crucial role in solving this task (Molina et al., 2019; Johnsson et al., 2023).

      Line 190: the authors need to clarify which part of their study provides positive evidence for this conclusion.

      We added the evidence for this conclusion as follows: Our findings suggest that bumblebees with experience of string pulling prefer the connected strings, but they failed to identify the interrupted strings when the string was coiled in the test.

      Line 265: was the far end of the string glued only?

      The entire string was glued to the floor, not just the far ends of the string.

    1. "Don't Download This Song" is the first single from "Weird Al" Yankovic's 12th studio album Straight Outta Lynwood. The song was released exclusively on August 21, 2006 as a digital download. It is a style parody of "We Are the World", "Voices That Care", "Hands Across America", "Heal the World" and other similar charity songs. The song "describes the perils of online music file-sharing" in a tongue-in-cheek manner.[1] To further the sarcasm, the song was freely available for streaming and to legally download in DRM-free MPEG fileformat at Weird Al's Myspace page, a standalone website,[2] as well as his YouTube channel. Background[edit] "Don't Download This Song" references several court cases related to the RIAA and copyright infringement of music. Among these are lawsuits against "a grandma" (presumably Gertrude Walton,[3] who was sued for copyright infringement six months after dying) and a "7-year-old girl" (presumably a reference to Tanya Andersen's daughter[4] sued at age 10 for alleged copyright infringements made at the age of 7), as well as Lars Ulrich's strong stance against copyright infringement of music in the days of Napster. The song also challenges the RIAA's claim that file sharing prevents the artists from profiting from their work, as the song argues that they are still very financially successful via their recording contracts: ("Don't take away money from artists just like me/How else can I afford another solid-gold Humvee, And diamond-studded swimming pools? These things don't grow on trees"). Mention is also made of Tommy Chong's time spent in prison.[5] Yankovic's own views on filesharing are less clear-cut: .mw-parser-output .templatequote{overflow:hidden;margin:1em 0;padding:0 32px}.mw-parser-output .templatequote .templatequotecite{line-height:1.5em;text-align:left;padding-left:1.6em;margin-top:0}I have very mixed feelings about it. On one hand, I’m concerned that the rampant downloading of my copyright-protected material over the Internet is severely eating into my album sales and having a decidedly adverse effect on my career. On the other hand, I can get all the Metallica songs I want for FREE! WOW!!!!!— "Weird Al" Yankovic, "Ask Al" Q&As for May 2000 Yankovic's intention was to leave the listener with no clear understanding of Yankovic's own views on the matter, "all by design".[6] Music video[edit] The death scene from the music video. The music video, animated by Bill Plympton, premiered August 22, 2006 on Yahoo! Music. It depicts the vision of the capture, trial, imprisonment, attempted execution, escape, and burning of a young boy who burns a CD on his computer.[7] The boy's death, where he stands on top of a tower just before it explodes, parodies the film White Heat, where Cody Jarrett, played by James Cagney, dies in a similar fashion. Various people, from policemen to criminals to even sharks and dogs, are then seen celebrating throughout the ending chorus. But at the end, it turns out the boy is just imagining what would happen if he downloaded the song, so he throws away the burned CD and goes back to playing his guitar. Throughout the song, the video coloring gradually changes from color to grayscale to dark grayscale to yellowed. On MTV's MTV Music site where this music video is available, they have censored the names of the file sharing programs in the song, such as LimeWire or KaZaA.[8] Weird Al explained that MTV contacted him and told him they would not air his video if the references to the filesharing programs were not in some way removed, so he "made the creative decision to bleep them out as obnoxiously as possible, so that there would be no mistake I was being censored."[9] The video was praised by the Annie Awards and was subsequently nominated for Best Animated Short Subject for its 34th ceremony, but was beat out by the Ice Age featurette, No Time for Nuts. See also[edit] List of singles by "Weird Al" Yankovic List of songs by "Weird Al" Yankovic References[edit] .mw-parser-output .reflist{margin-bottom:0.5em;list-style-type:decimal}@media screen{.mw-parser-output .reflist{font-size:90%}}.mw-parser-output .reflist .references{font-size:100%;margin-bottom:0;list-style-type:inherit}.mw-parser-output .reflist-columns-2{column-width:30em}.mw-parser-output .reflist-columns-3{column-width:25em}.mw-parser-output .reflist-columns{margin-top:0.3em}.mw-parser-output .reflist-columns ol{margin-top:0}.mw-parser-output .reflist-columns li{page-break-inside:avoid;break-inside:avoid-column}.mw-parser-output .reflist-upper-alpha{list-style-type:upper-alpha}.mw-parser-output .reflist-upper-roman{list-style-type:upper-roman}.mw-parser-output .reflist-lower-alpha{list-style-type:lower-alpha}.mw-parser-output .reflist-lower-greek{list-style-type:lower-greek}.mw-parser-output .reflist-lower-roman{list-style-type:lower-roman} ^ Bill Plympton Studio Archived November 16, 2006, at the Wayback Machine ^ .mw-parser-output cite.citation{font-style:inherit;word-wrap:break-word}.mw-parser-output .citation q{quotes:"\"""\"""'""'"}.mw-parser-output .citation:target{background-color:rgba(0,127,255,0.133)}.mw-parser-output .id-lock-free.id-lock-free a{background:url("//upload.wikimedia.org/wikipedia/commons/6/65/Lock-green.svg")right 0.1em center/9px no-repeat}.mw-parser-output .id-lock-limited.id-lock-limited a,.mw-parser-output .id-lock-registration.id-lock-registration a{background:url("//upload.wikimedia.org/wikipedia/commons/d/d6/Lock-gray-alt-2.svg")right 0.1em center/9px no-repeat}.mw-parser-output .id-lock-subscription.id-lock-subscription a{background:url("//upload.wikimedia.org/wikipedia/commons/a/aa/Lock-red-alt-2.svg")right 0.1em center/9px no-repeat}.mw-parser-output .cs1-ws-icon a{background:url("//upload.wikimedia.org/wikipedia/commons/4/4c/Wikisource-logo.svg")right 0.1em center/12px no-repeat}body:not(.skin-timeless):not(.skin-minerva) .mw-parser-output .id-lock-free a,body:not(.skin-timeless):not(.skin-minerva) .mw-parser-output .id-lock-limited a,body:not(.skin-timeless):not(.skin-minerva) .mw-parser-output .id-lock-registration a,body:not(.skin-timeless):not(.skin-minerva) .mw-parser-output .id-lock-subscription a,body:not(.skin-timeless):not(.skin-minerva) .mw-parser-output .cs1-ws-icon a{background-size:contain;padding:0 1em 0 0}.mw-parser-output .cs1-code{color:inherit;background:inherit;border:none;padding:inherit}.mw-parser-output .cs1-hidden-error{display:none;color:var(--color-error,#d33)}.mw-parser-output .cs1-visible-error{color:var(--color-error,#d33)}.mw-parser-output .cs1-maint{display:none;color:#085;margin-left:0.3em}.mw-parser-output .cs1-kern-left{padding-left:0.2em}.mw-parser-output .cs1-kern-right{padding-right:0.2em}.mw-parser-output .citation .mw-selflink{font-weight:inherit}@media screen{.mw-parser-output .cs1-format{font-size:95%}html.skin-theme-clientpref-night .mw-parser-output .cs1-maint{color:#18911f}}@media screen and (prefers-color-scheme:dark){html.skin-theme-clientpref-os .mw-parser-output .cs1-maint{color:#18911f}}"Weird Al- Dont Download This Song". 2007-02-26. Archived from the original on 2007-02-26. Retrieved 2022-12-13. ^ "RIAA sues the dead". The Register. Retrieved 18 April 2018. ^ Beckerman, Ray (23 March 2007). "Recording Industry vs The People: RIAA Insists on Deposing Tanya Andersen's 10-year-old daughter". Retrieved 18 April 2018. ^ "kuro5hin.org". www.kuro5hin.org. Retrieved 18 April 2018. ^ Rabin, Nathan (2011-06-29). ""Weird Al" Yankovic". The A.V. Club. Retrieved 2011-06-29. ^ Premieres on Yahoo! Music Archived 2006-08-21 at the Wayback Machine ^ "MTV Bleeps File Sharing Software Out Of Music Videos". 30 October 2008. Retrieved 18 April 2018. ^ Cohen, Noam (2 November 2008). "Censorship, or What Really Weirds Out Weird Al". The New York Times. Retrieved 18 April 2018. External links[edit] alyankovicVEVO, "Weird Al Yankovic - Don't Download This Song", YouTube, October 2, 2009. The music video at Yankovic's official YouTube Vevo website. Plymptoons, DON'T DOWNLOAD THIS SONG - Weird Al Yankovic & Bill Plympton, YouTube. The music video at Bill Plympton's official YouTube website. Listen to the Song and Send E-Cards .mw-parser-output .navbox{box-sizing:border-box;border:1px solid #a2a9b1;width:100%;clear:both;font-size:88%;text-align:center;padding:1px;margin:1em auto 0}.mw-parser-output .navbox .navbox{margin-top:0}.mw-parser-output .navbox+.navbox,.mw-parser-output .navbox+.navbox-styles+.navbox{margin-top:-1px}.mw-parser-output .navbox-inner,.mw-parser-output .navbox-subgroup{width:100%}.mw-parser-output .navbox-group,.mw-parser-output .navbox-title,.mw-parser-output .navbox-abovebelow{padding:0.25em 1em;line-height:1.5em;text-align:center}.mw-parser-output .navbox-group{white-space:nowrap;text-align:right}.mw-parser-output .navbox,.mw-parser-output .navbox-subgroup{background-color:#fdfdfd}.mw-parser-output .navbox-list{line-height:1.5em;border-color:#fdfdfd}.mw-parser-output .navbox-list-with-group{text-align:left;border-left-width:2px;border-left-style:solid}.mw-parser-output tr+tr>.navbox-abovebelow,.mw-parser-output tr+tr>.navbox-group,.mw-parser-output tr+tr>.navbox-image,.mw-parser-output tr+tr>.navbox-list{border-top:2px solid #fdfdfd}.mw-parser-output .navbox-title{background-color:#ccf}.mw-parser-output .navbox-abovebelow,.mw-parser-output .navbox-group,.mw-parser-output .navbox-subgroup .navbox-title{background-color:#ddf}.mw-parser-output .navbox-subgroup .navbox-group,.mw-parser-output .navbox-subgroup .navbox-abovebelow{background-color:#e6e6ff}.mw-parser-output .navbox-even{background-color:#f7f7f7}.mw-parser-output .navbox-odd{background-color:transparent}.mw-parser-output .navbox .hlist td dl,.mw-parser-output .navbox .hlist td ol,.mw-parser-output .navbox .hlist td ul,.mw-parser-output .navbox td.hlist dl,.mw-parser-output .navbox td.hlist ol,.mw-parser-output .navbox td.hlist ul{padding:0.125em 0}.mw-parser-output .navbox .navbar{display:block;font-size:100%}.mw-parser-output .navbox-title .navbar{float:left;text-align:left;margin-right:0.5em}body.skin--responsive .mw-parser-output .navbox-image img{max-width:none!important}@media print{body.ns-0 .mw-parser-output .navbox{display:none!important}}show.mw-parser-output .navbar{display:inline;font-size:88%;font-weight:normal}.mw-parser-output .navbar-collapse{float:left;text-align:left}.mw-parser-output .navbar-boxtext{word-spacing:0}.mw-parser-output .navbar ul{display:inline-block;white-space:nowrap;line-height:inherit}.mw-parser-output .navbar-brackets::before{margin-right:-0.125em;content:"[ "}.mw-parser-output .navbar-brackets::after{margin-left:-0.125em;content:" ]"}.mw-parser-output .navbar li{word-spacing:-0.125em}.mw-parser-output .navbar a>span,.mw-parser-output .navbar a>abbr{text-decoration:inherit}.mw-parser-output .navbar-mini abbr{font-variant:small-caps;border-bottom:none;text-decoration:none;cursor:inherit}.mw-parser-output .navbar-ct-full{font-size:114%;margin:0 7em}.mw-parser-output .navbar-ct-mini{font-size:114%;margin:0 4em}html.skin-theme-clientpref-night .mw-parser-output .navbar li a abbr{color:var(--color-base)!important}@media(prefers-color-scheme:dark){html.skin-theme-clientpref-os .mw-parser-output .navbar li a abbr{color:var(--color-base)!important}}@media print{.mw-parser-output .navbar{display:none!important}}vte"Weird Al" Yankovic "Weird Al" Yankovic Jon "Bermuda" Schwartz Steve Jay Jim West Rubén Valtierra Rick Derringer Studio albums "Weird Al" Yankovic "Weird Al" Yankovic in 3-D Dare to Be Stupid Polka Party! Even Worse UHF – Original Motion Picture Soundtrack and Other Stuff Off the Deep End Alapalooza Bad Hair Day Running with Scissors Poodle Hat Straight Outta Lynwood Alpocalypse Mandatory Fun Soundtrack albums Weird: The Al Yankovic Story EPs Another One Rides the Bus Internet Leaks Compilations Greatest Hits The Best of Yankovic The Food Album Permanent Record: Al in the Box Greatest Hits Vol. II The TV Album The Essential "Weird Al" Yankovic Squeeze Box: The Complete Works of "Weird Al" Yankovic Songs "My Bologna" "Another One Rides the Bus" "Ricky" "I Love Rocky Road" "Eat It" "I Lost on Jeopardy" "Like a Surgeon" "Yoda" "Hooked on Polkas" "Dare to Be Stupid" "I Want a New Duck "Living with a Hernia" "Christmas at Ground Zero" "Fat" "Money for Nothing/Beverly Hillbillies" "Chicken Pot Pie" "Smells Like Nirvana" "You Don't Love Me Anymore" "Jurassic Park" "Bedrock Anthem" "Achy Breaky Song" "Headline News" "Amish Paradise" "Spy Hard" "The Night Santa Went Crazy" "The Saga Begins" "It's All About the Pentiums" "Polka Power!" "Pretty Fly for a Rabbi" "Albuquerque" "Bob" "Couch Potato" "eBay" "You're Pitiful" "Don't Download This Song" "White & Nerdy" "Pancreas" "Canadian Idiot" "Trapped in the Drive-Thru" "Whatever You Like" "Craigslist" "Perform This Way" "Tacky" "Word Crimes" "Foil" "Handy" "First World Problems" Videography Al TV The Compleat Al UHF The "Weird Al" Yankovic Video Library Alapalooza: The Videos Bad Hair Day: The Videos The Weird Al Show "Weird Al" Yankovic: The Ultimate Video Collection "Weird Al" Yankovic Live!: The Alpocalypse Tour Tours An Evening of Dementia with Dr. Demento in Person Plus "Weird Al" Yankovic Mandatory World Tour Ridiculously Self-Indulgent, Ill-Advised Vanity Tour Strings Attached Tour The Unfortunate Return of the Ridiculously Self-Indulgent, Ill-Advised Vanity Tour Related articles Discography Videography Polka medleys Peter & the Wolf/Carnival of the Animals – Part II Weird: The Al Yankovic Story Category showvteFilms directed by Bill PlymptonFeature films The Tune (1992) I Married a Strange Person! (1998) Mutant Aliens (2001) Hair High (2004) Idiots and Angels (2008) Cheatin' (2013) Hitler's Folly (2016) Revengeance (2016) Short films Your Face (1987) 12 Tiny Christmas Tales (2001) Guard Dog (2004) The Cow Who Wanted to Be a Hamburger (2010) Music videos "Heard 'Em Say" (2004) "Don't Download This Song" (2005) "TMZ" (2011) Authority control databases MusicBrainz release groupMusicBrainz work <img src="https://login.wikimedia.org/wiki/Special:CentralAutoLogin/start?type=1x1" alt="" width="1" height="1" style="border: none; position: absolute;"> Retrieved from "https://en.wikipedia.org/w/index.php?title=Don%27t_Download_This_Song&oldid=1239009310" Categories: 2006 singlesProtest songs"Weird Al" Yankovic songsSongs written by "Weird Al" YankovicPop balladsMusic videos directed by Bill PlymptonSongs about the Internet2006 songs2000s balladsAnimated music videosVolcano Entertainment singlesHidden categories: Webarchive template wayback linksArticles with short descriptionShort description matches WikidataArticles with hAudio microformatsArticles with MusicBrainz release group identifiersArticles with MusicBrainz work identifiers This page was last edited on 6 August 2

      you heard weird al.dont pirate or download his song!!!

    1. That’s good. (Kisses her hand. She lowers her head.) Oh, I beg your pardon! (Rises) But a working machine must not play the piano, must not feel happy, must not do a whole lot of other things.

      This a great example of the restrictions and the guidelines coders/programmers put on their works. I wonder if people allow or figure a way to code ai to feel emotions would they eventually go berserk on there own.

    1. As a concept, fascism tends to act as a “bridging metaphor”(Alexander 2003)—that is, as a code word for evil, violence, and authoritarianbehavior, whether it be political, cultural, or social. Definitions of fascism tendtoward reductionism even when sophisticated scholars offer them.

      This could be unrelated but I've recently seen videos of humans hyperbolic linguistic tendencies. Will we really be able to detangle "fascism" from general words for evil?

  2. parrt.cs.usfca.edu parrt.cs.usfca.edu
    1. “Why program by hand in 5 days what you can spend 5 years of your life automating?”, Keynotepresentation at Code Generation conference 2011, Cambridge, England, June 2011.http://www.infoq.com/presentations/Automation-DSL

      The curse of ANTLR

      combined with the curse of LISP

      = MetaLISP

      .4 - insipirations

    1. “code of silence” regarding mental illness, especially when a family member has not “come out” regarding their situation.

      I think that is super common among parents to act like mental illness doesn't exist and that it's still problematic to talk about

    1. When a user asks Claude to generate content like code snippets, text documents, or website designs, these Artifacts appear in a dedicated window alongside their conversation. This creates a dynamic workspace where they can see, edit, and build upon Claude’s creations in real-time, seamlessly integrating AI-generated content into their projects and workflows.
    1. Welcome back and in this lesson I'm going to cover object versioning and MFA delete, two essential features of S3.

      These are two things I can almost guarantee will feature on the exam and almost every major project I can involved in has needed solid knowledge of both.

      So let's jump in and get started.

      Object versioning is something which is controlled at a bucket level.

      It starts off in a disabled state.

      You can optionally enable versioning on a disabled bucket, but once enabled you cannot disable it again.

      Just to be super clear, you can never switch bucket versioning back to disabled once it's been enabled.

      What you can do though is suspend it and if desired a suspended bucket can be re-enabled.

      It's really important for the exam to remember these stage changes.

      So make a point of noting them down and when revising try to repeat until it sticks.

      So a bucket starts off as disabled, it can be re-enabled again, an enabled bucket can be moved to suspended and then moved back to enabled.

      But the important one is that enabled bucket can never be switched back to disabled.

      That is critical to understand for the exam.

      So you can see many trick questions which will test your knowledge on that point.

      Without versioning enabled on a bucket, each object is identified solely by the object key, its name, which is unique inside the bucket.

      If you modify an object, the original version of that object is replaced.

      Versioning lets you store multiple versions of an object within a bucket.

      Any operations which would modify an object, generate a new version of that object and leave the original one in place.

      For example, let's say I have a bucket and inside the bucket is a picture of one of my cats, Winky.

      So the object is called Winky.JPEG.

      It's identified in the bucket by the key, essentially its name, and the key is unique.

      If I modify the Winky.JPEG object or delete it, those changes impact this object.

      Now there's an attribute of an object which I haven't introduced yet and that's the ID of the object.

      When versioning on a bucket is disabled, the ID of the object in that bucket are set to null.

      That's what versioning being off on a bucket means.

      All of the objects have an ID of null.

      Now if you upload or put a new object into a bucket with versioning enabled, then S3 allocates an ID to that object.

      In this case, 111, 111.

      If any modifications are made to this object, so let's say somebody accidentally overrides the Winky.JPEG object with the dog picture, but still calls it Winky.JPEG.

      S3 doesn't remove the original object.

      It allocates a new ID to the newer version and it retains the old version.

      The newest version of any object in a version-enabled bucket is known as the current version of that object.

      So in this case, the object called Winky.JPEG with an ID of 2222222, which is actually a dog picture, that is the current version of this object.

      Now if an object is accessed without explicitly indicating to S3 which version is required, then it's always the current version which will be returned.

      But you've always got the ability of requesting an object from S3 and providing the ID of a specific version to get that particular version back rather than the current version.

      So versions can be individually accessed by specifying the ID, and if you don't specify the ID, then it's assumed that you want to interact with the current version, the most recent version.

      Now versioning also impacts deletions.

      Let's say we've got these two different versions of Winky.JPEG stored in a version-enabled bucket.

      If we indicate to S3 that we want to delete the object and we don't give any specific version ID, then what S3 will do is try a new special version of that object known as a delete marker.

      Now the delete marker essentially is just a new version of that object, so S3 doesn't actually delete anything, but the delete marker makes it look deleted.

      In reality though, it's just hidden.

      The delete marker is a special version of an object which hides all previous versions of that object.

      But you can delete the delete marker which essentially undeletes the object, returning the current version to being active again, and all the previous versions of the object still exist, accessible using their unique version ID.

      Now even with versioning enabled, you can actually fully delete a version of an object, and that actually really deletes it.

      To do that, you just need to delete an object and specify the particular version ID that you want to remove.

      And if you are deleting a particular version of an object and the version that you're deleting is the most recent version, so the current version, then the next most recent version of that object then becomes the current version.

      Now some really important points that you need to be aware about object versioning.

      I've mentioned this at the start of the lesson, it cannot be switched off, it can only be suspended.

      Now why that matters is that when versioning is enabled on a bucket, all the versions of that object stay in that bucket, and so you're consuming space for all of the different versions of an object.

      If you have one single object that's 5 gig in size, and you have five versions of that object, then that's 5 times 5 gig of space that you're consuming for that one single object, and it's multiple versions.

      And logically, you'll build for all of those versions of all of those objects inside an S3 bucket, and the only way that you can zero those costs out is to delete the bucket and then re-upload all those objects to a bucket without versioning enabled.

      That's why it's important that you can't disable versioning.

      You can only suspend it, and when you suspend it, it doesn't actually remove any of those old versions, so you're still built for them.

      Now there's one other relevant feature of S3 which does make it to the exam all the time, and that's known as MFA delete.

      Now MFA delete is something that's enabled within the versioning configuration on a bucket.

      And when you enable MFA delete, it means that MFA is required to change bucket versioning state.

      So if you move from enable to suspend it or vice versa, you need this MFA to be able to do that, and also MFA is required to delete any versions of an object.

      So to fully delete any versions, you need this MFA token.

      Now the way that this works is that when you're performing API calls in order to change a bucket to versioning state or delete a particular version of an object, you need to provide the serial number of your MFA token as well as the code that it generates.

      You concatenate both of those together, and you pass that along with any API calls to interact how you delete versions or change the versioning state of a bucket.

      Okay, so that's all of the theory for object versioning inside S3.

      And at this point, that's everything I wanted to cover in this license.

      I'll go ahead and complete the video, and when you're ready, I'll look forward to you joining me in the next.

    1. Welcome back in this demo lesson you're going to gain some practical experience of working with the versioning feature of S3.

      So to get started just make sure that you're logged in to the management account of the organization, so the general account, and then make sure that you've got the Northern Virginia region selected, so US-EAS-1.

      Now there is a link attached to this lesson which you need to click on and then extract.

      This is going to contain all of the files that you'll be using throughout this demo.

      So go ahead and click on that link, extract it, and it should create a folder called S3_Versioning.

      Once you've confirmed that you're logged in and have the right region selected, then go ahead and move to the S3 console.

      So you can get there either using the recently visited services, or you can type S3 into the Find Services box and click to move to the S3 console.

      Now to demonstrate versioning we're going to go ahead and create an S3 bucket, we're going to set it up for static website hosting, enable versioning, and then experiment with some objects and just observe how versioning changes the default behavior inside an S3 bucket.

      So go ahead and click on Create bucket.

      As long as the bucket name is unique, its specific name isn't important because we won't be using it with Route 53.

      So just give the bucket a name and make sure that it's something unique.

      So I'm going to use AC_Bucket_13337.

      You should pick something different than me and different from something that any other student would use.

      Once you've selected a unique bucket name, just scroll down and uncheck Block All Public Access.

      We're going to be using this as a static website hosting bucket, so this is fine.

      And we'll need to acknowledge that we understand the changes that we're making, so check this box, scroll down a little bit more, and then under bucket versioning we're going to click to enable versioning.

      Keep scrolling down and at the bottom click on Create bucket.

      Next, go inside the bucket, click on Properties, scroll all the way down to the bottom, and we need to enable static website hosting.

      So click on Edit, check the box to enable static website hosting.

      For hosting type, we'll set it to host a static website, and then for the index document, just type index.html, and then for the error document, type error.html.

      Once you've set both of those, you can scroll down to the bottom and click on Save Changes.

      Now as you learned in the previous demo lesson, just enabling static website hosting isn't enough to allow access, we need to apply a bucket policy.

      So click on the permissions tab, scroll down, and under bucket policy click on Edit.

      Now inside the link attached to this lesson, which you should have downloaded and extracted, there should be a file called bucket_policy.json, which is an example bucket policy.

      So go ahead and open that file and copy the contents into your clipboard, move back to the console and paste it into the policy box, and we need to replace this example bucket placeholder with the ARN for this bucket.

      So copy the bucket ARN into your clipboard by clicking this icon.

      Because this ARN references objects in this bucket, and we know this because it's got forward slash star at the end, we need to replace only the first part of this placeholder ARN with the actual bucket ARN from the top.

      So select from the A all the way up to the T, so not including the forward slash and the star, and then paste in the bucket ARN that you copied onto your clipboard.

      Once you've done that, you can scroll down and then click on Save Changes.

      Next, click on the objects tab, and we're going to upload some of the files that you downloaded from the link attached to this lesson.

      So click on Upload, and first we're going to add the files.

      So click on Add Files, then you'll need to go to the location where you downloaded and extracted the file that's attached to this lesson.

      And once you're there, go into the folder called S3_Versioning, and you'll see a folder called Website.

      Open that folder, select index.html and click on Open, and then click on Add Folder, and select the IMG folder that's also in that same location.

      So select that folder and then click on Upload.

      So this is going to upload an index.html object, and it's going to upload a folder called IMG which contains winky.jpeg.

      Once you've done that, scroll down to the bottom and just click on Upload.

      Now once the upload's completed, you can go ahead and click on Close, and what you'll see in the Objects dialog inside the bucket is index.html and then a folder called IMG.

      And as we know by now, S3 doesn't actually have folders it uses prefixes, but if we go inside there, you'll see a single object called winky.jpeg.

      Now go back to the bucket, and what we're going to do is click on Properties, scroll down to the bottom, and then click on this icon to open our bucket in a new browser tab.

      All being well, you should see AnimalsForLife.org, Animal of the Week, and a picture of my one-eyed cat called winky.

      So this is using the same architecture as the previous demo lesson where you experienced static website hosting.

      What we're going to do now though is experiment with versions.

      So go back to the main S3 console, scroll to the top, and click on Objects.

      So because we've got versioning enabled on this bucket, as I talked about in the previous theory lesson, it means that every time you upload an object to this S3 bucket, it's assigned a version ID.

      And if you upload an object with the same name, then instead of overwriting that object, it just creates a new version of that object.

      Now with versioning enabled and using the default settings, we don't see all the individual versions, but we can elect to see them by toggling this Show Versions toggle.

      So go ahead and do that.

      Now you'll see that every object inside the S3 bucket, you'll see a particular version ID, and this is a unique code which represents this particular version of this particular object.

      So if we go inside the IMG folder, you'll see that we have the same for winkeep.jpeg.

      Toggle Show Versions to Disable, and you'll see that that version ID disappears.

      What I want you to do now is to click on the Upload button inside this IMG folder.

      So click on Upload, and then click on Add Files.

      Now inside this Lessons folder, so S3 versioning, at the top level you've got a number of folders.

      You have Website, which is what you uploaded to this S3 bucket, and this Image folder contains winkeep.jpeg.

      So this is a particular file, winkeep.jpeg, that contains the picture of winkeep my one-eyed cat.

      Now if you expand version 1 and version 2, you might be able to tell that version 1 is the same one-eyed cat, and we can expand that and say that it is actually winkeep.

      Inside version 2 we have an object with the same name, but if we expand this, this is not winkeep, this is a picture of truffles.

      So let's say that an administrator of this bucket makes a mistake and uploads this second version of winkeep.jpeg, which is not actually winkeep, it's actually truffles the cat.

      But let's say that we do this, so we select winkeep.jpeg from the version 2 folder, and we click on Open.

      Once we've selected that for upload, we scroll all the way down to the bottom and click on Upload.

      That might take a few seconds to complete the upload because these are relatively large image files, but once it's uploaded you can click on Close.

      So now we're still inside this image folder, and if we refresh, all we can see is one object, winkeep.jpeg.

      So it looks with this default configuration of the user interface, like we've overwritten a previous object with this new object.

      And if we go back to the tab which has got the static website open and hit refresh, you'll see that this image has indeed been replaced by the truffles image.

      So even though it's called winkeep.jpeg, this is clearly truffles.

      Now if we go back to the S3 console, and now if we enable the versions toggle, now we can see that we've got two different versions of this same object.

      We've got the original version at the bottom and a new version at the top.

      And note how both of these have different version IDs.

      Now what S3 does is it always picks the latest version whenever you use any operations which simply request that one object.

      So if we just request the object like we're doing with the static website hosting, then it will always pick the current or the latest version of this object.

      But we do still have access to the older versions because we have versioning enabled on this bucket.

      Nothing is ever truly deleted as long as we're operating with objects.

      So let's experiment with exactly what functionality this gives us.

      Go ahead and toggle show versions.

      Once you've done that, select the winkeep.jpeg object and then click delete.

      You'll need to type or copy and paste delete into this delete objects box and then click on delete.

      Before we do that, note what it says at the top.

      Deleting the specified objects adds delete markers to them.

      If you need to undo the delete action, you can delete the delete markers.

      So let's explore what this means.

      Go ahead and click on delete objects.

      And once it's completed, click on close.

      Now how this looks at the moment, we're still in the image folder.

      And because we've got show version set to off, it looks like we deleted the object.

      But this is not what's occurred because we've got versioning enabled.

      What's actually occurred is this is added a new version of this object.

      But instead of an actual new version of the object, it's simply added a delete marker as that new version.

      So if we toggle show versions back to on, now what we see are the previous versions of winkeep.jpeg.

      So the original version at the bottom and the one that we replaced in the middle.

      And then at the top we have this delete marker.

      Now the delete marker is the thing which makes it look to be deleted in the console UI when we have show version set to off.

      So this is how S3 handles deletions when versioning is enabled.

      If you're interacting with an object and you delete that object, it doesn't actually delete the object.

      It simply adds a delete marker as the most recent version of that object.

      Now if we just select that delete marker and then click on delete, that has the effect of undeleting the object.

      Now it's important to highlight that because we're dealing with object versions, anything that we do is permanent.

      If you're operating with an object and you have versioning enabled on a bucket, if you overwrite it or delete it, all it's going to do is either add a new version or add a delete marker.

      When you're operating with versions, everything is permanent.

      So in this case we're going to be permanently deleting the delete marker.

      So you need to confirm that by the typing or copying and pasting permanently delete into this box and click on delete objects.

      What this is going to do is delete the delete marker.

      So if we click on close, now we're left with these two versions of winkeep.jpeg so we've deleted the delete marker.

      If we toggle show versions to off, we can see that we now have our object back in the bucket.

      If we go back to static website hosting and refresh, we can see though that it's still truffle.

      So this is a mistake.

      It's not actually winky in this particular image.

      So what we can do is go back to the S3 console, we can enable show versions.

      We know that the most recent version is actually truffles rather than winky.

      So what we can do is select this incorrect version, so the most recent version and select delete.

      Now again, we're working with an object version.

      So this is permanent.

      You need to make sure that this is what you intend.

      In our case it is.

      So you need to either type or copy and paste permanently delete into the box and click on delete objects.

      Now this is going to delete the most recent version of this object.

      What happens when you do that is it makes the next most recent version of that object the current or latest version.

      So now this is the original version of winky.jpeg, the one that we first uploaded to this bucket.

      So this is now the only version of this object.

      If we go back to the static website hosting tab and hit refresh, this time it loads the correct version of this image.

      So this is actually winky my one-eyed cat.

      So this is how you can interact with versioning in an S3 bucket.

      Whenever it's enabled, it means that whenever you upload an object to the same name instead of overwriting, it simply creates a new version.

      Whenever you delete an object, it simply adds a delete marker.

      When you're operating with objects, it's always creating new versions or adding delete markers.

      But when you're working with particular versions rather than objects, any operations are permanent.

      So you can actually delete specific versions of an object permanently and you can delete delete markers to undelete that object.

      Now it's not possible to turn off versioning on a bucket.

      Once it's enabled on that bucket, you don't have the ability to disable it.

      You only have the ability to suspend it.

      Now when you suspend it, it stops new versions being created, but it does nothing about the existing versions.

      The only way to remove the additional costs for a version-enabled bucket is either to delete the bucket and then reload the objects to a new bucket, or go through the existing bucket and then manually purge any specific versions of objects which aren't required.

      So you need to be careful when you're enabling versioning on a bucket because it can cause additional costs.

      If you have a bucket where you're uploading objects over and over again, specifically of their large objects, then if you have versioning enabled, you can incur significantly higher costs than if you have a bucket which doesn't have a versioning enabled.

      So that's something you need to keep in mind.

      If you enable versioning, you need to manage those versions of those objects inside the bucket.

      With that being said, let's tidy up.

      So let's go back to the main S3 console, select the bucket, click on Empty, copy and paste or type "Permanently Delete" and click on Empty.

      When it's finished, click on Exit, and with the bucket still selected, click on Delete.

      Copy and paste or type the name of the bucket and confirm it with the delete bucket.

      I want you to build out the accounties back in the same state as it was at the start of this demo lesson.

      Now at this point, that's everything that I want you to do in this demo lesson.

      You've gained some practical exposure with how to deal with object versions inside an S3 bucket.

      At this point, go ahead and complete this video, and when you're ready, I'll afford you joining me in the next lesson.

    1. In their study, Felt’s team used a computeralgorithm to look through the underlying code of each application line by line anddeduce a list of permissions that were necessary for the program to function.

      doesn't clearly state the mechanism of how the study is done, confusing how methodology of study.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The authors of the manuscript entitled "A conserved fungal Knr4/Smi1 protein is vital for maintaining cell wall integrity and host plant pathogenesis" used a weighted gene co-expression network to identify Fusarium graminearum genes highly expressed during early symptomless infection of wheat. Based on its sequence and previous studies, authors selected FgKnr4 from the early symptomless Fusarium modules. The characterization of knockout strains revealed a role in morphogenesis, growth, cell wall stress tolerance, and virulence in F. graminearum and the phylogenetically distant fungus Zymoseptoria tritici.

      The methods are properly described and statistical analysis are reasonable so reproducibility is possible. The RNA-seq dataset is already published and the authors provided a repository with the code used to create the co-expression network. However, I have the following questions:

      • Why only use of high confidence transcripts maize to map the reads and not the full genome like Fusarium graminearum? I have never analyzed plant transcriptome.
      • The regular output of DESeq are TPMs, how did the authors obtain the FPKM used in the analysis?
      • Do the authors have a southern blot to prove the location of the insertion and number of insertions in Zymoseptoria tritici mutant and complemented strains?
      • Boxplots and bar graphs should have the same format. In Figures 5 B and F and supplementary figure 6.3 the authors showed the distribution of samples but it is lacking in figure 3 B and all bar graphs.
      • Line 247 FGRAMPH1_0T23707 should be FGRAMPH1_01T23707

      Referees cross-commenting

      I agree with reviewer 1, the order in which the figures are called in the text is confusing. Regardless of figures 5C-D I am no expert in the field therefore I can only say they look like they have not been edited.

      I agree with reviewer 1, data of DON mycotoxin production in infected issues is need it to support statement in line 272-273.

      I agree with Reviewer 2, the criteria to exclude genes from the final selection list should be explained.

      Significance

      The study showed, once again, that a weighted gene co-expression network is a great method to identify new genes of interest regardless of the organism or condition even if not very popular in the fungal pathogen field yet. The study proved that functions identified in a WGCN module from a pathogen have their opposite in the host module. The authors go beyond the theory and demonstrate the effect of the highest expressed gene during the early symptomless stage of infection in maize and wheat fungal pathogens.

      Fungal pathogen, RNA-seq, metabolic models, metabolism, comparative genomics

    1. Reviewer #1 (Public Review):

      Summary:

      In this study, the authors re-analyzed Experiment 1 of a public dataset (Rademaker et al, 2019, Nature Neuroscience) which includes fMRI and behavioral data recorded while participants held an oriented grating in visual working memory (WM) and performed a delayed recall task at the end of an extended delay period. In that experiment, participants were pre-cued on each trial as to whether there would be a distracting visual stimulus presented during the delay period (filtered noise or randomly oriented grating). In this manuscript, the authors focused on identifying whether the neural code in the retinotopic cortex for remembered orientation was 'stable' over the delay period, such that the format of the code remained the same, or whether the code was dynamic, such that information was present, but encoded in an alternative format. They identify some time points - especially towards the beginning/end of the delay - where the multivariate activation pattern fails to generalize to other time points and interpret this as evidence for a dynamic code. Additionally, the authors compare the representational format of remembered orientation in the presence vs absence of a distracting stimulus, averaged over the delay period. This analysis suggested a 'rotation' of the representational subspace between distracting orientations and remembered orientations, which may help preserve simultaneous representations of both remembered and viewed stimuli.

      Strengths:

      (1) Direct comparisons of coding subspaces/manifolds between time points and task conditions is an innovative and useful approach for understanding how neural representations are transformed to support cognition.

      (2) Re-use of existing datasets substantially goes beyond the authors' previous findings by comparing the geometry of representational spaces between conditions and time points, and by looking explicitly for dynamic neural representations

      Weaknesses:

      (1) Only Experiment 1 of Rademaker et al (2019) is reanalyzed. The previous study included another experiment (Expt 2) using different types of distractors which did result in distractor-related costs to neural and behavioral measures of working memory. The Rademaker et al (2019) study uses these two results to conclude that neural WM representations are protected from distraction when distraction does not impact behavior, but conditions that do impact behavior also impact neural WM representations. Considering this previous result is critical for relating the present manuscript's results to the previous findings, it seems necessary to address Experimentt 2's data in the present work

      (2) Primary evidence for 'dynamic coding', especially in the early visual cortex, appears to be related to the transition between encoding/maintenance and maintenance/recall, but the delay period representations seem overall stable, consistent with previous findings

      (3) Dynamicism index used in Figure 1f quantifies the proportion of off-diagonal cells with significant differences in decoding performance from the diagonal cell. It's unclear why the proportion of time points is the best metric, rather than something like a change in decoding accuracy. This is addressed in the subsequent analysis considering coding subspaces, but the utility of the Figure 1f analysis remains weakly justified.

      (4) There is no report of how much total variance is explained by the two PCs defining the subspaces of interest in each condition, and timepoint. It could be the case that the first two principal components in one condition (e.g., sensory distractor) explain less variance than the first two principal components of another condition.

      (5) Converting a continuous decoding metric (angular error) to "% decoding accuracy" serves to obfuscate the units of the actual results. Decoding precision (e.g., sd of decoding error histogram) would be more interpretable and better related to both the previous study and behavioral measures of WM performance.

      (6) This report does not make use of behavioral performance data in the Rademaker et al (2019) dataset.

      (7) Given there were observed differences between individual retinotopic ROIs in the temporal cross-decoding analyses shown in Figure 1, the lack of data presented for the subspace analyses for the corresponding individual ROIs is a weakness

    2. Reviewer #2 (Public Review):

      Summary:

      In this work, Degutis and colleagues addressed an interesting issue related to the concurrent coding of sensory percepts and visual working memory contents in visual cortices. They used generalization analyses to test whether working memory representations change over time, diverge from sensory percepts, and vary across distraction conditions. Temporal generalization analysis demonstrated that off-diagonal decoding accuracies were lower than on-diagonal decoding accuracies, regardless of the presence of intervening distractions, implying that working memory representations can change over time. They further showed that the coding space for working memory contents showed subtle but statistically significant changes over time, potentially explaining the impaired off-diagonal decoding performance. The neural coding of sensory distractions instead remained largely stable. Generalization analyses between target and distractor codes showed overlaps but were not identical. Cross-condition decodings had lower accuracies compared to within-condition decodings. Finally, within-condition decoding revealed more reliable working memory representations in the condition with intervening random noises compared to cross-condition decoding using a trained classifier on data from the no-distraction condition, indicating a change in the VWM format between the noise distractor and no-distractor trials.

      Strengths:

      This paper demonstrates a clever use of generalization analysis to show changes in the neural codes of working memory contents across time and distraction conditions. It provides some insights into the differences between representations of working memory and sensory percepts, and how they can potentially coexist in overlapping brain regions.

      Weaknesses:

      (1) An alternative interpretation of the temporal dynamic pattern is that working memory representations become less reliable over time. As shown by the authors in Figure 1c and Figure 4a, the on-diagonal decoding accuracy generally decreased over time. This implies that the signal-to-noise ratio was decreasing over time. Classifiers trained with data of relatively higher SNR and lower SNR may rely on different features, leading to poor generalization performance. This issue should be addressed in the paper.

      (2) The paper tests against a strong version of stable coding, where neural spaces representing WM contents must remain identical over time. In this version, any changes in the neural space will be evidence of dynamic coding. As the paper acknowledges, there is already ample evidence arguing against this possibility. However, the evidence provided here (dynamic coding cluster, angle between coding spaces) is not as strong as what prior studies have shown for meaningful transformations in neural coding. For instance, the principal angle between coding spaces over time was smaller than 8 degrees, and around 7 degrees between sensory distractors and WM contents. This suggests that the coding space for WM was largely overlapping across time and with that for sensory distractors. Therefore, the major conclusion that working memory contents are dynamically coded is not well-supported by the presented results.

      (3) Relatedly, the main conclusions, such as "VWM code in several visual regions did not generalize well between different time points" and "VWM and feature-matching sensory distractors are encoded in separable coding spaces" are somewhat subjective given that cross-condition generalization analyses consistently showed above chance-level performance. These results could be interpreted as evidence of stable coding. The authors should use more objective descriptions, such as 'temporal generalization decoding showed reduced decoding accuracy in off-diagonals compared to on-diagonals.

    1. Introduction In the year 1914 the University Museum secured by purchase a large six column tablet nearly complete, carrying originally, according to the scribal note, 240 lines of text. The contents supply the South Babylonian version of the second book of the epic ša nagba imuru, “He who has seen all things,” commonly referred to as the Epic of Gilgamish. The tablet is said to have been found at Senkere, ancient Larsa near Warka, modern Arabic name for and vulgar descendant of the ancient name Uruk, the Biblical Erech mentioned in Genesis X. 10. This fact makes the new text the more interesting since the legend of Gilgamish is said to have originated at Erech and the hero in fact figures as one of the prehistoric Sumerian rulers of that ancient city. The dynastic list preserved on a Nippur tablet1 mentions him as the fifth king of a legendary line of rulers at Erech, who succeeded the dynasty of Kish, a city in North Babylonia near the more famous but more recent city Babylon. The list at Erech contains the names of two well known Sumerian deities, Lugalbanda2 and Tammuz. The reign of the former is given at 1,200 years and that of Tammuz at 100 years. Gilgamish ruled 126 years. We have to do here with a confusion of myth and history in which the real facts are disengaged only by conjecture. The prehistoric Sumerian dynasties were all transformed [208]into the realm of myth and legend. Nevertheless these rulers, although appearing in the pretentious nomenclature as gods, appear to have been real historic personages.3 The name Gilgamish was originally written dGi-bil-aga-miš, and means “The fire god (Gibil) is a commander,” abbreviated to dGi-bil-ga-miš, and dGi(š)-bil-ga-miš, a form which by full labialization of b to u̯ was finally contracted to dGi-il-ga-miš.4 Throughout the new text the name is written with the abbreviation dGi(š),5 whereas the standard Assyrian text has consistently the writing dGIŠ-ṬU6-BAR. The latter method of writing the name is apparently cryptographic for dGiš-bar-aga-(miš); the fire god Gibil has also the title Giš-bar. A fragment of the South Babylonian version of the tenth book was published in 1902, a text from the period of Hammurapi, which showed that the Babylonian epic differed very much from the Assyrian in diction, but not in content. The new tablet, which belongs to the same period, also differs radically from the diction of the Ninevite text in the few lines where they duplicate each other. The first line of the new tablet corresponds to Tablet I, Col. V 25 of the Assyrian text,7 where Gilgamish begins to relate his dreams to his mother Ninsun.8 [209] The last line of Col. I corresponds to the Assyrian version Book I, Col. VI 29. From this point onward the new tablet takes up a hitherto unknown portion of the epic, henceforth to be assigned to the second book.9 At the end of Book I in the Assyrian text and at the end of Col. I of Book II in the new text, the situation in the legend is as follows. The harlot halts outside the city of Erech with the enamoured Enkidu, while she relates to him the two dreams of the king, Gilgamish. In these dreams which he has told to his mother he receives premonition concerning the advent of the satyr Enkidu, destined to join with him in the conquest of Elam. Now the harlot urges Enkidu to enter the beautiful city, to clothe himself like other men and to learn the ways of civilization. When he enters he sees someone, whose name is broken away, eating bread and drinking milk, but the beautiful barbarian understands not. The harlot commands him to eat and drink also: “It is the conformity of life, Of the conditions and fate of the Land.” He rapidly learns the customs of men, becomes a shepherd and a mighty hunter. At last he comes to the notice of Gilgamish himself, who is shocked by the newly acquired manner of Enkidu. “Oh harlot, take away the man,” says the lord of Erech. Once again the faithful woman instructs her heroic lover in the conventions of society, this time teaching him the importance of the family in Babylonian life, and obedience to the ruler. Now the people of Erech assemble about him admiring his [210]godlike appearance. Gilgamish receives him and they dedicate their arms to heroic endeavor. At this point the epic brings in a new and powerful motif, the renunciation of woman’s love in the presence of a great undertaking. Gilgamish is enamoured of the beautiful virgin goddess Išhara, and Enkidu, fearing the effeminate effects of his friend’s attachment, prevents him forcibly from entering a house. A terrific combat between these heroes ensues,10 in which Enkidu conquers, and in a magnanimous speech he reminds Gilgamish of his higher destiny. In another unplaced fragment of the Assyrian text11 Enkidu rejects his mistress also, apparently on his own initiative and for ascetic reasons. This fragment, heretofore assigned to the second book, probably belongs to Book III. The tablet of the Assyrian version which carries the portion related on the new tablet has not been found. Man redeemed from barbarism is the major theme of Book II. The newly recovered section of the epic contains two legends which supplied the glyptic artists of Sumer and Accad with subjects for seals. Obverse III 28–32 describes Enkidu the slayer of lions and panthers. Seals in all periods frequently represent Enkidu in combat with a lion. The struggle between the two heroes, where Enkidu strives to rescue his friend from the fatal charms of Išhara, is probably depicted on seals also. On one of the seals published by Ward, Seal Cylinders of Western Asia, No. 459, a nude female stands beside the struggling heroes.12 This scene not improbably illustrates the effort of Enkidu to rescue his friend from the goddess. In fact the satyr stands between Gilgamish and Išhara(?) on the seal. [211] 1 Ni. 13981, published by Dr. Poebel in PBS. V, No. 2. 2 The local Bêl of Erech and a bye-form of Enlil, the earth god. Here he is the consort of the mother goddess Ninsun. 3 Tammuz is probably a real personage, although Dumu-zi, his original name, is certainly later than the title Ab-ú, probably the oldest epithet of this deity, see Tammuz and Ishtar, p. 8. Dumu-zi I take to have been originally the name of a prehistoric ruler of Erech, identified with the primitive deity Abu. 4 See ibid., page 40. 5 Also Meissner’s early Babylonian duplicate of Book X has invariably the same writing, see Dhorme, Choix de Textes Religieux, 298–303. 6 Sign whose gunufied form is read aga. 7 The standard text of the Assyrian version is by Professor Paul Haupt, Das Babylonische Nimrodepos, Leipzig, 1884. 8 The name of the mother of Gilgamish has been erroneously read ri-mat ilatNin-lil, or Rimat-Bêlit, see Dhorme 202, 37; 204, 30, etc. But Dr. Poebel, who also copied this text, has shown that Nin-lil is an erroneous reading for Nin-sun. For Ninsun as mother of Gilgamish see SBP. 153 n. 19 and R.A., IX 113 III 2. Ri-mat ilatNin-sun should be rendered “The wild cow Ninsun.” 9 The fragments which have been assigned to Book II in the British Museum collections by Haupt, Jensen, Dhorme and others belong to later tablets, probably III or IV. 10 Rm. 289, latter part of Col. II (part of the Assyrian version) published in HAUPT, ibid., 81–4 preserves a defective text of this part of the epic. This tablet has been erroneously assigned to Book IV, but it appears to be Book III. 11 K. 2589 and duplicate (unnumbered) in Haupt, ibid., 16–19. 12 See also Ward, No. 199. Transliteration 1it-bi-e-ma iluGilgamiš šu-na-tam i-pa-aš-šar. 2iz-za-kar-am1 a-na um-mi-šu 3um-mi i-na ša-a-at mu-ši-ti-i̭a 4ša-am-ḫa-ku-ma at-ta-na-al-la-ak 5i-na bi-ri-it id-da-tim 6ib-ba-šu-nim-ma ka-ka-’a2 ša-ma-i 7ki-?-?-rum3 ša a-nim im-ku-ut a-na ṣi-ri-i̭a 8áš-ši-šu-ma ik-ta-bi-it4 e-li-i̭a 9ilam5 iš-šu-ma nu-uš-ša-šu6 u-ul el-ti-’i̭ 10ad-ki ma-tum pa-ḫi-ir7 e-li-šu 11id-lu-tum ú-na-ša-ku ši-pi-šu 12ú-um-mi-id-ma     pu-ti 13i-mi- du         i̭a-ti 14aš-ši-a-šu-ma at-ba-la-áš-šu a-na ṣi-ri-ki 15um-mi iluGilgamiš mu-u-da-a-at ka-la-ma 16iz-za-kar-am a-na iluGilgamiš [212] 17mi-in-di iluGilgamish ša ki-ma ka-ti 18i-na ṣi-ri   i-wa-li-id-ma 19ú-ra-ab-bi-šu   ša-du-ú 20ta-mar-šu-ma [sa(?)]-ap-ḫa-ta at-ta 21id-lu-tum ú-na-ša-ku ši-pi-šu8 22te-iṭ-ṭi-ra-šu(?) … šu-ú-zu 23ta-tar-ra-[’a]-šu a-na ṣi-[ri-i̭]a 24[iš-(?)] ti-lam-ma9 i-ta-mar ša-ni-tam 25[šu-na-]ta i-ta-wa-a-am a-na um-mi-šu 26[um-m]i a-ta-mar ša-ni-tam 27[šu-na-ta a-ta]mar e-mi-a i-na zu-ki-im 28[i-na?] Unuk-(ki) ri-bi-tim10 29ḫa-aṣ-ṣi-nu   na-di-i-ma 30e-li-šu   pa-aḫ- ru 31ḫa-aṣ-ṣi-nu-um-ma ša-ni bu-nu-šu 32a-mur-šu-ma aḫ-ta-ta a-na-ku 33a-ra-am-šu-ma ki-ma áš-ša-tim 34a-ḫa-ap-pu-up   el-šu 35el-ki-šu-ma áš-ta-ka-an-šu 36a-na     a-ḫi-i̭a 37um-mi iluGilgamish mu-da-at ka-la-ma 38[iz-za-kar-am a-na iluGilgamish] ................................... [213] COL. II 1aš-šum uš-[ta-] ma-ḫa-ru it-ti-ka. 2iluGilgamish šu-na-tam i-pa-šar 3iluEn-ki-[dû w]a?-ši-ib ma-ḫar ḫa-ri-im-tim 4UR [ ]-ḫa-mu DI-?-al-lu-un 5[ ] im-ta-ši a-šar i-wa-al-du 6ûmê 611 ù 7 mu-ši- a-tim 7iluEn-ki-dû te-bi-   i-ma 8ša-[am-ka-ta]   ir- ḫi 9ḫa-[ri-im-tu pa-a]-ša i-pu-ša-am-ma 10iz-za-[kar-am] a-na iluEn-ki-dû12 11a-na-ṭal-ka dEn-ki-dû ki-ma ili ta-ba-áš-ši 12am-mi-nim it-ti na-ma-áš-te-e13 13ta-at-ta-[na-al-]la -ak ṣi-ra-am 14al-kam   lu-ùr-di-   ka 15a-na libbi Uruk-(ki) ri-bi-tim 16a-na biti [el-]lim mu-ša-bi ša A-nim 17dEn-ki-dû ti-bi lu-ru-ka 18a-na É-[an-n]a mu-ša-bi ša A-nim 19a-šar [iluGilgamiš] it-[.........] ne-pi-ši-tim(?) 20ù at-[   ]-di [   -] ma 21ta-[   ] ra-ma-an-   ka [214] 22al-ka ti-ba i-[na] ga-ag-ga-ri 23ma-a-a?14 -ak ri-i-im 24iš-me a-wa-az-za im-ta-gár ga-ba-ša 25mi-il-kum ša sinništi 26im-ta-[ku]-ut a-na libbi-šu 27iš-ḫu-uṭ li-ib-ša-am 28iš-ti-nam [ú]-la-ab-bi-iš-šu 29li-ib- [ša-am] ša-ni-a-am 30ši-i it-ta-al-ba- áš 31ṣa-ab-ta-at ga-az- zu 32ki-ma ? i-ri-id-di-šu 33a-na gu-up-ri ša ri-i-im 34a-š[ar   ] tar-ba-ṣi-im 35i-na [   ]-ḫu-ru ri-i̭a-ú15 36............................. (About two lines broken away.) COL. III 1ši-iz-ba ša na-ma-áš-te-e 2i-te-en-   ni-   iḳ 3a-ka-lam iš-ku-nu ma-ḫar-šu 4ip-te-iḳ-ma i-na -aṭ-ṭal16 5ù ip-pa-al-la-   as 6u-ul i-di dEn-ki- dû 7aklam a-na a-ka-lim 8šikaram   a-na ša-te-e-im 9la-a   lum-mu-   ud [215] 10ḫa-ri-im-lum pi-ša i-pu-ša-am- ma 11iz-za-kar-am a-na iluEn-ki-dû 12a-ku-ul ak-lam dEn-ki-dû 13zi-ma-at ba-la-ṭi-im 14bi-ši-ti ši-im-ti ma-ti 15i-ku-ul a-ak-lam iluEn-ki-dû 16a-di ši-bi-e-šu 17šikaram iš-ti-a-am 187 aṣ-ṣa-am-mi-im17 19it-tap-šar kab-ta-tum i-na-an-gu 20i-li-iṣ libba- šu- ma 21pa-nu-šu [it-]ta(?)-bir -ru18 22ul-tap-pi-it [............]-i 23šu-ḫu-ra-am pa-ga-ar-šu 24ša-am-nam ip-ta-ša-áš-ma 25a-we-li-iš i-mē 26il-ba- áš li-ib-ša-am 27ki-ma mu-ti i-ba-áš-ši 28il-ki ka-ak-ka-šu 29la-bi ú gi-ir- ri 30iš-sa-ak-pu šab-[ši]-eš mu-ši-a-ti 31ut- tap -pi-iš šib-ba-ri19 32la-bi uk-t[a ]-ši-id 33it-ti immer na-ki-[e?] ra-bu-tum 34iluEn-ki-dû ma-aṣ-ṣa-ar-šu-nu 35a-we-lum wa-ru-um 36iš-[te]-en id-lum 37a-na[ ........ u]-za-ak-ki-ir ........................... (About five lines broken away.) [216] REVERSE I .............................. 1i-ip-pu-uš     ul-ṣa-am 2iš-ši-ma   i-ni-i-šu 3i-ta-mar   a-we-lam 4iz20-za-kar-am   a-na ḫarimti 5ša-am-ka-at uk-ki-ši21 a-we-lam 6a-na mi-nim    il-li-kam 7zi-ki-ir-šu   lu-uš-šu22 8ḫa-ri-im-tum iš-ta-si a-we-lam 9i-ba-uš-šu-um-ma i-ta-mar-šu 10e-di-il23 e-eš-ta-ḫi-[ṭa-am] 11mi-nu   a-la-ku-zu na-aḫ-24 [     -]ma 12e pi-šu    i-pu-ša-am-[ma] 13iz-za-kar-am a-na iluEn-[ki-dû] 14bi-ti-iš e-mu-tim [                ] 15ši-ma-a-at    ni-ši-i-   ma 16tu-ṣa25-ar pa-a-ta-tim26 17a-na âli dup-šak-ki-i e ṣi-en 18UG-AD-AD-LIL e-mi ṣa-a-a-ḫa-tim [217] 19a-na šarri Unuk-(ki) ri-bi-tim 20pi-ti pu-uk epši27 a-na ḫa-a-a-ri 21a-na iluGilgamiš šarri ša Unuk-(ki) ri-bi-tim 22pi-ti pu-uk epši28 23a-na ha-a-a-ri 24áš-ša-at ši-ma-tim i-ra-aḫ-ḫi 25šu-u pa-na-nu-um-ma 26mu-uk wa-ar-ka-nu 27i-na mi-il-ki ša ili ga-bi-ma 28i-na bi-ti-iḳ a-pu-un-na-ti-šu29 29ši- ma- az- zum 30a-na zi-ik-ri id-li-im 31i-ri-ku pa-nu-šu REVERSE II ............................................................ (About five lines broken away.) 1i-il-la-ak- .......... 2ù ša-am-ka-at[     ]ar-ki-šu 3i- ru- ub-ma30 a-na31 libbi Uruk-(ki) ri-bi-tim 4ip-ḫur um-ma-nu-um i-na ṣi-ri-šu 5iz-zi-za-am-ma i-na zu-ki-im 6ša Unuk-(ki) ri-bi-tim 7pa-aḫ-ra-a-ma ni-šu [218] 8i-ta-mē-a   i-na ṣi-ri-šu pi(?)-it-tam32 9a-na mi-[ni]33 iluGilgamiš ma-ši-il 10la-nam   ša- pi-  il 11e-ṣi[   pu]-uk-ku-ul 12    i ? -ak-ta 13i[-    -]di   i-ši? 14ši-iz-ba ša[na-ma-]áš-[te]-e 15i-te-  en-  ni-   iḳ 16ka-i̭ā-na i-na [libbi] Uruk-(ki) kak-ki-a-tum34 17id-lu-tum u-te-el-li-   lu 18ša-ki-in  ip-ša-   nu35 19a-na idli ša i-tu-ru   zi-mu-šu 20a-na iluGilgamiš ki-ma i-li-im 21ša-ki-iš-šum36 me-iḫ-rum 22a-na ilatIš-ḫa-ra ma-i̭ā-lum 23na-   [di]-i-   ma 24iluGilgamish id-[   ]na-an(?)... 25i-na mu-ši in-ni-[    -]id 26i-na-ak37-ša-am- ma 27it-ta-[    ]i-na zûki 28ip-ta-ra-[ku   ]-ak-tām 29ša   iluGilgamish 30........... da-na(?) ni-iš-šu COL. III 1ur-(?)ḫa ..................... 2iluGilgamiš ................ 3i-na ṣi-ri .................... [219] 4i-ḫa-an-ni-ib [pi-ir-ta-šu?] 5it-bi-ma ... 6a-na pa-ni- šu 7it-tam-ḫa-ru i-na ri-bi-tu ma-ti 8iluEn-ki-dû ba-ba-am ip-ta-ri-ik 9i-na ši-pi-šu 10iluGilgamiš e-ri-ba-am u-ul id-di-in 11iṣ-ṣa-ab-tu-ma ki-ma li-i-im 12i- lu- du38 13zi-ip-pa-am ’i-bu- tu 14i-ga-rum ir-tu-tū39 15iluGilgamiš ù iluEn-ki- dû 16iṣ-ṣa-ab-tu-ù- ma 17ki-ma li-i-im i-lu-du 18zi-ip-pa-am ’i-bu- tu 19i-ga-rum ir-tu-tū 20ik-mi-is-ma iluGilgamiš 21i-na ga-ga-ag-ga-ri ši-ip-šu 22ip-ši-iḫ40 uṣ-ṣa-šu- ma 23i-ni-’i i-ra-az-zu 24iš-tu i-ra-zu i-ni-ḫu41 25iluEn-ki-dû a-na ša-ši-im 26iz-za-kar-am a-na iluGilgamiš 27ki-ma iš-te-en-ma um-ma-ka 28ú- li- id- ka 29ri-im-tum ša zu- pu-ri 30ilat-Nin- sun- na 31ul-lu e-li mu-ti ri-eš-su [220] 32šar-ru-tam ša ni-ši 33i-ši-im-kum iluEn-lil duppu 2 kam-ma šu-tu-ur e-li … 4 šu-ši42 1 Here this late text includes both variants pašāru and zakāru. The earlier texts have only the one or the other. 2 For kakabê; b becomes u̯ and then is reduced to the breathing. 3 The variants have kima kiṣri; ki-[ma]?-rum is a possible reading. The standard Assyrian texts regard Enkidu as the subject. 4 Var. da-an 5 ŠAM-KAK = ilu, net. The variant has ultaprid ki-is-su-šu, “he shook his murderous weapon.” For kissu see ZA. 9,220,4 = CT. 12,14b 36, giš-kud = ki-is-su. 6 Var. nussu for nuš-šu = nušša-šu. The previous translations of this passage are erroneous. 7 This is to my knowledge the first occurence of the infinitive of this verb, paḫēru, not paḫāru. 8 Text ma? 9 ištanamma > ištilamma. 10 Cf. Code of Hammurapi IV 52 and Streck in Babyloniaca II 177. 11 Restored from Tab. I Col. IV 21. 12 Cf. Dhorme Choix de Textes Religieux 198, 33. 13 namaštû a late form which has followed the analogy of reštû in assuming the feminine t as part of the root. The long û is due to analogy with namaššû a Sumerian loan-word with nisbe ending. 14 Room for a small sign only, perhaps A; māi̭āk? For mâka, there, see BEHRENS, LSS. II page 1 and index. 15 Infinitive “to shepherd”; see also Poebel, PBS. V 106 I, ri-i̭a-ú, ri-te-i̭a-ú. 16 The text has clearly AD-RI. 17 Or azzammim? The word is probably an adverb; hardly a word for cup, mug (??). 18 it is uncertain and ta more likely than uš. One expects ittabriru. Cf. muttabrirru, CT. 17, 15, 2; littatabrar, EBELING, KTA. 69, 4. 19 For šapparu. Text and interpretation uncertain. uttappiš II² from tapāšu, Hebrew tāpaś, seize. 20 Text ta! 21 On ekēšu, drive away, see Zimmern, Shurpu, p. 56. Cf. uk-kiš Myhrman, PBS. I 14, 17; uk-ki-ši, King, Cr. App. V 55; etc., etc. 22 The Hebrew cognate of mašû, to forget, is našâ, Arabic nasijia, and occurs here in Babylonian for the first time. See also Brockelman, Vergleichende Grammatik 160 a. 23 Probably phonetic variant of edir. The preterite of edēru, to be in misery, has not been found. If this interpretation be correct the preterite edir is established. For the change r > l note also attalaḫ < attaraḫ, Harper, Letters 88, 10, bilku < birku, RA. 9, 77 II 13; uttakkalu < uttakkaru, Ebeling, KTA. 49 IV 10. 24 Also na-’-[     -]ma is possible. 25 The text cannot be correct since it has no intelligible sign. My reading is uncertain. 26 Text uncertain, kal-lu-tim is possible. 27 KAK-ši. 28 KAK-ši. 29 Literally nostrils. pitik apunnati-šu, work done in his presence(?). The meaning of the idiom is uncertain. 30 Text ZU! 31 Text has erroneous form. 32 Text PA-it-tam clearly! 33 Omitted by the scribe. 34 Sic! The plural of kakku, kakkîtu(?). 35 Cf. e-pi-ša-an-šu-nu libâru, “May they see their doings,” Maḳlu VII 17. 36 For šakin-šum. 37 On the verb nâku see the Babylonian Book of Proverbs § 27. 38 The verb la’āṭu, to pierce, devour, forms its preterite iluṭ; see VAB. IV 216, 1. The present tense which occurs here as iluṭ also. 39 Note BUL(tu-ku) = ratātu (falsely entered in Meissner, SAI. 7993), and irattutu in Zimmern, Shurpu, Index. 40 “For ipšaḫ.” 41 Sic! ḫu reduced to the breathing ’u; read i-ni-’u. 42 The tablet is reckoned at forty lines in each column, Translation 1Gilgamish arose interpreting dreams, 2addressing his mother. 3“My mother! during my night 4I, having become lusty, wandered about 5in the midst of omens. 6And there came out stars in the heavens, 7Like a … of heaven he fell upon me. 8I bore him but he was too heavy for me. 9He bore a net but I was not able to bear it. 10I summoned the land to assemble unto him, 11that heroes might kiss his feet. 12He stood up before me1 13and they stood over against me. 14I lifted him and carried him away unto thee.” 15The mother of Gilgamish she that knows all things, 16said unto Gilgamish:— [212] 17“Truly oh Gilgamish he is 18born2 in the fields like thee. 19The mountains have reared him. 20Thou beholdest him and art distracted(?) 21Heroes kiss his feet. 22Thou shalt spare him…. 23Thou shalt lead him to me.” 24Again he dreamed and saw another dream 25and reported it unto his mother. 26“My mother, I have seen another 27[dream. I beheld] my likeness in the street. 28In Erech of the wide spaces3 29he hurled the axe, 30and they assembled about him. 31Another axe seemed his visage. 32I saw him and was astounded. 33I loved him as a woman, 34falling upon him in embrace. 35I took him and made him 36my brother.” 37The mother of Gilgamish she that knows all things 38[said unto Gilgamish:—] ................................... [213] COL. II 1that he may join with thee in endeavor.” 2(Thus) Gilgamish solves (his) dream. 3Enkidu sitting before the hierodule 4 5[   ] forgot where he was born. 6Six days and seven nights 7came forth Enkidu 8and cohabited with the courtesan. 9The hierodule opened her mouth 10speaking unto Enkidu. 11“I behold thee Enkidu; like a god thou art. 12Why with the animals 13wanderest thou on the plain? 14Come! I will lead thee 15into the midst of Erech of the wide places, 16even unto the holy house, dwelling place of Anu. 17Oh Enkidu, arise, I will conduct thee 18unto Eanna dwelling place of Anu, 19where Gilgamish [oppresses] the souls of men(?) 20And as I ............ 21thou shalt ........ thyself. [214] 22Come thou, arise from the ground 23unto the place yonder (?) of the shepherd.” 24He heard her speak and accepted her words with favor. 25The advice of the woman 26fell upon his heart. 27She tore off one garment 28and clothed him with it. 29With a second garment 30she clothed herself. 31She clasped his hand, 32guiding him like .............. 33unto the mighty presence of the shepherd, 34unto the place of the ... of the sheepfolds. 35In ......... to shepherd 36............................. (About two lines broken away.) COL. III 1Milk of the cattle 2he drank. 3Food they placed before him. 4He broke bread4 5gazing and looking. 6But Enkidu understood not. 7Bread to eat, 8beer to drink, 9he had not been taught. [215] 10The hierodule opened her mouth 11and said unto Enkidu:— 12“Eat bread, oh Enkidu! 13It is the conformity of life, 14of the conditions and the fate of the land.” 15Enkidu ate bread, 16until he was satiated. 17Beer he drank 18seven times(?). 19His thoughts became unbounded and he shouted loudly. 20His heart became joyful, 21and his face glowed. 22He stroked................. 23the hair of the head.5 His body 24with oil he anointed. 25He became like a man. 26He attired himself with clothes 27even as does a husband. 28He seized his weapon, 29which the panther and lion 30fells in the night time cruelly. 31He captured the wild mountain goats. 32The panther he conquered. 33Among the great sheep for sacrifice 34Enkidu was their guard. 35A man, a leader, 36A hero. 37Unto .......... he elevated ........................... (About five lines broken away.) [216] REVERSE I .............................. 1And he made glad. 2He lifted up his eyes, 3and beheld the man, 4and said unto the hierodule:— 5“Oh harlot, take away the man. 6Wherefore did he come to me? 7I would forget the memory of him.” 8The hierodule called unto the man 9and came unto him beholding him. 10She sorrowed and was astonished 11how his ways were ............ 12Behold she opened her mouth 13saying unto Enkidu:— 14“At home with a family [to dwell??] 15is the fate of mankind. 16Thou shouldest design boundaries(??) 17for a city. The trencher-basket put (upon thy head). 18.... ......an abode of comfort. [217] 19For the king of Erech of the wide places 20open, addressing thy speech as unto a husband. 21Unto Gilgamish king of Erech of the wide places 22open, addressing thy speech 23as unto a husband. 24He cohabits with the wife decreed for him, 25even he formerly. 26But henceforth 27in the counsel which god has spoken, 28in the work of his presence 29shall be his fate.” 30At the mention of the hero 31his face became pale. REVERSE II ............................................................ (About five lines broken away.) 1going ....................... 2and the harlot ..... after him. 3He entered into the midst of Erech of the wide places. 4The artisans gathered about him. 5And as he stood in the street 6of Erech of the wide places, 7the people assembled [218] 8disputing round about him:— 9“How is he become like Gilgamish suddenly? 10In form he is shorter. 11In ........ he is made powerful. 12 13 14Milk of the cattle 15he drank. 16Continually in the midst of Erech weapons 17the heroes purified. 18A project was instituted. 19Unto the hero whose countenance was turned away, 20unto Gilgamish like a god 21he became for him a fellow. 22For Išhara a couch 23was laid. 24Gilgamish ................... 25In the night he .............. 26embracing her in sleep. 27They ........ in the street 28halting at the ................ 29of Gilgamish. 30.......... mightily(?) COL. III 1A road(?) .................... 2Gilgamish ................... 3in the plain .................. [219] 4his hair growing thickly like the corn. 5He came forth ... 6into his presence. 7They met in the wide park of the land. 8Enkidu held fast the door 9with his foot, 10and permitted not Gilgamish to enter. 11They grappled with each other 12goring like an ox. 13The threshold they destroyed. 14The wall they demolished. 15Gilgamish and Enkidu 16grappled with each other, 17goring like an ox. 18The threshold they destroyed. 19The wall they demolished. 20Gilgamish bowed 21to the ground at his feet 22and his javelin reposed. 23He turned back his breast. 24After he had turned back his breast, 25Enkidu unto that one 26spoke, even unto Gilgamish. 27“Even as one6 did thy mother 28bear thee, 29she the wild cow of the cattle stalls, 30Ninsunna, 31whose head she exalted more than a husband. [220] 32Royal power over the people 33Enlil has decreed for thee.” Second tablet. Written upon ... 240 (lines). [221] 1 Literally “he attained my front.” 2 IV¹ of walādu. 3 I.e., in the suburb of Erech. 4 patāḳu has apparently the same sense originally as batāḳu, although the one forms its preterite iptiḳ, and the other ibtuḳ. Cf. also maḫāṣu break, hammer and construct. 5 The passage is obscure. Here šuḫuru is taken as a loan-word from suģur = ḳimmatu, hair of the head. The infinitive II¹ of saḫāru is philologically possible. 6 I.e., an ordinary man. Index to Parts 2 and 3 A. Adab, city, 123, 23. addi, wailing, 117, 31; 137, 22; 161, 12. aḫu, brother, 212, 36. Aja, goddess, 198, 9. al (giš), al-gar (giš), a musical instrument, 187–191. See also No. 20 Rev. 7–12. al-bi, compound verb, 189 n. 6. In Ni. 8164 (unpublished) al-gar, al-gar-balag in list with (giš)-á-lá, also an instrument of music. alad, protecting genius, 154, 18. ameliš, like a man, 215, 25. Amurrû, god. Psalm to, 118; 119. angubba, sentinel, 180, 14. Anu, god. 116, 18:26 ff. 131, 8; 165, 9; 180, 20. Anunnaki, gods, 114, 17:21; 116, 25; 116 n. 7; 128, 13; 135, 31; 189, 21. Anunit, goddess, 158, 12; 166, 2. apunnatu, nostrils, pitiḳ, apunnāti, 217, 28. aṣṣammim (?), 215, 18. Arallû, 132, 26; 134, 7. arāmu, cover, 198 n. 2. arāḳu, be pale, Prt. iriku, 217, 31. arḫiš, quickly, 199, 28. Aruru, goddess. Lamentation to, 115. Sister of Enlil, 115, 2; 171, 29; 190, 25. Other references, 116, 13:15:18; 117, 34 f. Asarludug, god, 163, 8; 170, 4. Aš-im-ur, title of Moon-god, 136, 12. áš omitted, No. 19, 2. aš-me, disk, 133, 38. Ašširgi, god, No. 22, Rev. 7. Azagsud, goddess, 196, 30:33; 197, 38. B. Babbar, god, 116, 24; 139, 43; 147, 21; 148, 3; 152. Babylon, city, 158, 14; 160, 6; 163, 8; 166, 4:11. badara, see 200 n. 2. badarani, a weapon, 133, 36. balag, lyre, 138, 52. bansur, table; title of a goddess, 175, 3. Bau, goddess, 179, 2; 181, 30; 182, 32; 141, 7:10. bišîtu, condition, 215, 14. bi’u, cavern, 196, 29. bulukku, crab, 174, 5. burgul, engraver, 185, 8. C. Cutha, city. Center of the cult of Nergal, 167, 15. D. Dada, god, 192, 6. Dagan, West Semitic god, 149, 21. Damu, title of Tammuz, 176, 7. Deification of kings, 106–9; 127 n. 1. dêpu, shatter, 195 n. 16. [222] DI-BAL, ideogram in incantations, 194, 10. Dilbat, city, 167, 16. Dilmun, land and city, 112, 2:4. dimgul, dimdul, master workman, 150. dingir-gal-gal-e-ne, the great gods, the Anunnaki, 114, 21:125; 149, 19. dumu-anna, daughter of heaven, title of Bau, 179, 5; 181, 28; 184, 28. dumu-sag, title of Tašmet, 163, 12. Dungi, king of Ur, liturgy to, 136. dupšakku, trencher basket, 216, 17. Duranki, epithet for Nippur, 122, 18; 180, 11. E. E-anna, temple in Erech, 123, 30; 125; 148, 12; 213, 18. E-babbar, temple of the sun god, 152; 158, 11; 166, 1. Perhaps read E-barra. E-daranna, temple of Enki in Babylon, 169, 25; 170, 29. See BL. 133. edēlu = edēru, be gloomy, 216, 10. é-dub, house of learning, 117, 39. é-gal, palace, No. 19, Rev. 3; 115, 11; 131, 7; 134, 22; 158, 9. é-gig = ḳiṣṣu, 191, 11. E-ibe-Anu, temple in Dilbat, 167, 16. E-kinammaka, temple, 115, 10. E-kišibba, temple in Kish, 166, 13. E-kur, temple, 180, 12; 183, 23; 190, 7; 146, 9; 147, 17; 158, 8; 160, 4; 166, 17; 169, 23. Emaḫ, Ešmaḫ, ritual house of the water cult of Marduk, 163, 7; 115, 4. E-malga-sud, temple, 181, 24; 141, 3. E-meteg, daughter of Ninkasi, 144. E-mete-ursag, temple in Kish, 166, 13. E-namtila, temple, 160, 4; 169, 24. en-a-nu-un, en-á-nun, title of Innini and Gula, 173, 2. Enbilulu, title of Marduk, 170, 5. E-ninnû, temple, 181, 22. EN-ḪUL-tim-mu, 194 n. 2. EN-KA-KA, bêl dabābi, 194, 2. Enki, god. Hymn to, No. 20, 113, 7; 114, 10; 116, 21; 122, 7; 149, 16. Enkidu, satyr, 213, 3:7:10:11; 214, 6; 215, 11:12:15:34; 216, 13; 219, 8:15:25; 131, 11; 134, 16; 178, 13. Enlil, god. Liturgy to, 155–184. Regarded as god of light, 157, 1 ff. 158, 3 f. Other references, 114, 19; 115, 2; 116, 19; 131, 6; 136, 5; 139, 40; 149, 22; 146, 3:7:14; 189, 11:19; 220, 33. Enul, god, 149, 16. Enzu, god, 139, 41; 146, 3. epšānu, deeds, 218, 18. epû, be dark, I² itêpû, 196, 29. Erech, city, 125; 149, 13. Erech ribîtim, 212, 28; 213, 15; 217, 19:21; 217, 3:6. eri-azag, holy city, Isin, 141, 8. erida, title, 175, 1. Eridu, city, 113, 20; 136, 13. Erishkigal, goddess, 131, 10; 134, 11. eršagtugmal, penitential psalm, 118. E-sagila, temple, 152. E-sakudkalamma, temple, 166, 10; 169 n. 4. ešendili, a title, 177, 10. [223] eškar, fixed tax, 188, 9. eš-lal, a sacred place, 161, 14. E-temen-anki, temple, 169, 25. E-turkalamma, temple, 166, 14. Euphrates, river, 183, 12; 183, 20. E-zida, temple, 166, 12. Ezina, grain goddess, 174, 9. Ezira, reading of the divine name KA-DI, 177, 11. F. Fara, modern Arabic name for the site of Isin (?), 177 n. 4. G. GAB, baked bread, 200, 33. GAB-LAL, a cake made with honey, 195, 22; 200, 35. GAR-šunnu = epišan-šunu, 198, 13. gašan-gula, title of Ninâ, 119 n. 2. gepar, dark chamber, 123, 30 f., 148, 10; 161, 18. Gibil, god, 197, 3. gi-gál(giš),interlude, 151 n. 1; 182, 33. gigunna, 114, 23. Gilgamish, king of Erech, 207; 211, 1:115 f. 212, 17:37; 213, 2; 217, 21; 218, 9:20:24:29 and below 2; 219, 10;15:20:26. Derivation of name, 208. See also No. 16 Rev. II 15; 197, 42; 124 f. gilsa, a sacred relic, 132, 22. Girra, Irra, god, 174, 7; 177, 12. girru, lion, 215, 29. Girsu, city, 181, 23. Guanna, deity, No. 16 Rev. II 18. Guedin, province, 129, 28. Gunura, goddess of healing, 176, 6. gupru, mighty, 214, 33. Gutium, land, 120 ff. H. Hallab, city, 125; 141. ḫanābu, grow thickly, Prs. ibannib, 219, 4. ḫapāpu, embrace, 212, 34. ḫaṣṣinu, axe, 212, 29:31. ḫarbatu, waste place, 200, 39. Harsagkalamma, temple, 166, 14. Hubur, mythical river, 197, 42. ḫûlu, a bird, 199, 31. ḫûḳu, a bird, 199, 31. I. Ibi-Sin, king of Ur, 151 n. 2. ibsi, liturgical expression, 120, 5. Igigi, heaven spirits, 116 n. 6. IGI-NAGIN-NA, 194, 11. imib, weapon, 131, 8. mi-ib, ibid. n.3. imin, seven. Seven lands, 130, 35; seventh day, 134, 18. Immer, god, 177, 8. Indag, god, consort of Gula, 173, 3. Innini, goddess, 123. Liturgy to, 184; 123, 29. Consort of Shamash, 148, 4. Other references, 154, 21. iṣṣur šamê, unclean birds, 195 n. 10. Išhara, goddess, 218, 22. Isin, city, 122, 15; 176, 4. Ishme-Dagan, 178 ff. Son of Enlil, 181, 29; 182, 32. Liturgy to, 143. K. KA-DIB-BI, sibit pî, 194, 10. KAK-DIG, a weapon, 130, 4. kakkitu (?), weapon. Pl. kakkiatum, 218, 16. KAK-SIR, a weapon (?), 130, 4. [121] kalama, the Land, Sumer, 138, 25; 141, 5; 147, 22; 150, 4; 154, 17; 177, 9. kanami=kalama, land, 120, 8. KA-NE, a new ideograph, 153 n. 10. kasû, bind. I² liktisu, 198, 20. Kenurra, chapel of Ninlil, 114, 22; 123, 20; 160, 4; 166, 18; 166, 8; 169, 24. Keš, city, 115, 11; 123, 22. kešda-azag, a relic, 132, 27. ki, kin for gim = kima, 120, 6. KI-AG-MAL, râmu, 194 n. 4. Kidurkazal, daughter of Ninkasi, 145. ki-malla, to bend. tig-zu ki-ma-al-la nu-gí-gí, “Thy neck wearies not in bending,” 168, 2. [Correct the translation.] ki-in-gin, ki-en-gin, Sumer, 115, 24; 134, 19; 189, 17. KI-SAR, ḳaḳḳara tašabbiṭ, 199, 29. Kish, city, 129, 30; 166, 12. é kiš-(ki)-šú, so read, No. 5 Obv. 8. Kullab, city, 149, 14; 173, 1. kunin, gunin, reed basket, 150 n. 3. kurgal, “great mountain,” title of Sumer, 114, 11. Of Enlil, 114, 19; 182, 5. KURUN-NA, (amelu), 196, 34. KUŠ-KU-MAL, 194, 11. L. la’aṭu, gore. Prt. ilûdu, 219, 12:17. labu, panther, 215, 29:32. Lagash, city, 181, 23:26. Laḫama, goddess of Chaos, 113, 5. Laws, promulgated by Dungi, 138, 31. Libit-Ishtar, king, 141. libšu, garment, 214, 27:29; 215, 26. Ligirsig, a god, 113, 3. lilazag, epithet of a deified king, 141, 1. Lillaenna, goddess, 192, 5. limēnu, be evil. II¹ ulammenu-inni, 197, 7. Lugal-dīg, god, 197, 5. lu’ûtu, pollution, 195, 19. M. Magan, land, 112, 2:5. mai̭ālu, couch, 218, 22. malāšu, shear, 195, 20. Mamit, 200, 41. mandatu, form, 195, 21. mal-gar (gi), a musical instrument, 191, 10. mangu, disease, 195, 19. Marduk, god, 151. markasu, leader, 150. masû, seize, 195 n. 5. mašû, to forget, 216, 7. Me-azag, daughter of Ninkasi, 144. meḫru, fellow, 218, 21. Meḫuš, daughter of Ninkasi, 144. Meluḫḫa, land, 112, 6. Meslam, temple in Cutha, 167, 15. mesû, a tree, 159, 23. muk, now, but now, 217, 26. Mulgenna, Saturn, 137, 18. Mulmul, gods, 142. N. nâdu, water bottle, 198, 17. nadîtu, temple devotee, 188, 7. nagû, shout. Prs. inangu, 215, 19. nâku, embrace, 218, 26. namaštû, cattle, etc., 213, 12:17; 214, 1; 219, 14. Namtar, god, 197, 3; 132, 24. Nangt, goddess, 192, 7. [225] Nannar, god, 115, 12; 116, 23; 133, 38; 137, 11; 150, 2. Nergal, god, 131, 6. Nidaba, goddess, 191. ni-gál, cattle, 121, 6. nimir = ligir, 174, 4. ninda, linear measure, 133, 41. Ningal, goddess, No. 19, 5; 148, 3; 151, 3. Ningišzida, god, 133, 34. Nin-isinna, goddess, 122, 16; 191, 15. Ninkasi, goddess, 144. Ninki, goddess, 149, 16. Ninlil, goddess, 116, 20; 123, 20; 137, 12; 146, 14. Ninmada, daughter of Ninkasi, 144. Ninmaḫ, goddess, 116, 22. Ninmenna, epithet of Damgalnunna, 190, 27. Ninsun, goddess, 219, 30; 208 n. 6; 129; 131, 16 (?). Nintudri, goddess, 123, 26. Nintudra, 137, 16. Creatress of man and woman, 192. Ninul, goddess, 149, 16. Ninurašâ, god, 191, 12; 146, 12. Ninzuanna, goddess, 122, 13. Nippur, city, 112, 8; 122, 18:19; 160, 3; 169, 21; 180, 11; 149, 18; 158, 7; 165, 16. NI-SUR (amelu), 196, 35. Nudimmud, god, 199, 25. No. 20, 10. nugiganna, epithet of Innini, 185, 2. nûn apsi, unclean fish, 195 n. 11. Nunamnirri, god, 190, 28; 146, 13; 180, 10:13:17. nun-ùr, epithet of Amurrû, 119, 3. Nusiligga, daughter of Ninkasi, 144. Nusku, god, 146, 7; 163, 13. P. Pabilsag, god. Son and consort of Gula, 173 n. 3; 176, 5. A form of Tammuz. pananumma, formerly, 217, 25. Panunnaki, goddess, consort of Marduk, 163, 9. patāḳu, fashion, break, 214, 4. paturru, a weapon, 200, 37. Pleiades, 142. R. ratātu, demolish, 219, 19. Rimat ilatNinsun, 208 n. 6; 219, 29. Ruškišag, goddess, 132, 28. RU-TIG, an epithet, 141, 2. S. sa-bar; sa-sud-da, liturgical note, 182, 31. šabšiš, cruelly, 215, 30. Sagilla, temple, 158, 15. E-sagila, 160, 5; 166, 5; 166, 11. šaḫātu, be astounded, 216, 10. Arabic saḫiṭa. ṣai̭āḫatu, desire, comfort, 216, 18. šakāpu, fell. I² išsakpu, 215, 30. ṣalûtu, enmity, 199, 27. Šamaš, god, 197, 4:8; 198, 10:13; 199, 25:31. Šamaš-šum-ukin, king. Incantations for, 193–200; 199, 23. Samsuiluna, king, 151. SAR-DI-DA, a relic, 133, 37. Serpent adversary, 183, 21; 148, 12. Seven, sacred number. Seven gods, 196, 30. Ship, in legend, 113, 2. Silsirsir, a chapel. Sin, god. Hymn to, No. 19. sippu, threshold, 219, 13:18. [226] Sippar, city, 158, 10; 160, 5; 166, 19. sirgidda, long song, 140, 54. Siriš, daughter of Ninkasi, 144. Siriškaš, daughter of Ninkasi, 144. Siriškašgig, daughter of Ninkasi, 144. sirsagga, first melody, 117, 28; 139, 48. ŠU-AN = kat ili, 194, 12. See also ŠU-dINNINI, 194, 12. ŠU-NAM-ERIM-MA, 194, 13. ŠU-NAM-LU-GAL-LU, 194, 13. subura, earth, 175, 3. su-ud, sú-ud-ám, epithet of goddess of Šuruppak, 177, 10 and note 4. šuḫuru, hair (?), 215, 23. sukkal-zid, title of Nebo, 163, 10. Šulpae, god, No. 16 II 22. Sumer, land, 113, 21; 114, 11; 136, 2. sumugan, title of Girra, 177, 12 and note; 179, 3. T. Tablet of fates, 132 n. 3. Tammuz, ancient ruler, 208. Liturgy to, 191. Other references, 126; 208; 131, 20. tapāšu, seize, capture, II² uttappiš, 215, 31. temēru, cook, 196, 35. Tigris, river, 183, 12. Tummal, land, 190, 9; 191, 10. U. ud, spirit, word, 150, 1:4; 158, 16; 159, 17:24. ul-al-tar, 191 n. 6. ulinnu, girdle cord, 195, 20. Ulmaš, temple of Anunit, 158, 13; 166, 3. Ur, city, 134, 21; 137, 6. Lamentation for, 150. Other references, No. 19, 4:7:8:16:28: Rev. 5; 151, 3. Ur-azag, king of Isin (?), 140 n. 2. Ur-Engur, king of Ur, 126 ff. urinu, spear (?), 173, 3. ursaggal, epithet for Ninurašā, 165, 11. For Enbilulu, 170, 5. ušumgal, 117, 33. Z. zâbu, flow. li-zu-bu, 198, 16. Cf. gàm = za’ibu, miṭirtu, words for canal, SAI. 691–3. zag-sal, liturgical note, 103 f. No. 21 end. za-am, 138, 34; 139, 38; 140, 56. zênu, be enraged, II¹ uzinu-inni, 197, 6. ZI-TAR-RU-DA = nikis napišti, 194 n. 6. [124] Description of Tablets Number in this volume. 1 Museum number. 7771 Description. Dark brown unbaked tablet. Three columns. Lower edge slightly broken. Knobs at left upper and left lower corners to facilitate the holding of the tablet. H. 7 inches: W. 6½; T. 1½. Second tablet of the Epic of Gilgamish. [125] Autograph Plates Plate LXIII. Plate LXIV. Plate LXV. Plate LXVI. Plate LXVII. Plate LXVIII. Plate LXIX. Tablet of the Gilgamish Epic (Obverse) Plate LXX. Tablet of the Gilgamish Epic (Reverse) *** END OF THE PROJECT GUTENBERG EBOOK THE EPIC OF GILGAMISH ***

      Comparing this version of the Babylonian text, there is a greater focus on when Gilgamesh was with Enkidu on how he was a great companion for him. Dreams are seen as divine in Mesopotamian culture so it is interesting that Gilgamesh was able to foreshadow the presence of Enkidu ahead of time. Because of this dream, it shows that it was a part of destiny for Gilgamesh to find his equal and was a journey for his own identity to become what it is now. Not to mention, Enkidu becoming tame as time went along and reforming into societal norms shows that outsiders can be assimilated and that is what is needed in many nations in order for them to be successful and functional. One instance of us vs them situation would be for Enkidu. He was a wild man at first which was very different from "them" which were the Uruk people as they were calm and controlled. The transition for Enkidu to becoming like the others were crucial if he wanted to be a companion of Gilgamesh and also become a figure that would be respected by others. It also points to the fact that people need to be like others to some extent in order to be liked and respected. Because Enkidu and Gilgamesh are the only prominent characters that are also male, it suggests that during this time period that females were inferior to some extent and did not hold the same respect or regard because they were unable to showcase their own skills or talents. This may have affected the way that the text is written because the perspective of males can contrast those of women because they tend to be more caring and honest. Not to mention, the other translations of the text mention more about Gilgamesh's longing for immortality so the absence or lack of information on that aspect creates a more biased view of Enkidu and alters the way that Gilgamesh is viewed as well. The low point of the text has to be more in the beginning when the people are complaining about the rule of Gilgamesh because he does not contain the same qualities of a good leader that he obtains later on. With that being the case, the text reaches a high point when Gilgamesh sees Enkidu as an equal to himself and embraces him as a companion which allows him to be a much better leader and also allows the people in his land to feel better as a result. CC BY Ajey Sasimugunthan (contact)

    1. The Gilgamesh Epic is the most notable literary product of Babylonia as yet discovered in the mounds of Mesopotamia. It recounts the exploits and adventures of a favorite hero, and in its final form covers twelve tablets, each tablet consisting of six columns (three on the obverse and three on the reverse) of about 50 lines for each column, or a total of about 3600 lines. Of this total, however, barely more than one-half has been found among the remains of the great collection of cuneiform tablets gathered by King Ashurbanapal (668–626 B.C.) in his palace at Nineveh, and discovered by Layard in 18541 in the course of his excavations of the mound Kouyunjik (opposite Mosul). The fragments of the epic painfully gathered—chiefly by George Smith—from the circa 30,000 tablets and bits of tablets brought to the British Museum were published in model form by Professor Paul Haupt;2 and that edition still remains the primary source for our study of the Epic. [10] For the sake of convenience we may call the form of the Epic in the fragments from the library of Ashurbanapal the Assyrian version, though like most of the literary productions in the library it not only reverts to a Babylonian original, but represents a late copy of a much older original. The absence of any reference to Assyria in the fragments recovered justifies us in assuming that the Assyrian version received its present form in Babylonia, perhaps in Erech; though it is of course possible that some of the late features, particularly the elaboration of the teachings of the theologians or schoolmen in the eleventh and twelfth tablets, may have been produced at least in part under Assyrian influence. A definite indication that the Gilgamesh Epic reverts to a period earlier than Hammurabi (or Hammurawi)3 i.e., beyond 2000 B. C., was furnished by the publication of a text clearly belonging to the first Babylonian dynasty (of which Hammurabi was the sixth member) in CT. VI, 5; which text Zimmern4 recognized as a part of the tale of Atra-ḫasis, one of the names given to the survivor of the deluge, recounted on the eleventh tablet of the Gilgamesh Epic.5 This was confirmed by the discovery6 of a [11]fragment of the deluge story dated in the eleventh year of Ammisaduka, i.e., c. 1967 B.C. In this text, likewise, the name of the deluge hero appears as Atra-ḫasis (col. VIII, 4).7 But while these two tablets do not belong to the Gilgamesh Epic and merely introduce an episode which has also been incorporated into the Epic, Dr. Bruno Meissner in 1902 published a tablet, dating, as the writing and the internal evidence showed, from the Hammurabi period, which undoubtedly is a portion of what by way of distinction we may call an old Babylonian version.8 It was picked up by Dr. Meissner at a dealer’s shop in Bagdad and acquired for the Berlin Museum. The tablet consists of four columns (two on the obverse and two on the reverse) and deals with the hero’s wanderings in search of a cure from disease with which he has been smitten after the death of his companion Enkidu. The hero fears that the disease will be fatal and longs to escape death. It corresponds to a portion of Tablet X of the Assyrian version. Unfortunately, only the lower portion of the obverse and the upper of the reverse have been preserved (57 lines in all); and in default of a colophon we do not know the numeration of the tablet in this old Babylonian edition. Its chief value, apart from its furnishing a proof for the existence of the Epic as early as 2000 B. C., lies (a) in the writing Gish instead of Gish-gi(n)-mash in the Assyrian version, for the name of the hero, (b) in the writing En-ki-dũ—abbreviated from dũg—() “Enki is good” for En-ki-dú () in the Assyrian version,9 and (c) in the remarkable address of the maiden Sabitum, dwelling at the seaside, to whom Gilgamesh comes in the course of his wanderings. From the Assyrian version we know that the hero tells the maiden of his grief for his lost companion, and of his longing to escape the dire fate of Enkidu. In the old Babylonian fragment the answer of Sabitum is given in full, and the sad note that it strikes, showing how hopeless it is for man to try to escape death which is in store for all mankind, is as remarkable as is the philosophy of “eat, drink and be merry” which Sabitum imparts. The address indicates how early the tendency arose to attach to ancient tales the current religious teachings. [12] “Why, O Gish, does thou run about? The life that thou seekest, thou wilt not find. When the gods created mankind, Death they imposed on mankind; Life they kept in their power. Thou, O Gish, fill thy belly, Day and night do thou rejoice, Daily make a rejoicing! Day and night a renewal of jollification! Let thy clothes be clean, Wash thy head and pour water over thee! Care for the little one who takes hold of thy hand! Let the wife rejoice in thy bosom!” Such teachings, reminding us of the leading thought in the Biblical Book of Ecclesiastes,10 indicate the didactic character given to ancient tales that were of popular origin, but which were modified and elaborated under the influence of the schools which arose in connection with the Babylonian temples. The story itself belongs, therefore, to a still earlier period than the form it received in this old Babylonian version. The existence of this tendency at so early a date comes to us as a genuine surprise, and justifies the assumption that the attachment of a lesson to the deluge story in the Assyrian version, to wit, the limitation in attainment of immortality to those singled out by the gods as exceptions, dates likewise from the old Babylonian period. The same would apply to the twelfth tablet, which is almost entirely didactic, intended to illustrate the impossibility of learning anything of the fate of those who have passed out of this world. It also emphasizes the necessity of contenting oneself with the comfort that the care of the dead, by providing burial and food and drink offerings for them affords, as the only means of ensuring for them rest and freedom from the pangs of hunger and distress. However, it is of course possible that the twelfth tablet, which impresses one as a supplement to the adventures of Gilgamesh, ending with his return to Uruk (i.e., Erech) at the close of the eleventh tablet, may represent a later elaboration of the tendency to connect religious teachings with the exploits of a favorite hero. [13] We now have further evidence both of the extreme antiquity of the literary form of the Gilgamesh Epic and also of the disposition to make the Epic the medium of illustrating aspects of life and the destiny of mankind. The discovery by Dr. Arno Poebel of a Sumerian form of the tale of the descent of Ishtar to the lower world and her release11—apparently a nature myth to illustrate the change of season from summer to winter and back again to spring—enables us to pass beyond the Akkadian (or Semitic) form of tales current in the Euphrates Valley to the Sumerian form. Furthermore, we are indebted to Dr. Langdon for the identification of two Sumerian fragments in the Nippur Collection which deal with the adventures of Gilgamesh, one in Constantinople,12 the other in the collection of the University of Pennsylvania Museum.13 The former, of which only 25 lines are preserved (19 on the obverse and 6 on the reverse), appears to be a description of the weapons of Gilgamesh with which he arms himself for an encounter—presumably the encounter with Ḫumbaba or Ḫuwawa, the ruler of the cedar forest in the mountain.14 The latter deals with the building operations of Gilgamesh in the city of Erech. A text in Zimmern’s Sumerische Kultlieder aus altbabylonischer Zeit (Leipzig, 1913), No. 196, appears likewise to be a fragment of the Sumerian version of the Gilgamesh Epic, bearing on the episode of Gilgamesh’s and Enkidu’s relations to the goddess Ishtar, covered in the sixth and seventh tablets of the Assyrian version.15 Until, however, further fragments shall have turned up, it would be hazardous to institute a comparison between the Sumerian and the Akkadian versions. All that can be said for the present is that there is every reason to believe in the existence of a literary form of the Epic in Sumerian which presumably antedated the Akkadian recension, [14]just as we have a Sumerian form of Ishtar’s descent into the nether world, and Sumerian versions of creation myths, as also of the Deluge tale.16 It does not follow, however, that the Akkadian versions of the Gilgamesh Epic are translations of the Sumerian, any more than that the Akkadian creation myths are translations of a Sumerian original. Indeed, in the case of the creation myths, the striking difference between the Sumerian and Akkadian views of creation17 points to the independent production of creation stories on the part of the Semitic settlers of the Euphrates Valley, though no doubt these were worked out in part under Sumerian literary influences. The same is probably true of Deluge tales, which would be given a distinctly Akkadian coloring in being reproduced and steadily elaborated by the Babylonian literati attached to the temples. The presumption is, therefore, in favor of an independent literary origin for the Semitic versions of the Gilgamesh Epic, though naturally with a duplication of the episodes, or at least of some of them, in the Sumerian narrative. Nor does the existence of a Sumerian form of the Epic necessarily prove that it originated with the Sumerians in their earliest home before they came to the Euphrates Valley. They may have adopted it after their conquest of southern Babylonia from the Semites who, there are now substantial grounds for believing, were the earlier settlers in the Euphrates Valley.18 We must distinguish, therefore, between the earliest literary form, which was undoubtedly Sumerian, and the origin of the episodes embodied in the Epic, including the chief actors, Gilgamesh and his companion Enkidu. It will be shown that one of the chief episodes, the encounter of the two heroes with a powerful guardian or ruler of a cedar forest, points to a western region, more specifically to Amurru, as the scene. The names of the two chief actors, moreover, appear to have been “Sumerianized” by an artificial process,19 and if this view turns out to be [15]correct, we would have a further ground for assuming the tale to have originated among the Akkadian settlers and to have been taken over from them by the Sumerians. New light on the earliest Babylonian version of the Epic, as well as on the Assyrian version, has been shed by the recovery of two substantial fragments of the form which the Epic had assumed in Babylonia in the Hammurabi period. The study of this important new material also enables us to advance the interpretation of the Epic and to perfect the analysis into its component parts. In the spring of 1914, the Museum of the University of Pennsylvania acquired by purchase a large tablet, the writing of which as well as the style and the manner of spelling verbal forms and substantives pointed distinctly to the time of the first Babylonian dynasty. The tablet was identified by Dr. Arno Poebel as part of the Gilgamesh Epic; and, as the colophon showed, it formed the second tablet of the series. He copied it with a view to publication, but the outbreak of the war which found him in Germany—his native country—prevented him from carrying out this intention.20 He, however, utilized some of its contents in his discussion of the historical or semi-historical traditions about Gilgamesh, as revealed by the important list of partly mythical and partly historical dynasties, found among the tablets of the Nippur collection, in which Gilgamesh occurs21 as a King of an Erech dynasty, whose father was Â, a priest of Kulab.22 The publication of the tablet was then undertaken by Dr. Stephen Langdon in monograph form under the title, “The Epic of Gilgamish.”23 In a preliminary article on the tablet in the Museum Journal, Vol. VIII, pages 29–38, Dr. Langdon took the tablet to be of the late [16]Persian period (i.e., between the sixth and third century B. C.), but his attention having been called to this error of some 1500 years, he corrected it in his introduction to his edition of the text, though he neglected to change some of his notes in which he still refers to the text as “late.”24 In addition to a copy of the text, accompanied by a good photograph, Dr. Langdon furnished a transliteration and translation with some notes and a brief introduction. The text is unfortunately badly copied, being full of errors; and the translation is likewise very defective. A careful collation with the original tablet was made with the assistance of Dr. Edward Chiera, and as a consequence we are in a position to offer to scholars a correct text. We beg to acknowledge our obligations to Dr. Gordon, the Director of the Museum of the University of Pennsylvania, for kindly placing the tablet at our disposal. Instead of republishing the text, I content myself with giving a full list of corrections in the appendix to this volume which will enable scholars to control our readings, and which will, I believe, justify the translation in the numerous passages in which it deviates from Dr. Langdon’s rendering. While credit should be given to Dr. Langdon for having made this important tablet accessible, the interests of science demand that attention be called to his failure to grasp the many important data furnished by the tablet, which escaped him because of his erroneous readings and faulty translations. The tablet, consisting of six columns (three on the obverse and three on the reverse), comprised, according to the colophon, 240 lines25 and formed the second tablet of the series. Of the total, 204 lines are preserved in full or in part, and of the missing thirty-six quite a number can be restored, so that we have a fairly complete tablet. The most serious break occurs at the top of the reverse, where about eight lines are missing. In consequence of this the connection between the end of the obverse (where about five lines are missing) and the beginning of the reverse is obscured, though not to the extent of our entirely losing the thread of the narrative. [17] About the same time that the University of Pennsylvania Museum purchased this second tablet of the Gilgamesh Series, Yale University obtained a tablet from the same dealer, which turned out to be a continuation of the University of Pennsylvania tablet. That the two belong to the same edition of the Epic is shown by their agreement in the dark brown color of the clay, in the writing as well as in the size of the tablet, though the characters on the Yale tablet are somewhat cramped and in consequence more difficult to read. Both tablets consist of six columns, three on the obverse and three on the reverse. The measurements of both are about the same, the Pennsylvania tablet being estimated at about 7 inches high, as against 72/16 inches for the Yale tablet, while the width of both is 6½ inches. The Yale tablet is, however, more closely written and therefore has a larger number of lines than the Pennsylvania tablet. The colophon to the Yale tablet is unfortunately missing, but from internal evidence it is quite certain that the Yale tablet follows immediately upon the Pennsylvania tablet and, therefore, may be set down as the third of the series. The obverse is very badly preserved, so that only a general view of its contents can be secured. The reverse contains serious gaps in the first and second columns. The scribe evidently had a copy before him which he tried to follow exactly, but finding that he could not get all of the copy before him in the six columns, he continued the last column on the edge. In this way we obtain for the sixth column 64 lines as against 45 for column IV, and 47 for column V, and a total of 292 lines for the six columns. Subtracting the 16 lines written on the edge leaves us 276 lines for our tablet as against 240 for its companion. The width of each column being the same on both tablets, the difference of 36 lines is made up by the closer writing. Both tablets have peculiar knobs at the sides, the purpose of which is evidently not to facilitate holding the tablet in one’s hand while writing or reading it, as Langdon assumed26 (it would be quite impracticable for this purpose), but simply to protect the tablet in its position on a shelf, where it would naturally be placed on the edge, just as we arrange books on a shelf. Finally be it noted that these two tablets of the old Babylonian version do not belong to the same edition as the Meissner tablet above described, for the latter consists [18]of two columns each on obverse and reverse, as against three columns each in the case of our two tablets. We thus have the interesting proof that as early as 2000 B.C. there were already several editions of the Epic. As to the provenance of our two tablets, there are no definite data, but it is likely that they were found by natives in the mounds at Warka, from which about the year 1913, many tablets came into the hands of dealers. It is likely that where two tablets of a series were found, others of the series were also dug up, and we may expect to find some further portions of this old Babylonian version turning up in the hands of other dealers or in museums. Coming to the contents of the two tablets, the Pennsylvania tablet deals with the meeting of the two heroes, Gilgamesh and Enkidu, their conflict, followed by their reconciliation, while the Yale tablet in continuation takes up the preparations for the encounter of the two heroes with the guardian of the cedar forest, Ḫumbaba—but probably pronounced Ḫubaba27—or, as the name appears in the old Babylonian version, Ḫuwawa. The two tablets correspond, therefore, to portions of Tablets I to V of the Assyrian version;28 but, as will be shown in detail further on, the number of completely parallel passages is not large, and the Assyrian version shows an independence of the old Babylonian version that is larger than we had reason to expect. In general, it may be said that the Assyrian version is more elaborate, which points to its having received its present form at a considerably later period than the old Babylonian version.29 On the other hand, we already find in the Babylonian version the tendency towards repetition, which is characteristic of Babylonian-Assyrian tales in general. Through the two Babylonian tablets we are enabled to fill out certain details [19]of the two episodes with which they deal: (1) the meeting of Gilgamesh and Enkidu, and (2) the encounter with Ḫuwawa; while their greatest value consists in the light that they throw on the gradual growth of the Epic until it reached its definite form in the text represented by the fragments in Ashurbanapal’s Library. Let us now take up the detailed analysis, first of the Pennsylvania tablet and then of the Yale tablet. The Pennsylvania tablet begins with two dreams recounted by Gilgamesh to his mother, which the latter interprets as presaging the coming of Enkidu to Erech. In the one, something like a heavy meteor falls from heaven upon Gilgamesh and almost crushes him. With the help of the heroes of Erech, Gilgamesh carries the heavy burden to his mother Ninsun. The burden, his mother explains, symbolizes some one who, like Gilgamesh, is born in the mountains, to whom all will pay homage and of whom Gilgamesh will become enamoured with a love as strong as that for a woman. In a second dream, Gilgamesh sees some one who is like him, who brandishes an axe, and with whom he falls in love. This personage, the mother explains, is again Enkidu. Langdon is of the opinion that these dreams are recounted to Enkidu by a woman with whom Enkidu cohabits for six days and seven nights and who weans Enkidu from association with animals. This, however, cannot be correct. The scene between Enkidu and the woman must have been recounted in detail in the first tablet, as in the Assyrian version,30 whereas here in the second tablet we have the continuation of the tale with Gilgamesh recounting his dreams directly to his mother. The story then continues with the description of the coming of Enkidu, conducted by the woman to the outskirts of Erech, where food is given him. The main feature of the incident is the conversion of Enkidu to civilized life. Enkidu, who hitherto had gone about naked, is clothed by the woman. Instead of sucking milk and drinking from a trough like an animal, food and strong drink are placed before him, and he is taught how to eat and drink in human fashion. In human fashion he also becomes drunk, and his “spree” is naïvely described: “His heart became glad and his face shone.”31 [20]Like an animal, Enkidu’s body had hitherto been covered with hair, which is now shaved off. He is anointed with oil, and clothed “like a man.” Enkidu becomes a shepherd, protecting the fold against wild beasts, and his exploit in dispatching lions is briefly told. At this point—the end of column 3 (on the obverse), i.e., line 117, and the beginning of column 4 (on the reverse), i.e., line 131—a gap of 13 lines—the tablet is obscure, but apparently the story of Enkidu’s gradual transformation from savagery to civilized life is continued, with stress upon his introduction to domestic ways with the wife chosen or decreed for him, and with work as part of his fate. All this has no connection with Gilgamesh, and it is evident that the tale of Enkidu was originally an independent tale to illustrate the evolution of man’s career and destiny, how through intercourse with a woman he awakens to the sense of human dignity, how he becomes accustomed to the ways of civilization, how he passes through the pastoral stage to higher walks of life, how the family is instituted, and how men come to be engaged in the labors associated with human activities. In order to connect this tale with the Gilgamesh story, the two heroes are brought together; the woman taking on herself, in addition to the rôle of civilizer, that of the medium through which Enkidu is brought to Gilgamesh. The woman leads Enkidu from the outskirts of Erech into the city itself, where the people on seeing him remark upon his likeness to Gilgamesh. He is the very counterpart of the latter, though somewhat smaller in stature. There follows the encounter between the two heroes in the streets of Erech, where they engage in a fierce combat. Gilgamesh is overcome by Enkidu and is enraged at being thrown to the ground. The tablet closes with the endeavor of Enkidu to pacify Gilgamesh. Enkidu declares that the mother of Gilgamesh has exalted her son above the ordinary mortal, and that Enlil himself has singled him out for royal prerogatives. After this, we may assume, the two heroes become friends and together proceed to carry out certain exploits, the first of which is an attack upon the mighty guardian of the cedar forest. This is the main episode in the Yale tablet, which, therefore, forms the third tablet of the old Babylonian version. In the first column of the obverse of the Yale tablet, which is badly preserved, it would appear that the elders of Erech (or perhaps the people) are endeavoring to dissuade Gilgamesh from making the [21]attempt to penetrate to the abode of Ḫuwawa. If this is correct, then the close of the first column may represent a conversation between these elders and the woman who accompanies Enkidu. It would be the elders who are represented as “reporting the speech to the woman,” which is presumably the determination of Gilgamesh to fight Ḫuwawa. The elders apparently desire Enkidu to accompany Gilgamesh in this perilous adventure, and with this in view appeal to the woman. In the second column after an obscure reference to the mother of Gilgamesh—perhaps appealing to the sun-god—we find Gilgamesh and Enkidu again face to face. From the reference to Enkidu’s eyes “filled with tears,” we may conclude that he is moved to pity at the thought of what will happen to Gilgamesh if he insists upon carrying out his purpose. Enkidu, also, tries to dissuade Gilgamesh. This appears to be the main purport of the dialogue between the two, which begins about the middle of the second column and extends to the end of the third column. Enkidu pleads that even his strength is insufficient, “My arms are lame, My strength has become weak.” (lines 88–89) Gilgamesh apparently asks for a description of the terrible tyrant who thus arouses the fear of Enkidu, and in reply Enkidu tells him how at one time, when he was roaming about with the cattle, he penetrated into the forest and heard the roar of Ḫuwawa which was like that of a deluge. The mouth of the tyrant emitted fire, and his breath was death. It is clear, as Professor Haupt has suggested,32 that Enkidu furnishes the description of a volcano in eruption, with its mighty roar, spitting forth fire and belching out a suffocating smoke. Gilgamesh is, however, undaunted and urges Enkidu to accompany him in the adventure. “I will go down to the forest,” says Gilgamesh, if the conjectural restoration of the line in question (l. 126) is correct. Enkidu replies by again drawing a lurid picture of what will happen “When we go (together) to the forest…….” This speech of Enkidu is continued on the reverse. In reply Gilgamesh emphasizes his reliance upon the good will of Shamash and reproaches Enkidu with cowardice. He declares himself superior to Enkidu’s warning, and in bold terms [22]says that he prefers to perish in the attempt to overcome Ḫuwawa rather than abandon it. “Wherever terror is to be faced, Thou, forsooth, art in fear of death. Thy prowess lacks strength. I will go before thee, Though thy mouth shouts to me: ‘thou art afraid to approach,’ If I fall, I will establish my name.” (lines 143–148) There follows an interesting description of the forging of the weapons for the two heroes in preparation for the encounter.33 The elders of Erech when they see these preparations are stricken with fear. They learn of Ḫuwawa’s threat to annihilate Gilgamesh if he dares to enter the cedar forest, and once more try to dissuade Gilgamesh from the undertaking. “Thou art young, O Gish, and thy heart carries thee away, Thou dost not know what thou proposest to do.” (lines 190–191) They try to frighten Gilgamesh by repeating the description of the terrible Ḫuwawa. Gilgamesh is still undaunted and prays to his patron deity Shamash, who apparently accords him a favorable “oracle” (têrtu). The two heroes arm themselves for the fray, and the elders of Erech, now reconciled to the perilous undertaking, counsel Gilgamesh to take provision along for the undertaking. They urge Gilgamesh to allow Enkidu to take the lead, for “He is acquainted with the way, he has trodden the road [to] the entrance of the forest.” (lines 252–253) The elders dismiss Gilgamesh with fervent wishes that Enkidu may track out the “closed path” for Gilgamesh, and commit him to the care of Lugalbanda—here perhaps an epithet of Shamash. They advise Gilgamesh to perform certain rites, to wash his feet in the stream of Ḫuwawa and to pour out a libation of water to Shamash. Enkidu follows in a speech likewise intended to encourage the hero; and with the actual beginning of the expedition against Ḫuwawa the tablet ends. The encounter itself, with the triumph of the two heroes, must have been described in the fourth tablet. [23] Now before taking up the significance of the additions to our knowledge of the Epic gained through these two tablets, it will be well to discuss the forms in which the names of the two heroes and of the ruler of the cedar forest occur in our tablets. As in the Meissner fragment, the chief hero is invariably designated as dGish in both the Pennsylvania and Yale tablets; and we may therefore conclude that this was the common form in the Hammurabi period, as against the writing dGish-gì(n)-mash34 in the Assyrian version. Similarly, as in the Meissner fragment, the second hero’s name is always written En-ki-dũ35 (abbreviated from dúg) as against En-ki-dú in the Assyrian version. Finally, we encounter in the Yale tablet for the first time the writing Ḫu-wa-wa as the name of the guardian of the cedar forest, as against Ḫum-ba-ba in the Assyrian version, though in the latter case, as we may now conclude from the Yale tablet, the name should rather be read Ḫu-ba-ba.36 The variation in the writing of the latter name is interesting as pointing to the aspirate pronunciation of the labial in both instances. The name would thus present a complete parallel to the Hebrew name Ḫowawa (or Ḫobab) who appears as the brother-in-law of Moses in the P document, Numbers 10, 29.37 Since the name also occurs, written precisely as in the Yale tablet, among the “Amoritic” names in the important lists published by Dr. Chiera,38 there can be no doubt that [24]Ḫuwawa or Ḫubaba is a West Semitic name. This important fact adds to the probability that the “cedar forest” in which Ḫuwawa dwells is none other than the Lebanon district, famed since early antiquity for its cedars. This explanation of the name Ḫuwawa disposes of suppositions hitherto brought forward for an Elamitic origin. Gressmann39 still favors such an origin, though realizing that the description of the cedar forest points to the Amanus or Lebanon range. In further confirmation of the West Semitic origin of the name, we have in Lucian, De Dea Syria, § 19, the name Kombabos40 (the guardian of Stratonika), which forms a perfect parallel to Ḫu(m)baba. Of the important bearings of this western character of the name Ḫuwawa on the interpretation and origin of the Gilgamesh Epic, suggesting that the episode of the encounter between the tyrant and the two heroes rests upon a tradition of an expedition against the West or Amurru land, we shall have more to say further on. The variation in the writing of the name Enkidu is likewise interesting. It is evident that the form in the old Babylonian version with the sign dũ (i.e., dúg) is the original, for it furnishes us with a suitable etymology “Enki is good.” The writing with dúg, pronounced dū, also shows that the sign dú as the third element in the form which the name has in the Assyrian version is to be read dú, and that former readings like Ea-bani must be definitely abandoned.41 The form with dú is clearly a phonetic writing of the Sumerian name, the sign dú being chosen to indicate the pronunciation (not the ideograph) of the third element dúg. This is confirmed by the writing En-gi-dú in the syllabary CT XVIII, 30, 10. The phonetic writing is, therefore, a warning against any endeavor to read the name by an Akkadian transliteration of the signs. This would not of itself prove that Enkidu is of Sumerian origin, for it might well be that the writing En-ki-dú is an endeavor to give a Sumerian aspect to a name that may have been foreign. The element dúg corresponds to the Semitic ṭâbu, “good,” and En-ki being originally a designation of a deity as the “lord of the land,” which would be the Sumerian [25]manner of indicating a Semitic Baal, it is not at all impossible that En-ki-dúg may be the “Sumerianized” form of a Semitic בַּעל טזֹב “Baal is good.” It will be recalled that in the third column of the Yale tablet, Enkidu speaks of himself in his earlier period while still living with cattle, as wandering into the cedar forest of Ḫuwawa, while in another passage (ll. 252–253) he is described as “acquainted with the way … to the entrance of the forest.” This would clearly point to the West as the original home of Enkidu. We are thus led once more to Amurru—taken as a general designation of the West—as playing an important role in the Gilgamesh Epic.42 If Gilgamesh’s expedition against Ḫuwawa of the Lebanon district recalls a Babylonian campaign against Amurru, Enkidu’s coming from his home, where, as we read repeatedly in the Assyrian version, “He ate herbs with the gazelles, Drank out of a trough with cattle,”43 may rest on a tradition of an Amorite invasion of Babylonia. The fight between Gilgamesh and Enkidu would fit in with this tradition, while the subsequent reconciliation would be the form in which the tradition would represent the enforced union between the invaders and the older settlers. Leaving this aside for the present, let us proceed to a consideration of the relationship of the form dGish, for the chief personage in the Epic in the old Babylonian version, to dGish-gi(n)-mash in the Assyrian version. Of the meaning of Gish there is fortunately no doubt. It is clearly the equivalent to the Akkadian zikaru, “man” (Brünnow No. 5707), or possibly rabû, “great” (Brünnow No. 5704). Among various equivalents, the preference is to be given to itlu, “hero.” The determinative for deity stamps the person so designated as deified, or as in part divine, and this is in accord with the express statement in the Assyrian version of the Gilgamesh Epic which describes the hero as “Two-thirds god and one-third human.”44 [26]Gish is, therefore, the hero-god par excellence; and this shows that we are not dealing with a genuine proper name, but rather with a descriptive attribute. Proper names are not formed in this way, either in Sumerian or Akkadian. Now what relation does this form Gish bear to as the name of the hero is invariably written in the Assyrian version, the form which was at first read dIz-tu-bar or dGish-du-bar by scholars, until Pinches found in a neo-Babylonian syllabary45 the equation of it with Gi-il-ga-mesh? Pinches’ discovery pointed conclusively to the popular pronunciation of the hero’s name as Gilgamesh; and since Aelian (De natura Animalium XII, 2) mentions a Babylonian personage Gilgamos (though what he tells us of Gilgamos does not appear in our Epic, but seems to apply to Etana, another figure of Babylonian mythology), there seemed to be no further reason to question that the problem had been solved. Besides, in a later Syriac list of Babylonian kings found in the Scholia of Theodor bar Koni, the name גלמגום with a variant גמיגמוס occurs,46 and it is evident that we have here again the Gi-il-ga-mesh, discovered by Pinches. The existence of an old Babylonian hero Gilgamesh who was likewise a king is thus established, as well as his identification with It is evident that we cannot read this name as Iz-tu-bar or Gish-du-bar, but that we must read the first sign as Gish and the third as Mash, while for the second we must assume a reading Gìn or Gi. This would give us Gish-gì(n)-mash which is clearly again (like En-ki-dú) not an etymological writing but a phonetic one, intended to convey an approach to the popular pronunciation. Gi-il-ga-mesh might well be merely a variant for Gish-ga-mesh, or vice versa, and this would come close to Gish-gi-mash. Now, when we have a name the pronunciation of which is not definite but approximate, and which is written in various ways, the probabilities are that the name is foreign. A foreign name might naturally be spelled in various ways. The [27]Epic in the Assyrian version clearly depicts dGish-gì(n)-mash as a conqueror of Erech, who forces the people into subjection, and whose autocratic rule leads the people of Erech to implore the goddess Aruru to create a rival to him who may withstand him. In response to this appeal dEnkidu is formed out of dust by Aruru and eventually brought to Erech.47 Gish-gì(n)-mash or Gilgamesh is therefore in all probability a foreigner; and the simplest solution suggested by the existence of the two forms (1) Gish in the old Babylonian version and (2) Gish-gì(n)-mash in the Assyrian version, is to regard the former as an abbreviation, which seemed appropriate, because the short name conveyed the idea of the “hero” par excellence. If Gish-gì(n)-mash is a foreign name, one would think in the first instance of Sumerian; but here we encounter a difficulty in the circumstance that outside of the Epic this conqueror and ruler of Erech appears in quite a different form, namely, as dGish-bil-ga-mesh, with dGish-gibil(or bìl)-ga-mesh and dGish-bil-ge-mesh as variants.48 In the remarkable list of partly mythological and partly historical dynasties, published by Poebel,49 the fifth member of the first dynasty of Erech appears as dGish-bil-ga-mesh; and similarly in an inscription of the days of Sin-gamil, dGish-bil-ga-mesh is mentioned as the builder of the wall of Erech.50 Moreover, in the several fragments of the Sumerian version of the Epic we have invariably the form dGish-bil-ga-mesh. It is evident, therefore, that this is the genuine form of the name in Sumerian and presumably, therefore, the oldest form. By way of further confirmation we have in the syllabary above referred to, CT, XVIII, 30, 6–8, three designations of our hero, viz: dGish-gibil(or bíl)-ga-mesh muḳ-tab-lu (“warrior”) a-lik pa-na (“leader”) All three designations are set down as the equivalent of the Sumerian Esigga imin i.e., “the seven-fold hero.” [28] Of the same general character is the equation in another syllabary:51 Esigga-tuk and its equivalent Gish-tuk = “the one who is a hero.” Furthermore, the name occurs frequently in “Temple” documents of the Ur dynasty in the form dGish-bil-ga-mesh52 with dGish-bil-gi(n)-mesh as a variant.53 In a list of deities (CT XXV, 28, K 7659) we likewise encounter dGish-gibil(or bíl)-ga-mesh, and lastly in a syllabary we have the equation54 dGish-gi-mas-[si?] = dGish-bil-[ga-mesh]. The variant Gish-gibil for Gish-bil may be disposed of readily, in view of the frequent confusion or interchange of the two signs Bil (Brünnow No. 4566) and Gibil or Bíl (Brünnow No. 4642) which has also the value Gi (Brünnow 4641), so that we might also read Gish-gi-ga-mesh. Both signs convey the idea of “fire,” “renew,” etc.; both revert to the picture of flames of fire, in the one case with a bowl (or some such obiect) above it, in the other the flames issuing apparently from a torch.55 The meaning of the name is not affected whether we read dGish-bil-ga-mesh or dGish-gibil(or bíl)-ga-mesh, for the middle element in the latter case being identical with the fire-god, written dBil-gi and to be pronounced in the inverted form as Gibil with -ga (or ge) as the phonetic complement; it is equivalent, therefore, to the writing bil-ga in the former case. Now Gish-gibil or Gish-bíl conveys the idea of abu, “father” (Brünnow No. 5713), just as Bil (Brünnow No. 4579) has this meaning, while Pa-gibil-(ga) or Pa-bíl-ga is abu abi, “grandfather.”56 This meaning may be derived from Gibil, as also from Bíl = išatu, “fire,” then eššu, “new,” then abu, “father,” as the renewer or creator. Gish with Bíl or Gibil would, therefore, be “the father-man” or “the father-hero,” [29]i.e., again the hero par excellence, the original hero, just as in Hebrew and Arabic ab is used in this way.57 The syllable ga being a phonetic complement, the element mesh is to be taken by itself and to be explained, as Poebel suggested, as “hero” (itlu. Brünnow No. 5967). We would thus obtain an entirely artificial combination, “man (or hero), father, hero,” which would simply convey in an emphatic manner the idea of the Ur-held, the original hero, the father of heroes as it were—practically the same idea, therefore, as the one conveyed by Gish alone, as the hero par excellence. Our investigation thus leads us to a substantial identity between Gish and the longer form Gish-bil(or bíl)-ga-mesh, and the former might, therefore, well be used as an abbreviation of the latter. Both the shorter and the longer forms are descriptive epithets based on naive folk etymology, rather than personal names, just as in the designation of our hero as muḳtablu, the “fighter,” or as âlik pâna, “the leader,” or as Esigga imin, “the seven-fold hero,” or Esigga tuk, “the one who is a hero,” are descriptive epithets, and as Atra-ḫasis, “the very wise one,” is such an epithet for the hero of the deluge story. The case is different with Gi-il-ga-mesh, or Gish-gì(n)-mash, which represent the popular and actual pronunciation of the name, or at least the approach to such pronunciation. Such forms, stripped as they are of all artificiality, impress one as genuine names. The conclusion to which we are thus led is that Gish-bil(or bíl)-ga-mesh is a play upon the genuine name, to convey to those to whom the real name, as that of a foreigner, would suggest no meaning an interpretation fitting in with his character. In other words, Gish-bil-ga-mesh is a “Sumerianized” form of the name, introduced into the Sumerian version of the tale which became a folk-possession in the Euphrates Valley. Such plays upon names to suggest the character of an individual or some incident are familiar to us from the narratives in Genesis.58 They do not constitute genuine etymologies and are rarely of use in leading to a correct etymology. Reuben, e.g., certainly does not mean “Yahweh has seen my affliction,” which the mother is supposed to have exclaimed at [30]the birth (Genesis 29, 32), with a play upon ben and be’onyi, any more than Judah means “I praise Yahweh” (v. 35), though it does contain the divine name (Yehô) as an element. The play on the name may be close or remote, as long as it fulfills its function of suggesting an etymology that is complimentary or appropriate. In this way, an artificial division and at the same time a distortion of a foreign name like Gilgamesh into several elements, Gish-bil-ga-mesh, is no more violent than, for example, the explanation of Issachar or rather Issaschar as “God has given my hire” (Genesis 30, 18) with a play upon the element sechar, and as though the name were to be divided into Yah (“God”) and sechar (“hire”); or the popular name of Alexander among the Arabs as Zu’l Karnaini, “the possessor of the two horns.” with a suggestion of his conquest of two hemispheres, or what not.59 The element Gil in Gilgamesh would be regarded as a contraction of Gish-bil or gi-bil, in order to furnish the meaning “father-hero,” or Gil might be looked upon as a variant for Gish, which would give us the “phonetic” form in the Assyrian version dGish-gi-mash,60 as well as such a variant writing dGish-gi-mas-(si). Now a name like Gilgamesh, upon which we may definitely settle as coming closest to the genuine form, certainly impresses one as foreign, i.e., it is neither Sumerian nor Akkadian; and we have already suggested that the circumstance that the hero of the Epic is portrayed as a conqueror of Erech, and a rather ruthless one at that, points to a tradition of an invasion of the Euphrates Valley as the background for the episode in the first tablet of the series. Now it is significant that many of the names in the “mythical” dynasties, as they appear in Poebel’s list,61 are likewise foreign, such as Mes-ki-in-ga-še-ir, son of the god Shamash (and the founder of the “mythical” dynasty of Erech of which dGish-bil-ga-mesh is the fifth member),62 and En-me-ir-kár his son. In a still earlier “mythical” dynasty, we encounter names like Ga-lu-mu-um, Zu-ga-gi-ib, Ar-pi, [31]E-ta-na,63 which are distinctly foreign, while such names as En-me(n)-nun-na and Bar-sal-nun-na strike one again as “Sumerianized” names rather than as genuine Sumerian formations.64 Some of these names, as Galumum, Arpi and Etana, are so Amoritic in appearance, that one may hazard the conjecture of their western origin. May Gilgamesh likewise belong to the Amurru65 region, or does he represent a foreigner from the East in contrast to Enkidu, whose name, we have seen, may have been Baal-Ṭôb in the West, with which region he is according to the Epic so familiar? It must be confessed that the second element ga-mesh would fit in well with a Semitic origin for the name, for the element impresses one as the participial form of a Semitic stem g-m-š, just as in the second element of Meskin-gašer we have such a form. Gil might then be the name of a West-Semitic deity. Such conjectures, however, can for the present not be substantiated, and we must content ourselves with the conclusion that Gilgamesh as the real name of the hero, or at least the form which comes closest to the real name, points to a foreign origin for the hero, and that such forms as dGish-bil-ga-mesh and dGish-bíl-gi-mesh and other variants are “Sumerianized” forms for which an artificial etymology was brought forward to convey the [32]idea of the “original hero” or the hero par excellence. By means of this “play” on the name, which reverts to the compilers of the Sumerian version of the Epic, Gilgamesh was converted into a Sumerian figure, just as the name Enkidu may have been introduced as a Sumerian translation of his Amoritic name. dGish at all events is an abbreviated form of the “Sumerianized” name, introduced by the compilers of the earliest Akkadian version, which was produced naturally under the influence of the Sumerian version. Later, as the Epic continued to grow, a phonetic writing was introduced, dGish-gi-mash, which is in a measure a compromise between the genuine name and the “Sumerianized” form, but at the same time an approach to the real pronunciation. Next to the new light thrown upon the names and original character of the two main figures of the Epic, one of the chief points of interest in the Pennsylvania fragment is the proof that it furnishes for a striking resemblance of the two heroes, Gish and Enkidu, to one another. In interpreting the dream of Gish, his mother. Ninsun, lays stress upon the fact that the dream portends the coming of someone who is like Gish, “born in the field and reared in the mountain” (lines 18–19). Both, therefore, are shown by this description to have come to Babylonia from a mountainous region, i.e., they are foreigners; and in the case of Enkidu we have seen that the mountain in all probability refers to a region in the West, while the same may also be the case with Gish. The resemblance of the two heroes to one another extends to their personal appearance. When Enkidu appears on the streets of Erech, the people are struck by this resemblance. They remark that he is “like Gish,” though “shorter in stature” (lines 179–180). Enkidu is described as a rival or counterpart.66 This relationship between the two is suggested also by the Assyrian version. In the creation of Enkidu by Aruru, the people urge the goddess to create the “counterpart” (zikru) of Gilgamesh, someone who will be like him (ma-ši-il) (Tablet I, 2, 31). Enkidu not only comes from the mountain,67 but the mountain is specifically designated [33]as his birth-place (I, 4, 2), precisely as in the Pennsylvania tablet, while in another passage he is also described, as in our tablet, as “born in the field.”68 Still more significant is the designation of Gilgamesh as the talimu, “younger brother,” of Enkidu.69 In accord with this, we find Gilgamesh in his lament over Enkidu describing him as a “younger brother” (ku-ta-ni);70 and again in the last tablet of the Epic, Gilgamesh is referred to as the “brother” of Enkidu.71 This close relationship reverts to the Sumerian version, for the Constantinople fragment (Langdon, above, p. 13) begins with the designation of Gish-bil-ga-mesh as “his brother.” By “his” no doubt Enkidu is meant. Likewise in the Sumerian text published by Zimmern (above, p. 13) Gilgamesh appears as the brother of Enkidu (rev. 1, 17). Turning to the numerous representations of Gilgamesh and Enkidu on Seal Cylinders,72 we find this resemblance of the two heroes to each other strikingly confirmed. Both are represented as bearded, with the strands arranged in the same fashion. The face in both cases is broad, with curls protruding at the side of the head, though at times these curls are lacking in the case of Enkidu. What is particularly striking is to find Gilgamesh generally a little taller than Enkidu, thus bearing out the statement in the Pennsylvania tablet that Enkidu is “shorter in stature.” There are, to be sure, also some distinguishing marks between the two. Thus Enkidu is generally represented with animal hoofs, but not always.73 Enkidu is commonly portrayed with the horns of a bison, but again this sign is wanting in quite a number of instances.74 The hoofs and the horns mark the period when Enkidu lived with animals and much like an [34]animal. Most remarkable, however, of all are cylinders on which we find the two heroes almost exactly alike as, for example, Ward No. 199 where two figures, the one a duplicate of the other (except that one is just a shade taller), are in conflict with each other. Dr. Ward was puzzled by this representation and sets it down as a “fantastic” scene in which “each Gilgamesh is stabbing the other.” In the light of the Pennsylvania tablet, this scene is clearly the conflict between the two heroes described in column 6, preliminary to their forming a friendship. Even in the realm of myth the human experience holds good that there is nothing like a good fight as a basis for a subsequent alliance. The fragment describes this conflict as a furious one in which Gilgamesh is worsted, and his wounded pride assuaged by the generous victor, who comforts his vanquished enemy by the assurance that he was destined for something higher than to be a mere “Hercules.” He was singled out for the exercise of royal authority. True to the description of the two heroes in the Pennsylvania tablet as alike, one the counterpart of the other, the seal cylinder portrays them almost exactly alike, as alike as two brothers could possibly be; with just enough distinction to make it clear on close inspection that two figures are intended and not one repeated for the sake of symmetry. There are slight variations in the manner in which the hair is worn, and slightly varying expressions of the face, just enough to make it evident that the one is intended for Gilgamesh and the other for Enkidu. When, therefore, in another specimen, No. 173, we find a Gilgamesh holding his counterpart by the legs, it is merely another aspect of the fight between the two heroes, one of whom is intended to represent Enkidu, and not, as Dr. Ward supposed, a grotesque repetition of Gilgamesh.75 The description of Enkidu in the Pennsylvania tablet as a parallel figure to Gilgamesh leads us to a consideration of the relationship of the two figures to one another. Many years ago it was pointed out that the Gilgamesh Epic was a composite tale in which various stories of an independent origin had been combined and brought into more or less artificial connection with the heros eponymos of southern Babylonia.76 We may now go a step further and point out that not [35]only is Enkidu originally an entirely independent figure, having no connection with Gish or Gilgamesh, but that the latter is really depicted in the Epic as the counterpart of Enkidu, a reflection who has been given the traits of extraordinary physical power that belong to Enkidu. This is shown in the first place by the fact that in the encounter it is Enkidu who triumphs over Gilgamesh. The entire analysis of the episode of the meeting between the two heroes as given by Gressmann77 must be revised. It is not Enkidu who is terrified and who is warned against the encounter. It is Gilgamesh who, during the night on his way from the house in which the goddess Ishḫara lies, encounters Enkidu on the highway. Enkidu “blocks the path”78 of Gilgamesh. He prevents Gilgamesh from re-entering the house,79 and the two attack each other “like oxen.”80 They grapple with each other, and Enkidu forces Gilgamesh to the ground. Enkidu is, therefore, the real hero whose traits of physical prowess are afterwards transferred to Gilgamesh. Similarly in the next episode, the struggle against Ḫuwawa, the Yale tablet makes it clear that in the original form of the tale Enkidu is the real hero. All warn Gish against the undertaking—the elders of Erech, Enkidu, and also the workmen. “Why dost thou desire to do this?”81 they say to him. “Thou art young, and thy heart carries thee away. Thou knowest not what thou proposest to do.”82 This part of the incident is now better known to us through the latest fragment of the Assyrian version discovered and published by King.83 The elders say to Gilgamesh: “Do not trust, O Gilgamesh, in thy strength! Be warned(?) against trusting to thy attack! The one who goes before will save his companion,84 He who has foresight will save his friend.85 [36] Let Enkidu go before thee. He knows the roads to the cedar forest; He is skilled in battle and has seen fight.” Gilgamesh is sufficiently impressed by this warning to invite Enkidu to accompany him on a visit to his mother, Ninsun, for the purpose of receiving her counsel.86 It is only after Enkidu, who himself hesitates and tries to dissuade Gish, decides to accompany the latter that the elders of Erech are reconciled and encourage Gish for the fray. The two in concert proceed against Ḫuwawa. Gilgamesh alone cannot carry out the plan. Now when a tale thus associates two figures in one deed, one of the two has been added to the original tale. In the present case there can be little doubt that Enkidu, without whom Gish cannot proceed, who is specifically described as “acquainted with the way … to the entrance of the forest”87 in which Ḫuwawa dwells is the original vanquisher. Naturally, the Epic aims to conceal this fact as much as possible ad majorem gloriam of Gilgamesh. It tries to put the one who became the favorite hero into the foreground. Therefore, in both the Babylonian and the Assyrian version Enkidu is represented as hesitating, and Gilgamesh as determined to go ahead. Gilgamesh, in fact, accuses Enkidu of cowardice and boldly declares that he will proceed even though failure stare him in the face.88 Traces of the older view, however, in which Gilgamesh is the one for whom one fears the outcome, crop out; as, for example, in the complaint of Gilgamesh’s mother to Shamash that the latter has stirred the heart of her son to take the distant way to Ḫu(m)baba, “To a fight unknown to him, he advances, An expedition unknown to him he undertakes.”89 Ninsun evidently fears the consequences when her son informs her of his intention and asks her counsel. The answer of Shamash is not preserved, but no doubt it was of a reassuring character, as was the answer of the Sun-god to Gish’s appeal and prayer as set forth in the Yale tablet.90 [37] Again, as a further indication that Enkidu is the real conqueror of Ḫuwawa, we find the coming contest revealed to Enkidu no less than three times in dreams, which Gilgamesh interprets.91 Since the person who dreams is always the one to whom the dream applies, we may see in these dreams a further trace of the primary rôle originally assigned to Enkidu. Another exploit which, according to the Assyrian version, the two heroes perform in concert is the killing of a bull, sent by Anu at the instance of Ishtar to avenge an insult offered to the goddess by Gilgamesh, who rejects her offer of marriage. In the fragmentary description of the contest with the bull, we find Enkidu “seizing” the monster by “its tail.”92 That Enkidu originally played the part of the slayer is also shown by the statement that it is he who insults Ishtar by throwing a piece of the carcass into the goddess’ face,93 adding also an insulting speech; and this despite the fact that Ishtar in her rage accuses Gilgamesh of killing the bull.94 It is thus evident that the Epic alters the original character of the episodes in order to find a place for Gilgamesh, with the further desire to assign to the latter the chief rôle. Be it noted also that Enkidu, not Gilgamesh, is punished for the insult to Ishtar. Enkidu must therefore in the original form of the episode have been the guilty party, who is stricken with mortal disease as a punishment to which after twelve days he succumbs.95 In view of this, we may supply the name of Enkidu in the little song introduced at the close of the encounter with the bull, and not Gilgamesh as has hitherto been done. “Who is distinguished among the heroes? Who is glorious among men? [Enkidu] is distinguished among heroes, [Enkidu] is glorious among men.”96 [38]Finally, the killing of lions is directly ascribed to Enkidu in the Pennsylvania tablet: “Lions he attacked *     *     *     *     * Lions he overcame”97 whereas Gilgamesh appears to be afraid of lions. On his long search for Utnapishtim he says: “On reaching the entrance of the mountain at night I saw lions and was afraid.”98 He prays to Sin and Ishtar to protect and save him. When, therefore, in another passage some one celebrates Gilgamesh as the one who overcame the “guardian,” who dispatched Ḫu(m)baba in the cedar forest, who killed lions and overthrew the bull,99 we have the completion of the process which transferred to Gilgamesh exploits and powers which originally belonged to Enkidu, though ordinarily the process stops short at making Gilgamesh a sharer in the exploits; with the natural tendency, to be sure, to enlarge the share of the favorite. We can now understand why the two heroes are described in the Pennsylvania tablet as alike, as born in the same place, aye, as brothers. Gilgamesh in the Epic is merely a reflex of Enkidu. The latter is the real hero and presumably, therefore, the older figure.100 Gilgamesh resembles Enkidu, because he is originally Enkidu. The “resemblance” motif is merely the manner in which in the course of the partly popular, partly literary transfer, the recollection is preserved that Enkidu is the original, and Gilgamesh the copy. The artificiality of the process which brings the two heroes together is apparent in the dreams of Gilgamesh which are interpreted by his mother as portending the coming of Enkidu. Not the conflict is foreseen, but the subsequent close association, naïvely described as due to the personal charm which Enkidu exercises, which will lead Gilgamesh to fall in love with the one whom he is to meet. The two will become one, like man and wife. [39] On the basis of our investigations, we are now in a position to reconstruct in part the cycle of episodes that once formed part of an Enkidu Epic. The fight between Enkidu and Gilgamesh, in which the former is the victor, is typical of the kind of tales told of Enkidu. He is the real prototype of the Greek Hercules. He slays lions, he overcomes a powerful opponent dwelling in the forests of Lebanon, he kills the bull, and he finally succumbs to disease sent as a punishment by an angry goddess. The death of Enkidu naturally formed the close of the Enkidu Epic, which in its original form may, of course, have included other exploits besides those taken over into the Gilgamesh Epic. There is another aspect of the figure of Enkidu which is brought forward in the Pennsylvania tablet more clearly than had hitherto been the case. Many years ago attention was called to certain striking resemblances between Enkidu and the figure of the first man as described in the early chapters of Genesis.101 At that time we had merely the Assyrian version of the Gilgamesh Epic at our disposal, and the main point of contact was the description of Enkidu living with the animals, drinking and feeding like an animal, until a woman is brought to him with whom he engages in sexual intercourse. This suggested that Enkidu was a picture of primeval man, while the woman reminded one of Eve, who when she is brought to Adam becomes his helpmate and inseparable companion. The Biblical tale stands, of course, on a much higher level, and is introduced, as are other traditions and tales of primitive times, in the style of a parable to convey certain religious teachings. For all that, suggestions of earlier conceptions crop out in the picture of Adam surrounded by animals to which he assigns names. Such a phrase as “there was no helpmate corresponding to him” becomes intelligible on the supposition of an existing tradition or belief, that man once lived and, indeed, cohabited with animals. The tales in the early chapters of Genesis must rest on very early popular traditions, which have been cleared of mythological and other objectionable features in order to adapt them to the purpose of the Hebrew compilers, to serve as a medium for illustrating [40]certain religious teachings regarding man’s place in nature and his higher destiny. From the resemblance between Enkidu and Adam it does not, of course, follow that the latter is modelled upon the former, but only that both rest on similar traditions of the condition under which men lived in primeval days prior to the beginnings of human culture. We may now pass beyond these general indications and recognize in the story of Enkidu as revealed by the Pennsylvania tablet an attempt to trace the evolution of primitive man from low beginnings to the regular and orderly family life associated with advanced culture. The new tablet furnishes a further illustration for the surprisingly early tendency among the Babylonian literati to connect with popular tales teachings of a religious or ethical character. Just as the episode between Gilgamesh and the maiden Sabitum is made the occasion for introducing reflections on the inevitable fate of man to encounter death, so the meeting of Enkidu with the woman becomes the medium of impressing the lesson of human progress through the substitution of bread and wine for milk and water, through the institution of the family, and through work and the laying up of resources. This is the significance of the address to Enkidu in column 4 of the Pennsylvania tablet, even though certain expressions in it are somewhat obscure. The connection of the entire episode of Enkidu and the woman with Gilgamesh is very artificial; and it becomes much more intelligible if we disassociate it from its present entanglement in the Epic. In Gilgamesh’s dream, portending the meeting with Enkidu, nothing is said of the woman who is the companion of the latter. The passage in which Enkidu is created by Aruru to oppose Gilgamesh102 betrays evidence of having been worked over in order to bring Enkidu into association with the longing of the people of Erech to get rid of a tyrannical character. The people in their distress appeal to Aruru to create a rival to Gilgamesh. In response, “Aruru upon hearing this created a man of Anu in her heart.” Now this “man of Anu” cannot possibly be Enkidu, for the sufficient reason that a few lines further on Enkidu is described as an [41]offspring of Ninib. Moreover, the being created is not a “counterpart” of Gilgamesh, but an animal-man, as the description that follows shows. We must separate lines 30–33 in which the creation of the “Anu man” is described from lines 34–41 in which the creation of Enkidu is narrated. Indeed, these lines strike one as the proper beginning of the original Enkidu story, which would naturally start out with his birth and end with his death. The description is clearly an account of the creation of the first man, in which capacity Enkidu is brought forward. “Aruru washed her hands, broke off clay, threw it on the field103 … created Enkidu, the hero, a lofty offspring of the host of Ninib.”104 The description of Enkidu follows, with his body covered with hair like an animal, and eating and drinking with the animals. There follows an episode105 which has no connection whatsoever with the Gilgamesh Epic, but which is clearly intended to illustrate how Enkidu came to abandon the life with the animals. A hunter sees Enkidu and is amazed at the strange sight—an animal and yet a man. Enkidu, as though resenting his condition, becomes enraged at the sight of the hunter, and the latter goes to his father and tells him of the strange creature whom he is unable to catch. In reply, the father advises his son to take a woman with him when next he goes out on his pursuit, and to have the woman remove her dress in the presence of Enkidu, who will then approach her, and after intercourse with her will abandon the animals among whom he lives. By this device he will catch the strange creature. Lines 14–18 of column 3 in the first tablet in which the father of the hunter refers to Gilgamesh must be regarded as a later insertion, a part of the reconstruction of the tale to connect the episode with Gilgamesh. The advice of the father to his son, the hunter, begins, line 19, “Go my hunter, take with thee a woman.” [42]In the reconstructed tale, the father tells his son to go to Gilgamesh to relate to him the strange appearance of the animal-man; but there is clearly no purpose in this, as is shown by the fact that when the hunter does so, Gilgamesh makes precisely the same speech as does the father of the hunter. Lines 40–44 of column 3, in which Gilgamesh is represented as speaking to the hunter form a complete doublet to lines 19–24, beginning “Go, my hunter, take with thee a woman, etc.” and similarly the description of Enkidu appears twice, lines 2–12 in an address of the hunter to his father, and lines 29–39 in the address of the hunter to Gilgamesh. The artificiality of the process of introducing Gilgamesh into the episode is revealed by this awkward and entirely meaningless repetition. We may therefore reconstruct the first two scenes in the Enkidu Epic as follows:106 Tablet I, col. 2, 34–35: Creation of Enkidu by Aruru. 36–41: Description of Enkidu’s hairy body and of his life with the animals. 42–50: The hunter sees Enkidu, who shows his anger, as also his woe, at his condition. 3, 1–12: The hunter tells his father of the strange being who pulls up the traps which the hunter digs, and who tears the nets so that the hunter is unable to catch him or the animals. 19–24: The father of the hunter advises his son on his next expedition to take a woman with him in order to lure the strange being from his life with the animals. Line 25, beginning “On the advice of his father,” must have set forth, in the original form of the episode, how the hunter procured the woman and took her with him to meet Enkidu. Column 4 gives in detail the meeting between the two, and naïvely describes how the woman exposes her charms to Enkidu, who is captivated by her and stays with her six days and seven nights. The animals see the change in Enkidu and run away from him. [43]He has been transformed through the woman. So far the episode. In the Assyrian version there follows an address of the woman to Enkidu beginning (col. 4, 34): “Beautiful art thou, Enkidu, like a god art thou.” We find her urging him to go with her to Erech, there to meet Gilgamesh and to enjoy the pleasures of city life with plenty of beautiful maidens. Gilgamesh, she adds, will expect Enkidu, for the coming of the latter to Erech has been foretold in a dream. It is evident that here we have again the later transformation of the Enkidu Epic in order to bring the two heroes together. Will it be considered too bold if we assume that in the original form the address of the woman and the construction of the episode were such as we find preserved in part in columns 2 to 4 of the Pennsylvania tablet, which forms part of the new material that can now be added to the Epic? The address of the woman begins in line 51 of the Pennsylvania tablet: “I gaze upon thee, Enkidu, like a god art thou.” This corresponds to the line in the Assyrian version (I, 4, 34) as given above, just as lines 52–53: “Why with the cattle Dost thou roam across the field?” correspond to I, 4, 35, of the Assyrian version. There follows in both the old Babylonian and the Assyrian version the appeal of the woman to Enkidu, to allow her to lead him to Erech where Gilgamesh dwells (Pennsylvania tablet lines 54–61 = Assyrian version I, 4, 36–39); but in the Pennsylvania tablet we now have a second speech (lines 62–63) beginning like the first one with al-ka, “come:” “Come, arise from the accursed ground.” Enkidu consents, and now the woman takes off her garments and clothes the naked Enkidu, while putting another garment on herself. She takes hold of his hand and leads him to the sheepfolds (not to Erech!!), where bread and wine are placed before him. Accustomed hitherto to sucking milk with cattle, Enkidu does not know what to do with the strange food until encouraged and instructed by the woman. The entire third column is taken up with this introduction [44]of Enkidu to civilized life in a pastoral community, and the scene ends with Enkidu becoming a guardian of flocks. Now all this has nothing to do with Gilgamesh, and clearly sets forth an entirely different idea from the one embodied in the meeting of the two heroes. In the original Enkidu tale, the animal-man is looked upon as the type of a primitive savage, and the point of the tale is to illustrate in the naïve manner characteristic of folklore the evolution to the higher form of pastoral life. This aspect of the incident is, therefore, to be separated from the other phase which has as its chief motif the bringing of the two heroes together. We now obtain, thanks to the new section revealed by the Pennsylvania tablet, a further analogy107 with the story of Adam and Eve, but with this striking difference, that whereas in the Babylonian tale the woman is the medium leading man to the higher life, in the Biblical story the woman is the tempter who brings misfortune to man. This contrast is, however, not inherent in the Biblical story, but due to the point of view of the Biblical writer, who is somewhat pessimistically inclined and looks upon primitive life, when man went naked and lived in a garden, eating of fruits that grew of themselves, as the blessed life in contrast to advanced culture which leads to agriculture and necessitates hard work as the means of securing one’s substance. Hence the woman through whom Adam eats of the tree of knowledge and becomes conscious of being naked is looked upon as an evil tempter, entailing the loss of the primeval life of bliss in a gorgeous Paradise. The Babylonian point of view is optimistic. The change to civilized life—involving the wearing of clothes and the eating of food that is cultivated (bread and wine) is looked upon as an advance. Hence the woman is viewed as the medium of raising man to a higher level. The feature common to the Biblical and Babylonian tales is the attachment of a lesson to early folk-tales. The story of Adam and Eve,108 as the story of Enkidu and the woman, is told with a purpose. Starting with early traditions of men’s primitive life on earth, that may have arisen independently, Hebrew and [45]Babylonian writers diverged, each group going its own way, each reflecting the particular point of view from which the evolution of human society was viewed. Leaving the analogy between the Biblical and Babylonian tales aside, the main point of value for us in the Babylonian story of Enkidu and the woman is the proof furnished by the analysis, made possible through the Pennsylvania tablet, that the tale can be separated from its subsequent connection with Gilgamesh. We can continue this process of separation in the fourth column, where the woman instructs Enkidu in the further duty of living his life with the woman decreed for him, to raise a family, to engage in work, to build cities and to gather resources. All this is looked upon in the same optimistic spirit as marking progress, whereas the Biblical writer, consistent with his point of view, looks upon work as a curse, and makes Cain, the murderer, also the founder of cities. The step to the higher forms of life is not an advance according to the J document. It is interesting to note that even the phrase the “cursed ground” occurs in both the Babylonian and Biblical tales; but whereas in the latter (Gen. 3, 17) it is because of the hard work entailed in raising the products of the earth that the ground is cursed, in the former (lines 62–63) it is the place in which Enkidu lives before he advances to the dignity of human life that is “cursed,” and which he is asked to leave. Adam is expelled from Paradise as a punishment, whereas Enkidu is implored to leave it as a necessary step towards progress to a higher form of existence. The contrast between the Babylonian and the Biblical writer extends to the view taken of viniculture. The Biblical writer (again the J document) looks upon Noah’s drunkenness as a disgrace. Noah loses his sense of shame and uncovers himself (Genesis 9, 21), whereas in the Babylonian description Enkidu’s jolly spirit after he has drunk seven jars of wine meets with approval. The Biblical point of view is that he who drinks wine becomes drunk;109 the Babylonian says, if you drink wine you become happy.110 If the thesis here set forth of the original character and import of the episode of Enkidu with the woman is correct, we may again regard lines 149–153 of the Pennsylvania tablet, in which Gilgamesh is introduced, as a later addition to bring the two heroes into association. [46]The episode in its original form ended with the introduction of Enkidu first to pastoral life, and then to the still higher city life with regulated forms of social existence. Now, to be sure, this Enkidu has little in common with the Enkidu who is described as a powerful warrior, a Hercules, who kills lions, overcomes the giant Ḫuwawa, and dispatches a great bull, but it is the nature of folklore everywhere to attach to traditions about a favorite hero all kinds of tales with which originally he had nothing to do. Enkidu, as such a favorite, is viewed also as the type of primitive man,111 and so there arose gradually an Epic which began with his birth, pictured him as half-animal half-man, told how he emerged from this state, how he became civilized, was clothed, learned to eat food and drink wine, how he shaved off the hair with which his body was covered,112 anointed himself—in short, “He became manlike.”113 Thereupon he is taught his duties as a husband, is introduced to the work of building, and to laying aside supplies, and the like. The fully-developed and full-fledged hero then engages in various exploits, of which some are now embodied in the Gilgamesh Epic. Who this Enkidu was, we are not in a position to determine, but the suggestion has been thrown out above that he is a personage foreign to Babylonia, that his home appears to be in the undefined Amurru district, and that he conquers that district. The original tale of Enkidu, if this view be correct, must therefore have been carried to the Euphrates Valley, at a very remote period, with one of the migratory waves that brought a western people as invaders into Babylonia. Here the tale was combined with stories current of another hero, Gilgamesh—perhaps also of Western origin—whose conquest of Erech likewise represents an invasion of Babylonia. The center of the Gilgamesh tale was Erech, and in the process of combining the stories of Enkidu and Gilgamesh, Enkidu is brought to Erech and the two perform exploits [47]in common. In such a combination, the aim would be to utilize all the incidents of both tales. The woman who accompanies Enkidu, therefore, becomes the medium of bringing the two heroes together. The story of the evolution of primitive man to civilized life is transformed into the tale of Enkidu’s removal to Erech, and elaborated with all kinds of details, among which we have, as perhaps embodying a genuine historical tradition, the encounter of the two heroes. Before passing on, we have merely to note the very large part taken in both the old Babylonian and the Assyrian version by the struggle against Ḫuwawa. The entire Yale tablet—forming, as we have seen, the third of the series—is taken up with the preparation for the struggle, and with the repeated warnings given to Gilgamesh against the dangerous undertaking. The fourth tablet must have recounted the struggle itself, and it is not improbable that this episode extended into the fifth tablet, since in the Assyrian version this is the case. The elaboration of the story is in itself an argument in favor of assuming some historical background for it—the recollection of the conquest of Amurru by some powerful warrior; and we have seen that this conquest must be ascribed to Enkidu and not to Gilgamesh. If, now, Enkidu is not only the older figure but the one who is the real hero of the most notable episode in the Gilgamesh Epic; if, furthermore, Enkidu is the Hercules who kills lions and dispatches the bull sent by an enraged goddess, what becomes of Gilgamesh? What is left for him? In the first place, he is definitely the conqueror of Erech. He builds the wall of Erech,114 and we may assume that the designation of the city as Uruk supûri, “the walled Erech,”115 rests upon this tradition. He is also associated with the great temple Eanna, “the heavenly house,” in Erech. To Gilgamesh belongs also the unenviable tradition of having exercised his rule in Erech so harshly that the people are impelled to implore Aruru to create a rival who may rid [48]the district of the cruel tyrant, who is described as snatching sons and daughters from their families, and in other ways terrifying the population—an early example of “Schrecklichkeit.” Tablets II to V inclusive of the Assyrian version being taken up with the Ḫuwawa episode, modified with a view of bringing the two heroes together, we come at once to the sixth tablet, which tells the story of how the goddess Ishtar wooed Gilgamesh, and of the latter’s rejection of her advances. This tale is distinctly a nature myth. The attempt of Gressmann116 to find some historical background to the episode is a failure. The goddess Ishtar symbolizes the earth which woos the sun in the spring, but whose love is fatal, for after a few months the sun’s power begins to wane. Gilgamesh, who in incantation hymns is invoked in terms which show that he was conceived as a sun-god,117 recalls to the goddess how she changed her lovers into animals, like Circe of Greek mythology, and brought them to grief. Enraged at Gilgamesh’s insult to her vanity, she flies to her father Anu and cries for revenge. At this point the episode of the creation of the bull is introduced, but if the analysis above given is correct it is Enkidu who is the hero in dispatching the bull, and we must assume that the sickness with which Gilgamesh is smitten is the punishment sent by Anu to avenge the insult to his daughter. This sickness symbolizes the waning strength of the sun after midsummer is past. The sun recedes from the earth, and this was pictured in the myth as the sun-god’s rejection of Ishtar; Gilgamesh’s fear of death marks the approach of the winter season, when the sun appears to have lost its vigor completely and is near to death. The entire episode is, therefore, a nature myth, symbolical of the passing of spring to midsummer and then to the bare season. The myth has been attached to Gilgamesh as a favorite figure, and then woven into a pattern with the episode of Enkidu and the bull. The bull episode can be detached from the nature myth without any loss to the symbolism of the tale of Ishtar and Gilgamesh. As already suggested, with Enkidu’s death after this conquest of the bull the original Enkidu Epic came to an end. In order to connect Gilgamesh with Enkidu, the former is represented as sharing [49]in the struggle against the bull. Enkidu is punished with death, while Gilgamesh is smitten with disease. Since both shared equally in the guilt, the punishment should have been the same for both. The differentiation may be taken as an indication that Gilgamesh’s disease has nothing to do with the bull episode, but is merely part of the nature myth. Gilgamesh now begins a series of wanderings in search of the restoration of his vigor, and this motif is evidently a continuation of the nature myth to symbolize the sun’s wanderings during the dark winter in the hope of renewed vigor with the coming of the spring. Professor Haupt’s view is that the disease from which Gilgamesh is supposed to be suffering is of a venereal character, affecting the organs of reproduction. This would confirm the position here taken that the myth symbolizes the loss of the sun’s vigor. The sun’s rays are no longer strong enough to fertilize the earth. In accord with this, Gilgamesh’s search for healing leads him to the dark regions118 in which the scorpion-men dwell. The terrors of the region symbolize the gloom of the winter season. At last Gilgamesh reaches a region of light again, described as a landscape situated at the sea. The maiden in control of this region bolts the gate against Gilgamesh’s approach, but the latter forces his entrance. It is the picture of the sun-god bursting through the darkness, to emerge as the youthful reinvigorated sun-god of the spring. Now with the tendency to attach to popular tales and nature myths lessons illustrative of current beliefs and aspirations, Gilgamesh’s search for renewal of life is viewed as man’s longing for eternal life. The sun-god’s waning power after midsummer is past suggests man’s growing weakness after the meridian of life has been left behind. Winter is death, and man longs to escape it. Gilgamesh’s wanderings are used as illustration of this longing, and accordingly the search for life becomes also the quest for immortality. Can the precious boon of eternal life be achieved? Popular fancy created the figure of a favorite of the gods who had escaped a destructive deluge in which all mankind had perished.119 Gilgamesh hears [50]of this favorite and determines to seek him out and learn from him the secret of eternal life. The deluge story, again a pure nature myth, symbolical of the rainy season which destroys all life in nature, is thus attached to the Epic. Gilgamesh after many adventures finds himself in the presence of the survivor of the Deluge who, although human, enjoys immortal life among the gods. He asks the survivor how he came to escape the common fate of mankind, and in reply Utnapishtim tells the story of the catastrophe that brought about universal destruction. The moral of the tale is obvious. Only those singled out by the special favor of the gods can hope to be removed to the distant “source of the streams” and live forever. The rest of mankind must face death as the end of life. That the story of the Deluge is told in the eleventh tablet of the series, corresponding to the eleventh month, known as the month of “rain curse”120 and marking the height of the rainy season, may be intentional, just as it may not be accidental that Gilgamesh’s rejection of Ishtar is recounted in the sixth tablet, corresponding to the sixth month,121 which marks the end of the summer season. The two tales may have formed part of a cycle of myths, distributed among the months of the year. The Gilgamesh Epic, however, does not form such a cycle. Both myths have been artificially attached to the adventures of the hero. For the deluge story we now have the definite proof for its independent existence, through Dr. Poebel’s publication of a Sumerian text which embodies the tale,122 and without any reference [51]to Gilgamesh. Similarly, Scheil and Hilprecht have published fragments of deluge stories written in Akkadian and likewise without any connection with the Gilgamesh Epic.123 In the Epic the story leads to another episode attached to Gilgamesh, namely, the search for a magic plant growing in deep water, which has the power of restoring old age to youth. Utnapishtim, the survivor of the deluge, is moved through pity for Gilgamesh, worn out by his long wanderings. At the request of his wife, Utnapishtim decides to tell Gilgamesh of this plant, and he succeeds in finding it. He plucks it and decides to take it back to Erech so that all may enjoy the benefit, but on his way stops to bathe in a cool cistern. A serpent comes along and snatches the plant from him, and he is forced to return to Erech with his purpose unachieved. Man cannot hope, when old age comes on, to escape death as the end of everything. Lastly, the twelfth tablet of the Assyrian version of the Gilgamesh Epic is of a purely didactic character, bearing evidence of having been added as a further illustration of the current belief that there is no escape from the nether world to which all must go after life has come to an end. Proper burial and suitable care of the dead represent all that can be done in order to secure a fairly comfortable rest for those who have passed out of this world. Enkidu is once more introduced into this episode. His shade is invoked by Gilgamesh and rises up out of the lower world to give a discouraging reply to Gilgamesh’s request, “Tell me, my friend, tell me, my friend, The law of the earth which thou hast experienced, tell me,” The mournful message comes back: “I cannot tell thee, my friend, I cannot tell.” Death is a mystery and must always remain such. The historical Gilgamesh has clearly no connection with the figure introduced into [52]this twelfth tablet. Indeed, as already suggested, the Gilgamesh Epic must have ended with the return to Erech, as related at the close of the eleventh tablet. The twelfth tablet was added by some school-men of Babylonia (or perhaps of Assyria), purely for the purpose of conveying a summary of the teachings in regard to the fate of the dead. Whether these six episodes covering the sixth to the twelfth tablets, (1) the nature myth, (2) the killing of the divine bull, (3) the punishment of Gilgamesh and the death of Enkidu, (4) Gilgamesh’s wanderings, (5) the Deluge, (6) the search for immortality, were all included at the time that the old Babylonian version was compiled cannot, of course, be determined until we have that version in a more complete form. Since the two tablets thus far recovered show that as early as 2000 B.C. the Enkidu tale had already been amalgamated with the current stories about Gilgamesh, and the endeavor made to transfer the traits of the former to the latter, it is eminently likely that the story of Ishtar’s unhappy love adventure with Gilgamesh was included, as well as Gilgamesh’s punishment and the death of Enkidu. With the evidence furnished by Meissner’s fragment of a version of the old Babylonian revision and by our two tablets, of the early disposition to make popular tales the medium of illustrating current beliefs and the teachings of the temple schools, it may furthermore be concluded that the death of Enkidu and the punishment of Gilgamesh were utilized for didactic purposes in the old Babylonian version. On the other hand, the proof for the existence of the deluge story in the Hammurabi period and some centuries later, independent of any connection with the Gilgamesh Epic, raises the question whether in the old Babylonian version, of which our two tablets form a part, the deluge tale was already woven into the pattern of the Epic. At all events, till proof to the contrary is forthcoming, we may assume that the twelfth tablet of the Assyrian version, though also reverting to a Babylonian original, dates as the latest addition to the Epic from a period subsequent to 2000 B.C.; and that the same is probably the case with the eleventh tablet. To sum up, there are four main currents that flow together in the Gilgamesh Epic even in its old Babylonian form: (1) the adventures of a mighty warrior Enkidu, resting perhaps on a faint tradition [53]of the conquest of Amurru by the hero; (2) the more definite recollection of the exploits of a foreign invader of Babylonia by the name of Gilgamesh, whose home appears likewise to have been in the West;124 (3) nature myths and didactic tales transferred to Enkidu and Gilgamesh as popular figures; and (4) the process of weaving the traditions, exploits, myths and didactic tales together, in the course of which process Gilgamesh becomes the main hero, and Enkidu his companion. Furthermore, our investigation has shown that to Enkidu belongs the episode with the woman, used to illustrate the evolution of primitive man to the ways and conditions of civilized life, the conquest of Ḫuwawa in the land of Amurru, the killing of lions and also of the bull, while Gilgamesh is the hero who conquers Erech. Identified with the sun-god, the nature myth of the union of the sun with the earth and the subsequent separation of the two is also transferred to him. The wanderings of the hero, smitten with disease, are a continuation of the nature myth, symbolizing the waning vigor of the sun with the approach of the wintry season. The details of the process which led to making Gilgamesh the favorite figure, to whom the traits and exploits of Enkidu and of the sun-god are transferred, escape us, but of the fact that Enkidu is the older figure, of whom certain adventures were set forth in a tale that once had an independent existence, there can now be little doubt in the face of the evidence furnished by the two tablets of the old Babylonian version; just as the study of these tablets shows that in the combination of the tales of Enkidu and Gilgamesh, the former is the prototype of which Gilgamesh is the copy. If the two are regarded as brothers, as born in the same place, even resembling one another in appearance and carrying out their adventures in common, it is because in the process of combination Gilgamesh becomes the reflex of Enkidu. That Enkidu is not the figure created by Aruru to relieve Erech of its tyrannical ruler is also shown by the fact that Gilgamesh remains in control of Erech. It is to Erech that he returns when he fails of his purpose to learn the secret of escape from old age and death. Erech is, therefore, not relieved of the presence of the ruthless ruler through Enkidu. The “Man of Anu” formed by Aruru as a deliverer is confused in the course of the growth of the [54]Epic with Enkidu, the offspring of Ninib, and in this way we obtain the strange contradiction of Enkidu and Gilgamesh appearing first as bitter rivals and then as close and inseparable friends. It is of the nature of Epic compositions everywhere to eliminate unnecessary figures by concentrating on one favorite the traits belonging to another or to several others. The close association of Enkidu and Gilgamesh which becomes one of the striking features in the combination of the tales of these two heroes naturally recalls the “Heavenly Twins” motif, which has been so fully and so suggestively treated by Professor J. Rendell Harris in his Cult of the Heavenly Twins, (London, 1906). Professor Harris has conclusively shown how widespread the tendency is to associate two divine or semi-divine beings in myths and legends as inseparable companions125 or twins, like Castor and Pollux, Romulus and Remus,126 the Acvins in the Rig-Veda,127 Cain and Abel, Jacob and Esau in the Old Testament, the Kabiri of the Phoenicians,128 Herakles and Iphikles in Greek mythology, Ambrica and Fidelio in Teutonic mythology, Patollo and Potrimpo in old Prussian mythology, Cautes and Cautopates in Mithraism, Jesus and Thomas (according to the Syriac Acts of Thomas), and the various illustrations of “Dioscuri in Christian Legends,” set forth by Dr. Harris in his work under this title, which carries the motif far down into the period of legends about Christian Saints who appear in pairs, including the reference to such a pair in Shakespeare’s Henry V: “And Crispin Crispian shall ne’er go by From that day to the ending of the world.”—(Act, IV, 3, 57–58.) There are indeed certain parallels which suggest that Enkidu-Gilgamesh may represent a Babylonian counterpart to the “Heavenly [55]Twins.” In the Indo-Iranian, Greek and Roman mythology, the twins almost invariably act together. In unison they proceed on expeditions to punish enemies.129 But after all, the parallels are of too general a character to be of much moment; and moreover the parallels stop short at the critical point, for Gilgamesh though worsted is not killed by Enkidu, whereas one of the “Heavenly Twins” is always killed by the brother, as Abel is by Cain, and Iphikles by his twin brother Herakles. Even the trait which is frequent in the earliest forms of the “Heavenly Twins,” according to which one is immortal and the other is mortal, though applying in a measure to Enkidu who is killed by Ishtar, while Gilgamesh the offspring of a divine pair is only smitten with disease, is too unsubstantial to warrant more than a general comparison between the Enkidu-Gilgamesh pair and the various forms of the “twin” motif found throughout the ancient world. For all that, the point is of some interest that in the Gilgamesh Epic we should encounter two figures who are portrayed as possessing the same traits and accomplishing feats in common, which suggest a partial parallel to the various forms in which the twin-motif appears in the mythologies, folk-lore and legends of many nations; and it may be that in some of these instances the duplication is due, as in the case of Enkidu and Gilgamesh, to an actual transfer of the traits of one figure to another who usurped his place. In concluding this study of the two recently discovered tablets of the old Babylonian version of the Gilgamesh Epic which has brought us several steps further in the interpretation and in our understanding of the method of composition of the most notable literary production of ancient Babylonia, it will be proper to consider the literary relationship of the old Babylonian to the Assyrian version. We have already referred to the different form in which the names of the chief figures appear in the old Babylonian version, dGish as against dGish-gì(n)-mash, dEn-ki-dũ as against dEn-ki-dú, Ḫu-wa-wa as against Ḫu(m)-ba-ba. Erech appears as Uruk ribîtim, “Erech of [56]the Plazas,” as against Uruk supûri, “walled Erech” (or “Erech within the walls”), in the Assyrian version.130 These variations point to an independent recension for the Assyrian revision; and this conclusion is confirmed by a comparison of parallel passages in our two tablets with the Assyrian version, for such parallels rarely extend to verbal agreements in details, and, moreover, show that the Assyrian version has been elaborated. Beginning with the Pennsylvania tablet, column I is covered in the Assyrian version by tablet I, 5, 25, to 6, 33, though, as pointed out above, in the Assyrian version we have the anticipation of the dreams of Gilgamesh and their interpretation through their recital to Enkidu by his female companion, whereas in the old Babylonian version we have the dreams directly given in a conversation between Gilgamesh and his mother. In the anticipation, there would naturally be some omissions. So lines 4–5 and 12–13 of the Pennsylvania tablet do not appear in the Assyrian version, but in their place is a line (I, 5, 35), to be restored to ”[I saw him and like] a woman I fell in love with him.” which occurs in the old Babylonian version only in connection with the second dream. The point is of importance as showing that in the Babylonian version the first dream lays stress upon the omen of the falling meteor, as symbolizing the coming of Enkidu, whereas the second dream more specifically reveals Enkidu as a man,131 of whom Gilgamesh is instantly enamored. Strikingly variant lines, though conveying the same idea, are frequent. Thus line 14 of the Babylonian version reads “I bore it and carried it to thee” and appears in the Assyrian version (I, 5, 35b supplied from 6, 26) “I threw it (or him) at thy feet”132 [57]with an additional line in elaboration “Thou didst bring him into contact with me”133 which anticipates the speech of the mother (Line 41 = Assyrian version I, 6, 33). Line 10 of the Pennsylvania tablet has pa-ḫi-ir as against iz-za-az I, 5, 31. Line 8 has ik-ta-bi-it as against da-an in the Assyrian version I, 5, 29. More significant is the variant to line 9 “I became weak and its weight I could not bear” as against I, 5, 30. “Its strength was overpowering,134 and I could not endure its weight.” The important lines 31–36 are not found in the Assyrian version, with the exception of I, 6, 27, which corresponds to lines 33–34, but this lack of correspondence is probably due to the fact that the Assyrian version represents the anticipation of the dreams which, as already suggested, might well omit some details. As against this we have in the Assyrian version I, 6, 23–25, an elaboration of line 30 in the Pennsylvania tablet and taken over from the recital of the first dream. Through the Assyrian version I, 6, 31–32, we can restore the closing lines of column I of the Pennsylvania tablet, while with line 33 = line 45 of the Pennsylvania tablet, the parallel between the two versions comes to an end. Lines 34–43 of the Assyrian version (bringing tablet I to a close)135 represent an elaboration of the speech of Ninsun, followed by a further address of Gilgamesh to his mother, and by the determination of Gilgamesh to seek out Enkidu.136 Nothing of this sort appears to have been included in the old Babylonian version.[58]Our text proceeds with the scene between Enkidu and the woman, in which the latter by her charms and her appeal endeavors to lead Enkidu away from his life with the animals. From the abrupt manner in which the scene is introduced in line 43 of the Pennsylvania tablet, it is evident that this cannot be the first mention of the woman. The meeting must have been recounted in the first tablet, as is the case in the Assyrian version.137 The second tablet takes up the direct recital of the dreams of Gilgamesh and then continues the narrative. Whether in the old Babylonian version the scene between Enkidu and the woman was described with the same naïve details, as in the Assyrian version, of the sexual intercourse between the two for six days and seven nights cannot of course be determined, though presumably the Assyrian version, with the tendency of epics to become more elaborate as they pass from age to age, added some realistic touches. Assuming that lines 44–63 of the Pennsylvania tablet—the cohabitation of Enkidu and the address of the woman—is a repetition of what was already described in the first tablet, the comparison with the Assyrian version I, 4, 16–41, not only points to the elaboration of the later version, but likewise to an independent recension, even where parallel lines can be picked out. Only lines 46–48 of the Pennsylvania tablet form a complete parallel to line 21 of column 4 of the Assyrian version. The description in lines 22–32 of column 4 is missing, though it may, of course, have been included in part in the recital in the first tablet of the old Babylonian version. Lines 49–59 of the Pennsylvania tablet are covered by 33–39, the only slight difference being the specific mention in line 58 of the Pennsylvania tablet of Eanna, the temple in Erech, described as “the dwelling of Anu,” whereas in the Assyrian version Eanna is merely referred to as the “holy house” and described as “the dwelling of Anu and Ishtar,” where Ishtar is clearly a later addition. Leaving aside lines 60–61, which may be merely a variant (though independent) of line 39 of column 4 of the Assyrian version, we now have in the Pennsylvania tablet a second speech of the woman to Enkidu (not represented in the Assyrian version) beginning like the first one with alka, “Come” (lines 62–63), in which she asks Enkidu to leave the “accursed ground” in which he dwells. This speech, as the description which follows, extending into columns 3–4, [59]and telling how the woman clothed Enkidu, how she brought him to the sheep folds, how she taught him to eat bread and to drink wine, and how she instructed him in the ways of civilization, must have been included in the second tablet of the Assyrian version which has come down to us in a very imperfect form. Nor is the scene in which Enkidu and Gilgamesh have their encounter found in the preserved portions of the second (or possibly the third) tablet of the Assyrian version, but only a brief reference to it in the fourth tablet,138 in which in Epic style the story is repeated, leading up to the second exploit—the joint campaign of Enkidu and Gilgamesh against Ḫuwawa. This reference, covering only seven lines, corresponds to lines 192–231 of the Pennsylvania tablet; but the former being the repetition and the latter the original recital, the comparison to be instituted merely reveals again the independence of the Assyrian version, as shown in the use of kibsu, “tread” (IV, 2, 46), for šêpu, “foot” (l. 216), i-na-uš, “quake” (line 5C), as against ir-tu-tu (ll. 221 and 226). Such variants as dGish êribam ûl iddin (l. 217) against dGilgamesh ana šurûbi ûl namdin, (IV, 2, 47). and again iṣṣabtûma kima lîm “they grappled at the gate of the family house” (IV, 2, 48), against iṣṣabtûma ina bâb bît emuti, “they grappled at the gate of the family house” (IV, 2, 48), all point once more to the literary independence of the Assyrian version. The end of the conflict and the reconciliation of the two heroes is likewise missing in the Assyrian version. It may have been referred to at the beginning of column 3139 of Tablet IV. Coming to the Yale tablet, the few passages in which a comparison [60]may be instituted with the fourth tablet of the Assyrian version, to which in a general way it must correspond, are not sufficient to warrant any conclusions, beyond the confirmation of the literary independence of the Assyrian version. The section comprised within lines 72–89, where Enkidu’s grief at his friend’s decision to fight Ḫuwawa is described140, and he makes confession of his own physical exhaustion, may correspond to Tablet IV, column 4, of the Assyrian version. This would fit in with the beginning of the reverse, the first two lines of which (136–137) correspond to column 5 of the fourth tablet of the Assyrian version, with a variation “seven-fold fear”141 as against “fear of men” in the Assyrian version. If lines 138–139 (in column 4) of the Yale tablet correspond to line 7 of column 5 of Tablet IV of the Assyrian version, we would again have an illustration of the elaboration of the later version by the addition of lines 3–6. But beyond this we have merely the comparison of the description of Ḫuwawa “Whose roar is a flood, whose mouth is fire, and whose breath is death” which occurs twice in the Yale tablet (lines 110–111 and 196–197), with the same phrase in the Assyrian version Tablet IV, 5, 3—but here, as just pointed out, with an elaboration. Practically, therefore, the entire Yale tablet represents an addition to our knowledge of the Ḫuwawa episode, and until we are fortunate enough to discover more fragments of the fourth tablet of the Assyrian version, we must content ourselves with the conclusions reached from a comparison of the Pennsylvania tablet with the parallels in the Assyrian version. It may be noted as a general point of resemblance in the exterior form of the old Babylonian and Assyrian versions that both were inscribed on tablets containing six columns, three on the obverse and three on the reverse; and that the length of the tablets—an average of 40 to 50 lines—was about the same, thus revealing in the external form a conventiona1 size for the tablets in the older period, which was carried over into later times. [61] 1 See for further details of this royal library, Jastrow, Civilization of Babylonia and Assyria, p. 21 seq. 2 Das Babylonische Nimrodepos (Leipzig, 1884–1891), supplemented by Haupt’s article Die Zwölfte Tafel des Babylonischen Nimrodepos in BA I, pp. 48–79, containing the fragments of the twelfth tablet. The fragments of the Epic in Ashurbanapal’s library—some sixty—represent portions of several copies. Sin-liḳî-unnini—perhaps from Erech, since this name appears as that of a family in tablets from Erech (see Clay, Legal Documents from Erech, Index, p. 73)—is named in a list of texts (K 9717—Haupt’s edition No. 51, line 18) as the editor of the Epic, though probably he was not the only compiler. Since the publication of Haupt’s edition, a few fragments were added by him as an appendix to Alfred Jeremias Izdubar-Nimrod (Leipzig, 1891) Plates II–IV, and two more are embodied in Jensen’s transliteration of all the fragments in the Keilinschriftliche Bibliothek VI; pp. 116–265, with elaborate notes, pp. 421–531. Furthermore a fragment, obtained from supplementary excavations at Kouyunjik, has been published by L. W. King in his Supplement to the Catalogue of the Cuneiform Tablets in the Kouyunjik Collection of the British Cuneiform Tablets in the Kouyunjik Collection of the British Museum No. 56 and PSBA Vol. 36, pp. 64–68. Recently a fragment of the 6th tablet from the excavations at Assur has been published by Ebeling, Keilschrifttexte aus Assur Religiösen Inhalts No. 115, and one may expect further portions to turn up. The designation “Nimrod Epic” on the supposition that the hero of the Babylonian Epic is identical with Nimrod, the “mighty hunter” of Genesis 10, has now been generally abandoned, in the absence of any evidence that the Babylonian hero bore a name like [10n]Nimrod. For all that, the description of Nimrod as the “mighty hunter” and the occurrence of a “hunter” in the Babylonian Epic (Assyrian version Tablet I)—though he is not the hero—points to a confusion in the Hebrew form of the borrowed tradition between Gilgamesh and Nimrod. The latest French translation of the Epic is by Dhorme, Choix de Textes Religieux Assyro-Babyloniens (Paris, 1907), pp. 182–325; the latest German translation by Ungnad-Gressmann, Das Gilgamesch-Epos (Göttingen, 1911), with a valuable analysis and discussion. These two translations now supersede Jensen’s translation in the Keilinschriftliche Bibliothek, which, however, is still valuable because of the detailed notes, containing a wealth of lexicographical material. Ungnad also gave a partial translation in Gressmann-Ranke, Altorientalische Texte and Bilder I, pp. 39–61. In English, we have translations of substantial portions by Muss-Arnolt in Harper’s Assyrian and Babylonian Literature (New York, 1901), pp. 324–368; by Jastrow, Religion of Babylonia and Assyria (Boston, 1898), Chap. XXIII; by Clay in Light on the Old Testament from Babel, pp. 78–84; by Rogers in Cuneiform Parallels to the Old Testament, pp. 80–103; and most recently by Jastrow in Sacred Books and Early Literature of the East (ed. C. F. Horne, New York, 1917), Vol. I, pp. 187–220. 3 See Luckenbill in JAOS, Vol. 37, p. 452 seq. Prof. Clay, it should be added, clings to the older reading, Hammurabi, which is retained in this volume. 4 ZA, Vol. 14, pp. 277–292. 5 The survivor of the Deluge is usually designated as Ut-napishtim in the Epic, but in one passage (Assyrian version, Tablet XI, 196), he is designated as Atra-ḫasis “the very wise one.” Similarly, in a second version of the Deluge story, also found in Ashurbanapal’s library (IV R² additions, p. 9, line 11). The two names clearly point to two versions, which in accordance with the manner of ancient compositions were merged into one. See an article by Jastrow in ZA, Vol. 13, pp. 288–301. 6 Published by Scheil in Recueil des Travaux, etc. Vol. 20, pp. 55–58. 7 The text does not form part of the Gilgamesh Epic, as the colophon, differing from the one attached to the Epic, shows. 8 Ein altbabylonisches Fragment des Gilgamosepos (MVAG 1902, No. 1). 9 On these variant forms of the two names see the discussion below, p. 24. 10 The passage is paralleled by Ecc. 9, 7–9. See Jastrow, A Gentle Cynic, p. 172 seq. 11 Among the Nippur tablets in the collection of the University of Pennsylvania Museum. The fragment was published by Dr. Poebel in his Historical and Grammatical Texts No. 23. See also Poebel in the Museum Journal, Vol. IV, p. 47, and an article by Dr. Langdon in the same Journal, Vol. VII, pp. 178–181, though Langdon fails to credit Dr. Poebel with the discovery and publication of the important tablet. 12 No. 55 in Langdon’s Historical and Religious Texts from the Temple Library of Nippur (Munich, 1914). 13 No. 5 in his Sumerian Liturgical Texts. (Philadelphia, 1917) 14 See on this name below, p. 23. 15 See further below, p. 37 seq. 16 See Poebel, Historical and Grammatical Texts, No. 1, and Jastrow in JAOS, Vol. 36, pp. 122–131 and 274–299. 17 See an article by Jastrow, Sumerian and Akkadian Views of Beginnings (JAOS Vol. 36, pp. 274–299). 18 See on this point Eduard Meyer, Sumerier und Semiten in Babylonien (Berlin, 1906), p. 107 seq., whose view is followed in Jastrow, Civilization of Babylonia and Assyria, p. 121. See also Clay, Empire of the Amorites (Yale University Press, 1919), p. 23 et seq. 19 See the discussion below, p. 24 seq. 20 Dr. Poebel published an article on the tablet in OLZ, 1914, pp. 4–6, in which he called attention to the correct name for the mother of Gilgamesh, which was settled by the tablet as Ninsun. 21 Historical Texts No. 2, Column 2, 26. See the discussion in Historical and Grammatical Texts, p. 123, seq. 22 See Fostat in OLZ, 1915, p. 367. 23 Publications of the University of Pennsylvania Museum, Babylonian Section, Vol. X, No. 3 (Philadelphia, 1917). It is to be regretted that Dr. Langdon should not have given full credit to Dr. Poebel for his discovery of the tablet. He merely refers in an obscure footnote to Dr. Poebel’s having made a copy. 24 E.g., in the very first note on page 211, and again in a note on page 213. 25 Dr. Langdon neglected to copy the signs 4 šú-si = 240 which appear on the edge of the tablet. He also misunderstood the word šú-tu-ur in the colophon which he translated “written,” taking the word from a stem šaṭâru, “write.” The form šú-tu-ur is III, 1, from atâru, “to be in excess of,” and indicates, presumably, that the text is a copy “enlarged” from an older original. See the Commentary to the colophon, p. 86. 26 Museum Journal, Vol. VIII, p. 29. 27 See below, p. 23. 28 I follow the enumeration of tablets, columns and lines in Jensen’s edition, though some fragments appear to have been placed by him in a wrong position. 29 According to Bezold’s investigation, Verbalsuffixformen als Alterskriterien babylonisch-assyrischer Inschriften (Heidelberg Akad. d. Wiss., Philos.-Histor. Klasse, 1910, 9te Abhandlung), the bulk of the tablets in Ashurbanapal’s library are copies of originals dating from about 1500 B.C. It does not follow, however, that all the copies date from originals of the same period. Bezold reaches the conclusion on the basis of various forms for verbal suffixes, that the fragments from the Ashurbanapal Library actually date from three distinct periods ranging from before c. 1450 to c. 700 B.C. 30 “Before thou comest from the mountain, Gilgamesh in Erech will see thy dreams,” after which the dreams are recounted by the woman to Enkidu. The expression “thy dreams” means here “dreams about thee.” (Tablet I, 5, 23–24). 31 Lines 100–101. 32 In a paper read before the American Oriental Society at New Haven, April 4, 1918. 33 See the commentary to col. 4 of the Yale tablet for further details. 34 This is no doubt the correct reading of the three signs which used to be read Iz-tu-bar or Gish-du-bar. The first sign has commonly the value Gish, the second can be read Gin or Gi (Brünnow No. 11900) and the third Mash as well as Bar. See Ungnad in Ungnad-Gressmann, Das Gilgamesch-Epos, p. 76, and Poebel, Historical and Grammatical Texts, p. 123. 35 So also in Sumerian (Zimmern, Sumerische Kultlieder aus altbabylonischer Zeit, No. 196, rev. 14 and 16.) 36 The sign used, LUM (Brünnow No. 11183), could have the value ḫu as well as ḫum. 37 The addition “father-in-law of Moses” to the name Ḫobab b. Re’uel in this passage must refer to Re’uel, and not to Ḫobab. In Judges 4, 11, the gloss “of the Bene Ḫobab, the father-in-law of Moses” must be separated into two: (1) “Bene Ḫobab,” and (2) “father-in-law of Moses.” The latter addition rests on an erroneous tradition, or is intended as a brief reminder that Ḫobab is identical with the son of Re’uel. 38 See his List of Personal Names from the Temple School of Nippur, p. 122. Ḫu-um-ba-bi-tu and ši-kin ḫu-wa-wa also occur in Omen Texts (CT XXVII, 4, 8–9 = Pl. 3, 17 = Pl. 6, 3–4 = CT XXVIII, 14, 12). The contrast to ḫuwawa is ligru, “dwarf” (CT XXVII, 4, 12 and 14 = Pl. 6, 7.9 = Pl. 3, 19). See Jastrow, Religion Babyloniens und Assyriens, II, p. 913, Note 7. Ḫuwawa, therefore, has the force of “monster.” 39 Ungnad-Gressmann, Das Gilgamesch-Epos, p. 111 seq. 40 Ungnad, 1. c. p. 77, called attention to this name, but failed to draw the conclusion that Ḫu(m)baba therefore belongs to the West and not to the East. 41 First pointed out by Ungnad in OLZ 1910, p. 306, on the basis of CT XVIII, 30, 10, where En-gi-dú appears in the column furnishing phonetic readings. 42 See Clay Amurru, pp. 74, 129, etc. 43 Tablet I, 2, 39–40; 3, 6–7 and 33–34; 4, 3–4. 44 Tablet I, 2, 1 and IX, 2, 16. Note also the statement about Gilgamesh that “his body is flesh of the gods” (Tablet IX, 2, 14; X, 1, 7). 45 BOR IV, p. 264. 46 Lewin, Die Scholien des Theodor bar Koni zur Patriarchengeschichte (Berlin, 1905), p. 2. See Gressmann in Ungnad-Gressmann, Das Gilgamesch-Epos, p. 83, who points out that the first element of גלמגוס compared with the second of גמיגמוס gives the exact form that we require, namely, Gilgamos. 47 Tablet I, col. 2, is taken up with this episode. 48 See Poebel, Historical and Grammatical Texts, p. 123. 49 See Poebel, Historical Texts No. 2, col. 2, 26. 50 Hilprecht, Old Babylonian Inscriptions I, 1 No. 26. 51 Delitzsch, Assyrische Lesestücke, p. 88, VI, 2–3. Cf. also CT XXV, 28(K 7659) 3, where we must evidently supply [Esigga]-tuk, for which in the following line we have again Gish-bil-ga-mesh as an equivalent. See Meissner, OLZ 1910, 99. 52 See, e.g., Barton, Haverford Collection II No. 27, Col. I, 14, etc. 53 Deimel, Pantheon Babylonicum, p. 95. 54 CT XII, 50 (K 4359) obv. 17. 55 See Barton, Origin and Development of Babylonian Writing, II, p. 99 seq., for various explanations, though all centering around the same idea of the picture of fire in some form. 56 See the passages quoted by Poebel, Historical and Grammatical Texts, p. 126. 57 E.g., Genesis 4, 20, Jabal, “the father of tent-dwelling and cattle holding;” Jubal (4, 21), “the father of harp and pipe striking.” 58 See particularly the plays (in the J. Document) upon the names of the twelve sons of Jacob, which are brought forward either as tribal characteristics, or as suggested by some incident or utterance by the mother at the birth of each son. 59 The designation is variously explained by Arabic writers. See Beidhawi’s Commentary (ed. Fleischer), to Súra 18, 82. 60 The writing Gish-gi-mash as an approach to the pronunciation Gilgamesh would thus represent the beginning of the artificial process which seeks to interpret the first syllable as “hero.” 61 See above, p. 27. 62 Poebel, Historical Texts, p. 115 seq. 63 Many years ago (BA III, p. 376) I equated Etana with Ethan in the Old Testament—therefore a West Semitic name. 64 See Clay, The Empire of the Amorites, p. 80. 65 Professor Clay strongly favors an Amoritic origin also for Gilgamesh. His explanation of the name is set forth in his recent work on The Empire of the Amorites, page 89, and is also referred to in his work on Amurru, page 79, and in his volume of Miscellaneous Inscriptions in the Yale Babylonian Collection, page 3, note. According to Professor Clay the original form of the hero’s name was West Semitic, and was something like Bilga-Mash, the meaning of which was perhaps “the offspring of Mash.” For the first element in this division of the name cf. Piliḳam, the name of a ruler of an early dynasty, and Balaḳ of the Old Testament. In view of the fact that the axe figures so prominently in the Epic as an instrument wielded by Gilgamesh, Professor Clay furthermore thinks it reasonable to assume that the name was interpreted by the Babylonian scribe as “the axe of Mash.” In this way he would account for the use of the determinative for weapons, which is also the sign Gish, in the name. It is certainly noteworthy that the ideogram Gish-Tún in the later form of Gish-Tún-mash = pašu, “axe,” CT XVI, 38:14b, etc. Tun also = pilaḳu “axe,” CT xii, 10:34b. Names with similar element (besides Piliḳam) are Belaḳu of the Hammurabi period, Bilaḳḳu of the Cassite period, etc. It is only proper to add that Professor Jastrow assumes the responsibility for the explanation of the form and etymology of the name Gilgamesh proposed in this volume. The question is one in regard to which legitimate differences of opinion will prevail among scholars until through some chance a definite decision, one way or the other, can be reached. 66 me-iḫ-rù (line 191). 67 Tablet I, 5, 23. Cf. I, 3, 2 and 29. 68 Tablet IV, 4, 7 and I, 5, 3. 69 Assyrian version, Tablet II, 3b 34, in an address of Shamash to Enkidu. 70 So Assyrian version, Tablet VIII, 3, 11. Also supplied VIII, 5, 20 and 21; and X, 1, 46–47 and 5, 6–7. 71 Tablet XII, 3, 25. 72 Ward, Seal Cylinders of Western Asia, Chap. X, and the same author’s Cylinders and other Ancient Oriental Seals—Morgan collection Nos. 19–50. 73 E.g., Ward No. 192, Enkidu has human legs like Gilgamesh; also No. 189, where it is difficult to say which is Gilgamesh, and which is Enkidu. The clothed one is probably Gilgamesh, though not infrequently Gilgamesh is also represented as nude, or merely with a girdle around his waist. 74 E.g., Ward, Nos. 173, 174, 190, 191, 195 as well as 189 and 192. 75 On the other hand, in Ward Nos. 459 and 461, the conflict between the two heroes is depicted with the heroes distinguished in more conventional fashion, Enkidu having the hoofs of an animal, and also with a varying arrangement of beard and hair. 76 See Jastrow, Religion of Babylonia and Assyria (Boston, 1898), p. 468 seq. 77 Ungnad-Gressmann, Das Gilgamesch-Epos, p. 90 seq. 78 Pennsylvania tablet, l. 198 = Assyrian version, Tablet IV, 2, 37. 79 “Enkidu blocked the gate” (Pennsylvania tablet, line 215) = Assyrian version Tablet IV, 2, 46: “Enkidu interposed his foot at the gate of the family house.” 80 Pennsylvania tablet, lines 218 and 224. 81 Yale tablet, line 198; also to be supplied lines 13–14. 82 Yale tablet, lines 190 and 191. 83 PSBA 1914, 65 seq. = Jensen III, 1a, 4–11, which can now be completed and supplemented by the new fragment. 84 I.e., Enkidu will save Gilgamesh. 85 These two lines impress one as popular sayings—here applied to Enkidu. 86 King’s fragment, col. I, 13–27, which now enables us to complete Jensen III, 1a, 12–21. 87 Yale tablet, lines 252–253. 88 Yale tablet, lines 143–148 = Assyrian version, Tablet IV, 6, 26 seq. 89 Assyrian version, Tablet III, 2a, 13–14. 90 Lines 215–222. 91 Assyrian version, Tablet V, Columns 3–4. We have to assume that in line 13 of column 4 (Jensen, p. 164), Enkidu takes up the thread of conversation, as is shown by line 22: “Enkidu brought his dream to him and spoke to Gilgamesh.” 92 Assyrian version, Tablet VI, lines 146–147. 93 Lines 178–183. 94 Lines 176–177. 95 Tablet VII, Column 6. 96 Assyrian version, Tablet VI, 200–203. These words are put into the mouth of Gilgamesh (lines 198–199). It is, therefore, unlikely that he would sing his own praise. Both Jensen and Ungnad admit that Enkidu is to be supplied in at least one of the lines. 97 Lines 109 and 112. 98 Assyrian version, Tablet IX, 1, 8–9. 99 Tablet VIII, 5, 2–6. 100 So also Gressmann in Ungnad-Gressmann, Das Gilgamesch-Epos, p. 97, regards Enkidu as the older figure. 101 See Jastrow, Adam and Eve in Babylonian Literature, AJSL, Vol. 15, pp. 193–214. 102 Assyrian version, Tablet I, 2, 31–36. 103 It will be recalled that Enkidu is always spoken of as “born in the field.” 104 Note the repetition ibtani “created” in line 33 of the “man of Anu” and in line 35 of the offspring of Ninib. The creation of the former is by the “heart,” i.e., by the will of Aruru, the creation of the latter is an act of moulding out of clay. 105 Tablet I, Column 3. 106 Following as usual the enumeration of lines in Jensen’s edition. 107 An analogy does not involve a dependence of one tale upon the other, but merely that both rest on similar traditions, which may have arisen independently. 108 Note that the name of Eve is not mentioned till after the fall (Genesis 3, 20). Before that she is merely ishsha, i.e., “woman,” just as in the Babylonian tale the woman who guides Enkidu is ḫarimtu, “woman.” 109 “And he drank and became drunk” (Genesis 9, 21). 110 “His heart became glad and his face shone” (Pennsylvania Tablet, lines 100–101). 111 That in the combination of this Enkidu with tales of primitive man, inconsistent features should have been introduced, such as the union of Enkidu with the woman as the beginning of a higher life, whereas the presence of a hunter and his father shows that human society was already in existence, is characteristic of folk-tales, which are indifferent to details that may be contradictory to the general setting of the story. 112 Pennsylvania tablet, lines 102–104. 113 Line 105. 114 Tablet I, 1, 9. See also the reference to the wall of Erech as an “old construction” of Gilgamesh, in the inscription of An-Am in the days of Sin-gamil (Hilprecht, Old Babylonian Inscriptions, I, No. 26.) Cf IV R² 52, 3, 53. 115 The invariable designation in the Assyrian version as against Uruk ribîtim, “Erech of the plazas,” in the old Babylonian version. 116 In Ungnad-Gressmann, Das Gilgamesch-Epos, p. 123 seq. 117 See Jensen, p. 266. Gilgamesh is addressed as “judge,” as the one who inspects the divisions of the earth, precisely as Shamash is celebrated. In line 8 of the hymn in question, Gilgamesh is in fact addressed as Shamash. 118 The darkness is emphasized with each advance in the hero’s wanderings (Tablet IX, col. 5). 119 This tale is again a nature myth, marking the change from the dry to the rainy season. The Deluge is an annual occurrence in the Euphrates Valley through the overflow [50n]of the two rivers. Only the canal system, directing the overflow into the fields, changed the curse into a blessing. In contrast to the Deluge, we have in the Assyrian creation story the drying up of the primeval waters so that the earth makes its appearance with the change from the rainy to the dry season. The world is created in the spring, according to the Akkadian view which is reflected in the Biblical creation story, as related in the P. document. See Jastrow, Sumerian and Akkadian Views of Beginnings (JAOS, Vol 36, p. 295 seq.). 120 Aš-am in Sumerian corresponding to the Akkadian Šabaṭu, which conveys the idea of destruction. 121 The month is known as the “Mission of Ishtar” in Sumerian, in allusion to another nature myth which describes Ishtar’s disappearance from earth and her mission to the lower world. 122 Historical Texts No. 1. The Sumerian name of the survivor is Zi-ū-gíd-du or perhaps Zi-ū-sū-du (cf. King, Legends of Babylon and Egypt, p. 65, note 4), signifying “He who lengthened the day of life,” i.e., the one of long life, of which Ut-napishtim (“Day of Life”) in the Assyrian version seems to be an abbreviated Akkadian rendering, [n]with the omission of the verb. So King’s view, which is here followed. See also CT XVIII, 30, 9, and Langdon, Sumerian Epic of Paradise, p. 90, who, however, enters upon further speculations that are fanciful. 123 See the translation in Ungnad-Gressmann, Das Gilgamesch-Epos, pp. 69, seq. and 73. 124 According to Professor Clay, quite certainly Amurru, just as in the case of Enkidu. 125 Gressmann in Ungnad-Gressmann, Das Gilgamesch-Epos, p. 100 seq. touches upon this motif, but fails to see the main point that the companions are also twins or at least brothers. Hence such examples as Abraham and Lot, David and Jonathan, Achilles and Patroclus, Eteokles and Polyneikes, are not parallels to Gilgamesh-Enkidu, but belong to the enlargement of the motif so as to include companions who are not regarded as brothers. 126 Or Romus. See Rendell Harris, l. c., p. 59, note 2. 127 One might also include the primeval pair Yama-Yami with their equivalents in Iranian mythology (Carnoy, Iranian Mythology, p. 294 seq.). 128 Becoming, however, a triad and later increased to seven. Cf. Rendell Harris, l. c., p. 32. 129 I am indebted to my friend, Professor A. J. Carnoy, of the University of Louvain, for having kindly gathered and placed at my disposal material on the “twin-brother” motif from Indo-European sources, supplemental to Rendell Harris’ work. 130 On the other hand, Uruk mâtum for the district of Erech, i.e., the territory over which the city holds sway, appears in both versions (Pennsylvania tablet, 1. 10 = Assyrian version I, 5, 36). 131 “My likeness” (line 27). It should be noted, however, that lines 32–44 of I, 5, in Jensen’s edition are part of a fragment K 9245 (not published, but merely copied by Bezold and Johns, and placed at Jensen’s disposal), which may represent a duplicate to I, 6, 23–34, with which it agrees entirely except for one line, viz., line 34 of K 9245 which is not found in column 6, 23–34. If this be correct, then there is lacking after line 31 of column 5, the interpretation of the dream given in the Pennsylvania tablet in lines 17–23. 132 ina šap-li-ki, literally, “below thee,” whereas in the old Babylonian version we have ana ṣi-ri-ka, “towards thee.” 133 Repeated I, 6, 28. 134 ul-tap-rid ki-is-su-šú-ma. The verb is from parâdu, “violent.” For kissu, “strong,” see CT XVI, 25, 48–49. Langdon (Gilgamesh Epic, p. 211, note 5) renders the phrase: “he shook his murderous weapon!!”—another illustration of his haphazard way of translating texts. 135 Shown by the colophon (Jeremias, Izdubar-Nimrod, Plate IV.) 136 Lines 42–43 must be taken as part of the narrative of the compiler, who tells us that after the woman had informed Enkidu that Gilgamesh already knew of Enkidu’s coming through dreams interpreted by Ninsun, Gilgamesh actually set out and encountered Enkidu. 137 Tablet I, col. 4. See also above, p. 19. 138 IV, 2, 44–50. The word ullanum, (l.43) “once” or “since,” points to the following being a reference to a former recital, and not an original recital. 139 Only the lower half (Haupt’s edition, p. 82) is preserved. 140 “The eyes of Enkidu were filled with tears,” corresponding to IV, 4, 10. 141 Unless indeed the number “seven” is a slip for the sign ša. See the commentary to the line. Pennsylvania Tablet The 240 lines of the six columns of the text are enumerated in succession, with an indication on the margin where a new column begins. This method, followed also in the case of the Yale tablet, seems preferable to Langdon’s breaking up of the text into Obverse and Reverse, with a separate enumeration for each of the six columns. In order, however, to facilitate a comparison with Langdon’s edition, a table is added: Obverse Col. I, 1 = Line 1 of our text. ,, I, 5 = ,, 5 ,, ,, ,, ,, I, 10 = ,, 10 ,, ,, ,, ,, I, 15 = ,, 15 ,, ,, ,, ,, I, 20 = ,, 20 ,, ,, ,, ,, I, 25 = ,, 25 ,, ,, ,, ,, I, 30 = ,, 30 ,, ,, ,, ,, I, 35 = ,, 35 ,, ,, ,, Col. II, 1 = Line 41 ,, ,, ,, ,, II, 5 = ,, 45 ,, ,, ,, ,, II, 10 = ,, 50 ,, ,, ,, ,, II, 15 = ,, 55 ,, ,, ,, ,, II, 20 = ,, 60 ,, ,, ,, ,, II, 25 = ,, 65 ,, ,, ,, ,, II, 30 = ,, 70 ,, ,, ,, ,, II, 35 = ,, 75 ,, ,, ,, Col. III, 1 = Line 81 ,, ,, ,, ,, III, 5 = ,, 85 ,, ,, ,, ,, III, 10 = ,, 90 ,, ,, ,, ,, III, 15 = ,, 95 ,, ,, ,, ,, III, 26 = ,, 100 ,, ,, ,, ,, III, 25 = ,, 105 ,, ,, ,, ,, III, 30 = ,, 110 ,, ,, ,, ,, III, 35 = ,, 115 ,, ,, ,, Reverse Col. I, 1 (= Col. IV) = Line 131 of our text. ,, I, 5 = ,, 135 ,, ,, ,, ,, I, 10 = ,, 140 ,, ,, ,, ,, I, 15 = ,, 145 ,, ,, ,, ,, I, 20 = ,, 150 ,, ,, ,, ,, I, 25 = ,, 155 ,, ,, ,, ,, I, 30 = ,, 160 ,, ,, ,, ,, II, 1 (= Col. V) = Line 171 ,, ,, ,, ,, II, 5 = ,, 175 ,, ,, ,, ,, II, 10 = ,, 180 ,, ,, ,, ,, II, 15 = ,, 185 ,, ,, ,, ,, II, 20 = ,, 190 ,, ,, ,, ,, II, 25 = ,, 195 ,, ,, ,, ,, II, 30 = ,, 200 ,, ,, ,, ,, III, 1 (= Col. VI) = Line 208 ,, ,, ,, ,, III, 5 = ,, 212 ,, ,, ,, ,, III, 10 = ,, 217 ,, ,, ,, ,, III, 15 = ,, 222 ,, ,, ,, ,, III, 20 = ,, 227 ,, ,, ,, ,, III, 25 = ,, 232 ,, ,, ,, ,, III, 30 = ,, 237 ,, ,, ,, ,, III, 33 = ,, 240 ,, ,, ,, [62] Pennsylvania Tablet. Transliteration. Col. I. 1it-bi-e-ma dGiš šú-na-tam i-pa-áš-šar 2iz-za-kàr-am a-na um-mi-šú 3um-mi i-na šá-at mu-ši-ti-ia 4šá-am-ḫa-ku-ma at-ta-na-al-la-ak 5i-na bi-ri-it it-lu-tim 6ib-ba-šú-nim-ma ka-ka-bu šá-ma-i 7[ki]-iṣ-rù šá A-nim im-ḳu-ut a-na ṣi-ri-ia 8áš-ši-šú-ma ik-ta-bi-it e-li-ia 9ú-ni-iš-šú-ma nu-uš-šá-šú ú-ul il-ti-’i 10Urukki ma-tum pa-ḫi-ir e-li-šú 11it-lu-tum ú-na-šá-ku ši-pi-šú 12ú-um-mi-id-ma pu-ti 13i-mi-du ia-ti 14áš-ši-a-šú-ma ab-ba-la-áš-šú a-na ṣi-ri-ki 15um-mi dGiš mu-di-a-at ka-la-ma 16iz-za-kàr-am a-na dGiš 17mi-in-di dGiš šá ki-ma ka-ti 18i-na ṣi-ri i-wa-li-id-ma 19ú-ra-ab-bi-šú šá-du-ú 20ta-mar-šú-ma [kima Sal(?)] ta-ḫa-du at-ta 21it-lu-tum ú-na-šá-ku ši-pi-šú 22tí-iṭ-ṭi-ra-áš-[šú tu-ut]-tu-ú-ma 23ta-tar-ra-[as-su] a-na ṣi-[ri]-ia 24[uš]-ti-nim-ma i-ta-mar šá-ni-tam[63] 25[šú-na]-ta i-ta-wa-a-am a-na um-mi-šú 26[um-mi] a-ta-mar šá-ni-tam 27[šú-na-tu a-ta]-mar e-mi-a i-na su-ḳi-im 28[šá Uruk]ki ri-bi-tim 29ḫa-aṣ-ṣi-nu na-di-i-ma 30e-li-šú pa-aḫ-ru 31ḫa-aṣ-ṣi-nu-um-ma šá-ni bu-nu-šú 32a-mur-šú-ma aḫ-ta-du a-na-ku 33a-ra-am-šú-ma ki-ma áš-šá-tim 34a-ḫa-ab-bu-ub el-šú 35el-ki-šú-ma áš-ta-ka-an-šú 36a-na a-ḫi-ia 37um-mi dGiš mu-da-at [ka]-la-ma 38[iz-za-kàr-am a-na dGiš] 39[dGiš šá ta-mu-ru amêlu] 40[ta-ḫa-ab-bu-ub ki-ma áš-šá-tim el-šú] Col. II. 41áš-šum uš-[ta]-ma-ḫa-ru it-ti-ka 42dGiš šú-na-tam i-pa-šar 43dEn-ki-[dũ wa]-ši-ib ma-ḫar ḫa-ri-im-tim 44ur-[šá ir]-ḫa-mu di-da-šá(?) ip-tí-[e] 45[dEn-ki]-dũ im-ta-ši a-šar i-wa-al-du 46ûm, 6 ù 7 mu-ši-a-tim 47dEn-[ki-dũ] ti-bi-i-ma 48šá-[am-ka-ta] ir-ḫi 49ḫa-[ri-im-tum pa-a]-šá i-pu-šá-am-ma 50iz-za-[kàr-am] a-na dEn-ki-dũ 51a-na-tal-ka dEn-ki-dũ ki-ma ili ta-ba-áš-ši 52am-mi-nim it-ti na-ma-áš-te-e 53ta-at-ta-[na-al]-ak ṣi-ra-am[64] 54al-kam lu-úr-di-ka 55a-na libbi [Urukki] ri-bi-tim 56a-na bît [el]-lim mu-šá-bi šá A-nim 57dEn-ki-dũ ti-bi lu-ru-ka 58a-na Ê-[an]-na mu-šá-bi šá A-nim 59a-šar [dGiš gi]-it-ma-[lu] ne-pi-ši-tim 60ù at-[ta] ki-[ma Sal ta-ḫa]-bu-[ub]-šú 61ta-[ra-am-šú ki-ma] ra-ma-an-ka 62al-ka ti-ba i-[na] ga-ag-ga-ri 63ma-a-ag-ri-i-im 64iš-me a-wa-as-sa im-ta-ḫar ga-ba-šá 65mi-il-[kum] šá aššatim 66im-ta-ḳu-ut a-na libbi-šú 67iš-ḫu-ut li-ib-šá-am 68iš-ti-nam ú-la-ab-bi-iš-sú 69li-ib-[šá-am] šá-ni-a-am 70ši-i it-ta-al-ba-áš 71ṣa-ab-tat ga-as-su 72ki-ma [ili] i-ri-id-di-šú 73a-na gu-up-ri šá-ri-i-im 74a-šar tar-ba-ṣi-im 75i-na [áš]-ri-šú [im]-ḫu-ruri-ia-ú 76[ù šú-u dEn-ki-dũ i-lit-ta-šú šá-du-um-ma] 77[it-ti ṣabâti-ma ik-ka-la šam-ma] 78[it-ti bu-lim maš-ḳa-a i-šat-ti] 79[it-ti na-ma-áš-te-e mê i-ṭab lib-ba-šú] (Perhaps one additional line missing.) Col. III. 81ši-iz-ba šá na-ma-áš-te-e 82i-te-en-ni-ik 83a-ka-lam iš-ku-nu ma-ḫar-šú 84ib-tí-ik-ma i-na-at-tal 85ù ip-pa-al-la-as[65] 86ú-ul i-di dEn-ki-dũ 87aklam a-na a-ka-lim 88šikaram a-na šá-te-e-im 89la-a lum-mu-ud 90ḫa-ri-im-tum pi-šá i-pu-šá-am-ma 91iz-za-kàr-am a-na dEn-ki-dũ 92a-ku-ul ak-lam dEn-ki-dũ 93zi-ma-at ba-la-ṭi-im 94šikaram ši-ti ši-im-ti ma-ti 95i-ku-ul a-ak-lam dEn-ki-dũ 96a-di ši-bi-e-šú 97šikaram iš-ti-a-am 987 aṣ-ṣa-am-mi-im 99it-tap-šar kab-ta-tum i-na-an-gu 100i-li-iṣ libba-šú-ma 101pa-nu-šú [it]-tam-ru 102ul-tap-pi-it [lùŠÚ]-I 103šú-ḫu-ra-am pa-ga-ar-šú 104šá-am-nam ip-ta-šá-áš-ma 105a-we-li-iš i-we 106il-ba-áš li-ib-šá-am 107ki-ma mu-ti i-ba-áš-ši 108il-ki ka-ak-ka-šú 109la-bi ú-gi-ir-ri 110uš-sa-ak-pu re’ûti mu-ši-a-tim 111ut-tap-pi-iš šib-ba-ri 112la-bi uk-ta-ši-id 113it-ti-[lu] na-ki-[di-e] ra-bu-tum 114dEn-ki-dũ ma-aṣ-ṣa-ar-šú-nu 115a-we-lum giš-ru-um 116iš-te-en it-lum 117a-na [na-ki-di-e(?) i]-za-ak-ki-ir (About five lines missing.) Col. IV. (About eight lines missing.) 131i-ip-pu-uš ul-ṣa-am 132iš-ši-ma i-ni-i-šú 133i-ta-mar a-we-lam[66] 134iz-za-kàr-am a-na ḫarimtim 135šá-am-ka-at uk-ki-ši a-we-lam 136a-na mi-nim il-li-kam 137zi-ki-ir-šú lu-uš-šú 138ḫa-ri-im-tum iš-ta-si a-we-lam 139i-ba-uš-su-um-ma i-ta-mar-šú 140e-di-il e-eš ta-ḫi-[il-la]-am 141lim-nu a-la-ku ma-na-aḫ-[ti]-ka 142e-pi-šú i-pu-šá-am-ma 143iz-za-kàr-am a-na dEn-[ki-dũ] 144bi-ti-iš e-mu-tim ik …… 145ši-ma-a-at ni-ši-i-ma 146tu-a-(?)-ar e-lu-tim 147a-na âli(?) dup-šak-ki-i e-ṣi-en 148uk-la-at âli(?) e-mi-sa a-a-ḫa-tim 149a-na šarri šá Urukki ri-bi-tim 150pi-ti pu-uk epiši(-ši) a-na ḫa-a-a-ri 151a-na dGiš šarri šá Urukki ri-bi-tim 152pi-ti pu-uk epiši(-ši) 153a-na ḫa-a-a-ri 154áš-ša-at ši-ma-tim i-ra-aḫ-ḫi 155šú-ú pa-na-nu-um-ma 156mu-uk wa-ar-ka-nu 157i-na mi-il-ki šá ili ga-bi-ma 158i-na bi-ti-iḳ a-bu-un-na-ti-šú 159ši-ma-as-su 160a-na zi-ik-ri it-li-im 161i-ri-ku pa-nu-šú (About three lines missing.) [67] Col. V. (About six lines missing.) 171i-il-la-ak [dEn-ki-dũ i-na pa-ni] 172u-šá-am-ka-at [wa]-ar-ki-šú 173i-ru-ub-ma a-na libbi Urukki ri-bi-tim 174ip-ḫur um-ma-nu-um i-na ṣi-ri-šú 175iz-zi-za-am-ma i-na su-ḳi-im 176šá Urukki ri-bi-tim 177pa-aḫ-ra-a-ma ni-šú 178i-ta-wa-a i-na ṣi-ri-šú 179a-na ṣalam dGiš ma-ši-il pi-it-tam 180la-nam šá-pi-il 181si-ma …. [šá-ki-i pu]-uk-ku-ul 182............. i-pa-ka-du 183i-[na mâti da-an e-mu]-ki i-wa 184ši-iz-ba šá na-ma-aš-te-e 185i-te-en-ni-ik 186ka-a-a-na i-na [libbi] Urukki kak-ki-a-tum 187it-lu-tum ú-te-el-li-lu 188šá-ki-in ur-šá-nu 189a-na itli šá i-šá-ru zi-mu-šú 190a-na dGiš ki-ma i-li-im 191šá-ki-iš-šum me-iḫ-rù 192a-na dIš-ḫa-ra ma-a-a-lum 193na-di-i-ma 194dGiš it-[ti-il-ma wa-ar-ka-tim] 195i-na mu-ši in-ni-[ib-bi]-it 196i-na-ag-šá-am-ma 197it-ta-[zi-iz dEn-ki-dũ] i-na sûḳim 198ip-ta-ra-[aṣ a-la]-ak-tam 199šá dGiš 200[a-na e-pi-iš] da-na-ni-iš-šú (About three lines missing.) [68] Col. VI. (About four lines missing.) 208šar(?)-ḫa 209dGiš … 210i-na ṣi-ri-[šú il-li-ka-am dEn-ki-dũ] 211i-ḫa-an-ni-ib [pi-ir-ta-šú] 212it-bi-ma [il-li-ik] 213a-na pa-ni-šú 214it-tam-ḫa-ru i-na ri-bi-tum ma-ti 215dEn-ki-dũ ba-ba-am ip-ta-ri-ik 216i-na ši-pi-šú 217dGiš e-ri-ba-am ú-ul id-di-in 218iṣ-ṣa-ab-tu-ma ki-ma li-i-im 219i-lu-du 220zi-ip-pa-am ’i-bu-tu 221i-ga-rum ir-tu-tu 222dGiš ù dEn-ki-dũ 223iṣ-ṣa-ab-tu-ú-ma 224ki-ma li-i-im i-lu-du 225zi-ip-pa-am ’i-bu-tu 226i-ga-rum ir-tu-tú 227ik-mi-is-ma dGiš 228i-na ga-ag-ga-ri ši-ip-šú 229ip-ši-iḫ uz-za-šú-ma 230i-ni-iḫ i-ra-as-su 231iš-tu i-ra-su i-ni-ḫu 232dEn-ki-dũ a-na šá-ši-im 233iz-za-kàr-am a-na dGiš 234ki-ma iš-te-en-ma um-ma-ka 235ú-li-id-ka 236ri-im-tum šá su-pu-ri 237dNin-sun-na 238ul-lu e-li mu-ti ri-eš-ka 239šar-ru-tú šá ni-ši 240i-ši-im-kum dEn-lil 241 duppu 2 kam-ma 242šú-tu-ur e-li ………………… 243 4 šú-ši [62] Translation. Col. I. 1Gish sought to interpret the dream; 2Spoke to his mother: 3“My mother, during my night 4I became strong and moved about 5among the heroes; 6And from the starry heaven 7A meteor(?) of Anu fell upon me: 8I bore it and it grew heavy upon me, 9I became weak and its weight I could not endure. 10The land of Erech gathered about it. 11The heroes kissed its feet.1 12It was raised up before me. 13They stood me up.2 14I bore it and carried it to thee.” 15The mother of Gish, who knows all things, 16Spoke to Gish: 17“Some one, O Gish, who like thee 18In the field was born and 19Whom the mountain has reared, 20Thou wilt see (him) and [like a woman(?)] thou wilt rejoice. 21Heroes will kiss his feet. 22Thou wilt spare [him and wilt endeavor] 23To lead him to me.” 24He slept and saw another[63] 25Dream, which he reported to his mother: 26[“My mother,] I have seen another 27[Dream.] My likeness I have seen in the streets 28[Of Erech] of the plazas. 29An axe was brandished, and 30They gathered about him; 31And the axe made him angry. 32I saw him and I rejoiced, 33I loved him as a woman, 34I embraced him. 35I took him and regarded him 36As my brother.” 37The mother of Gish, who knows all things, 38[Spoke to Gish]: 39[“O Gish, the man whom thou sawest,] 40[Whom thou didst embrace like a woman]. Col II. 41(means) that he is to be associated with thee.” 42Gish understood the dream. 43[As] Enki[du] was sitting before the woman, 44[Her] loins(?) he embraced, her vagina(?) he opened. 45[Enkidu] forgot the place where he was born. 46Six days and seven nights 47Enkidu continued 48To cohabit with [the courtesan]. 49[The woman] opened her [mouth] and 50Spoke to Enkidu: 51“I gaze upon thee, O Enkidu, like a god art thou! 52Why with the cattle 53Dost thou [roam] across the field?[64] 54Come, let me lead thee 55into [Erech] of the plazas, 56to the holy house, the dwelling of Anu, 57O, Enkidu arise, let me conduct thee 58To Eanna, the dwelling of Anu, 59The place [where Gish is, perfect] in vitality. 60And thou [like a wife wilt embrace] him. 61Thou [wilt love him like] thyself. 62Come, arise from the ground 63(that is) cursed.” 64He heard her word and accepted her speech. 65The counsel of the woman 66Entered his heart. 67She stripped off a garment, 68Clothed him with one. 69Another garment 70She kept on herself. 71She took hold of his hand. 72Like [a god(?)] she brought him 73To the fertile meadow, 74The place of the sheepfolds. 75In that place they received food; 76[For he, Enkidu, whose birthplace was the mountain,] 77[With the gazelles he was accustomed to eat herbs,] 78[With the cattle to drink water,] 79[With the water beings he was happy.] (Perhaps one additional line missing.) Col. III. 81Milk of the cattle 82He was accustomed to suck. 83Food they placed before him, 84He broke (it) off and looked 85And gazed.[65] 86Enkidu had not known 87To eat food. 88To drink wine 89He had not been taught. 90The woman opened her mouth and 91Spoke to Enkidu: 92“Eat food, O Enkidu, 93The provender of life! 94Drink wine, the custom of the land!” 95Enkidu ate food 96Till he was satiated. 97Wine he drank, 98Seven goblets. 99His spirit was loosened, he became hilarious. 100His heart became glad and 101His face shone. 102[The barber(?)] removed 103The hair on his body. 104He was anointed with oil. 105He became manlike. 106He put on a garment, 107He was like a man. 108He took his weapon; 109Lions he attacked, 110(so that) the night shepherds could rest. 111He plunged the dagger; 112Lions he overcame. 113The great [shepherds] lay down; 114Enkidu was their protector. 115The strong man, 116The unique hero, 117To [the shepherds(?)] he speaks: (About five lines missing.) Col. IV. (About eight lines missing.) 131Making merry. 132He lifted up his eyes, 133He sees the man.[66] 134He spoke to the woman: 135“O, courtesan, lure on the man. 136Why has he come to me? 137His name I will destroy.” 138The woman called to the man 139Who approaches to him3 and he beholds him. 140“Away! why dost thou [quake(?)] 141Evil is the course of thy activity.”4 142Then he5 opened his mouth and 143Spoke to Enkidu: 144”[To have (?)] a family home 145Is the destiny of men, and 146The prerogative(?) of the nobles. 147For the city(?) load the workbaskets! 148Food supply for the city lay to one side! 149For the King of Erech of the plazas, 150Open the hymen(?), perform the marriage act! 151For Gish, the King of Erech of the plazas, 152Open the hymen(?), 153Perform the marriage act! 154With the legitimate wife one should cohabit. 155So before, 156As well as in the future.6 157By the decree pronounced by a god, 158From the cutting of his umbilical cord 159(Such) is his fate.” 160At the speech of the hero 161His face grew pale. (About three lines missing.) [67] Col. V. (About six lines missing.) 171[Enkidu] went [in front], 172And the courtesan behind him. 173He entered into Erech of the plazas. 174The people gathered about him. 175As he stood in the streets 176Of Erech of the plazas, 177The men gathered, 178Saying in regard to him: 179“Like the form of Gish he has suddenly become; 180shorter in stature. 181[In his structure high(?)], powerful, 182.......... overseeing(?) 183In the land strong of power has he become. 184Milk of cattle 185He was accustomed to suck.” 186Steadily(?) in Erech ..... 187The heroes rejoiced. 188He became a leader. 189To the hero of fine appearance, 190To Gish, like a god, 191He became a rival to him.7 192For Ishḫara a couch 193Was stretched, and 194Gish [lay down, and afterwards(?)] 195In the night he fled. 196He approaches and 197[Enkidu stood] in the streets. 198He blocked the path 199of Gish. 200At the exhibit of his power, (About three lines missing.) [68] Col. VI. (About four lines missing.) 208Strong(?) … 209Gish 210Against him [Enkidu proceeded], 211[His hair] luxuriant. 212He started [to go] 213Towards him. 214They met in the plaza of the district. 215Enkidu blocked the gate 216With his foot, 217Not permitting Gish to enter. 218They seized (each other), like oxen, 219They fought. 220The threshold they demolished; 221The wall they impaired. 222Gish and Enkidu 223Seized (each other). 224Like oxen they fought. 225The threshold they demolished; 226The wall they impaired. 227Gish bent 228His foot to the ground,8 229His wrath was appeased, 230His breast was quieted. 231When his breast was quieted, 232Enkidu to him 233Spoke, to Gish: 234“As a unique one, thy mother 235bore thee. 236The wild cow of the stall,9 237Ninsun, 238Has exalted thy head above men. 239Kingship over men 240Enlil has decreed for thee. 241Second tablet, 242enlarged beyond [the original(?)]. 243240 lines. [69] 1 I.e., paid homage to the meteor. 2 I.e., the heroes of Erech raised me to my feet, or perhaps in the sense of “supported me.” 3 I.e., Enkidu. 4 I.e., “thy way of life.” 5 I.e., the man. 6 I.e., an idiomatic phrase meaning “for all times.” 7 I.e., Enkidu became like Gish, godlike. Cf. col. 2, 11. 8 He was thrown and therefore vanquished. 9 Epithet given to Ninsun. See the commentary to the line. Commentary on the Pennsylvania Tablet. Line 1. The verb tibû with pašâru expresses the aim of Gish to secure an interpretation for his dream. This disposes of Langdon’s note 1 on page 211 of his edition, in which he also erroneously speaks of our text as “late.” Pašâru is not a variant of zakâru. Both verbs occur just as here in the Assyrian version I, 5, 25. Line 3. ina šât mušitia, “in this my night,” i.e., in the course of this night of mine. A curious way of putting it, but the expression occurs also in the Assyrian version, e.g., I, 5, 26 (parallel passage to ours) and II, 4a, 14. In the Yale tablet we find, similarly, mu-ši-it-ka (l. 262), “thy night,” i.e., “at night to thee.” Line 5. Before Langdon put down the strange statement of Gish “wandering about in the midst of omens” (misreading id-da-tim for it-lu-tim), he might have asked himself the question, what it could possibly mean. How can one walk among omens? Line 6. ka-ka-bu šá-ma-i must be taken as a compound term for “starry heaven.” The parallel passage in the Assyrian version (Tablet I, 5, 27) has the ideograph for star, with the plural sign as a variant. Literally, therefore, “The starry heaven (or “the stars in heaven”) was there,” etc. Langdon’s note 2 on page 211 rests on an erroneous reading. Line 7. kiṣru šá Anim, “mass of Anu,” appears to be the designation of a meteor, which might well be described as a “mass” coming from Anu, i.e., from the god of heaven who becomes the personification of the heavens in general. In the Assyrian version (I, 5, 28) we have kima ki-iṣ-rù, i.e., “something like a mass of heaven.” Note also I, 3, 16, where in a description of Gilgamesh, his strength is said to be “strong like a mass (i.e., a meteor) of heaven.” Line 9. For nuššašu ûl iltê we have a parallel in the Hebrew phrase נלְַפָסֵתִי נשַׂפָס (Isaiah 1, 14). Line 10. Uruk mâtum, as the designation for the district of Erech, occurs in the Assyrian version, e.g., I, 5, 31, and IV, 2, 38; also to be supplied, I, 6, 23. For paḫir the parallel in the Assyrian version has iz-za-az (I, 5, 31), but VI, 197, we find paḫ-ru and paḫ-ra. Line 17. mi-in-di does not mean “truly” as Langdon translates, but “some one.” It occurs also in the Assyrian version X, 1, 13, mi-in-di-e ma-an-nu-ṵ, “this is some one who,” etc. [70] Line 18. Cf. Assyrian version I, 5, 3, and IV, 4, 7, ina ṣiri âlid—both passages referring to Enkidu. Line 21. Cf. Assyrian version II, 3b, 38, with malkê, “kings,” as a synonym of itlutum. Line 23. ta-tar-ra-as-sú from tarâṣu, “direct,” “guide,” etc. Line 24. I take uš-ti-nim-ma as III, 2, from išênu (יָשֵׁן), the verb underlying šittu, “sleep,” and šuttu, “dream.” Line 26. Cf. Assyrian version I, 6, 21—a complete parallel. Line 28. Uruk ri-bi-tim, the standing phrase in both tablets of the old Babylonian version, for which in the Assyrian version we have Uruk su-pu-ri. The former term suggests the “broad space” outside of the city or the “common” in a village community, while supûri, “enclosed,” would refer to the city within the walls. Dr. W. F. Albright (in a private communication) suggests “Erech of the plazas” as a suitable translation for Uruk ribîtim. A third term, Uruk mâtum (see above, note to line 10), though designating rather the district of which Erech was the capital, appears to be used as a synonym to Uruk ribîtim, as may be concluded from the phrase i-na ri-bi-tum ma-ti (l. 214 of the Pennsylvania tablet), which clearly means the “plaza” of the city. One naturally thinks of רְחֹבֹת עִיר in Genesis 10, 11—the equivalent of Babylonian ri-bi-tu âli—which can hardly be the name of a city. It appears to be a gloss, as is הִיַפָס הָעִיּר הַגְּדֹלָה at the end of v. 12. The latter gloss is misplaced, since it clearly describes “Nineveh,” mentioned in v. 11. Inasmuch as רְחֹבֹת עִיר immediately follows the mention of Nineveh, it seems simplest to take the phrase as designating the “outside” or “suburbs” of the city, a complete parallel, therefore, to ri-bi-tu mâti in our text. Nineveh, together with the “suburbs,” forms the “great city.” Uruk ribîtim is, therefore, a designation for “greater Erech,” proper to a capital city, which by its gradual growth would take in more than its original confines. “Erech of the plazas” must have come to be used as a honorific designation of this important center as early as 2000 B. C., whereas later, perhaps because of its decline, the epithet no longer seemed appropriate and was replaced by the more modest designation of “walled Erech,” with an allusion to the tradition which ascribed the building of the wall of the city to Gilgamesh. At all [71]events, all three expressions, “Erech of the plazas,” “Erech walled” and “Erech land,” are to be regarded as synonymous. The position once held by Erech follows also from its ideographic designation (Brünnow No. 4796) by the sign “house” with a “gunufied” extension, which conveys the idea of Unu = šubtu, or “dwelling” par excellence. The pronunciation Unug or Unuk (see the gloss u-nu-uk, VR 23, 8a), composed of unu, “dwelling,” and ki, “place,” is hardly to be regarded as older than Uruk, which is to be resolved into uru, “city,” and ki, “place,” but rather as a play upon the name, both Unu + ki and Uru + ki conveying the same idea of the city or the dwelling place par excellence. As the seat of the second oldest dynasty according to Babylonian traditions (see Poebel’s list in Historical and Grammatical Texts No. 2), Erech no doubt was regarded as having been at one time “the city,” i.e., the capital of the entire Euphrates Valley. Line 31. A difficult line for which Langdon proposes the translation: “Another axe seemed his visage”!!—which may be picturesque, but hardly a description befitting a hero. How can a man’s face seem to be an axe? Langdon attaches šá-ni in the sense of “second” to the preceding word “axe,” whereas šanî bunušu, “change of his countenance” or “his countenance being changed,” is to be taken as a phrase to convey the idea of “being disturbed,” “displeased” or “angry.” The phrase is of the same kind as the well-known šunnu ṭêmu, “changing of reason,” to denote “insanity.” See the passages in Muss-Arnolt, Assyrian Dictionary, pp. 355 and 1068. In Hebrew, too, we have the same two phrases, e.g., וַיְשַׁנֹּו ַפָסֶת־טַעְמֹו (I Sam. 21, 14 = Ps. 34, 1), “and he changed his reason,” i.e., feigned insanity and מְשַׁנֶּה פָּנָיו (Job 14, 20), “changing his face,” to indicate a radical alteration in the frame of mind. There is a still closer parallel in Biblical Aramaic: Dan. 3, 19, “The form of his visage was changed,” meaning “he was enraged.” Fortunately, the same phrase occurs also in the Yale tablet (l. 192), šá-nu-ú bu-nu-šú, in a connection which leaves no doubt that the aroused fury of the tyrant Ḫuwawa is described by it: ”Ḫuwawa heard and his face was changed” precisely, therefore, as we should say—following Biblical usage—“his countenance fell.” Cf. also the phrase pânušu arpu, “his countenance [72]was darkened” (Assyrian version I, 2, 48), to express “anger.” The line, therefore, in the Pennsylvania tablet must describe Enkidu’s anger. With the brandishing of the axe the hero’s anger was also stirred up. The touch was added to prepare us for the continuation in which Gish describes how, despite this (or perhaps just because of it), Enkidu seemed so attractive that Gish instantly fell in love with him. May perhaps the emphatic form ḫaṣinumma (line 31) against ḫaṣinu (line 29) have been used to indicate “The axe it was,” or “because of the axe?” It would be worth while to examine other texts of the Hammurabi period with a view of determining the scope in the use and meaning of the emphatic ma when added to a substantive. Line 32. The combination amur ù aḫtadu occurs also in the El-Amarna Letters, No. 18, 12. Line 34. In view of the common Hebrew, Syriac and Arabic חָבַב “to love,” it seems preferable to read here, as in the other passages in the Assyrian versions (I, 4, 15; 4, 35; 6, 27, etc.), a-ḫa-ab-bu-ub, aḫ-bu-ub, iḫ-bu-bu, etc. (instead of with p), and to render “embrace.” Lines 38–40, completing the column, may be supplied from the Assyrian version I, 6, 30–32, in conjunction with lines 33–34 of our text. The beginning of line 32 in Jensen’s version is therefore to be filled out [ta-ra-am-šú ki]-i. Line 43. The restoration at the beginning of this line En-ki-[dũ wa]-ši-ib ma-ḫar ḫa-ri-im-tim enables us to restore also the beginning of the second tablet of the Assyrian version (cf. the colophon of the fragment 81, 7–27, 93, in Jeremias, Izdubar-Nimrod, plate IV = Jensen, p. 134), [dEn-ki-dũ wa-ši-ib] ma-ḫar-šá. Line 44. The restoration of this line is largely conjectural, based on the supposition that its contents correspond in a general way to I, 4, 16, of the Assyrian version. The reading di-da is quite certain, as is also ip-ti-[e]; and since both words occur in the line of the Assyrian version in question, it is tempting to supply at the beginning ur-[šá] = “her loins” (cf. Holma, Namen der Körperteile, etc., p. 101), which is likewise found in the same line of the Assyrian version. At all events the line describes the fascination exercised [73]upon Enkidu by the woman’s bodily charms, which make him forget everything else. Lines 46–47 form a parallel to I, 4, 21, of the Assyrian version. The form šamkatu, “courtesan,” is constant in the old Babylonian version (ll. 135 and 172), as against šamḫatu in the Assyrian version (I, 3, 19, 40, 45; 4, 16), which also uses the plural šam-ḫa-a-ti (II, 3b, 40). The interchange between ḫ and k is not without precedent (cf. Meissner, Altbabylonisches Privatrecht, page 107, note 2, and more particularly Chiera, List of Personal Names, page 37). In view of the evidence, set forth in the Introduction, for the assumption that the Enkidu story has been combined with a tale of the evolution of primitive man to civilized life, it is reasonable to suggest that in the original Enkidu story the female companion was called šamkatu, “courtesan,” whereas in the tale of the primitive man, which was transferred to Enkidu, the associate was ḫarimtu, a “woman,” just as in the Genesis tale, the companion of Adam is simply called ishshâ, “woman.” Note that in the Assyrian parallel (Tablet I, 4, 26) we have two readings, ir-ḫi (imperf.) and a variant i-ri-ḫi (present). The former is the better reading, as our tablet shows. Lines 49–59 run parallel to the Assyrian version I, 4, 33–38, with slight variations which have been discussed above, p. 58, and from which we may conclude that the Assyrian version represents an independent redaction. Since in our tablet we have presumably the repetition of what may have been in part at least set forth in the first tablet of the old Babylonian version, we must not press the parallelism with the first tablet of the Assyrian version too far; but it is noticeable nevertheless (1) that our tablet contains lines 57–58 which are not represented in the Assyrian version, and (2) that the second speech of the “woman” beginning, line 62, with al-ka, “come” (just as the first speech, line 54), is likewise not found in the first tablet of the Assyrian version; which on the other hand contains a line (39) not in the Babylonian version, besides the detailed answer of Enkidu (I 4, 42–5, 5). Line 6, which reads “Enkidu and the woman went (il-li-ku) to walled Erech,” is also not found in the second tablet of the old Babylonian version. Line 63. For magrû, “accursed,” see the frequent use in Astrological texts (Jastrow, Religion Babyloniens und Assyriens II, page [74]450, note 2). Langdon, by his strange error in separating ma-a-ag-ri-im into two words ma-a-ak and ri-i-im, with a still stranger rendering: “unto the place yonder of the shepherds!!”, naturally misses the point of this important speech. Line 64 corresponds to I, 4, 40, of the Assyrian version, which has an additional line, leading to the answer of Enkidu. From here on, our tablet furnishes material not represented in the Assyrian version, but which was no doubt included in the second tablet of that version of which we have only a few fragments. Line 70 must be interpreted as indicating that the woman kept one garment for herself. Ittalbaš would accordingly mean, “she kept on.” The female dress appears to have consisted of an upper and a lower garment. Line 72. The restoration “like a god” is favored by line 51, where Enkidu is likened to a god, and is further confirmed by l. 190. Line 73. gupru is identical with gu-up-ri (Thompson, Reports of the Magicians and Astrologers, etc., 223 rev. 2 and 223a rev. 8), and must be correlated to gipâru (Muss-Arnolt, Assyrian Dictionary, p. 229a), “planted field,” “meadow,” and the like. Thompson’s translation “men” (as though a synonym of gabru) is to be corrected accordingly. Line 74. There is nothing missing between a-šar and tar-ba-ṣi-im. Line 75. ri-ia-ú, which Langdon renders “shepherd,” is the equivalent of the Arabic riʿy and Hebrew רְעִי “pasturage,” “fodder.” We have usually the feminine form ri-i-tu (Muss-Arnolt, Assyrian Dictionary, p. 990b). The break at the end of the second column is not serious. Evidently Enkidu, still accustomed to live like an animal, is first led to the sheepfolds, and this suggests a repetition of the description of his former life. Of the four or five lines missing, we may conjecturally restore four, on the basis of the Assyrian version, Tablet I, 4, 2–5, or I, 2, 39–41. This would then join on well to the beginning of column 3. Line 81. Both here and in l. 52 our text has na-ma-áš-te-e, as against nam-maš-ši-i in the Assyrian version, e.g., Tablet I, 2, 41; 4, 5, etc.,—the feminine form, therefore, as against the masculine. Langdon’s note 3 on page 213 is misleading. In astrological texts we also find nam-maš-te; e.g., Thompson, Reports of the Magicians and Astrologers, etc., No. 200, Obv. 2. [75] Line 93. zi-ma-at (for simat) ba-la-ṭi-im is not “conformity of life” as Langdon renders, but that which “belongs to life” like si-mat pag-ri-šá, “belonging to her body,” in the Assyrian version III, 2a, 3 (Jensen, page 146). “Food,” says the woman, “is the staff of life.” Line 94. Langdon’s strange rendering “of the conditions and fate of the land” rests upon an erroneous reading (see the corrections, Appendix I), which is the more inexcusable because in line 97 the same ideogram, Kàš = šikaru, “wine,” occurs, and is correctly rendered by him. Šimti mâti is not the “fate of the land,” but the “fixed custom of the land.” Line 98. aṣ-ṣa-mi-im (plural of aṣṣamu), which Langdon takes as an adverb in the sense of “times,” is a well-known word for a large “goblet,” which occurs in Incantation texts, e.g., CT XVI, 24, obv. 1, 19, mê a-ṣa-am-mi-e šú-puk, “pour out goblets of water.” Line 18 of the passage shoves that aṣammu is a Sumerian loan word. Line 99. it-tap-šar, I, 2, from pašâru, “loosen.” In combination with kabtatum (from kabitatum, yielding two forms: kabtatum, by elision of i, and kabittu, by elision of a), “liver,” pašâru has the force of becoming cheerful. Cf. ka-bit-ta-ki lip-pa-šir (ZA V., p. 67, line 14). Line 100, note the customary combination of “liver” (kabtatum) and “heart” (libbu) for “disposition” and “mind,” just as in the standing phrase in penitential prayers: “May thy liver be appeased, thy heart be quieted.” Line 102. The restoration [lùŠÚ]-I = gallabu “barber” (Delitzsch, Sumer. Glossar, p. 267) was suggested to me by Dr. H. F. Lutz. The ideographic writing “raising the hand” is interesting as recalling the gesture of shaving or cutting. Cf. a reference to a barber in Lutz, Early Babylonian Letters from Larsa, No. 109, 6. Line 103. Langdon has correctly rendered šuḫuru as “hair,” and has seen that we have here a loan-word from the Sumerian Suḫur = kimmatu, “hair,” according to the Syllabary Sb 357 (cf. Delitzsch, Sumer. Glossar., p. 253). For kimmatu, “hair,” more specifically hair of the head and face, see Holma, Namen der Körperteile, page 3. The same sign Suḫur or Suḫ (Brünnow No. 8615), with Lal, i.e., “hanging hair,” designates the “beard” (ziḳnu, cf. Brünnow, No. 8620, and Holma, l. c., p. 36), and it is interesting to [76]note that we have šuḫuru (introduced as a loan-word) for the barbershop, according to II R, 21, 27c (= CT XII, 41). Ê suḫur(ra) (i.e., house of the hair) = šú-ḫu-ru. In view of all this, we may regard as assured Holma’s conjecture to read šú-[ḫur-ma-šú] in the list 93074 obv. (MVAG 1904, p. 203; and Holma, Beiträge z. Assyr. Lexikon, p. 36), as the Akkadian equivalent to Suḫur-Maš-Ḫa and the name of a fish, so called because it appeared to have a double “beard” (cf. Holma, Namen der Körperteile). One is tempted, furthermore, to see in the difficult word שכירה (Isaiah 7, 20) a loan-word from our šuḫuru, and to take the words ַפָסֶת־הָרַֹפָסשׁ וְשַׂעַר הָרַגְלַיִם “the head and hair of the feet” (euphemistic for the hair around the privates), as an explanatory gloss to the rare word שכירה for “hair” of the body in general—just as in the passage in the Pennsylvania tablet. The verse in Isaiah would then read, “The Lord on that day will shave with the razor the hair (השכירה), and even the beard will be removed.” The rest of the verse would represent a series of explanatory glosses: (a) “Beyond the river” (i.e., Assyria), a gloss to יְגַלַּח (b) “with the king of Assyria,” a gloss to בְּתַעַר “with a razor;” and (c) “the hair of the head and hair of the feet,” a gloss to השכירה. For “hair of the feet” we have an interesting equivalent in Babylonian šu-ḫur (and šú-ḫu-ur) šêpi (CT XII, 41, 23–24 c-d). Cf. also Boissier, Documents Assyriens relatifs aux Présages, p. 258, 4–5. The Babylonian phrase is like the Hebrew one to be interpreted as a euphemism for the hair around the male or female organ. To be sure, the change from ה to כ in השכירה constitutes an objection, but not a serious one in the case of a loan-word, which would aim to give the pronunciation of the original word, rather than the correct etymological equivalent. The writing with aspirated כ fulfills this condition. (Cf. šamkatum and šamḫatum, above p. 73). The passage in Isaiah being a reference to Assyria, the prophet might be tempted to use a foreign word to make his point more emphatic. To take השכירה as “hired,” as has hitherto been done, and to translate “with a hired razor,” is not only to suppose a very wooden metaphor, but is grammatically difficult, since השכירח would be a feminine adjective attached to a masculine substantive. Coming back to our passage in the Pennsylvania tablet, it is to [77]be noted that Enkidu is described as covered “all over his body with hair” (Assyrian version, Tablet I, 2, 36) like an animal. To convert him into a civilized man, the hair is removed. Line 107. mutu does not mean “husband” here, as Langdon supposes, but must be taken as in l. 238 in the more general sense of “man,” for which there is good evidence. Line 109. la-bi (plural form) are “lions”—not “panthers” as Langdon has it. The verb ú-gi-ir-ri is from gâru, “to attack.” Langdon by separating ú from gi-ir-ri gets a totally wrong and indeed absurd meaning. See the corrections in the Appendix. He takes the sign ú for the copula (!!) which of course is impossible. Line 110. Read uš-sa-ak-pu, III, 1, of sakâpu, which is frequently used for “lying down” and is in fact a synonym of ṣalâlu. See Muss-Arnolt, Assyrian Dictionary, page 758a. The original has very clearly Síb (= rê’u, “shepherd”) with the plural sign. The “shepherds of the night,” who could now rest since Enkidu had killed the lions, are of course the shepherds who were accustomed to watch the flocks during the night. Line 111. ut-tap-pi-iš is II, 2, napâšu, “to make a hole,” hence “to plunge” in connection with a weapon. Šib-ba-ri is, of course, not “mountain goats,” as Langdon renders, but a by-form to šibbiru, “stick,” and designates some special weapon. Since on seal cylinders depicting Enkidu killing lions and other animals the hero is armed with a dagger, this is presumably the weapon šibbaru. Line 113. Langdon’s translation is again out of the question and purely fanciful. The traces favor the restoration na-ki-[di-e], “shepherds,” and since the line appears to be a parallel to line 110, I venture to suggest at the beginning [it-ti]-lu from na’âlu, “lie down”—a synonym, therefore, to sakâpu in line 110. The shepherds can sleep quietly after Enkidu has become the “guardian” of the flocks. In the Assyrian version (tablet II, 3a, 4) Enkidu is called a na-kid, “shepherd,” and in the preceding line we likewise have lùNa-Kid with the plural sign, i.e., “shepherds.” This would point to nakidu being a Sumerian loan-word, unless it is vice versa, a word that has gone over into the Sumerian from Akkadian. Is perhaps the fragment in question (K 8574) in the Assyrian version (Haupt’s ed. No. 25) the parallel to our passage? If in line 4 of this fragment we could read šú for sa, i.e., na-kid-šú-nu, “their shepherd, we would have a [78]parallel to line 114 of the Pennsylvania tablet, with na-kid as a synonym to maṣṣaru, “protector.” The preceding line would then be completed as follows: [it-ti-lu]-nim-ma na-kidmeš [ra-bu-tum] (or perhaps only it-ti-lu-ma, since the nim is not certain) and would correspond to line 113 of the Pennsylvania tablet. Inasmuch as the writing on the tiny fragment is very much blurred, it is quite possible that in line 2 we must read šib-ba-ri (instead of bar-ba-ri), which would furnish a parallel to line 111 of the Pennsylvania tablet. The difference between Bar and Šib is slight, and the one sign might easily be mistaken for the other in the case of close writing. The continuation of line 2 of the fragment would then correspond to line 112 of the Pennsylvania tablet, while line 1 of the fragment might be completed [re-e]-u-ti(?) šá [mu-ši-a-tim], though this is by no means certain. The break at the close of column 3 (about 5 lines) and the top of column 4 (about 8 lines) is a most serious interruption in the narrative, and makes it difficult to pick up the thread where the tablet again becomes readable. We cannot be certain whether the “strong man, the unique hero” who addresses some one (lines 115–117) is Enkidu or Gish or some other personage, but presumably Gish is meant. In the Assyrian version, Tablet I, 3, 2 and 29, we find Gilgamesh described as the “unique hero” and in l. 234 of the Pennsylvania tablet Gish is called “unique,” while again, in the Assyrian version, Tablet I, 2, 15 and 26, he is designated as gašru as in our text. Assuming this, whom does he address? Perhaps the shepherds? In either case he receives an answer that rejoices him. If the fragment of the Assyrian version (K 8574) above discussed is the equivalent to the close of column 3 of the Pennsylvania tablet, we may go one step further, and with some measure of assurance assume that Gish is told of Enkidu’s exploits and that the latter is approaching Erech. This pleases Gish, but Enkidu when he sees Gish(?) is stirred to anger and wants to annihilate him. At this point, the “man” (who is probably Gish, though the possibility of a third personage must be admitted) intervenes and in a long speech sets forth the destiny and higher aims of mankind. The contrast between Enkidu and Gish (or the third party) is that between the primitive [79]savage and the civilized being. The contrast is put in the form of an opposition between the two. The primitive man is the stronger and wishes to destroy the one whom he regards as a natural foe and rival. On the other hand, the one who stands on a higher plane wants to lift his fellow up. The whole of column 4, therefore, forms part of the lesson attached to the story of Enkidu, who, identified with man in a primitive stage, is made the medium of illustrating how the higher plane is reached through the guiding influences of the woman’s hold on man, an influence exercised, to be sure, with the help of her bodily charms. Line 135. uk-ki-ši (imperative form) does not mean “take away,” as Langdon (who entirely misses the point of the whole passage) renders, but on the contrary, “lure him on,” “entrap him,” and the like. The verb occurs also in the Yale tablet, ll. 183 and 186. Line 137. Langdon’s note to lu-uš-šú had better be passed over in silence. The form is II. 1, from ešû, “destroy.” Line 139. Since the man whom the woman calls approaches Enkidu, the subject of both verbs is the man, and the object is Enkidu; i.e., therefore, “The man approaches Enkidu and beholds him.” Line 140. Langdon’s interpretation of this line again is purely fanciful. E-di-il cannot, of course, be a “phonetic variant” of edir; and certainly the line does not describe the state of mind of the woman. Lines 140–141 are to be taken as an expression of amazement at Enkidu’s appearance. The first word appears to be an imperative in the sense of “Be off,” “Away,” from dâlu, “move, roam.” The second word e-eš, “why,” occurs with the same verb dâlu in the Meissner fragment: e-eš ta-da-al (column 3, 1), “why dost thou roam about?” The verb at the end of the line may perhaps be completed to ta-ḫi-il-la-am. The last sign appears to be am, but may be ma, in which case we should have to complete simply ta-ḫi-il-ma. Taḫîl would be the second person present of ḫîlu. Cf. i-ḫi-il, frequently in astrological texts, e.g., Virolleaud, Adad No. 3, lines 21 and 33. Line 141. The reading lim-nu at the beginning, instead of Langdon’s mi-nu, is quite certain, as is also ma-na-aḫ-ti-ka instead of what Langdon proposes, which gives no sense whatever. Manaḫtu in the sense of the “toil” and “activity of life” (like עָמָל throughout the Book of Ecclesiastes) occurs in the introductory lines to [80]the Assyrian version of the Epic I, 1, 8, ka-lu ma-na-aḫ-ti-[šu], “all of his toil,” i.e., all of his career. Line 142. The subject of the verb cannot be the woman, as Langdon supposes, for the text in that case, e.g., line 49, would have said pi-šá (“her mouth”) not pi-šú (“his mouth”). The long speech, detailing the function and destiny of civilized man, is placed in the mouth of the man who meets Enkidu. In the Introduction it has been pointed out that lines 149 and 151 of the speech appear to be due to later modifications of the speech designed to connect the episode with Gish. Assuming this to be the case, the speech sets forth the following five distinct aims of human life: (1) establishing a home (line 144), (2) work (line 147), (3) storing up resources (line 148), (4) marriage (line 150), (5) monogamy (line 154); all of which is put down as established for all time by divine decree (lines 155–157), and as man’s fate from his birth (lines 158–159). Line 144. bi-ti-iš e-mu-ti is for bîti šá e-mu-ti, just as ḳab-lu-uš Ti-a-ma-ti (Assyrian Creation Myth, IV, 65) stands for ḳablu šá Tiamti. Cf. bît e-mu-ti (Assyrian version, IV, 2, 46 and 48). The end of the line is lost beyond recovery, but the general sense is clear. Line 146. tu-a-ar is a possible reading. It may be the construct of tu-a-ru, of frequent occurrence in legal texts and having some such meaning as “right,” “claim” or “prerogative.” See the passages given by Muss-Arnolt, Assyrian Dictionary, p. 1139b. Line 148. The reading uk-la-at, “food,” and then in the wider sense “food supply,” “provisions,” is quite certain. The fourth sign looks like the one for “city.” E-mi-sa may stand for e-mid-sa, “place it.” The general sense of the line, at all events, is clear, as giving the advice to gather resources. It fits in with the Babylonian outlook on life to regard work and wealth as the fruits of work and as a proper purpose in life. Line 150 (repeated lines 152–153) is a puzzling line. To render piti pûk epši (or epiši), as Langdon proposes, “open, addressing thy speech,” is philologically and in every other respect inadmissible. The word pu-uk (which Langdon takes for “thy mouth”!!) can, of course, be nothing but the construct form of pukku, which occurs in the Assyrian version in the sense of “net” (pu-uk-ku I, 2, 9 and 21, and also in the colophon to the eleventh tablet furnishing the [81]beginning of the twelfth tablet (Haupt’s edition No. 56), as well as in column 2, 29, and column 3, 6, of this twelfth tablet). In the two last named passages pukku is a synonym of mekû, which from the general meaning of “enclosure” comes to be a euphemistic expression for the female organ. So, for example, in the Assyrian Creation Myth, Tablet IV, 66 (synonym of ḳablu, “waist,” etc.). See Holma, Namen der Körperteile, page 158. Our word pukku must be taken in this same sense as a designation of the female organ—perhaps more specifically the “hymen” as the “net,” though the womb in general might also be designated as a “net” or “enclosure.” Kak-(ši) is no doubt to be read epši, as Langdon correctly saw; or perhaps better, epiši. An expression like ip-ši-šú lul-la-a (Assyrian version, I, 4, 13; also line 19, i-pu-us-su-ma lul-la-a), with the explanation šipir zinništi, “the work of woman” (i.e., after the fashion of woman), shows that epêšu is used in connection with the sexual act. The phrase pitî pûk epiši a-na ḫa-a-a-ri, literally “open the net, perform the act for marriage,” therefore designates the fulfillment of the marriage act, and the line is intended to point to marriage with the accompanying sexual intercourse as one of the duties of man. While the general meaning is thus clear, the introduction of Gish is puzzling, except on the supposition that lines 149 and 151 represent later additions to connect the speech, detailing the advance to civilized life, with the hero. See above, p. 45 seq. Line 154. aššat šimâtim is the “legitimate wife,” and the line inculcates monogamy as against promiscuous sexual intercourse. We know that monogamy was the rule in Babylonia, though a man could in addition to the wife recognized as the legalized spouse take a concubine, or his wife could give her husband a slave as a concubine. Even in that case, according to the Hammurabi Code, §§145–146, the wife retained her status. The Code throughout assumes that a man has only one wife—the aššat šimâtim of our text. The phrase “so” (or “that”) before “as afterwards” is to be taken as an idiomatic expression—“so it was and so it should be for all times”—somewhat like the phrase maḫriam ù arkiam, “for all times,” in legal documents (CT VIII, 38c, 22–23). For the use of mûk see Behrens, Assyrisch-Babylonische Briefe, p. 3. Line 158. i-na bi-ti-iḳ a-bu-un-na-ti-šú. Another puzzling line, for which Langdon proposes “in the work of his presence,” which [82]is as obscure as the original. In a note he says that apunnâti means “nostrils,” which is certainly wrong. There has been considerable discussion about this term (see Holma, Namen der Körperteile, pages 150 and 157), the meaning of which has been advanced by Christian’s discussion in OLZ 1914, p. 397. From this it appears that it must designate a part of the body which could acquire a wider significance so as to be used as a synonym for “totality,” since it appears in a list of equivalent for Dur = nap-ḫa-ru, “totality,” ka-lu-ma, “all,” a-bu-un-na-tum e-ṣi-im-tum, “bony structure,” and kul-la-tum, “totality” (CT XII, 10, 7–10). Christian shows that it may be the “navel,” which could well acquire a wider significance for the body in general; but we may go a step further and specify the “umbilical cord” (tentatively suggested also by Christian) as the primary meaning, then the “navel,” and from this the “body” in general. The structure of the umbilical cord as a series of strands would account for designating it by a plural form abunnâti, as also for the fact that one could speak of a right and left side of the appunnâti. To distinguish between the “umbilical cord” and the “navel,” the ideograph Dur (the common meaning of which is riksu, “bond” [Delitzsch, Sumer. Glossar., p. 150]), was used for the former, while for the latter Li Dur was employed, though the reading in Akkadian in both cases was the same. The expression “with (or at) the cutting of his umbilical cord” would mean, therefore, “from his birth”—since the cutting of the cord which united the child with the mother marks the beginning of the separate life. Lines 158–159, therefore, in concluding the address to Enkidu, emphasize in a picturesque way that what has been set forth is man’s fate for which he has been destined from birth. [See now Albright’s remarks on abunnatu in the Revue d’Assyriologie 16, pp. 173–175, with whose conclusion, however, that it means primarily “backbone” and then “stature,” I cannot agree.] In the break of about three lines at the bottom of column 4, and of about six at the beginning of column 5, there must have been set forth the effect of the address on Enkidu and the indication of his readiness to accept the advice; as in a former passage (line 64), Enkidu showed himself willing to follow the woman. At all events the two now proceed to the heart of the city. Enkidu is in front [83]and the woman behind him. The scene up to this point must have taken place outside of Erech—in the suburbs or approaches to the city, where the meadows and the sheepfolds were situated. Line 174. um-ma-nu-um are not the “artisans,” as Langdon supposes, but the “people” of Erech, just as in the Assyrian version, Tablet IV, 1, 40, where the word occurs in connection with i-dip-pi-ir, which is perhaps to be taken as a synonym of paḫâru, “gather;” so also i-dip-pir (Tablet I, 2, 40) “gathers with the flock.” Lines 180–182 must have contained the description of Enkidu’s resemblance to Gish, but the lines are too mutilated to permit of any certain restoration. See the corrections (Appendix) for a suggested reading for the end of line 181. Line 183 can be restored with considerable probability on the basis of the Assyrian version, Tablet I, 3, 3 and 30, where Enkidu is described as one “whose power is strong in the land.” Lines 186–187. The puzzling word, to be read apparently kak-ki-a-tum, can hardly mean “weapons,” as Langdon proposes. In that case we should expect kakkê; and, moreover, to so render gives no sense, especially since the verb ú-te-el-li-lu is without much question to be rendered “rejoiced,” and not “purified.” Kakkiatum—if this be the correct reading—may be a designation of Erech like ribîtim. Lines 188–189 are again entirely misunderstood by Langdon, owing to erroneous readings. See the corrections in the Appendix. Line 190. i-li-im in this line is used like Hebrew Elohîm, “God.” Line 191. šakiššum = šakin-šum, as correctly explained by Langdon. Line 192. With this line a new episode begins which, owing to the gap at the beginning of column 6, is somewhat obscure. The episode leads to the hostile encounter between Gish and Enkidu. It is referred to in column 2 of the fourth tablet of the Assyrian version. Lines 35–50—all that is preserved of this column—form in part a parallel to columns 5–6 of the Pennsylvania tablet, but in much briefer form, since what on the Pennsylvania tablet is the incident itself is on the fourth tablet of the Assyrian version merely a repeated summary of the relationship between the two heroes, leading up to the expedition against Ḫu(m)baba. Lines 38–40 of [84]column 2 of the Assyrian version correspond to lines 174–177 of the Pennsylvania tablet, and lines 44–50 to lines 192–221. It would seem that Gish proceeds stealthily at night to go to the goddess Ishḫara, who lies on a couch in the bît êmuti , the “family house” Assyrian version, Tablet IV, 2. 46–48). He encounters Enkidu in the street, and the latter blocks Gish’s path, puts his foot in the gate leading to the house where the goddess is, and thus prevents Gish from entering. Thereupon the two have a fierce encounter in which Gish is worsted. The meaning of the episode itself is not clear. Does Enkidu propose to deprive Gish, here viewed as a god (cf. line 190 of the Pennsylvania tablet = Assyrian version, Tablet I, 4, 45, “like a god”), of his spouse, the goddess Ishḫara—another form of Ishtar? Or are the two heroes, the one a counterpart of the other, contesting for the possession of a goddess? Is it in this scene that Enkidu becomes the “rival” (me-iḫ-rù, line 191 of the Pennsylvania tablet) of the divine Gish? We must content ourself with having obtained through the Pennsylvania tablet a clearer indication of the occasion of the fight between the two heroes, and leave the further explanation of the episode till a fortunate chance may throw additional light upon it. There is perhaps a reference to the episode in the Assyrian version, Tablet II, 3b, 35–36. Line 196. For i-na-ag-šá-am (from nagâšu), Langdon proposes the purely fanciful “embracing her in sleep,” whereas it clearly means “he approaches.” Cf. Muss-Arnolt, Assyrian Dictionary, page 645a. Lines 197–200 appear to correspond to Tablet IV, 2, 35–37, of the Assyrian version, though not forming a complete parallel. We may therefore supply at the beginning of line 35 of the Assyrian version [ittaziz] Enkidu, corresponding to line 197 of the Pennsylvania tablet. Line 36 of IV, 2, certainly appears to correspond to line 200 (dan-nu-ti = da-na-ni-iš-šú). Line 208. The first sign looks more like šar, though ur is possible. Line 211 is clearly a description of Enkidu, as is shown by a comparison with the Assyrian version I, 2, 37: [pi]-ti-ik pi-ir-ti-šú uḫ-tan-na-ba kima dNidaba, “The form of his hair sprouted like wheat.” We must therefore supply Enkidu in the preceding line. Tablet IV, 4, 6, of the Assyrian version also contains a reference to the flowing hair of Enkidu. [85] Line 212. For the completion of the line cf. Harper, Assyrian and Babylonian Letters, No. 214. Line 214. For ribîtu mâti see the note above to line 28 of column 1. Lines 215–217 correspond almost entirely to the Assyrian version IV, 2, 46–48. The variations ki-ib-su in place of šêpu, and kima lîm, “like oxen,” instead of ina bâb êmuti (repeated from line 46), ana šurûbi for êribam, are slight though interesting. The Assyrian version shows that the “gate” in line 215 is “the gate of the family house” in which the goddess Ishḫara lies. Lines 218–228. The detailed description of the fight between the two heroes is only partially preserved in the Assyrian version. Line 218. li-i-im is evidently to be taken as plural here as in line 224, just as su-ḳi-im (lines 27 and 175), ri-bi-tim (lines 4, 28, etc.), tarbaṣim (line 74), aṣṣamim (line 98) are plural forms. Our text furnishes, as does also the Yale tablet, an interesting illustration of the vacillation in the Hammurabi period in the twofold use of im: (a) as an indication of the plural (as in Hebrew), and (b) as a mere emphatic ending (lines 63, 73, and 232), which becomes predominant in the post-Hammurabi age. Line 227. Gilgamesh is often represented on seal cylinders as kneeling, e.g., Ward Seal Cylinders Nos. 159, 160, 165. Cf. also Assyrian version V, 3, 6, where Gilgamesh is described as kneeling, though here in prayer. See further the commentary to the Yale tablet, line 215. Line 229. We must of course read uz-za-šú, “his anger,” and not uṣ-ṣa-šú, “his javelin,” as Langdon does, which gives no sense. Line 231. Langdon’s note is erroneous. He again misses the point. The stem of the verb here as in line 230 (i-ni-iḫ) is the common nâḫu, used so constantly in connection with pašâḫu, to designate the cessation of anger. Line 234. ištên applied to Gish designates him of course as “unique,” not as “an ordinary man,” as Langdon supposes. Line 236. On this title “wild cow of the stall” for Ninsun, see Poebel in OLZ 1914, page 6, to whom we owe the correct view regarding the name of Gilgamesh’s mother. Line 238. mu-ti here cannot mean “husband,” but “man” in [86]general. See above note to line 107. Langdon’s strange misreading ri-eš-su for ri-eš-ka (“thy head”) leads him again to miss the point, namely that Enkidu comforts his rival by telling him that he is destined for a career above that of the ordinary man. He is to be more than a mere prize fighter; he is to be a king, and no doubt in the ancient sense, as the representative of the deity. This is indicated by the statement that the kingship is decreed for him by Enlil. Similarly, Ḫu(m)baba or Ḫuwawa is designated by Enlil to inspire terror among men (Assyrian version, Tablet IV, 5, 2 and 5), i-šim-šú dEnlil = Yale tablet, l. 137, where this is to be supplied. This position accorded to Enlil is an important index for the origin of the Epic, which is thus shown to date from a period when the patron deity of Nippur was acknowledged as the general head of the pantheon. This justifies us in going back several centuries at least before Hammurabi for the beginning of the Gilgamesh story. If it had originated in the Hammurabi period, we should have had Marduk introduced instead of Enlil. Line 242. As has been pointed out in the corrections to the text (Appendix), šú-tu-ur can only be III, 1, from atâru, “to be in excess of.” It is a pity that the balance of the line is broken off, since this is the first instance of a colophon beginning with the term in question. In some way šutûr must indicate that the copy of the text has been “enlarged.” It is tempting to fill out the line šú-tu-ur e-li [duppi labiri], and to render “enlarged from an original,” as an indication of an independent recension of the Epic in the Hammurabi period. All this, however, is purely conjectural, and we must patiently hope for more tablets of the Old Babylonian version to turn up. The chances are that some portions of the same edition as the Yale and Pennsylvania tablets are in the hands of dealers at present or have been sold to European museums. The war has seriously interfered with the possibility of tracing the whereabouts of groups of tablets that ought never to have been separated. [87] Yale Tablet. Transliteration. (About ten lines missing.) Col. I. 11.................. [ib]-ri(?) 12[mi-im-ma(?) šá(?)]-kú-tu wa(?)-ak-rum 13[am-mi-nim] ta-aḫ-ši-iḫ 14[an-ni]-a-am [e-pi]-šá-am 15...... mi-im[-ma šá-kú-tu(?)]ma- 16di-iš 17[am-mi]-nim [taḫ]-ši-iḫ 18[ur(?)]-ta-du-ú [a-na ki-i]š-tim 19ši-ip-ra-am it-[ta-šú]-ú i-na [nišê] 20it-ta-áš-šú-ú-ma 21i-pu-šú ru-ḫu-tam 22.................. uš-ta-di-nu 23............................. bu 24............................... (About 17 lines missing.) 40.............. nam-........ 41.................... u ib-[ri] ..... 42.............. ú-na-i-du ...... 43[zi-ik]-ra-am ú-[tí-ir]-ru 44[a-na] ḫa-ri-[im]-tim 45[i]-pu(?)-šú a-na sa-[ka]-pu-ti Col. II. (About eleven lines missing.) 57... šú(?)-mu(?) ............... 58ma-ḫi-ra-am [šá i-ši-šú] 59šú-uk-ni-šum-[ma] ............... 60la-al-la-ru-[tu] .................. 61um-mi d-[Giš mu-di-a-at ka-la-ma] 62i-na ma-[ḫar dŠamaš i-di-šá iš-ši][88] 63šá ú 64i-na- an(?)-[na am-mi-nim] 65ta-[aš-kun(?) a-na ma-ri-ia li-ib-bi la] 66ṣa-[li-la te-mid-su] 67............................. (About four lines missing.) 72i-na [šá dEn-ki-dũ im-la-a] di-[im-tam] 73il-[pu-ut li]-ib-ba-šú-[ma] 74[zar-biš(?)] uš-ta-ni-[iḫ] 75[i-na šá dEn]-ki-dũ im-la-a di-im-tam 76[il-pu-ut] li-ib-ba-šú-ma 77[zar-biš(?)] uš-ta-ni-[iḫ] 78[dGiš ú-ta]-ab-bil pa-ni-šú 79[iz-za-kar-am] a-na dEn-ki-dũ 80[ib-ri am-mi-nim] i-na-ka 81[im-la-a di-im]-tam 82[il-pu-ut li-ib-bi]-ka 83[zar-biš tu-uš-ta]-ni-iḫ 84[dEn-ki-dũ pi-šú i-pu-šá]-am-ma 85iz-za-[kàr-am] a-na dGiš 86ta-ab-bi-a-tum ib-ri 87uš-ta-li-pa da-1da-ni-ia 88a-ḫa-a-a ir-ma-a-ma 89e-mu-ki i-ni-iš 90dGiš pi-šú i-pu-šá-am-ma 91iz-za-kàr-am a-na dEn-ki-dũ (About four lines missing.) Col. III. 96..... [a-di dḪu]-wa-wa da-pi-nu 97.................. ra-[am(?)-ma] 98................ [ú-ḫal]- li-ik 99[lu-ur-ra-du a-na ki-iš-ti šá] iserini[89] 100............ lam(?) ḫal-bu 101............ [li]-li-is-su 102.............. lu(?)-up-ti-šú 103dEn-ki-dũ pi-šú i-pu-šá-am-ma 104iz-za-kàr-am a-na dGiš 105i-di-ma ib-ri i-na šadî(-i) 106i-nu-ma at-ta-la-ku it-ti bu-lim 107a-na ištên(-en) kas-gíd-ta-a-an nu-ma-at ki-iš-tum 108[e-di-iš(?)] ur-ra-du a-na libbi-šá 109d[Ḫu-wa]-wa ri-ig-ma-šú a-bu-bu 110pi-[šú] dBil-gi-ma 111na-pi-iš-šú mu-tum 112am-mi-nim ta-aḫ-ši-iḫ 113an-ni-a-am e-pi-šá-am 114ga-[ba]-al-la ma-ḫa-ar 115[šú]-pa-at dḪu-wa-wa 116(d)Giš pi-šú i-pu-šá-am-ma 117[iz-za-k]àr-am a-na dEn-ki-dũ 118....... su(?)-lu-li a-šá-ki2-šá 119............. [i-na ki-iš]-tim 120............................... 121ik(?) ......................... 122a-na .......................... 123mu-šá-ab [dḪu-wa-wa] ....... 124ḫa-aṣ-si-nu ................. 125at-ta lu(?) ................. 126a-na-ku lu-[ur-ra-du a-na ki-iš-tim] 127dEn-ki-dũ pi-šú i-pu-[šá-am-ma] 128iz-za-kàr-am a-na [dGiš] 129ki-i ni[il]-la-ak [iš-te-niš(?)] 130a-na ki-iš-ti [šá iṣerini] 131na-ṣi-ir-šá dGiš muḳ-[tab-lu] 132da-a-an la ṣa[-li-lu(?)] 133dḪu-wa-wa dpi-ir-[ḫu ša (?)][90] 134dAdad iš .......... 135šú-ú .................. Col. IV. 136áš-šúm šú-ul-lu-m[u ki-iš-ti šáiṣerini] 137pu-ul-ḫi-a-tim 7 [šú(?) i-šim-šú dEnlil] 138dGiš pi-šú i-pu [šá-am-ma] 139iz-za-kàr-am a-na [dEn-ki-dũ] 140ma-an-nu ib-ri e-lu-ú šá-[ru-ba(?)] 141i-ṭib-ma it-ti dŠamaš da-ri-iš ú-[me-šú] 142a-we-lu-tum ba-ba-nu ú-tam-mu-šá-[ma] 143mi-im-ma šá i-te-ni-pu-šú šá-ru-ba 144at-ta an-na-nu-um-ma ta-dar mu-tam 145ul iš-šú da-na-nu ḳar-ra-du-ti-ka 146lu-ul-li-ik-ma i-na pa-ni-ka 147pi-ka li-iš-si-a-am ṭi-ḫi-e ta-du-ur 148šum-ma am-ta-ḳu-ut šú-mi lu-uš-zi-iz 149dGiš mi3-it-ti dḪu-wa-wa da-pi-nim 150il(?)-ḳu-ut iš-tu 151i-wa-al-dam-ma tar-bi-a i-na šam-mu(?) Il(?) 152iš-ḫi-it-ka-ma la-bu ka-la-ma ti-di 153it- ku(?) ..... [il(?)]-pu-tu-(?) ma ..... 154.............. ka-ma 155.............. ši pi-ti 156............ ki-ma re’i(?) na-gi-la sa-rak-ti 157.... [ta-šá-s]i-a-am tu-lim-mi-in li-ib-bi 158[ga-ti lu]-uš-ku-un-ma 159[lu-u-ri]-ba-am iṣerini[91] 160[šú-ma sá]-ṭa-ru-ú a-na-ku lu-uš-ta-ak-na 161[pu-tu-ku(?)] ib-ri a-na ki-iš-ka-tim lu-mu-ḫa 162[be-le-e li-iš-]-pu-ku i-na maḫ-ri-ni 163[pu-tu]-ku a-na ki-iš-ka-ti-i i-mu-ḫu 164wa-áš-bu uš-ta-da-nu um-mi-a-nu 165pa-ši iš-pu-ku ra-bu-tim 166ḫa-aṣ-si-ni 3 biltu-ta-a-an iš-tap-ku 167pa-aṭ-ri iš-pu-ku ra-bu-tim 168me-še-li-tum 2 biltu-ta-a-an 169ṣi-ip-ru 30 ma-na-ta-a-an šá a-ḫi-ši-na 170išid(?) pa-aṭ-ri 30 ma-na-ta-a-an ḫuraṣi 171[d]Giš ù [dEn-ki-]dũ 10 biltu-ta-a-an šá-ak-nu] 172.... ul-la . .[Uruk]ki 7 i-di-il-šú 173...... iš-me-ma um-ma-nu ib-bi-ra 174[uš-te-(?)]-mi-a i-na sûḳi šá Urukki ri-bi-tim 175...... [u-še(?)]-ṣa-šú dGis 176[ina sûḳi šá(?) Urukki] ri-bi-tim 177[dEn-ki-dũ(?) ú]-šá-ab i-na maḫ-ri-šú 178..... [ki-a-am(?) i-ga]-ab-bi 179[........ Urukki ri]-bi-tim 180 [ma-ḫa-ar-šú] Col. V. 181dGiš šá i-ga-ab-bu-ú lu-mu-ur 182šá šú-um-šú it-ta-nam-ma-la ma-ta-tum 183lu-uk-šú-su-ma i-na ki-iš-ti iṣerini 184ki-ma da-an-nu pi-ir-ḫu-um šá Urukki[92] 185lu-ši-eš-mi ma-tam 186ga-ti lu-uš-ku-un-ma lu-uk-[šú]4-su-ma iṣerini 187šú-ma šá-ṭa-ru-ú a-na-ku lu-uš-tak-nam 188ši-bu-tum šá Urukki ri-bi-tim 189zi-ik-ra ú-ti-ir-ru a-na dGiš 190ṣi-iḫ-ri-ti-ma dGiš libbi-ka na-ši-ka 191mi-im-ma šá te-te-ni-pu-šú la ti-di 192ni-ši-im-me-ma dḪu-wa-wa šá-nu-ú bu-nu-šú 193ma-an-nu-um [uš-tam]-ḫa-ru ka-ak-ki-šú 194a-na ištên(-en) [kas-gíd-ta-a]-an nu-ma-at kišti 195ma-an-nu šá [ur-ra]-du a-na libbi-šá 196dḪu-wa-wa ri-ig-ma-šú a-bu-bu 197pi-šú dBil-gi-ma na-pi-su mu-tum 198am-mi-nim taḫ-ši-iḫ an-ni-a-am e-pi-šá 199ga-ba-al-la ma-ḫa-ar šú-pa-at dḪu-wa-wa 200iš-me-e-ma dGiš zi-ki-ir ma-li-[ki]-šú 201ip-pa-al-sa-am-ma i-ṣi-iḫ a-na ib-[ri-šú] 202i-na-an-na ib-[ri] ki-a-am [a-ga-ab-bi] 203a-pa-al-aḫ-šú-ma a-[al-la-ak a-na kišti] 204[lu]ul-[lik it-ti-ka a-na ki-iš-ti iṣerini(?)] (About five lines missing.) 210........................ -ma 211li ............... -ka[93] 212ilu-ka li(?) ..............-ka 213ḫarrana li-šá-[tir-ka a-na šú-ul-mi] 214a-na kar šá [Urukki ri-bi-tim] 215ka-mi-is-ma dGiš [ma-ḫa-ar dŠamaš(?)] 216a-wa-at i-ga-ab- [bu-šú-ma] 217a-al-la-ak dŠamaš katâ-[ka a-ṣa-bat] 218ul-la-nu lu-uš-li-ma na-pi-[iš-ti] 219te-ir-ra-an-ni a-na kar i-[na Urukki] 220ṣi-il-[la]m šú-ku-un [a-na ia-a-ši(?)] 221iš-si-ma dGiš ib-[ri.....] 222te-ir-ta-šú .......... 223is(?) .............. 224tam ................ 225........................ 226i-nu(?)-[ma] .................. (About two lines missing.) Col. VI. 229[a-na-ku] dGiš [i-ik]-ka-di ma-tum 230........... ḫarrana šá la al-[kam] ma-ti-ma 231.... a-ka-lu ..... la(?) i-di 232[ul-la-nu] lu-uš-li-[mu] a-na-ku 233[lu-ud-lul]-ka i-na [ḫ]u-ud li-ib-bi 234...... [šú]-ḳu-ut-[ti] la-li-ka 235[lu-še-šib(?)] - ka i-na kussêmeš 236....................... ú-nu-su 237[bêlêmeš(?)ú-ti-ir]-ru ra-bu-tum 238[ka-aš-tum] ù iš-pa-tum 239[i-na] ga-ti iš-ku-nu 240[il-]te-ki pa-ši 241....... -ri iš-pa-as-su[94] 242..... [a-na] ili šá-ni-tam 243[it-ti pa(?)] - tar-[šú] i-na ši-ip-pi-šú 244........ i-ip-pu-šú a-la-kam 245[ša]-niš ú-ga-ra-bu dGiš 246[a-di ma]-ti tu-ut-te-ir a-na libbi Urukki 247[ši-bu]-tum i-ka-ra-bu-šú 248[a-na] ḫarrani i-ma-li-ku dGiš 249[la t]a-at-kal dGiš a-na e-[mu]-ḳi-ka 250[a-]ka-lu šú-wa-ra-ma ú-ṣur ra-ma-an-ka 251[li]-il-lik dEn-ki-dũ i-na pa-ni-ka 252[ur-ḫa]-am a-we-ir a-lik ḫarrana(-na) 253[a-di] šá kišti ni-ri-bi-tim 254[šá(?)] [d]Ḫu-wa-wa ka-li-šú-nu ši-ip-pi-iḫ(?)-šú 255[ša(?)a-lik] maḫ-ra tap-pa-a ú-šá-lim 256[ḫarrana](-na)-šú šú-wa-ra-[ma ú-ṣur ra-ma-na-ka] 257[li-šak-šid]-ka ir-[ni-ta]-ka dŠamaš 258[ta]-ak-bi-a-at pi-ka li-kal-li-ma i-na-ka 259li-ip-ti-ḳu pa-da-nam pi-ḫi-tam 260ḫarrana li-iš-ta-zi-ik a-na ki-ib-si-ka 261šá-di-a li-iš-ta-zi-ik a-na šêpi-ka 262mu-ši-it-ka aw-a-at ta-ḫa-du-ú 263li-ib-la-ma dLugal-ban-da li-iz-zi-iz-ka[95] 264i-na ir-ni-ti-ka 265ki-ma ṣi-iḫ-ri ir-ni-ta-ka-ma luš-mida(-da) 266i-na na-ri šá dḪu-wa-wa šá tu-ṣa-ma-ru 267mi-zi ši-pi-ka 268i-na bat-ba-ti-ka ḫi-ri bu-ur-tam 269lu-ka-a-a-nu mê ellu i-na na-di-ka 270[ka-]su-tim me-e a-na dŠamaš ta-na-di 271[li-iš]ta-ḫa-sa-as dLugal-ban-da 272[dEn-ki-]dũ pi-su i-pu-šá-am-ma, iz-za-kàr a-na dGiš 273[is(?)]-tu(?) ta-áš-dan-nu e-pu-uš a-la-kam 274[la pa]la-aḫ libbi-ka ia-ti tu-uk-la-ni 275[šú-ku-]un i-di-a-am šú-pa-as-su 276[ḫarrana(?)]šá dḪu-wa-wa it-ta-la-ku 277.......... ki-bi-ma te-[ir]-šú-nu-ti (Three lines missing.) L.E. 281.............. nam-ma-la 282............... il-li-ku it-ti-ia 283............... ba-ku-nu-ši-im 284......... [ul]-la(?)-nu i-na ḫu-ud li-ib-bi 285[i-na še-me-e] an-ni-a ga-ba-šú 286e-diš ḫarrana(?) uš-te-[zi-ik] 287a-lik dGiš lu-[ul-lik a-na pa-ni-ka] 288li-lik il-ka .......... 289li-šá-ak-lim-[ka ḫarrana] ...... 290dGiš ù[dEn-ki-dũ] ....... 291mu-di-eš .......... 292bi-ri-[su-nu] ........ [87] Translation. (About ten lines missing.) Col. I. 11.................. (my friend?) 12[Something] that is exceedingly difficult, 13[Why] dost thou desire 14[to do this?] 15.... something (?) that is very [difficult (?)], 16[Why dost thou] desire 17[to go down to the forest]? 18A message [they carried] among [men] 19They carried about. 20They made a .... 21.............. they brought 22.............................. 23.............................. (About 17 lines missing.) 40............................. 41................... my friend 42................ they raised ..... 43answer [they returned.] 44[To] the woman 45They proceeded to the overthrowing Col. II. (About eleven lines missing.) 57.......... name(?) ............. 58[The one who is] a rival [to him] 59subdue and ................ 60Wailing ................ 61The mother [of Gišh, who knows everything] 62Before [Shamash raised her hand][88] 63Who 64Now(?) [why] 65hast thou stirred up the heart for my son, 66[Restlessness imposed upon him (?)] 67............................ (About four lines missing.) 72The eyes [of Enkidu filled with tears]. 73[He clutched] his heart; 74[Sadly(?)] he sighed. 75[The eyes of En]kidu filled with tears. 76[He clutched] his heart; 77[Sadly(?)] he sighed. 78The face [of Gišh was grieved]. 79[He spoke] to Enkidu: 80[“My friend, why are] thy eyes 81[Filled with tears]? 82Thy [heart clutched] 83Dost thou sigh [sadly(?)]?” 84[Enkidu opened his mouth] and 85spoke to Gišh: 86“Attacks, my friend, 87have exhausted my strength(?). 88My arms are lame, 89my strength has become weak.” 90Gišh opened his mouth and 91spoke to Enkidu: (About four lines missing.) Col. III. 96..... [until] Ḫuwawa, [the terrible], 97........................ 98............ [I destroyed]. 99[I will go down to the] cedar forest,[89] 100................... the jungle 101............... tambourine (?) 102................ I will open it. 103Enkidu opened his mouth and 104spoke to Gišh: 105“Know, my friend, in the mountain, 106when I moved about with the cattle 107to a distance of one double hour into the heart of the forest, 108[Alone?] I penetrated within it, 109[To] Ḫuwawa, whose roar is a flood, 110whose mouth is fire, 111whose breath is death. 112Why dost thou desire 113To do this? 114To advance towards 115the dwelling(?) of Ḫuwawa?” 116Gišh opened his mouth and 117[spoke to Enkidu: 118”... [the covering(?)] I will destroy. 119....[in the forest] 120.................... 121.................... 122To ................. 123The dwelling [of Ḫuwawa] 124The axe .......... 125Thou .......... 126I will [go down to the forest].” 127Enkidu opened his mouth and 128spoke to [Gish:] 129“When [together(?)] we go down 130To the [cedar] forest, 131whose guardian, O warrior Gish, 132a power(?) without [rest(?)], 133Ḫuwawa, an offspring(?) of ....[90] 134Adad ...................... 135He ........................ Col. IV. 136To keep safe [the cedar forest], 137[Enlil has decreed for it] seven-fold terror.” 138Gish [opened] his mouth and 139spoke to [Enkidu]: 140“Whoever, my friend, overcomes (?) [terror(?)], 141it is well (for him) with Shamash for the length of [his days]. 142Mankind will speak of it at the gates. 143Wherever terror is to be faced, 144Thou, forsooth, art in fear of death. 145Thy prowess lacks strength. 146I will go before thee. 147Though thy mouth calls to me; “thou art afraid to approach.” 148If I fall, I will establish my name. 149Gish, the corpse(?) of Ḫuwawa, the terrible one, 150has snatched (?) from the time that 151My offspring was born in ...... 152The lion restrained (?) thee, all of which thou knowest. 153........................ 154.............. thee and 155................ open (?) 156........ like a shepherd(?) ..... 157[When thou callest to me], thou afflictest my heart. 158I am determined 159[to enter] the cedar forest.[91] 160I will, indeed, establish my name. 161[The work(?)], my friend, to the artisans I will entrust. 162[Weapons(?)] let them mould before us.” 163[The work(?)] to the artisans they entrusted. 164A dwelling(?) they assigned to the workmen. 165Hatchets the masters moulded: 166Axes of 3 talents each they moulded. 167Lances the masters moulded; 168Blades(?) of 2 talents each, 169A spear of 30 mina each attached to them. 170The hilt of the lances of 30 mina in gold 171Gish and [Enki]du were equipped with 10 talents each 172.......... in Erech seven its .... 173....... the people heard and .... 174[proclaimed(?)] in the street of Erech of the plazas. 175..... Gis [brought him out(?)] 176[In the street (?)] of Erech of the plazas 177[Enkidu(?)] sat before him 178..... [thus] he spoke: 179”........ [of Erech] of the plazas 180............ [before him] Col. V. 181Gish of whom they speak, let me see! 182whose name fills the lands. 183I will lure him to the cedar forest, 184Like a strong offspring of Erech.[92] 185I will let the land hear (that) 186I am determined to lure (him) in the cedar (forest)5. 187A name I will establish.” 188The elders of Erech of the plazas 189brought word to Gish: 190“Thou art young, O Gish, and thy heart carries thee away. 191Thou dost not know what thou proposest to do. 192We hear that Huwawa is enraged. 193Who has ever opposed his weapon? 194To one [double hour] in the heart of the forest, 195Who has ever penetrated into it? 196Ḫuwawa, whose roar is a deluge, 197whose mouth is fire, whose breath is death. 198Why dost thou desire to do this? 199To advance towards the dwelling (?) of Ḫuwawa?” 200Gish heard the report of his counsellors. 201He saw and cried out to [his] friend: 202“Now, my friend, thus [I speak]. 203I fear him, but [I will go to the cedar forest(?)]; 204I will go [with thee to the cedar forest]. (About five lines missing.) 210.............................. 211May ................... thee[93] 212Thy god may (?) ........ thee; 213On the road may he guide [thee in safety(?)]. 214At the rampart of [Erech of the plazas], 215Gish kneeled down [before Shamash(?)], 216A word then he spoke [to him]: 217“I will go, O Shamash, [thy] hands [I seize hold of]. 218When I shall have saved [my life], 219Bring me back to the rampart [in Erech]. 220Grant protection [to me ?]!” 221Gish cried, ”[my friend] ...... 222His oracle .................. 223........................ 224........................ 225........................ 226When (?) (About two lines missing.) Col. VI. 229”[I(?)] Gish, the strong one (?) of the land. 230...... A road which I have never [trodden]; 231........ food ...... do not (?) know. 232[When] I shall have succeeded, 233[I will praise] thee in the joy of my heart, 234[I will extol (?)] the superiority of thy power, 235[I will seat thee] on thrones.” 236.................. his vessel(?) 237The masters [brought the weapons (?)]; 238[bow] and quiver 239They placed in hand. 240[He took] the hatchet. 241................. his quiver.[94] 242..... [to] the god(?) a second time 243[With his lance(?)] in his girdle, 244......... they took the road. 245[Again] they approached Gish! 246”[How long] till thou returnest to Erech?” 247[Again the elders] approached him. 248[For] the road they counselled Gis: 249“Do [not] rely, O Gish, on thy strength! 250Provide food and save thyself! 251Let Enkidu go before thee. 252He is acquainted with the way, he has trodden the road 253[to] the entrance of the forest. 254of Ḫuwawa all of them his ...... 255[He who goes] in advance will save the companion. 256Provide for his [road] and [save thyself]! 257(May) Shamash [carry out] thy endeavor! 258May he make thy eyes see the prophecy of thy mouth. 259May he track out (for thee) the closed path! 260May he level the road for thy treading! 261May he level the mountain for thy foot! 262During thy night6 the word that wilt rejoice 263may Lugal-banda convey, and stand by thee[95] 264in thy endeavor! 265Like a youth may he establish thy endeavor! 266In the river of Ḫuwawa as thou plannest, 267wash thy feet! 268Round about thee dig a well! 269May there be pure water constantly for thy libation 270Goblets of water pour out to Shamash! 271[May] Lugal-banda take note of it!” 272[Enkidu] opened his mouth and spoke to Gish: 273”[Since thou art resolved] to take the road. 274Thy heart [be not afraid,] trust to me! 275[Confide] to my hand his dwelling(?)!” 276[on the road to] Ḫuwawa they proceeded. 277....... command their return (Three lines missing.) L.E. 281............... were filled. 282.......... they will go with me. 283............................... 284.................. joyfully. 285[Upon hearing] this word of his, 286Alone, the road(?) [he levelled]. 287“Go, O Gish [I will go before thee(?)]. 288May thy god(?) go ......... 289May he show [thee the road !] ..... 290Gish and [Enkidu] 291Knowingly .................... 292Between [them] ................ [96]Lines 13–14 (also line 16). See for the restoration, lines 112–13. Line 62. For the restoration, see Jensen, p. 146 (Tablet III, 2a,9.) Lines 64–66. Restored on the basis of the Assyrian version, ib. line 10. Line 72. Cf. Assyrian version, Tablet IV, 4, 10, and restore at the end of this line di-im-tam as in our text, instead of Jensen’s conjecture. Lines 74, 77 and 83. The restoration zar-biš, suggested by the Assyrian version, Tablet IV, 4, 4. Lines 76 and 82. Cf. Assyrian version, Tablet VIII, 3, 18. Line 78. (ú-ta-ab-bil from abâlu, “grieve” or “darkened.” Cf. uš-ta-kal (Assyrian version, ib. line 9), where, perhaps, we are to restore it-ta-[bil pa-ni-šú]. Line 87. uš-ta-li-pa from elêpu, “exhaust.” See Muss-Arnolt, Assyrian Dictionary, p. 49a. Line 89. Cf. Assyrian version, ib. line 11, and restore the end of the line there to i-ni-iš, as in our text. Line 96. For dapinu as an epithet of Ḫuwawa, see Assyrian version, Tablet III, 2a, 17, and 3a, 12. Dapinu occurs also as a description of an ox (Rm 618, Bezold, Catalogue of the Kouyunjik Tablets, etc., p. 1627). Line 98. The restoration on the basis of ib. III, 2a, 18. Lines 96–98 may possibly form a parallel to ib. lines 17–18, which would then read about as follows: “Until I overcome Ḫuwawa, the terrible, and all the evil in the land I shall have destroyed.” At the same time, it is possible that we are to restore [lu-ul]-li-ik at the end of line 98. Line 101. lilissu occurs in the Assyrian version, Tablet IV, 6, 36. Line 100. For ḫalbu, “jungle,” see Assyrian version, Tablet V, 3, 39 (p. 160). Lines 109–111. These lines enable us properly to restore Assyrian version, Tablet IV, 5, 3 = Haupt’s edition, p. 83 (col. 5, 3). No doubt the text read as ours mu-tum (or mu-u-tum) na-pis-su. Line 115. šupatu, which occurs again in line 199 and also line 275.šú-pa-as-su (= šupat-su) must have some such meaning as [97]“dwelling,” demanded by the context. [Dhorme refers me to OLZ 1916, p. 145]. Line 129. Restored on the basis of the Assyrian version, Tablet IV, 6, 38. Line 131. The restoration muḳtablu, tentatively suggested on the basis of CT XVIII, 30, 7b, where muḳtablu, “warrior,” appears as one of the designations of Gilgamesh, followed by a-lik pa-na, “the one who goes in advance,” or “leader”—the phrase so constantly used in the Ḫuwawa episode. Line 132. Cf. Assyrian version, Tablet I, 5, 18–19. Lines 136–137. These two lines restored on the basis of Jensen IV, 5, 2 and 5. The variant in the Assyrian version, šá niše (written Ukumeš in one case and Lumeš in the other), for the numeral 7 in our text to designate a terror of the largest and most widespread character, is interesting. The number 7 is similarly used as a designation of Gilgamesh, who is called Esigga imin, “seven-fold strong,” i.e., supremely strong (CT XVIII, 30, 6–8). Similarly, Enkidu, ib. line 10, is designated a-rá imina, “seven-fold.” Line 149. A difficult line because of the uncertainty of the reading at the beginning of the following line. The most obvious meaning of mi-it-tu is “corpse,” though in the Assyrian version šalamtu is used (Assyrian version, Tablet V, 2, 42). On the other hand, it is possible—as Dr. Lutz suggested to me—that mittu, despite the manner of writing, is identical with miṭṭú, the name of a divine weapon, well-known from the Assyrian creation myth (Tablet IV, 130), and other passages. The combination miṭ-ṭu šá-ḳu-ú-, “lofty weapon,” in the Bilingual text IV, R², 18 No. 3, 31–32, would favor the meaning “weapon” in our passage, since [šá]-ḳu-tu is a possible restoration at the beginning of line 150. However, the writing mi-it-ti points too distinctly to a derivative of the stem mâtu, and until a satisfactory explanation of lines 150–152 is forthcoming, we must stick to the meaning “corpse” and read the verb il-ḳu-ut. Line 152. The context suggests “lion” for the puzzling la-bu. Line 156. Another puzzling line. Dr. Clay’s copy is an accurate reproduction of what is distinguishable. At the close of the line there appears to be a sign written over an erasure. Line 158. [ga-ti lu-]uš-kun as in line 186, literally, “I will place my hand,” i.e., I purpose, I am determined. [98] Line 160. The restoration on the basis of the parallel line 187. Note the interesting phrase, “writing a name” in the sense of acquiring “fame.” Line 161. The kiškattê, “artisans,” are introduced also in the Assyrian version, Tablet VI, 187, to look at the enormous size and weight of the horns of the slain divine bull. See for other passages Muss-Arnolt Assyrian Dictionary, p. 450b. At the beginning of this line, we must seek for the same word as in line 163. Line 162. While the restoration belê, “weapon,” is purely conjectural, the context clearly demands some such word. I choose belê in preference to kakkê, in view of the Assyrian version, Tablet VI, 1. Line 163. Putuku (or putukku) from patâku would be an appropriate word for the fabrication of weapons. Line 165. The rabûtim here, as in line 167, I take as the “master mechanics” as contrasted with the ummianu, “common workmen,” or journeymen. A parallel to this forging of the weapons for the two heroes is to be found in the Sumerian fragment of the Gilgamesh Epic published by Langdon, Historical and Religious Texts from the Temple Library of Nippur (Munich, 1914), No. 55, 1–15. Lines 168–170 describe the forging of the various parts of the lances for the two heroes. The ṣipru is the spear point Muss-Arnolt, Assyrian Dictionary, p. 886b; the išid paṭri is clearly the “hilt,” and the mešelitum I therefore take as the “blade” proper. The word occurs here for the first time, so far as I can see. For 30 minas, see Assyrian version, Tablet VI, 189, as the weight of the two horns of the divine bull. Each axe weighing 3 biltu, and the lance with point and hilt 3 biltu we would have to assume 4 biltu for each pašu, so as to get a total of 10 biltu as the weight of the weapons for each hero. The lance is depicted on seal cylinders representing Gilgamesh and Enkidu, for example, Ward, Seal Cylinders, No. 199, and also in Nos. 184 and 191 in the field, with the broad hilt; and in an enlarged form in No. 648. Note the clear indication of the hilt. The two figures are Gilgamesh and Enkidu—not two Gilgameshes, as Ward assumed. See above, page 34. A different weapon is the club or mace, as seen in Ward, Nos. 170 and 173. This appears also to be the weapon which Gilgamesh holds in his hand on the colossal figure from the palace of Sargon (Jastrow, Civilization of [99]Babylonia and Assyria, Pl. LVII), though it has been given a somewhat grotesque character by a perhaps intentional approach to the scimitar, associated with Marduk (see Ward, Seal Cylinders, Chap. XXVII). The exact determination of the various weapons depicted on seal-cylinders merits a special study. Line 181. Begins a speech of Ḫuwawa, extending to line 187, reported to Gish by the elders (line 188–189), who add a further warning to the youthful and impetuous hero. Line 183. lu-uk-šú-su (also l. 186), from akâšu, “drive on” or “lure on,” occurs on the Pennsylvania tablet, line 135, uk-ki-ši, “lure on” or “entrap,” which Langdon erroneously renders “take away” and thereby misses the point completely. See the comment to the line of the Pennsylvania tablet in question. Line 192. On the phrase šanû bunu, “change of countenance,” in the sense of “enraged,” see the note to the Pennsylvania tablet, l.31. Line 194. nu-ma-at occurs in a tablet published by Meissner, Altbabyl. Privatrecht, No. 100, with bît abi, which shows that the total confine of a property is meant; here, therefore, the “interior” of the forest or heart. It is hardly a “by-form” of nuptum as Muss-Arnolt, Assyrian Dictionary, p. 690b, and others have supposed, though nu-um-tum in one passage quoted by Muss-Arnolt, ib. p. 705a, may have arisen from an aspirate pronunciation of the p in nubtum. Line 215. The kneeling attitude of prayer is an interesting touch. It symbolizes submission, as is shown by the description of Gilgamesh’s defeat in the encounter with Enkidu (Pennsylvania tablet, l. 227), where Gilgamesh is represented as forced to “kneel” to the ground. Again in the Assyrian version, Tablet V, 4, 6, Gilgamesh kneels down (though the reading ka-mis is not certain) and has a vision. Line 229. It is much to be regretted that this line is so badly preserved, for it would have enabled us definitely to restore the opening line of the Assyrian version of the Gilgamesh Epic. The fragment published by Jeremias in his appendix to his Izdubar-Nimrod, Plate IV, gives us the end of the colophon line to the Epic, reading ……… di ma-a-ti (cf. ib., Pl. I, 1. … a-ti). Our text evidently reproduces the same phrase and enables us to supply ka, as well as [100]the name of the hero Gišh of which there are distinct traces. The missing word, therefore, describes the hero as the ruler, or controller of the land. But what are the two signs before ka? A participial form from pakâdu, which one naturally thinks of, is impossible because of the ka, and for the same reason one cannot supply the word for shepherd (nakidu). One might think of ka-ak-ka-du, except that kakkadu is not used for “head” in the sense of “chief” of the land. I venture to restore [i-ik-]ka-di, “strong one.” Our text at all events disposes of Haupt’s conjecture iš-di ma-a-ti (JAOS 22, p. 11), “Bottom of the earth,” as also of Ungnad’s proposed [a-di pa]-a-ti, “to the ends” (Ungnad-Gressmann, Gilgamesch-Epos, p. 6, note), or a reading di-ma-a-ti, “pillars.” The first line of the Assyrian version would now read šá nak-ba i-mu-ru [dGis-gi(n)-maš i-ik-ka]-di ma-a-ti, i.e., “The one who saw everything, Gilgamesh the strong one (?) of the land.” We may at all events be quite certain that the name of the hero occurred in the first line and that he was described by some epithet indicating his superior position. Lines 229–235 are again an address of Gilgamesh to the sun-god, after having received a favorable “oracle” from the god (line 222). The hero promises to honor and to celebrate the god, by erecting thrones for him. Lines 237–244 describe the arming of the hero by the “master” craftsman. In addition to the pašu and paṭru, the bow (?) and quiver are given to him. Line 249 is paralleled in the new fragment of the Assyrian version published by King in PSBA 1914, page 66 (col. 1, 2), except that this fragment adds gi-mir to e-mu-ḳi-ka. Lines 251–252 correspond to column 1, 6–8, of King’s fragment, with interesting variations “battle” and “fight” instead of “way” and “road,” which show that in the interval between the old Babylonian and the Assyrian version, the real reason why Enkidu should lead the way, namely, because he knows the country in which Ḫuwawa dwells (lines 252–253), was supplemented by describing Enkidu also as being more experienced in battle than Gilgamesh. Line 254. I am unable to furnish a satisfactory rendering for this line, owing to the uncertainty of the word at the end. Can it [101]be “his household,” from the stem which in Hebrew gives us מִשְׁפָּחָה “family?” Line 255. Is paralleled by col. 1, 4, of King’s new fragment. The episode of Gišh and Enkidu proceeding to Ninsun, the mother of Gish, to obtain her counsel, which follows in King’s fragment, appears to have been omitted in the old Babylonian version. Such an elaboration of the tale is exactly what we should expect as it passed down the ages. Line 257. Our text shows that irnittu (lines 257, 264, 265) means primarily “endeavor,” and then success in one’s endeavor, or “triumph.” Lines 266–270. Do not appear to refer to rites performed after a victory, as might at a first glance appear, but merely voice the hope that Gišh will completely take possession of Ḫuwawa’s territory, so as to wash up after the fight in Ḫuwawa’s own stream; and the hope is also expressed that he may find pure water in Ḫuwawa’s land in abundance, to offer a libation to Šhamašh. Line 275. On šú-pa-as-su = šupat-su, see above, to l. 115. [Note on Sabitum (above, p. 11) In a communication before the Oriental Club of Philadelphia (Feb. 10, 1920), Prof. Haupt made the suggestion that sa-bi-tum (or tu), hitherto regarded as a proper name, is an epithet describing the woman who dwells at the seashore which Gilgamesh in the course of his wanderings reaches, as an “innkeeper”. It is noticeable that the term always appears without the determinative placed before proper names; and since in the old Babylonian version (so far as preserved) and in the Assyrian version, the determinative is invariably used, its consistent absence in the case of sabitum (Assyrian Version, Tablet X, 1, 1, 10, 15, 20; 2, 15–16 [sa-bit]; Meissner fragment col. 2, 11–12) speaks in favor of Professor Haupt’s suggestion. The meaning “innkeeper”, while not as yet found in Babylonian-Assyrian literature is most plausible, since we have sabū as a general name for ’drink’, though originally designating perhaps more specifically sesame wine (Muss-Arnolt, Assyrian Dictionary, p. 745b) or distilled brandy, according to Prof. Haupt. Similarly, in the Aramaic dialects, sebha is used for “to drink” and in the Pael to “furnish drink”. Muss-Arnolt in [102]his Assyrian Dictionary, 746b, has also recognized that sabitum was originally an epithet and compares the Aramaic sebhoyâthâ(p1) “barmaids”. In view of the bad reputation of inns in ancient Babylonia as brothels, it would be natural for an epithet like sabitum to become the equivalent to “public” women, just as the inn was a “public” house. Sabitum would, therefore, have the same force as šamḫatu (the “harlot”), used in the Gilgamesh Epic by the side of ḫarimtu “woman” (see the note to line 46 of Pennsylvania Tablet). The Sumerian term for the female innkeeper is Sal Geštinna “the woman of the wine,” known to us from the Hammurabi Code §§108–111. The bad reputation of inns is confirmed by these statutes, for the house of the Sal Geštinna is a gathering place for outlaws. The punishment of a female devotee who enters the “house of a wine woman” (bît Sal Geštinna §110) is death. It was not “prohibition” that prompted so severe a punishment, but the recognition of the purpose for which a devotee would enter such a house of ill repute. The speech of the sabitum or innkeeper to Gilgamesh (above, p. 12) was, therefore, an invitation to stay with her, instead of seeking for life elsewhere. Viewed as coming from a “public woman” the address becomes significant. The invitation would be parallel to the temptation offered by the ḫarimtu in the first tablet of the Enkidu, and to which Enkidu succumbs. The incident in the tablet would, therefore, form a parallel in the adventures of Gilgamesh to the one that originally belonged to the Enkidu cycle. Finally, it is quite possible that sabitum is actually the Akkadian equivalent of the Sumerian Sal Geštinna, though naturally until this equation is confirmed by a syllabary or by other direct evidence, it remains a conjecture. See now also Albright’s remarks on Sabitum in the A. J. S. L. 36, pp. 269 seq.] [103] 1 Scribal error for an. 2 Text apparently di. 3 Hardly ul. 4 Omitted by scribe. 5 Kišti omitted by scribe. 6 I.e., at night to thee, may Lugal-banda, etc. Corrections to the Text of Langdon’s Edition of the Pennsylvania Tablet.1 Column 1. 5. Read it-lu-tim (“heroes”) instead of id-da-tim (“omens”). 6. Read ka-ka-bu instead of ka-ka-’a. This disposes of Langdon’s note 2 on p. 211. 9 Read ú-ni-iš-šú-ma, “I became weak” (from enêšu, “weak”) instead of ilam iš-šú-ma, “He bore a net”(!). This disposes of Langdon’s note 5 on page 211. 10. Read Urukki instead of ad-ki. Langdon’s note 7 is wrong. 12. Langdon’s note 8 is wrong. ú-um-mid-ma pu-ti does not mean “he attained my front.” 14. Read ab-ba-la-áš-šú instead of at-ba-la-áš-šú. 15. Read mu-di-a-at instead of mu-u-da-a-at. 20. Read ta-ḫa-du instead of an impossible [sa]-ah-ḫa-ta—two mistakes in one word. Supply kima Sal before taḫadu. 22. Read áš-šú instead of šú; and at the end of the line read [tu-ut]-tu-ú-ma instead of šú-ú-zu. 23. Read ta-tar-ra-[as-su]. 24. Read [uš]-ti-nim-ma instead of [iš]-ti-lam-ma. 28. Read at the beginning šá instead of ina. 29. Langdon’s text and transliteration of the first word do not tally. Read ḫa-aṣ-ṣi-nu, just as in line 31. 32. Read aḫ-ta-du (“I rejoiced”) instead of aḫ-ta-ta. Column 2. 4. Read at the end of the line di-da-šá(?) ip-tí-[e] instead of Di-?-al-lu-un (!). 5. Supply dEn-ki-dū at the beginning. Traces point to this reading. 19. Read [gi]-it-ma-[lu] after dGiš, as suggested by the Assyrian version, Tablet I, 4, 38, where emûḳu (“strength”) replaces nepištu of our text. 20. Read at-[ta kima Sal ta-ḫa]-bu-[ub]-šú. 21. Read ta-[ra-am-šú ki-ma]. [104] 23. Read as one word ma-a-ag-ri-i-im (“accursed”), spelled in characteristic Hammurabi fashion, instead of dividing into two words ma-a-ak and ri-i-im, as Langdon does, who suggests as a translation “unto the place yonder(?) of the shepherd”(!). 24. Read im-ta-ḫar instead of im-ta-gar. 32. Supply ili(?) after ki-ma. 33. Read šá-ri-i-im as one word. 35. Read i-na [áš]-ri-šú [im]-ḫu-ru. 36. Traces at beginning point to either ù or ki (= itti). Restoration of lines 36–39 (perhaps to be distributed into five lines) on the basis of the Assyrian version, Tablet I, 4, 2–5. Column 3. 14. Read Kàš (= šikaram, “wine”) ši-ti, “drink,” as in line 17, instead of bi-iš-ti, which leads Langdon to render this perfectly simple line “of the conditions and the fate of the land”(!). 21. Read it-tam-ru instead of it-ta-bir-ru. 22. Supply [lùŠú]-I. 29. Read ú-gi-ir-ri from garû (“attack), instead of separating into ú and gi-ir-ri, as Langdon does, who translates “and the lion.” The sign used can never stand for the copula! Nor is girru, “lion!” 30. Read Síbmeš, “shepherds,” instead of šab-[ši]-eš! 31. šib-ba-ri is not “mountain goat,” nor can ut-tap-pi-iš mean “capture.” The first word means “dagger,” and the second “he drew out.” 33. Read it-ti-[lu] na-ki-[di-e], instead of itti immer nakie which yields no sense. Langdon’s rendering, even on the basis of his reading of the line, is a grammatical monstrosity. 35. Read giš instead of wa. 37. Read perhaps a-na [na-ki-di-e i]- za-ak-ki-ir. Column 4. 4. The first sign is clearly iz, not ta, as Langdon has it in note 1 on page 216. 9. The fourth sign is su, not šú. 10. Separate e-eš (“why”) from the following. Read ta-ḫi-[il], followed, perhaps, by la. The last sign is not certain; it may be ma. [105] 11. Read lim-nu instead of mi-nu. In the same line read a-la-ku ma-na-aḫ-[ti]-ka instead of a-la-ku-zu(!) na-aḫ … ma, which, naturally, Langdon cannot translate. 16. Read e-lu-tim instead of pa-a-ta-tim. The first sign of the line, tu, is not certain, because apparently written over an erasure. The second sign may be a. Some one has scratched the tablet at this point. 18. Read uk-la-at âli (?) instead of ug-ad-ad-lil, which gives no possible sense! Column 5. 2. Read [wa]-ar-ki-šú. 8. Read i-ta-wa-a instead of i-ta-me-a. The word pi-it-tam belongs to line 9! The sign pi is unmistakable. This disposes of note 1 on p. 218. 9. Read Mi = ṣalmu, “image.” This disposes of Langdon’s note 2 on page 218. Of six notes on this page, four are wrong. 11. The first sign appears to be si and the second ma. At the end we are perhaps to supply [šá-ki-i pu]-uk-ku-ul, on the basis of the Assyrian version, Tablet IV, 2, 45, šá-ki-i pu-[uk-ku-ul]. 12. Traces at end of line suggest i-pa(?)-ka-du. 13. Read i-[na mâti da-an e-mu]-ki i-wa. 18. Read ur-šá-nu instead of ip-šá-nu. 19. Read i-šá-ru instead of i-tu-ru. 24. The reading it-ti after dGiš is suggested by the traces. 25. Read in-ni-[ib-bi-it] at the end of the line. 28. Read ip-ta-ra-[aṣ a-la]-ak-tam at the end of the line, as in the Assyrian version, Tablet IV, 2, 37. 30. The conjectural restoration is based on the Assyrian version, Tablet IV, 2, 36. Column 6. 3. Read i-na ṣi-ri-[šú]. 5. Supply [il-li-ik]. 21. Langdon’s text has a superfluous ga. 22. Read uz-za-šú, “his anger,” instead of uṣ-ṣa-šú, “his javelin” (!). 23. Read i-ni-iḫ i-ra-as-su, i.e., “his breast was quieted,” in the sense of “his anger was appeased.” 31. Read ri-eš-ka instead of ri-eš-su. [106] In general, it should be noted that the indications of the number of lines missing at the bottom of columns 1–3 and at the top of columns 4–6 as given by Langdon are misleading. Nor should he have drawn any lines at the bottom of columns 1–3 as though the tablet were complete. Besides in very many cases the space indications of what is missing within a line are inaccurate. Dr. Langdon also omitted to copy the statement on the edge: 4 šú-ši, i.e., “240 lines;” and in the colophon he mistranslates šú-tu-ur, “written,” as though from šaṭâru, “write,” whereas the form is the permansive III, 1, of atâru, “to be in excess of.” The sign tu never has the value ṭu! In all, Langdon has misread the text or mistransliterated it in over forty places, and of the 204 preserved lines he has mistranslated about one-half. 1 The enumeration here is according to Langdon’s edition. Plates Plate I. The Yale Tablet. Plate II. The Yale Tablet. Plate III. The Yale Tablet. Plate IV. The Yale Tablet. Plate V. The Yale Tablet. Plate VI. The Yale Tablet. Plate VII. The Yale Tablet.

      Compared to the other versions focusing on the epic of Gilgamesh, this version looks more into Gilgamesh's cure for immortality after Enkidu's death. The "us" in this instance would be Gilgamesh and his search for a cure while the "them" would be the enemies which are trying stop him which include the forces he come along. The text is able to create this distinction by describing Gilgamesh as the main character as the one who is need of a cure because struggles to come to terms that he will die one day. Not to mention, Enkidu as a being was able to turn Gilgamesh into a noble figure who used his power for good turning him into a more likeable figure which is why the reader also roots for him to find a cure. Gilgamesh as a figure shows that in his time period, males were the ones who were seen as leaders who have strength because the other females in all versions of the text do not carry dynamic roles that showcase their personality or even their endearing qualities. There are more political and nationalistic themes compared to the Sumerian versions which illustrate how linguistics and language can play a role in how a culture might be perceived. By using the strong characteristics of Gilgamesh, the text is ultimately able to show the civilization of Uruk and create a sense of identity as a result. CC BY Ajey Sasimugunthan (contact)

    1. TABLE OF CONTENTS

      There are some things in the Tyler system that do not seem to have equivalents here. * Error Codes * Code Version * Country Codes * State Codes * Filing Security * Data Field Config * Status Codes

    1. Reviewer #2 (Public Review):

      Summary:

      The authors build a colossal anatomical model of juvenile rat non-barrel primary somatosensory cortex, including inputs from the thalamus. This enhances past models by incorporating information on the shape of the cortex and estimated densities of various types of excitatory and inhibitory neurons across layers. This is intended to enable an analysis of the micro- and mesoscopic organisation of cortical connectivity and to be a base anatomical model for large-scale simulations of physiology.

      Strengths:

      • The authors incorporate many diverse data sources on morphology and connectivity.

      • This paper takes on the challenging task of linking micro- and mesoscale connectivity.

      • By building in the shape of the cortex, the authors were able to link cortical geometry to connectivity. In particular, they make an unexpected prediction that cortical conicality affects the modularity of local connectivity, which should be testable.

      • The author's analysis of the model led to the interesting prediction that layer 5 neurons connect local modules, which may be testable in the future, and provide a basis to link from detailed anatomy to functional computations.

      • The visualisation of the anatomy in various forms is excellent.

      • A subnetwork of the model is openly shared (but see question below).

      Weaknesses:

      • Why was non-barrel S1 of the juvenile rat cortex selected as the target for this huge modelling effort? This is not explained.

      • There is no effort to determine how specific or generalisable the findings here are to other parts of the cortex.

      • Although there is a link to physiological modelling in another paper, there is no clear pathway to go from this type of model to understand how the specific function of the modelled areas may emerge here (and not in other cortical areas).

      • In a few places the manuscript could be improved by being more specific in the language, for example:<br /> - "our anatomy-based approach has been shown to be powerful", I would prefer instead to read about specific contributions of past papers to the field, and how this builds on them.<br /> - similarly: "ensuring that the total number of synapses in a region-to-region pathway matches biology." Biology here is a loose term and implies too much confidence in the matching to some ground truth. Please instead describe the source of the data, including the type of experiment.

      • Some of the decisions seem a little ad-hoc, and the means to assess those decisions are not always available to the reader e.g.<br /> - pg. 10. "Based on these results, we decided that the local connectome sufficed to model connectivity within a region.". What is the basis for this decision? Can it be formalised?<br /> - "In the remaining layers the results of the objective classification were used to validate the class assignments of individual pyramidal cells. We found the objective classification to match the expert classification closely (i.e., for 80-90% of the morphologies). Consequently, we considered the expert classification to be sufficiently accurate to build the model." The description of the validation is a little informal. How many experts were there? What are their initials? Was inter-rater or intra-rater reliability assessed? What are these numbers? The match with Kanari's classification accuracy should be reported exactly. There are clearly experts among the author list, but we are all fallible without good controls in place, and they should be more explicit about those controls here, in my opinion.<br /> - "Morphology selection was then performed as previously (Markram et al., 2015), that is, a morphology was selected randomly from the top 10% scorers for a given position." A lot of the decisions seem a little ad-hoc, without justification other than this group had previously done the same thing. For example, why 10% here? Shouldn't this be based on selecting from all of the reasonable morphologies?

      • I would like to know if one of the key results relating to modularity and cortical geometry can be further explored. In particular, there seem to be sharp changes in the data at the end of the modelled cortical regions, which need to be explored or explained further.

      • The shape of the juvenile cortex - a key novelty of this work - was based on merely a scalar reduction of the adult cortex. This is very surprising, and surely an oversimplification. Huge efforts have gone into modelling the complex nonlinear development of the cortex, by teams including the developing Human Connectome Project. For such a fundamental aspect of this work, why isn't it possible to reconstruct the shape of this relatively small part of the juvenile rat cortex?

      • The same relative laminar depths are used for all subregions. This will have a large impact on the model. However, relative laminar depths can change drastically across the cortex (see e.g. many papers by Palomero-Gallagher, Zilles, and colleagues). The authors should incorporate the real laminar depths, or, failing that, show evidence to show that the laminar depth differences across the subregions included in the model are negligible.

      • The authors perform an affine mapping between mouse and rat cortex. This is again surprising. In human imaging, affine mappings are insufficient to map between two individual brains of the same species and nonlinear transformations are instead used. That an affine transformation should be considered sufficient to map between two different species is then very surprising. For some models, this may be fine, but there is a supposed emphasis here on biological precision in terms of anatomical location.

      • One of the most interesting conclusions, that the connectivity pattern observed is in part due to cooperative synapse formation, is based on analyses that are unfortunately not shown.

      • Open code:<br /> - Why is only a subvolume available to the community?<br /> - Live nature of the model. This is such a colossal model, and effort, that I worry that it may be quite difficult to update in light of new data. For example, how much person and computer time would it take to update the model to account for different layer sizes across subregions? Or to more precisely account for the shape of the juvenile rat cortex?

    2. Reviewer #3 (Public Review):

      This manuscript reports a detailed model of the rat non-barrel somatosensory cortex, consisting of 4.2 million morphologically and biophysically detailed neuron models, arranged in space and connected according to highly sophisticated rules informed by diverse experimental data. Due to its breadth and sophistication, the model will undoubtedly be of interest to the community, and the reporting of anatomical details of modeling in this paper is important for understanding all the assumptions and procedures involved in constructing the model. While a useful contribution to this field, the model and the manuscript could be improved by employing data more directly and comparing simple features of the model's connectivity - in particular, connection probabilities - with relevant experimental data.

      The manuscript is well-written overall but contains a substantial number of confusing or unclear statements, and some important information is not provided.

      Below, major concerns are listed, followed by more specific but still important issues.

      MAJOR ISSUES

      (1) Cortical connectivity.

      Section 2.3, "Local, mid-range and extrinsic connectivity modeled separately", and Figure 4: I am confused about what is done here and why. The authors have target data for connectivity (Figure 4B1). But then they use an apposition-based algorithm that results in connectivity that is quite different from the data (Figure 4B2, C). They then use a correction based on the data (Figure 4E) to arrive at a more realistic connectivity. Why not set the connectivity based on the data right away then? That would seem like a more straightforward approach.

      The same comment applies to Section 2.4., "Specificity of axonal targeting": the distributions of synapses on different types of target cell compartments were not well captured by the original model based on axon-dendrite overlap and pruning, so the authors introduced further pruning to match data specificity. While details of this process and what worked and what didn't may be interesting to some, overall it is not surprising, as it has been well known that cell types exhibit connectivity that is much more specific than "Peters rule" or its simple variations. The question is, since one has the data, why not use the data in the first place to set up the connectivity, instead of using the convoluted process of employing axon-dendrite overlap followed by multiple corrections?

      Most importantly, what is missing from the whole paper is the characterization of connection probabilities, at least for the local circuit within one area. Such connection probabilities can be obtained from the data that the authors already use here, such as the MICRONS dataset. Another good source of such data is Campagnola et al., Science, 2022. Both datasets are for mouse V1, but they provide a comprehensive characterization across all cortical layers, thus offering a good benchmark for comparison of the model with the data. It would be important for the authors to show how connection probabilities realized in their model for different cell types compared to these data.

      (2) Section 2.5, "Structure of thalamic inputs" and Figure 6.

      The text in section 2.5 should provide more details on what was done - namely, that the thalamic axons were generated based on the axon density profiles and then synapses were established based on their overall with cortical dendrites. Figure S10 where the target axon densities from data and the model axon densities are compared is not even mentioned here. Now, Figure S10 only shows that the axon densities were generated in a way that matches the data reasonably well. However, how can we know that it results in connectivity that agrees with data? Are there data sources that can be used for that purpose? For example, the authors show that in their model "the peaks of the mean number of thalamic inputs per neuron occur at lower depths than the peaks of the synaptic density". Is this prediction of the model consistent with any available data?

      Most importantly, the authors should show how the different cell types in their model are targeted by the thalamic inputs in each layer. Experimental studies have been done suggesting specificity in targeting of interneuron types by thalamic axons, such as PV cells being targeted strongly whereas SST and VIP cells being targeted less.

      (3) "We have therefore made not only the model but also most of our tool chain openly available to the public (Figure 1; step 7)."<br /> In fact it is not the whole model that is made publicly available, but only about 5% of it (211,000 out of 4,200,000 neurons). Also, why is "most" of the tool chain made openly available, and not the whole tool chain?

      OTHER ISSUES

      "At each soma location, a reconstruction of the corresponding m-type was chosen based on the size and shape of its dendritic and axonal trees (Figure S6). Additionally, it was rotated to according to the orientation towards the cortical surface at that point."

      After this procedure, were cells additionally rotated around the white matter-pia axis? If yes, then how much and randomly or not? If not, then why not? Such rotations would seem important because otherwise additional order potentially not present in the real cortex is introduced in the model affecting connectivity and possibly also in vivo physiology (such as the dynamics of the extracellular electric field).

      The term "new in vivo reconstructions" for the 58 neurons used in this paper in addition to "in vitro reconstructions" is a misnomer. It is not straightforward to see where the procedure is described, but then one finds that the part of Methods that describes experimental manipulations is mostly about that (so, a clearer pointer to that part of Methods could be useful). However, the description in Methods makes it clear that it is only labeling that is done in vivo; the microscopy and reconstruction are done subsequently in vitro. I would recommend changing the terminology here, as it is confusing. Also, can the authors show reconstructions of these neurons in the supplementary figures? Is the reconstruction shown in Figure 4A representative?

      In the Discussion, "This was taken into account during the modeling of the anatomical composition, e.g. by using three-dimensional, layer-specific neuron density profiles that match biological measurements, and by ensuring the biologically correct orientation of model neurons with respect to the orientation towards the cortical surface. As local connectivity was derived from axo-dendritic appositions in the anatomical model, it was strongly affected by these aspects.<br /> However, this approach alone was insufficient at the large spatial scale of the model, as it was limited to connections at distances below 1000μm."

      As mentioned above, it is not clear that this approach was sufficient for local connectivity either. It would be great if the authors showed a systematic comparison of local connection probabilities between different cell types in their model with experimental data and commented here in the Discussion about how well the model agrees with the data.

      In the Discussion: "The combined connectome therefore captures important correlations at that level, such as slender-tufted layer 5 PCs sending strong non-local cortico-cortical connections, but thick-tufted layer 5 PCs not." (Also the corresponding findings in Results.)

      If I understand this statement correctly, it may not agree with biological data. See analysis from MICRONS dataset in Bodor et al., https://www.biorxiv.org/content/10.1101/2023.10.18.562531v1.

      Table 2 is confusing. What do pluses and minuses mean? What does it mean that some entries have two pluses? This table is not mentioned anywhere else in the text. If pluses mean some meaningful predictions of the model, then their distribution in the table seems quite liberal and arbitrary. It is not clear to me that the model makes that many predictions, especially for type-specificity and plasticity. Also, why is the hippocampus mentioned in this table? I don't see anything about the hippocampus anywhere else in the paper.

      In the Discussion, "Thus, we made the tools to improve our model also openly available (see Data and Code availability section)."<br /> As mentioned before, the authors themselves write that they made "most of our tool chain openly available to the public", but not all of it.

      Table S2 has multiple question marks. It is not clear whether the "predictions" listed in that table are truly well-thought-out and/or whether experimental confirmations are real.

      Introduction: It would be quite appropriate to cite here Einevoll et al., Neuron, 2019 ("The Scientific Case for Brain Simulations").

    1. an impressive amount of ink considering that many other major biblical stories are told in less than one Torah Torah <audio src="https://www.myjewishlearning.com/wp-content/uploads/2017/02/torah.mp3" controls> Your browser does not support the &lt;code&gt;audio&lt;/code&gt; element. </audio> Pronunced: TORE-uh, Origin: Hebrew, the Five Books of Moses. portion.

      Its quite amusing to think that even when it comes to his story, Joseph's life occupies more space than other characters in the bible.

    1. If a problem is shared by only a handful of people, it's probably not worth programming a solution. Great Programmers Solve Important Problems The best programmers aren't simply the ones that write the best solutions: they're the ones that solve the best problems. The best programmers write kernels that allow billions of people to run other software, write highly reliable code that puts astronauts into space, write crawlers and indexers that organize the world's information. They make the right choices not only about how to solve a problem, but what problem to solve.

      Precisamente esa idea grandilocuente de qué son un programador y un problema valiosos es lo que deja desatendidas las soluciones que no suenan ambiciosas.

      Preferimos terraformar marte, que el depredado Amazonas.

      En contraste el software situado nos ha permitido resolver problemas para comunidades pequeñas en HackBo, nuestro hackerspace local o ayudando en la preservación lingüística en el Amazonas.

      En los ejemplos, todos los problemas a resolver parecen grandilocuentos: miles de millones de personas, la información del mundo, los astronáutas. Pareciera ser que el vecino, la familia, la comunidad local, están por fuera de esos imaginarios. Al menos pensar que los problemas importantes tan bien son cotidianos y pequeños es algo que vale la pena comunicar más asertiva y reiteradamente.

      Una de las cosas interesantes es que Breck cuenta en otra entrada como el software debería ahorrar tiempo a las personas, y allí revela una sensibilidad por los problemas pequeños, que le importaban a su familia y a él como niño/adolescente: tener 20 minutos más para poder jugar o ahorrarle esos 20 minutos a su familia a conectarse a internet.

      Lo que creo que necesitamos es una manera de expresar software para el cuidado: de la gente, del planeta, del tiempo. Algo como un software convivial, en las líneas de las tecnologías conviviales de Ivan Illich.

    1. n, Canada, 2013, p. 3), or what Aquino-Sterling(2016) defined as “pedagogical Spanish”—the languageand literacy competencies bilingual teachers require forthe effective work of teaching in Spanish across the cur-riculum in K–12 bilingual schools, and for competentlymeeting the professional language demands of workingwith students, colleagues, administrators, parents, andthe larger bilingual school community (p. 51). In thissense, bilingual teacher education programs across thenation are called to facilitate opportunities for futureteachers to develop what the U.S. Department of Statedefines as “Full Professional Proficiency” —“[the abil-ity] to use the language fluently and accurately on alllevels pertinent to professional needs” (U.S. Departmentof State, n.d., Proficiency Code #4), Spanish for teachingcontent-area knowledge in K-12 bilingual schools in thiscase

      What does full professional proficiency look like for someone that has been uprooted from the Spanish language and culture and has become part of the expatriate Spanish speaking community.

    Annotators

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      My main concern is still in place. It is unclear whether the proposed method can find actual goal states, and as a result it is unclear what states it finds. Table S1 mentions the model BIOMD0000000454, which is a small metabolic pathway with known equations given in "Example One" in "Metabolic Control Analysis: Rereading Reder". In this model the goal states can be calculated analytically.

      Regarding your statements below: I am not concerned that your method will be less efficient than random search (or any other search..) on small models, but I think it is important for the readers to have evidence that your method is able to discover true goal states at least in small networks, used in your study. You do show that your method scales to complex models. So, in my opinion, the missing part is to show that it is able to find true goal states.

      "...For simple models whose true steady-state distribution can be derived numerically and/or analytically, it is very likely that their exploration will be much simpler and this is not where a lot of improvement over random search may be found, which explains our focus on more complex models..."

      We thank you for your response and for your concerns on the lack of evidence that our method is able to re-discover the true goal states of simple models when these are known a priori. We acknowledge that adding these simple cases is useful for completeness. We did not include these simple models in our main study because in most cases a basic random search over the initial conditions will lead to the re-discovery of these goal states. For instance for the mentioned model BIOMD0000000454 described in the "Example One" from the "Metabolic Control Analysis: Rereading Reder" paper, several simplifying assumptions are made such that the system only has one steady state (x1=0.056, x2=0.769, x3=4.231) which can be found analytically as shown in the paper. In that simple case, this goal state is also straightforward to find with numerical simulation as any valid initial condition will converge to it.

      To address the concerns of the reviewer, we propose to add an additional "sanity check" figure in the supplementary of the revised paper (Figure S4), as well as a “sanity check” subsection in the “Methods”, to present additional experiments made on  simple models such as this one. The novel figure and subsection can be visualized on the paper’s interactive version available online https://developmentalsystems.org/curious-exploration-of-grn-competencies, and we plan to include them as such in the further revision.  We have also included the full code to reproduce this sanity check as a ‘sanity_check.ipynb’  jupyter notebook in the github repository (https://github.com/flowersteam/curious-exploration-of-grn-competencies/blob/main/notebooks/sanity_check.ipynb).

      In the novel figure S4-b, we show the results of our exploration pipeline on the suggested model BIOMD0000000454 as described in the "Example One" of the paper. These results provide evidence that the curiosity search is able to find back the correct unique goal state (x1=0.056, x2=0.769, x3=4.231), as expected.

      We also include a second sanity check on BIOMD0000000341 which models the dynamics of beta-cell mass, insulin and glucose dynamics. This model has two stable fixed points representing physiological (B=300, I=10, G=100) and pathological (B=0, I=0, G=600) steady states, which are the known ground truth steady states as described in Figure 3 of the "A Model of b-Cell Mass, Insulin, and Glucose Kinetics: Pathways to Diabetes" paper. Again, as expected, curiosity search is able to find back those two steady states (Figure S4-a).

      As stated in our previous answer, our main study focuses on more complex models that are not limited to one or few attractors that can easily be discovered with random initial conditions. Regarding the mentioned BIOMD0000000454, maybe something that has been confusing for the reviewer is that we indeed included it in our main study but, as specified in the caption of table S4, at the difference of what is done in the "example one" of the original paper, we let the metabolite concentrations y1,...,y5 evolve in time (instead of enforcing them as constants). When doing so, the resulting dynamics of the system are more complex and exhibit a spectrum of possible steady states (unknown a priori), which differ from the previous case with a single steady state. In that case, the new attractors are not analytically easy to find and the proposed curiosity search becomes interesting as it is able to uncover the distribution of possible steady states much more efficiently than a random search baseline, as shown in the new figures S4-c and S4-d.

      We hope that these new results will address the reviewer’s concerns and provide evidence to the readers on the validity of the approach on simple networks.

      eLife assessment

      This important study develops a machine learning method to reveal hidden unknown functions and behavior in gene regulatory networks by searching parameter space in an efficient way. The evidence for some parts of the paper is still incomplete and needs systematic comparison to other methods and to the ground truth, but the work will be of broad interest to anyone working in biology of all stripes since the ideas reach beyond gene regulatory networks to revealing hidden functions in any complex system with many interacting parts.

      We thank the editors and reviewers for their positive assessment and constructive suggestions. In our response, we acknowledge the importance of systematic comparison to other methods and to the ground truth, when available. However we also emphasize the challenges associated with evaluating such methods in the context of uncovering hidden behaviors in complex biological networks as the ground truth is often unknown. We hope that our explanations will clarify the potential of our approach in advancing the exploration of these systems.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: This paper suggests to apply intrinsically-motivated exploration for the discovery of robust goal states in gene regulatory networks.

      Strengths:

      The paper is well written. The biological motivation and the need for such methods are formulated extraordinarily well. The battery of experimental models is impressive.

      We thank the reviewer for sharing interest in the research problem and for recognizing the strengths of our work.

      Weaknesses:

      (1) The proposed method is compared to the random search. That says little about the performance with regard to the true steady-state goal sets. The latter could be calculated at least for a few simple ODE (e.g., BIOMD0000000454, `Metabolic Control Analysis: Rereading Reder'). The experiment with 'oscillator circuits' may not be directly interpolated to the other models.

      The lack of comparison to the ground truth goal set (attractors of ODE) from arbitrary initial conditions makes it hard to evaluate the true performance/contribution of the method. A part of the used models can be analyzed numerically using JAX, while there are models that can be analyzed analytically.

      "...The true versatility of the GRN is unknown and can only be inferred through empirical exploration and proxy metrics....": one could perform a sensitivity analysis of the ODEs, identifying stable equilibria. That could provide a proxy for the ground truth 'versatility'.

      We agree with the reviewer that one primary concern is to properly evaluate the effectiveness of the proposed method. However, as we move toward complex pathways, knowledge of the “true” steady-state goal sets is often unknown which is where the use of machine learning methods as the one we propose are particularly interesting (but challenging to evaluate).

      For simple models whose true steady-state distribution can be derived numerically and/or analytically, it is very likely that their exploration will be much simpler and this is not where a lot of improvement over random search may be found, which explains our focus on more complex models. While we agree that it is still interesting to evaluate exploration methods on these simple models for checking their behavior, it is not clear how to scale this analysis to the targeted more complex systems.

      For systems whose true steady state distribution cannot be derived analytically or numerically, we believe that random search is a pertinent baseline as it is commonly used in the literature to discover the attractors/trajectories of a biological network. For instance, Venkatachalapathy et al. [1] initialize stochastic simulations at multiple randomly sampled starting conditions (which is called a kinetic Monte Carlo-based method) to capture the steady states of a biological system. Similarly, Donzé et al. [29] use a Monte Carlo approach to compute the reachable set of a biological network «when the number of parameters  is large and their uncertain range  is not negligible». For the considered models, the true steady-state goal set is unknown, which is why we chose comparison with random search. We added a “Statistics” subsection in the Methods section providing additional details about the statistical analyses we perform between our method and the random search baseline.

      (2) The proposed method is based on `Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning', which assumes state action trajectories [s_{t_0:t}, a_{t_0:t}], (2.1 Notations and Assumptions' in the IMGEP paper). However, the models used in the current work do not include external control actions, but rather only the initial conditions can be set. It is not clear from the methods whether IMGEP was adapted to this setting, and how the exploration policy was designed w/o actual time-dependent actions. What does "...generates candidate intervention parameters to achieve the current goal....", mean considering that interventions 'Sets the initial state...' as explained in Table 2?

      We thank the reviewer for asking for clarification, as indeed the IMGEP methodology originates from developmental robotics scenarios which generally focus on the problem of robotic sequential decision-making, therefore assuming state action trajectories as presented in Forestier et al. [65]. However, in both cases, note that the IMGEP is responsible for sampling parameters which then govern the exploration of the dynamical system. In Forestier et al. [65], the IMGEP also only sets one vector at the start (denoted ) which was specifying parameters of a movement (like the initial state of the GRN), which was then actually produced with dynamic motion primitives which are dynamical system equations similar to GRN ODEs, so the two systems are mathematically equivalent. More generally, while in our case the “intervention” of the IMGEP (denoted ) only controls the initial state of the GRN, future work could consider more advanced sequential interventions simply by setting parameters of an action policy  at the start which could be called during the GRN’s trajectory to sample control actions  where  would be the state of the GRN. In practice this would also require setting only one vector at the start, so it would remain the same exploration algorithm and only the space of parameters would change, which illustrates the generality of the approach.

      (3) Fig 2 shows the phase space for (ERK, RKIPP_RP) without mentioning the typical full scale of ERK, RKIPP_RP. It is unclear whether the path from (0, 0) to (~0.575, ~3.75) at t=1000 is significant on the typical scale of this phase space. is it significant on the typical scale of this phase space?

      The purpose of Figure 2 is to illustrate an example of GRN trajectory in transcriptional space, and to illustrate what “interventions” and “perturbations” can be in that context. To that end we have used the fixed initial conditions provided in the BIOMD0000000647, replicating Figure 5 of Cho et al. [56].

      While we are not sure of what the reviewer means with “typical” scale of this phase space, we would like to point reviewer toward Figure 8 which shows examples of certain paths that indeed reach further point in the same phase space (up to ~10 in RKIPP_RP levels and ~300 in ERK levels). However, while the paths displayed in Figure 8 are possible (and were discovered with the IMGEP), note that they may be “rarer” to occur naturally  in the sense that a large portion of the tested initial conditions with random search tend to converge toward smaller (ERK, RKIPP_RP) steady-state values similar to the ones displayed in Figure 2.

      (4) Table 2:

      a. Where is 'effective intervention' used in the method?

      b. in my opinion 'controllability', 'trainability', and 'versatility' are different terms. If their correspondence is important I would suggest to extend/enhance the column "Proposed Isomorphism". otherwise, it may be confusing.

      a) We thank the reviewer for pointing out that “effective intervention” is not explicitly used in the method. The idea here is that as we are exploring a complex dynamical system (here the GRN), some of the sampled interventions will be particularly effective at revealing novel unseen outcomes whereas others will fail to produce a qualitative change to the distribution of discovered outcomes. What we show in this paper, for instance in Figure 3a and Figure 4, is that the IMGEP method is particularly sample-efficient in finding those “effective interventions”, at least more than a random exploration. However we agree that the term “effective intervention” is ambiguous (does not say effective in what) and we have replaced it with “salient intervention” in the revised version.

      b) We thank the reviewer for highlighting some confusing terms in our chosen vocabulary, and we have clarified those terms in the revised version. We agree that controllability/trainability and versatility are not exactly equivalent concepts, as controllability/trainability typically refers to the amount to which a system is externally controllable/trainable whereas versatility typically refers to the inherent adaptability or diversity of behaviors that a system can exhibit in response to inputs or conditions. However, they are both measuring the extent of states that can be reached by the system under a distribution of stimuli/conditions, whether natural conditions or engineered ones, which is why we believe that their correspondence is relevant.

      I don't see how this table generalizes "concepts from dynamical complex systems and behavioral sciences under a common navigation task perspective".

      We have replaced the verb “generalize” with “investigate” in the revised version.

      Reviewer #2 (Public Review):

      Summary:

      Etcheverry et al. present two computational frameworks for exploring the functional capabilities of gene regulatory networks (GRNs). The first is a framework based on intrinsically-motivated exploration, here used to reveal the set of steady states achievable by a given gene regulatory network as a function of initial conditions. The second is a behaviorist framework, here used to assess the robustness of steady states to dynamical perturbations experienced along typical trajectories to those steady states. In Figs. 1-5, the authors convincingly show how these frameworks can explore and quantify the diversity of behaviors that can be displayed by GRNs. In Figs. 6-9, the authors present applications of their framework to the analysis and control of GRNs, but the support presented for their case studies is often incomplete.

      Strengths:

      Overall, the paper presents an important development for exploring and understanding GRNs/dynamical systems broadly, with solid evidence supporting the first half of their paper in a narratively clear way.

      The behaviorist point of view for robustness is potentially of interest to a broad community, and to my knowledge introduces novel considerations for defining robustness in the GRN context.

      We thank the reviewer for recognizing the strengths and novelty of the proposed experimental framework for exploring and understanding GRNs, and complex dynamical systems more generally. We agree that the results presented in the section “Possible Reuses of the Behavioral Catalog and Framework” (Fig 6-9) can be seen as incomplete along certain aspects, which we tried to make as explicit as possible throughout the paper, and why we explicitly state that these are “preliminary experiments”. Despite the discussed limitations, we believe that these experiments are still very useful to illustrate the variety of potential use-cases in which the community could benefit from such computational methods and experimental framework, and build on for future work.

      Some specific weaknesses, mostly concerning incomplete analyses in the second half of the paper:

      (1) The analysis presented in Fig. 6 is exciting but preliminary. Are there other appropriate methods for constructing energy landscapes from dynamical trajectories in gene regulatory networks? How do the results in this particular case study compare to other GRNs studied in the paper?

      We are not aware of other methods than the one proposed by Venkatachalapathy et al. [1] for constructing an energy landscape given an input set of recorded dynamical trajectories, although it might indeed be the case. We want to emphasize that any of such methods would anyway depend on the input set of trajectories, and should therefore benefit from a set that is more representative of the diversity of behaviors that can be achieved by the GRN, which is why we believe the results presented in Figure 6 are interesting. As the IMGEP was able to find a higher diversity of reachable goal states (and corresponding trajectories) for many of the studied GRNs, we believe that similar effects should be observable when constructing the energy landscapes for these GRN models, with the discovery of additional or wider “valleys” of reachable steady states.

      Additionally, it is unclear whether the analysis presented in Fig. 6C is appropriate. In particular, if the pseudopotential landscapes are constructed from statistics of visited states along trajectories to the steady state, then the trajectories derived from dynamical perturbations do not only reflect the underlying pseudo-landscape of the GRN. Instead, they also include contributions from the perturbations themselves.

      We agree that the landscape displayed Fig. 6C integrates contributions from the perturbations on the GRN’s behavior, and that it can shape the landscape in various ways, for instance affecting the paths that are accessible, the shape/depth of certain valleys, etc. But we believe that qualitatively or quantitatively analyzing the effect of these perturbations  on the landscape is precisely what is interesting here: it might help 1) understand how a system respond to a range of perturbations and to visualize which behaviors are robust to those perturbations, 2) design better strategies for manipulating those systems to produce certain behaviors

      (2) In Fig. 7, I'm not sure how much is possible to take away from the results as given here, as they depend sensitively on the cohort of 432 (GRN, Z) pairs used. The comparison against random networks is well-motivated. However, as the authors note, comparison between organismal categories is more difficult due to low sample size; for instance, the "plant" and "slime mold" categories each only have 1 associated GRN. Additionally, the "n/a" category is difficult to interpret.

      We acknowledge that this part is speculative as stated in the paper: “the surveyed database is relatively small with respect to the wealth of available models and biological pathways, so we can hardly claim that these results represent the true distribution of competencies across these organism categories”. However, when further data is available, the same methodology can be reused and we believe that the resulting statistical analyses could be very informative to compare organismal (or other) categories.

      (3) In Fig. 8, it is unclear whether the behavioral catalog generated is important to the intervention design problem of moving a system from one attractor basin to another. The authors note that evolutionary searches or SGD could also be used to solve the problem. Is the analysis somehow enabled by the behavioral catalog in a way that is complementary to those methods? If not, comparison against those methods (or others e.g. optimal control) would strengthen the paper.

      We thank the reviewer for asking to clarify this point, which might not be clearly explained in the paper. Here the behavioral catalog is indeed used in a complementary way to the optimization method, by identifying a representative set of reachable attractors which are then used to define the optimization problem. For instance here, thanks to the catalog, we 1) were able to identify a “disease” region and several possible reachable states in that region and 2) use several of these states as starting points of our optimization problem, where we want to find a single intervention that can successfully and robustly reset all those points, as illustrated in Figure 8. Please note that given this problem formulation, a simple random search was used as an optimization strategy. When we mention more advanced techniques such as EA or SGD, it is to say that they might be more efficient optimizers than random search. However, we agree that in many cases optimizing directly will not work if starting from random or bad initial guess, and this even with EA or SGD. In that case the discovered behavioral catalog can be useful to better initialize  this local search and make it more efficient/useful, akin to what is done in Figure 9.

      (4) The analysis presented in Fig. 9 also is preliminary. The authors note that there exist many algorithms for choosing/identifying the parameter values of a dynamical system that give rise to a desired time-series. It would be a stronger result to compare their approach to more sophisticated methods, as opposed to random search and SGD. Other options from the recent literature include Bayesian techniques, sparse nonlinear regression techniques (e.g. SINDy), and evolutionary searches. The authors note that some methods require fine-tuning in order to be successful, but even so, it would be good to know the degree of fine-tuning which is necessary compared to their method.

      We agree that the analysis presented in Figure 9 is preliminary, and thank the reviewer for the suggestion. We would first like to refer to other papers from the ML literature that have more thoroughly analyzed this issue, such as Colas et al. [74] and Pugh et al. [34], and shown the interest of diversity-driven strategies as promising alternatives.  Additionally, as suggested by the reviewer, we added an additional comparison to the CMA-ES algorithm in the revised version in order to complete our analysis. CMA-ES is an evolutionary algorithm which is self-adaptive in the optimization steps and that is known to be better suited than SGD to escape local minimas when the number of parameters is not too high (here we only have 15 parameters). However, our results showed that while CMA-ES explores more the solution space at the beginning of optimization than SGD does, it also ultimately converges into a local minima similarly to SGD. The best solution converges toward a constant signal (of the target b) but fails to maintain the target oscillations, similar to the solutions discovered by gradient descent. We tried this for a few hyperparameters (init mean and std) but always found similar results.  We have updated the figure 9 image and caption, as well as descriptive text, to include these novel results in the revised version. We also added a reference to the CMA-ES paper in the citations.

      Reviewer #1 (Recommendations For The Authors):

      I would suggest to conduct a more rigor analysis of the performance by estimating/approximating the ground truth robust goal sets in important GRNs.

      Also, the use of terminology from different disciplines can be improved. Please see my comments above. Specifically, the connection between controllability in dynamical control systems and versatility used in this paper is unclear.

      We hope to have addressed the reviewer's concerns in our previous answers.

      Reviewer #2 (Recommendations For The Authors):

      Fig 4b: I'm not sure if DBSCAN is the appropriate method to use here, as the visual focus on the core elements of the clusters downplays the full convex hull of the points that random sampling achieves in Z space. An analysis based on convex hulls or the ball-coverage from Fig. 3b would presumably generate plots that were more similar between random sampling and curiosity search. If the goal is to highlight redundancy/non-linearity in the mapping between Z and I, another approach might be to simply bin Z-space in a grid, or to use a clustering algorithm that is less stringent about core/noise distinctions.

      We thank the reviewer for the suggestion. This plot is intended to convey the reader an understanding of why a method that uniformly samples goals in Z (what the  IMGEP is doing), is more efficient than a method that uniformly samples parameters in I (what the random search is doing), in systems for which there is high redundancy/non-linearity in the mapping between I and Z. We agree that binning the Z-space in a grid and counting the number of achieved bins is a way to quantitatively measure this, which is by the way very close to what we do in Figure 3 for measuring the achieved diversity. We believe however that the clustering and coloring provides additional intuitions on why this is the case: it illustrates that large regions of the intervention space map to small regions in the outcome space and vice versa.

      Additional changes in the revised version:

      We added a sentence in the Methods section as well as in the caption of Table S1 providing additional details about the way we simulate the biological models from the BioModels website

      We fixed a wrong reference to Figure 4 in the Methods “Sensitivity measure” subsection with reference to Figure 5.

    1. Reviewer #2 (Public Review):

      In this manuscript, Yang et al. present a modeling framework to understand the pattern of response biases and variance observed in delayed-response orientation estimation tasks. They combine a series of modeling approaches to show that coupled sensory-memory networks are in a better position than single-area models to support experimentally observed delay-dependent response bias and variance in cardinal compared to oblique orientations. These errors can emerge from a population-code approach that implements efficient coding and Bayesian inference principles and is coupled to a memory module that introduces random maintenance errors. A biological implementation of such operation is found when coupling two neural network modules, a sensory module with connectivity inhomogeneities that reflect environment priors, and a memory module with strong homogeneous connectivity that sustains continuous ring attractor function. Comparison with single-network solutions that combine both connectivity inhomogeneities and memory attractors shows that two-area models can more easily reproduce the patterns of errors observed experimentally.

      Strengths:

      The model provides an integration of two modeling approaches to the computational bases of behavioral biases: one based on Bayesian and efficient coding principles, and one based on attractor dynamics. These two perspectives are not usually integrated consistently in existing studies, which this manuscript beautifully achieves. This is a conceptual advancement, especially because it brings together the perceptual and memory components of common laboratory tasks.

      The proposed two-area model provides a biologically plausible implementation of efficient coding and Bayesian inference principles, which interact seamlessly with a memory buffer to produce a complex pattern of delay-dependent response errors. No previous model had achieved this.

      Weaknesses:

      The correspondence between the various computational models is not clearly shown. It is not easy to see clearly this correspondence because network function is illustrated with different representations for different models. In particular, the Bayesian model of Figure 2 is illustrated with population responses for different stimuli and delays, while the attractor models of Figure 3 and 4 are illustrated with neuronal tuning curves but not population activity.

      The proposed model has stronger feedback than feedforward connections between the sensory and memory modules (J_f = 0.1 and J_b = 0.25). This is not the common assumption when thinking about hierarchical processing in the brain. The manuscript argues that error patterns remain similar as long as the product of J_f and J_b is constant, so it is unclear why the authors preferred this network example as opposed to one with J_b = 0.1 and J_f = 0.25.

    1. To get the code, send an HTTP POST request to the /v1/DeviceCode endpoint.

      To use the code, send an HTTP POST request to the /v1/DeviceCode endpoint and authenticate with this user code using tenant's Global Admin.

    1. Late-replicating regions show higher under-representation in both non-subtelomeric and subtelomeric regions (N.S. p > 0.05; ***p < 0.0001, Wilcoxon rank test). All 200-bp windows of measured under-replication split into bins based on their replication timing (data from51). Colour-code corresponds to the proximity to telomeres (1:100 kb). ~0.1% of 200 bp regions have under-representation less than −20%; for visualization we plot them at -20%

      After release from G1, subtelomeric regions were more under-replicated than regions further away from telomeric regions.

    1. Reviewer #3 (Public Review):

      Summary:

      Wang and van Ede investigate whether and how attention re-orients within visual working memory following expected and unexpected centrally presented memory tests. Using a combination of spatial modulations in neural activity (EEG-alpha lateralization) and gaze bias quantified as time courses of microsaccade rate, the authors examined how retro cues with varying levels of reliability influence attentional deployment and subsequent memory performance. The conclusion is that attentional re-orienting occurs within visual working memory, even when tested centrally, with distinct patterns following expected and unexpected tests. The findings provide new value for the field and are likely of broad interest and impact, by highlighting working memory as an action-bound process (in)dependent on (an ambiguous) past.

      Strengths:

      The study uniquely integrates behavioral data (accuracy and reaction time), EEG-alpha activity, and gaze tracking to provide a comprehensive analysis of attentional re-orienting within visual working memory. As typical for this research group, the validity of the findings follows from the task design that effectively manipulates the reliability of retro cues and isolates attentional processes related to memory tests. The use of well-established markers for spatial attention (i.e. alpha lateralization) and more recently entangled dependent variable (gaze bias) is commendable. Utilizing these dependent metrics, the concise report presents a thorough analysis of the scaling effects of cue reliability on attentional deployment, both at the behavioral and neural levels. The clear demonstration of prolonged attentional deployment following unexpected memory tests is particularly noteworthy, although there are no significant time clusters per definition as time isn't a factor in a statistical sense, the jackknife approach is convincing. Overall, the evidence is compelling allowing the conclusion of a second stage of internal attentional deployment following both expected and unexpected memory tests, highlighting the importance of memory verification and re-orienting processes.

      Weaknesses:

      I want to stress upfront that these weaknesses are not specific to the presented work and do not affect my recommendation of the paper in its present form.

      The sample size is consistent with previous studies, a larger sample could enhance the generalizability and robustness of the findings. The authors acknowledge high noise levels in EEG-alpha activity, which may affect the reliability of this marker. This is a general issue in non-invasive electrophysiology that cannot be handled by the authors but an interested reader should be aware of it. Effectively, the sensitivity of the gaze analysis appears "better" in part due to the better SNR. The latter also sets the boundaries for single-tiral analyses as the authors correctly mention. In terms of generalizability, I am convinced that the main outcome will likely generalize to different samples and stimulus types. Yet, as typical for the field future research could explore different contexts and task demands to validate and extend the findings. The authors provide here how and why (including sharing of data and code).

    1. 创建片段的说明可以查看这里。 有用的、现成的片段也可以在市场中作为VS Code插件找到。 最重要的片段是用于console.log()命令的片段,例如,clog。这可以这样创建。 { "console.log": { "prefix": "clog", "body": [ "console.log('$1')", ], "description": "Log output to console" } }copy 使用console.log()来调试你的代码非常普遍,因此Visual Studio Code内置了这个片段。要使用它,请输入log并点击tab来自动完成。更多功能的console.log()片段扩展可以在市场中找到。

      稍后

      关于在 VS Code 中配置代码片段快捷键。

    1. Stripped out all the legacy "desktop UI" stuff, and replaced with a simpler "multi-page notebook" metaphor, then it could be massively more compelling to people. It then becomes a "personal notebook" for doing little sketches / experiments.If it's also "social" ie. has chat streams. Or is like the Smallest Federated Wiki. Or has other ways to sync sketches and pages etc. then this would be spectacular.And the Smalltalk VM / infrastructure is perfect for it.

      I have found the GT/Lepiter GUI pretty compelling for learners in my local hackerspace and in the information science department, both spaces where I'm a facilitator/teacher. It provides a pretty focused experience and it is stripped down of the overwhelming initial experience of the Pharo/Squeak GUI. It is not well suited for "classical Smalltalkers" though. as I have been talking with some of them and they find the DX too much specific and even cumbersome for some task they usually do (it has been not our case so far).

      In our last use case at the university, the students are creating a personal code repository in Fossil, with data narratives and they do a critic/annotated reading, using Hypothesis (this very technology), which is kind of a personal public wiki-like portfolio for data narratives. They put also the reading notes in their own repositories for the data stories I published previously where I introduce Smalltalk or and introduction to data representation and processing in Pharo.

      This could be another approach for wikis in the classroom, that is alterative to our use of interpersonal wikis with TiddlyWiki. At some point and in a pretty organic way, the idea would be to have all them integrated and powered by "context aware" and thematic chatbots (made in Pharo).

    2. What I think Smalltalk should look like in 2018 is something like JuPyter / iPython notebook. Or, at a pinch, HyperCard.I open "Smalltalk" (whether that's a browser-based version equivalent to Amber, LivelyKernel or Peter Fisk's Smalltalk Express, or a desktop version like PharoLanguage or SqueakLanguage), and what I see is a "smart notebook" type metaphor :A single page that takes up the whole window. To which I can start adding "cells" or "cards" containing either code or "literate" style documentation, or output produced by the code.You'd still have tools like the Class Browser etc. But they'd be integrated within the same UI. Ie. the class browser is just more "pages" in the notebook. There's no workspace or transcript because every page can have live code on it.This UI is immediate. And focused on "do something".

      On a similar approach, I created and actively developed from 2014 to 2019 Grafoscopio, which, while being inside Pharo and a companion of all other tools, was providing a computable outliner to do something: write computable and reproducible documents and bridge the gap between the IDE and the app for a more mature audience (a similar approach for children was previously tested in Squeak, with Etoys).

      This allowed me to write the Grafoscopio Manual (2016) inside Grafoscopio or to do with the community some hacktivist republishing, like we did with the Data Journalism Handbook (2018)

      Of course, being those initiatives from the so called "Global South" and being Grafoscopio my first "real program" ever, they lacked the visibility of Global North initiatives, like the ones you collected in Smart Academic Notebook, but they were acknowledge and appreciated in small/specialized communities, like the Pharo community.

      With the new GUI/DX provided by Lepiter (2021), I have been migrating the Grafoscopio Lessons from the previous half decade to this technology, with the MiniDocs package and I imagine Grafoscopio becoming more a software distribution on top of Pharo/GT, providing documentation and collaboration workflows and improved outlining with packages like TiddlyWikiPharo or the Brea decoupled CMS / static site generator.

      BTW, as I don't know how to add comments or suggest updates I wonder why this note is not updated with Lepiter as it provides pretty much the experience you were advocating for since 2018 and it is already in your wiki/bliki. Maybe it is just a matter of some wiki refactoring a links update.

    1. An eSTL gets an input signal “L1” and should give an expected output signal “L*”. In case one of the temperaturesensors reaches a temperature beyond the programmed limit, the output signal L* is disabled and a LED error code isgenerated to detect the reason for tripping. Also main contactor K1 is deactivated in this case

      The input to L1 on the ESTL comes from the A10 I/O PCB Terminal X22 pin number 1 (marked with an arrow on the pcb). This is the 240V output from A10 to the safety chain.

      The output from the ESTL (Terminal L*) terminates on connection A1 of the K1 contactor coil, thus potentially switching K1 on or off. Dependent of course upon the condition of the safety chain.

      At the same time the output from the ESTL (Terminal L*) also returns to pin 3 of A10 terminal X22 to signal to A10 the condition of the safety chain, enabling the processor to provide appropriate indications of safety chain status. (Service 72 or Error 72).

      Where two ESTLs are employed (20-2/1E models) the output terminal L* on the first ESTL is connected directly to the input terminal L1 on the second ESTL thus connecting their outputs in series, meaning both ESTL ouputs must be closed for the K1 contactor to energise.

      See section 24 of the training manual for electrical wiring diagrams - commencing page 181.

    2. The password to enter the service menu for technicians is “TECLEVEL”

      It is no longer possible to enter the iCombi Classic Service menu unless you have downloaded the Rational PIN Creator App and have a valid log in and password.

      Attempts to enter the Service Menu will generate a QR code on the screen.

      Scanning the QR code with the Rational PIN Creator app will generate a PIN which if used on the screen will enable entry to the Service Menu for technicians.

      See Video Library "Access to Service menu for Technicians"

      If you have undergone technical training by Rational UK or its approved training partners you can obtain a log in and password from Rational UK.

    3. 34.32

      Error code description:

      Faulty data communication to automatic ignition controller.

      Condition for error detection:

      BUS signal from automatic ignition controller is missing or is not transmitted for at least 5 seconds at a time.

      Error area:

      Data transfer cable, automatic ignition controller.

      Relevant causes/components:

      • Electrical connection to components
      • Automatic ignition controller
    4. 34.16

      Error code description:

      Faulty data communication to the I/O board.

      Condition for error detection:

      BUS signal from I/O board is missing or is not transmitted for at least 5 seconds at a time.

      Error area:

      Data transfer cable, I/O board

      Relevant causes/components:

      • Electrical connection to components
      • I/O board
    5. 34.8

      Error code description:

      Faulty data communication to pump PCB.

      Condition for error detection:

      BUS signal from pump PCB is missing or is not transmitted for at least 5 seconds at a time.

      Error area:

      Data transfer cable, pump PCB

      Relevant causes/components:

      • Electrical connection to components
      • Pump PCB
    6. 34.4

      Error code description:

      Faulty data communication to bottom fan motor.

      Condition for error detection:

      BUS signal from bottom fan motor is missing or is not transmitted for at least 5 seconds at a time.

      Error area:

      Data transfer cable, control electronics for bottom fan motor

      Relevant causes/components:

      • eSTL has initialised
      • Electrical connection to components
      • Control electronics for bottom fan motor
    7. 34.2

      Error code description:

      Faulty data communication to the middle fan motor on floor units and the bottom fan motor on 10 tray table top units.

      Condition for error detection:

      BUS signal from middle fan motor on floor units and the bottom fan motor on 10 tray table top units is missing or is not transmitted for at least 5 seconds at a time.

      Error area:

      Data transfer cable, control electronics for middle fan motor on floor units and the bottom fan motor on 10 tray table top units

      Relevant causes/components:

      • eSTL has initialised
      • Electrical connection to components
      • Control electronics for middle fan motor
    8. 27

      Error code description:

      ICP XS & CMP XS ONLY

      Malfunction when opening/closing the drain valve

      Condition for error detection:

      The drain valve receives a request to open/close, but the drain valve still does not detect an end position after 4 minutes.

      Error area: Drain valve

      Relevant causes/components:

      • Drain valve
      • Electrical connection to components
    9. 14

      Error code description: The steam generator water level detection is faulty.

      Condition for error detection: The steam generator is filled via valve Y1, but no “full” water level is measured by the level electrode, even after a long time.

      If the measured water flow from the CDS sensor is significantly greater than the reference volume of the steam generator, this service error 14 is triggered.

      Error area:

      Water level detection in the steam generator, reference volume steam generator, water leakage when filling the steam generator.

      Relevant causes/components:

      Level electrode Electrical connection to components Conductance water/threshold value for level detection Steam generator reference volume

    10. 12

      Error code description:

      Incorrect water quantity/flow measurement when filling the steam generator.

      Condition for error detection:

      A flow rate in a corresponding tolerance range is expected on the CDS sensor. The water also actively enters the steam generator, the level electrode indicates a full water level, but the CDS sensor passes on an error/implausible flow or no signal at all.

      Error area:

      Water supply, water flow detection, fill level detection

      Relevant causes/components:

      • Water supply
      • Solenoid valve block with CDS sensor
      • Electrical connection (CDS sensor – I/O module)
      • Level electrode
      • Steam generator reference volume
      • Incorrect CDS pulse saved
    11. 34.1

      Error code description:

      Faulty data communication to top fan motor.

      Condition for error detection:

      BUS signal from top fan motor is missing or is not forwarded for at least 5 seconds at a time.

      Error area:

      Data transfer cable, control electronics for top fan motor

      Relevant causes/components:

      ESTL has initialised Electrical connection to components Control electronics for top fan motor Other information:

      Before disconnecting the unit or replacing a component, download all data from the unit to a USB stick.This data is required if further support from RATIONAL is required.

      NOTE: This error is reset automatically as soon as a BUS signal is present for at least 5 seconds at a time.

      NOTE: If the ESTL has triggered (Service 72), one or more Service 34 errors are often displayed because the eSTL de-energises the affected BUS participants.

      After a successful reset of the Service 72 error, the Service 34 errors are also reset automatically.

    12. 25

      Service 25

      Error code description: No or insufficient water circulation is detected during an iCareSystem cleaning program, self-test or module test.

      Condition for error detection: Water hitting the fan wheel during circulation increases the output of the fan motor.

      If this output increase is too low, the system assumes that no or too little water is entering the cooking cabinet.

      Relevant causes/components * Drain sieve * Connection between control box and cleaning box * Cleaning carried out incorrectly * Only for XS units (6-2/3): circulation pump or drain valve faulty * Cleaning pipe * Air baffle, baffle plate * Cleaning box * Fan motor * Foam brake * Faulty water outlet/water loss

    13. The password to enter the service menu for technicians is “TECLEVEL”

      It is no longer possible to enter the iCombi Pro Service menu unless you have downloaded the Rational PIN Creator App and have a valid log in and password.

      Attempts to enter the Service Menu will generate a QR code on the screen.

      Scanning the QR code with the Rational PIN Creator app will generate a PIN which if used on the screen will enable entry to the Service Menu for technicians.

      See Video Library "Access to Service menu for Technicians"

      If you have undergone technical training by Rational UK or its approved training partners you can obtain a log in and password from Rational UK.

    14. X5: B4 thermocouple humidity

      Thermocouple B4 measures the temperature behind the top circulation fan during the initial Self test calibration process and during any subsequent manual calibration. It also monitors the cabinet temperature for humidity control above the local boiling point.

      The combi-steamer checks the temperature of thermocouple B4 every second (in all modes). As soon as an implausible temperature value is measured service error code Service 20.4 is indicated..

      Check the value of temperature sensor B4 in the service menu:

      For iCombi Pro: Select the steam or humidity launchers - go to Diagnosis.

      For iCombi Classic: Select Diagnostics and B4

      If this value is approx. 615°C [1140°F], either the sensor is defective (Open Circuit) or the screw connection on the inner cabinet is loose or not adequately insulated or the electrical connection between the temperature sensor and the I/O PCB is faulty.

    15. X4: B2 thermocouple control

      B2 is situated in the Control box (formerly known as the quench box) and its function is to reduce the volume of steam emmissions from the vent stack and also to reduce the temperature of waste water through the drain to a level below 65 degrees during normal oven cooking processes.

      The combi-steamer checks the temperature of thermocouple B2 every second (in all modes).

      As soon as an implausible temperature value is measured, service error code Service 20.2 is indicated.

      Check the value of temperature sensor B2 in the service menu:

      For iCombi Pro: Select the steam or humidity launchers - go to Diagnosis.

      For iCombi Classic: Select Diagnostics and B2

      If this value is approx. 615°C [1140°F], either the sensor is defective (Open Circuit) or the electrical connection between the temperature sensor and the I/O PCB is faulty.

    1. §3. When Phaedra sees Hippolytus for the very first time in the narrative of Pausanias 2.32.3, as I noted in the posting for 2018.06.21, she is already falling in love with the youthful hero. In that posting, I was worrying about the translation ‘fall in love’ for erân/erâsthai in the “present” or imperfective aspect of the relevant verb used by Pausanias—and for erasthênai in its aorist aspect, as he uses it elsewhere. In the present posting, 2018.08.03, I still worry about that translation—and I continue to prefer the wording ‘conceive an erotic passion’ as a more accurate way to capture the moment—but now I worry more about the actual moment of erotic passion in Pausanias 2.32.3. As we will see, that moment is really a recurrence of moments. The storytelling of Pausanias points to an untold number of moments for experiencing the erotic passion—as expressed by the “present” or imperfective aspect of the verb, erân, and by the imperfect tense of the verb apo-blepein ‘gaze away, look off into the distance’. Further, there is a divine force that presides over all these moments, embodied in the sacralized role of Aphrodite as the kataskopiā, ‘the one who is looking down from on high’. §4. Here is the relevant passage in Pausanias, where our traveler speaks of the enclosure containing the space that is sacred to both Hippolytus and Phaedra as cult heroes: {2.32.3} In the other part of the enclosure [peribolos] is a racecourse [stadion] named after Hippolytus, and looming over it is a shrine [nāos] of Aphrodite [invoked by way of the epithet] kataskopiā [‘looking down from the heights’]. Here is the reason [for the epithet]: it was at this very spot, whenever Hippolytus was exercising-naked [gumnazesthai], that she, Phaedra, feeling-an-erotic-passion-for [erân] him, used-to-gaze-away [imperfect of apo-blepein] at him from above. A myrtle bush [mursinē] still grows here, and its leaves—as I wrote at an earlier point [= 1.22.2]—have holes pricked into them. Whenever Phaedra was-feeling-there-was-no-way-out [aporeîn] and could find no relief for her erotic-passion [erōs], she would take it out on the leaves of this myrtle bush, wantonly injuring them. {2.32.4} There is also a tomb [taphos] of Phaedra, not far from the tomb [mnēma] of Hippolytus, and it [= the mnēma] is heaped-up-as-a-tumulus [kekhōstai] near the myrtle bush [mursinē]. The statue [agalma] of Asklepios was made by Timotheus, but the people of Troizen say that it is not Asklepios, but a likeness [eikōn] of Hippolytus. Also, when I saw the House [oikiā] of Hippolytus, I knew that it was his abode. In front of it is situated what they call the Fountain [krēnē] of Hēraklēs, since Hēraklēs, as the people of Troizen say, discovered the water. §5. Before further comment on Pausanias 2.32.3, I note a detail in my translation of 2.32.4. I take it that Pausanias here is guardedly indicating that he saw the tomb of Hippolytus himself, situated next to the tomb of Phaedra. Our traveler is guarded because, as he said earlier at 2.32.1 about the hero cult of Hippolytus, the people of Troizen ‘do not show [apophainein] his tomb [taphos], though they know where it is’. In the wording of Pausanias, oikiā ‘house’ can refer to the ‘abode’ of a cult hero, that is, to his tomb. And he ostentatiously uses this word here at 2.32.4. A telling parallel is the wording at Pausanias 2.23.2, where he refers to the tomb of the cult hero Adrastos as an oikiā while he calls the nearby tomb of Amphiaraos simply a hieron ‘sanctuary’—and while, even more simply, he refers to the nearby tomb of Eriphyle, wife of Amphiaraos, as a mnēma, the literal meaning of which is ‘memorial marker’. This same word mnēma is used by Pausanias here at 2.32.4 with reference to the tomb of Hippolytus. Other examples where oikiā refers to tombs of cult heroes include 2.36.8, 5.14.7, 5.20.6, 9.11.1. 9.12.3. 9.16.5. 9.16.7. §6. Returning to Pausanias 2.32.3, I conclude by arguing that the role of the goddess Aphrodite in the visualization of Phaedra’s recurrent erotic passion complements the role of the goddess Artemis in a visualization that we saw being brought to life in the poetry of Euripides. Whereas the role of Aphrodite is to be always available as the agent of erotic desire, the corresponding role of Artemis is to maintain her eternal unavailability as the object of that desire. Always unavailable, Artemis thus becomes the very picture of what is erotically desirable.

      In the narratives, Hippolytus is depicted as a paragon of chastity and self-discipline, qualities that define his heroism within the cultural context of ancient Greece. His rejection of Phaedra's advances is rooted in his dedication to the goddess Artemis and his adherence to a code of moral purity. This portrayal aligns with the ideal of the male hero as one who resists temptation and remains steadfast in his principles, even at the cost of his own life. Phaedra, on the other hand, embodies the complexities of female desire within a patriarchal society. In Euripides' Hippolytus, her passion for Hippolytus is portrayed as an uncontrollable force that ultimately leads to her destruction and the downfall of Hippolytus. Her role as a woman who transgresses the boundaries of acceptable female behavior highlights the dangers of unchecked female desire, reinforcing the cultural belief that women’s emotions must be controlled and contained. The tragedy of Phaedra is not just her unfulfilled love but also the societal constraints that define her actions as inherently destructive. Pausanias' reference to the myth of Phaedra and Hippolytus, as discussed in the source, offers a more subdued version of the narrative, focusing less on the psychological torment of Phaedra and more on the broader mythological context. This difference in emphasis reflects varying cultural attitudes towards gender and heroism. While Euripides explores the inner turmoil of his characters, highlighting the destructive power of female desire, Pausanias presents a more neutral account, possibly influenced by the historical and cultural lens through which he viewed the myth. When comparing the versions of the Phaedra and Hippolytus story in Euripides and Pausanias, it is evident that Euripides' version is more focused on the emotional and psychological aspects of the characters, particularly Phaedra. Euripides' portrayal of Phaedra’s inner conflict and her ultimate decision to falsely accuse Hippolytus after he rejects her advances emphasizes the tragic consequences of her unbridled passion. In contrast, Pausanias’ version is less concerned with the emotional depth of the characters and more with the events themselves, reflecting a different approach to the narrative that is more aligned with the recording of history and myth rather than the exploration of character psychology. This difference in focus can be attributed to the cultural and political contexts in which these works were created. Euripides, writing in a period of Athenian democracy, was likely influenced by the social and philosophical debates of his time, including those related to gender and the role of women in society. Pausanias, writing in a later period, may have been more influenced by the desire to preserve and record myths as part of the cultural heritage, leading to a more straightforward recounting of the story. Comparison Between Individual Works: When comparing the story of Phaedra and Hippolytus with other similar narratives, such as the story of Joseph, we see a recurring theme of male chastity and female desire. In both stories, the male hero is depicted as morally superior, resisting the advances of a woman who is driven by passion. This resistance enhances the hero’s status as a figure of virtue and integrity, while the woman’s desire is portrayed as dangerous and destructive. However, there are also significant differences: In the story of Joseph, his refusal leads to his imprisonment, but he is ultimately vindicated and rises to a position of power. In contrast, Hippolytus’ rejection of Phaedra leads to his death, underscoring the tragic nature of Greek heroism, where even the most virtuous are not immune to the whims of fate. Lastly, Euripides' language is rich in emotional intensity, capturing the turmoil and despair that drive the characters to their tragic ends. Phaedra’s monologues, in particular, offer insight into her conflicting emotions, torn between her illicit love for Hippolytus and her sense of duty and shame. I find Pausanias’ account is more straightforward and less emotionally charged. His language is more descriptive and factual, focusing on the sequence of events rather than the inner lives of the characters. This difference in linguistic style reflects the different purposes of the texts: Euripides’ play is a work of drama intended to evoke strong emotions and provoke thought, while Pausanias’ account is more concerned with documenting the myth for posterity. To critique, however, Euripides’ use of emotionally charged language and complex character interactions can be seen as a reflection of the intellectual and cultural climate of classical Athens, where issues of gender, morality, and human nature were hotly debated. Pausanias’ more restrained language, on the other hand, reflects his role as a chronicler of myths, where the emphasis is on preservation rather than interpretation. CC BY Aarushi Attray (contact)

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this work, the authors provide a comprehensive description of transcriptional regulation in Pseudomonas syringae by investigating the binding characteristics of various transcription factors. They uncover the hierarchical network structure of the transcriptome by identifying top-, middle-, and bottom-level transcription factors that govern the flow of information in the network. Additionally, they assess the functional variability and conservation of transcription factors across different strains of P. syringae by studying DNA-binding characteristics. These findings notably expand our current knowledge of the P. syringae transcriptome.

      The findings associated with crosstalk between transcription factors and pathways, and the diversity of transcription factor functions across strains provide valuable insights into the transcriptional regulatory network of P. syringae. However, these results are at times underwhelming as their significance is unclear. This study would benefit from a discussion of the implications of transcription factor crosstalk on the functioning of the organism as a whole. Additionally, the implications of variability in transcription factor functions on the phenotype of the strains studied would further this analysis.<br /> Overall, this manuscript serves as a key resource for researchers studying the transcriptional regulatory network of P. syringae.

      Thank you for your positive comments.

      Reviewer #2 (Public Review):

      Summary:

      The phytopathogenic bacterium Pseudomonas syringae is comprised of many pathovars with different host plant species and has been used as a model organism to study bacterial pathogenesis in plants. Transcriptional regulation is key to plant infection and adaptation to host environments by this bacterium. However, researchers have focused on a limited number of transcription factors (TFs) that regulate virulence-related pathways. Thus, a comprehensive, systems-level understanding of regulatory interactions between transcription factors in P. syringae has not been achieved.

      This study by Sun et al performed ChIP-seq analysis of 170 out of 301 TFs in P. syringae pv. syringae 1448A and used this unique dataset to infer transcriptional regulatory networks in this bacterium. The network analyses revealed hierarchical interactions between TFs, various network motifs, and co-regulation of target genes by TF pairs, which collectively mediate information flow. As discussed, the structure and properties of the P. syringae transcriptional regulatory networks are somewhat different from those identified in humans, yeast, and E. coli, highlighting the significance of this study. Further, the authors made use of the P. syringae transcriptional regulatory networks to find TFs of unknown functions to be involved in virulence-related pathways. For some of these TFs, their target specificity and biological functions, such as motility and biofilm formation, were experimentally validated. Of particular interest is the finding that despite conservation of TFs between P. syringae pv. syringae 1448A, P. syringae pv. tomato DC3000, P. syringae pv. syringae B728a, and P. syringae pv. actinidiae C48, some of the conserved TFs show different repertoires of target genes in these four P. syringae strains.

      Thank you for your positive comments.

      Strengths:

      This study presents a systems-level analysis of transcriptional regulatory networks in relation to P. syringae virulence and metabolism, and highlights differences in transcriptional regulatory landscapes of conserved TFs between different P. syringae strains, and develops a user-friendly database for mining the ChIP-seq data generated in this study. These findings and resources will be valuable to researchers in the fields of systems biology, bacteriology, and plant-microbe interactions.

      Thank you for your positive comments.

      Weaknesses:

      No major weaknesses were found, but some of the results may need to be interpreted with caution. ChIP-seq was performed with bacterial strains overexpressing TFs. This may cause artificial binding of TFs to promoters which may not occur when TFs are expressed at physiological levels. Another caution is applied to the interpretation of the biological functions of TFs. The biological roles of the tested TFs are based on in vitro experiments. Thus, functional relevance of the tested TFs during plant infection and/or survival under natural environmental conditions remains to be demonstrated.

      Thank you for your comments, and we agree with the reviewer. To eliminate the artificial binding of TFs, we performed EMSA to verify the analyzed targets. Our EMSA results confirmed the analyzed binding peaks.

      For the verification experiments of the biological functions of TFs, we also performed in vivo motility assay and biofilm production assay (Figures 3b-d). To further detect the biological functions of TFs, we performed plant infection assay of TF PSPPH2193 under natural environmental condition (bean leaves). As shown in Figures S6c and g, both the motility and the virulence of P. syringae in ∆PSPPH2193 strain was significantly reduced compared with WT strain. These results showed that TF PSPPH2193 positively regulated the pathogenicity of P. syringae via modulating the bacterial motility.

      Reviewer #3 (Public Review):

      Summary:

      This study aims to understand gene regulation of the plant bacterial pathogen Pseudomonas syringae. Although the function of some TFs has been characterized in this strain, a global picture of the gene regulatory network remains elusive. The authors conducted a large-scale ChIP-seq analysis, covering 170 out of 301 TFs of this strain, and revealed gene regulatory hierarchy with functional validation of some previously uncharacterized TFs.

      Thank you for your positive comments.

      Strengths:

      - This study provides one of the largest ChIP-seq datasets for a single bacterial strain, covering more than half of its TFs. This impressive resource enabled comprehensive systems-level analysis of the TF hierarchy.

      - This study identified novel gene regulation and function with validations through biochemical and genetic experiments.

      - The authors attempted on broad analyses including comparisons between different bacterial strains, providing further insights into the diversity and conservation of gene regulatory mechanisms.

      Thank you for your positive comments.

      Weaknesses:

      (1) Some conclusions are not backed by quantitative or statistical analyses, and they are sometimes overinterpreted.

      Thank you for your comments. We used hypergeometric test in this analysis. Although only one gene was enriched in some pathways, the adjusted p-value was less than 0.05. We added the details in the revised manuscript.

      (2) Some figures and analyses are not well explained, and I was not able to understand them.

      Thank you for your comments, and we are sorry for the confusion. We defined ‘indirect interaction’ as ‘co-association’ and ‘cooperativity’ as ‘if the common target of two TFs is from a TF’. We added the definition of "indirect interaction" and "cooperativity" in the revised legend.

      For Figure S3a, the low co-association scores and large peak numbers of these top-level TFs indicated that top-level TFs preferred to solely regulate target genes, but not to co-regulate with other top-level TFs. PSPPH4700 was an example to show that top-level TFs with low co-association scores and large peak numbers tend to solely regulate target genes, but not to co-regulate with other top-level TFs. We revised the sentence to ‘For example, the top-level TF PSPPH4700 yielded over 1,700 peaks but cooperated with only 24 top-level TFs with low co-association scores about 0.05 (Supplementary Table 2b).’.

      We analyzed high co-association scores of 125 TFs in three levels and further determined the co-association patterns. To identify the tendency of co-association of all these 125 TFs, the co-association patterns were classified into 4 clusters. Bottom-level TFs tend to co-regulate target genes with other TFs. We revised the sentence in the revised manuscript.

      For Figure 2b, in C1, C2 and C4, many bottom-level TFs performed co-association pattern with other TFs, especially bottom TFs (showed in C4). To explore the regulatory pattern in C3, the peak locations in target genes of MexT were analyzed with those of TFs in C3. Seven top-level TFs (PSPPH1435, PSPPH1758, PSPPH2193, PSPPH2454, PSPPH4638, PSPPH4998 and PSPPH3411), three middle-level TFs (PSPPH1100, PSPPH5132 and PSPPH5144) and four bottom-level TFs (PSPPH0700, PSPPH2300, PSPPH2444 and PSPPH2580) were compared with MexT. MexT showed higher co-association scores (more than 60 scores) with more top-level-TFs. Therefore, we demonstrated that MexT performed closer co-association relationships with top-level TFs. We added the statement in the revised manuscript.

      For Figure 1a, the hierarchical network showed different number of TFs in three levels (54 top-level TFs, 62 middle-level TFs and 147 bottom-level TFs), which indicated that more than half of TFs (bottom-level TFs) tend to be regulated by other TFs and then directly bound to target genes. This finding showed a downward regulatory direction of transcription regulation in P. syringae. We revised the statement in the revised manuscript.

      (3) The Method section lacks depth, especially in data analyses. It is strongly recommended that the authors share their analysis codes so that others can reproduce the analyses.

      Thank you for your comments, and we defined the intergenic region before each TF sequence as the promoter region. As pHM1 plasmid carries its own constitutive promoter (lacZ promoter), we amplified the TF-coding sequence and cloned into site following the promoter. The TF protein expression was activated by the promoter of plasmid. Psph 1448A was used for our main ChIP-seq. We added the details in the revised manuscript.

      For Figure S3, we performed GO analysis on genes that were co-bound by TF pairs. We added the details in the revised manuscript.

      We shared our analysis codes on the website (https://github.com/dengxinb2315/PS-PATRnet-code) in the Data Availability.

      Recommendations for the authors

      Reviewer #1 (Recommendations For The Authors):

      (1) The specific strain of Pseudomonas syringae used in the study outside of the evolutionary analysis should be specified in the abstract and main text.

      Thank you for your suggestion. We revised the statements in abstract and main text to specific strains.

      (2) The language used throughout the manuscript should be revised for clarity, conciseness, and readability.

      Thank you for your suggestion. We have revised the language used throughput the manuscript by a scientific editor who is a native speaker of English.

      (2) Line 688: Replace "80C" with "-80C".

      Thank you for your correction. We revised ‘80℃’ to ‘-80℃’. Please see Line 713.

      (3) Line 172 - 173: The abbreviations TT, MM, BB, TM, TB, and MB need to be expanded in the main text before their use.

      Thank you for your suggestion. We added the abbreviations TT, MM, BB, TM, TB, and MB in the manuscript. Please see Lines 172-174.

      Reviewer #2 (Recommendations For The Authors):

      Major points

      (1) The name of the P. syringae strains used in each experiment/analysis should be explicitly stated (most experiments were carried out with P. syringae strain 1448A). This should also be applied to the introduction where many papers on P. syringae are cited without clear indication of strain names. I think this amendment is essential because target genes and thus biological functions of TFs could be different between P. syringae strains, as shown in the present study.

      Thank you for your suggestion. We revised the P. syringae strains in the citations throughout the manuscript.

      (2) How many TFs were analyzed throughout the study? Most sentences including line 22 in the abstract say 170, but I also found some say 270 (for example, line 106 and line 149). The legend of Figure 1 says 262. More detailed information is required regarding the datasets used for each analysis.

      Thank you for your suggestion. The number of TFs analyzed by ChIP-seq in this research is 170, the number of TFs analyzed by HT-SELEX in our previous research is 100. Hierarchical analysis integrated data from ChIP-seq and HT-SELEX which included 270 TFs. As 8 TFs did not show hierarchical characteristic, the legend of Figure 1 said 262 TFs. We added the data source in the revised manuscript. Please see Lines 104, 147, 160 and 1082.

      (3) Figure 1b: Please define "indirect interaction" and "cooperativity" in the legend as well as in the text. I only found the definition of "direct interaction".

      Sorry for the missing information. We defined ‘indirect interaction’ and ‘cooperativity’ as ‘co-association’ and ‘if the common target of two TFs is from a TF’, respectively. We added the definition of "indirect interaction" and "cooperativity" in the revised legend. Please see Lines 174-176, 1084-1086.

      (4) I found it very interesting that conserved TFs show different repertoires of target genes in different P. syringae strains. This suggests the rewiring of transcriptional regulatory networks in P. syringae strains, but the underlying mechanism is not explored in the current manuscript. It can be easily tested whether these conserved TFs bind to similar or different motifs by motif enrichment analysis. If they bind to similar motifs, it is possible that the promoter sequences of their target genes have diversified. Addressing or at least discussing these points would provide molecular insights into the diversification of the transcriptional regulatory networks in P. syringae. Similarly, functional enrichment analysis of target genes can be used to test whether the conserved TFs regulate different biological processes.

      Thank you for your suggestion. We added the motif analysis and functional enrichment analysis of target genes of TFs (PSPPH3122 and PSPPH4127) in different P. syringae strains. We found two different motifs (AGACN4GATCAA and CGGACGN3GATCA) in 1448A and DC3000 strains, respectively. We also performed the GO analysis and found the specific functions of PSPPH3122 in Psph 1448A compared with Pst DC3000 and Pss B728a strains, including recombinase activity and DNA recombination. For PSPPH4127, we found four different motifs in four P. syringae strains. GO analysis showed its relationship with recombinase activity in Psph 1448A strain, and RNA binding, structural constituent of ribosome, translation and ribosome in Pss B728a strain. These results indicated the highly functional diversity of TFs in P. syringae. We added these points in the Results part, and Figure S9-S10 in the revised manuscript. Please see Lines 497-509.

      (5) Related to point 4, it would be quite useful if a list of orthologous genes of 1448A TFs in the other tested P. syringae strains were provided. Such information may also enhance the utility of the database developed in this study.

      Thank you for your suggestion. We added the list of orthologous genes of 301 Psph 1448A TFs in the other tested P. syringae strains in the Supplementary Table 5. Please see Lines 467 and Supplementary Table 5.

      (6) Lines 243-246: It is unclear how these functional enrichment analyses were performed. Did you use target genes regulated by individual TFs or those coregulated by pairs of TFs? Please add more information for the sake of readers.

      Thank you for your suggestion. We performed the functional enrichment analyses by hypergeometric test (BH-adjusted p < 0.05) via using target genes regulated by individual TFs. We added the details in the Results part. Please see Lines 248-252, 270, 1194-1195, 1199-1200 and 1205-1206.

      Minor points

      (1) Lines 167-168: I may not understand correctly, but you might want to say "downward-pointing edges" instead of "upward-pointing edges".

      Thank you for correction. We revised the ‘upward-pointing edges’ to ‘downward-pointing edges’. Please see Line 166.

      (2) Line 174: "physical interactions" should be amended to "direct interactions".

      Thank you for correction. We revised the ‘physical interactions’ to ‘direct interactions’. Please see Line 177.

      (3) Line 224: Could you please explain why bacterial growth in plant tissues is considered an example of "multi-stability"?

      Thank you for your suggestion. We are sorry for the incorrect statement. We showed ‘plant intercellular spaces’ as ‘multi-stability’. We revised the sentence to ‘These auto-regulators are important and always act as repressors in scenarios of multi-stability, such as plant intercellular spaces’. Please see Lines 224-226.

      (4) Line 254-257: Here, the definition of "tether binding" is introduced, but it is not very clear to me. In my understanding, tethered binding is an indirect binding of a TF to a target gene through protein-protein interaction with other TF that directly binds to the promoter of the target gene.

      Thank you for your suggestion, and we agree with you. We referred to the paper published in 2012 (Wang et al., 2012) and revised the statement of ‘tether binding’ to ‘This finding suggested that these TFs indirectly regulated target genes through protein-protein interaction with other TFs that directly binds to the promoters of target genes, a phenomenon defined as tethered binding’. Please see Lines 259-262.

      (5) Lines 341-343: Figure 3b shows qRT-PCR of hopAE1, not hrpR.

      Thank you for your correction. We revised ‘hrpR’ to ‘hopAE1’. Please see Line 349.

      (6) Lines 500 and Figure 6b: It is hard to see edges from module 12 to others. So, it would be better to provide numeric information (number of TFs and target genes) in the text.

      Thank you for your suggestion. Module 12 includes 22 TFs and 318 target genes. We added the statement of numeric information about Module 12 in the revised manuscript. Please see Lines 536-537.

      (7) Line 519: Figure S4b is not the EMSA data for PSPPH3798. Should it be Figure S4e?

      Thank you for your correction. We revised to ‘Figure S4e’. Please see Line 545.

      (8) Line 522: Figure S6b is not relevant to the statement here.

      Thank you for your correction. We deleted the ‘Figure S6b’ here. Please see Line 547.

      (9) Line 593: prokaryotic transcriptional regulatory networks -> eukaryotic transcriptional regulatory networks?

      Thank you for your correction. We revised ‘prokaryotic transcriptional regulatory networks’ to ‘eukaryotic transcriptional regulatory networks’. Please see Line 618.

      (10) Figure S3 requires images of higher resolution. Especially, values for the color codes are not readable or very hard to see.

      Thank you for your suggestion. To make the images clearer, we enlarged the images, change the color codes, and divided it into three figures. Please see the revised Figures S3-S5 and corresponding Figure legends at Lines 1191-1206.

      Reviewer #3 (Recommendations For The Authors):<br /> (1) Some conclusions are not backed by quantitative or statistical analyses, and they are sometimes overinterpreted.

      L221: "Taken together, the simplest and most effective submodule M1 and the coregulatory submodule M13 played crucial roles in the transcriptional regulation of TFs in P. syringae."

      The authors did not provide any evidence supporting the functional importance of any of these submodules. M13 is most enriched within the locked loop, but its size is much smaller than simple loops. What evidence supports the importance of this particular submodule?

      Thank you for your suggestion. In eukaryote (Saccharomyces cerevisiae) and prokaryote (Escherichia coli) which have the best characterized transcriptional regulation networks, the feed-forward loop (called M13 in this article) appear numerous times in the networks and perform different biological functions. M1 appeared most frequently by an order of magnitude than other modules. We revised the sentence to ‘Taken together, the most numerous but simplest submodule M1 played a crucial role in the transcriptional regulation of TFs in P. syringae.’ Please see Lines 222-224.

      L223: "...we found 92 auto-regulators...These auto-regulators are important and always act as repressors in scenarios of multi-stability, such as in plant intercellular spaces where bacteria grow (Figure 1d)(Alon, 2007). These regulators are regarded as bistable switches that further influence the expression of downstream genes."<br /> Are these claims supported by any evidence?

      Thank you for your suggestion. We referred to the following articles:

      (1) Alon. Nature Reviews Genetics. 2007(Alon, 2007).

      That transcription factors repress the transcription of their target genes was considered as negative regulation. These negative autoregulators account for half of the repressors in E. coli and occur in many eukaryotes. The repressors controlled the concentration of the target production through suppressing its expression, which accelerated back to the steady state of cells.

      (2) Becskei. et al. Nature. 2000; Rosenfeld et al. Journal of Molecular Biology. 2002 (Becskei & Serrano, 2000; Rosenfeld, Elowitz, & Alon, 2002).

      Fluorescent assay confirmed that the negative autoregulatory module (negative autoregulator TetR) spent less time to the log phase than unregulated group, which reduced cell-to-cell fluctuations in the steady-state level of the transcription factor. Some negative autoregulators were showed here, such as LexA, CysB and SrlA-D.

      In our research, we also identified many autoregulators including CysB and LexA2 (annotated as LexA repressor). We revised the sentence to ‘In addition, we found 92 auto-regulators in our hierarchy network. These auto-regulators are important and always act as repressors in scenarios of multi-stability, such as plant intercellular spaces (Figure 1d) (Alon, 2007). For example, LexA and CysB as negative autoregulators were indicated to reduce cell-to-cell fluctuations in the steady-state level of the transcription factor (Becskei & Serrano, 2000; Rosenfeld et al. 2002).’. Please see Lines 224-229.

      L265: "This finding indicated that the bottom-level TFs, which were more easily regulated, tended to cooperate with downstream genes and other intra-level TFs."<br /> Could the authors provide more explanation to reach this conclusion from the data? Analyzing the number of highly co-accessing TFs does not sufficiently support this conclusion. The clustering of TFs (C1-C4) is incomplete, and each TF level (Top/Middle/Bottom) contains different numbers of TFs. Since the authors calculated all-by-all co-association scores for these 125 TFs, they can group these scores into 6 possible combinations (TT, TM, TB, MM, MB, BB) and show the distribution of co-association scores.

      Thank you for your suggestion. We indicated that the bottom-level TFs preferred to regulate the target genes through the cooperation with other TFs. To further support the claim, we analyzed the proportion of the bottom TF interaction in all the TF pairs interactions and direct interaction based on results in Figure 1B. The interactions of bottom TFs were 43% and 49%, respectively. However, the interactions of top TFs and middle TFs were only 20% and 28%, respectively. We revised the statement ‘Based on the analysis in Figure 1B, we found that the proportions of bottom-level TF interaction in all the TF pair interactions and direct interaction were 43% and 49%. These results indicated that the bottom-level TFs tended to regulate downstream genes through cooperating with other level TFs.’ in the revised manuscript. Please see Lines 269-272.

      As not every TF performed co-association with other TFs, we only collected 125 TFs with co-association scores. For the numbers of TF in each level, we divided TFs into three levels according to hierarchy height. Hierarchy height from -1 to -0.3 represented bottom level; hierarchy height from -0.3 to 0.3 represented middle level ; hierarchy height from 0.3 to 1 represents top level. Each level was equally divided by height scores. We suggested that different numbers of TFs in three levels indicated the characteristic of transcriptional regulation in P. syringae.

      Thank you for your suggestion. As the co-association patterns were determined by co-association scores of the same TFs, we first grouped the co-association scores into 3 possible TF pairs (TT, MM, and BB, in Figures S3a, S4a and S5a). Our results indicated that higher co-association scores preferred to occur in bottom-level TFs. We revised the statement in the revised manuscript. Please see Lines 244-252.

      (2) Some figures and analyses are not well explained, and I was not able to understand them.

      Figure 1b: The terms "direct," "indirect," and "cooperativity" require further clarification as their definitions in the text (L169-183) are unclear to me. This ambiguity hampers the evaluation of the authors' discussion regarding TF-TF interactions (L561-584), an important theme of this study. The figure includes concepts discussed in later sections (e.g., cooperativity), making it difficult to understand. A diagram explaining these concepts would be highly helpful for readers to understand.

      Sorry for the missing information. We defined ‘indirect interaction’ as ‘co-association’, ‘cooperativity’ as ‘if the common target of two TFs is from a TF’. We added the definition of "indirect interaction" and "cooperativity" in the revised manuscript and legend. Please see Lines 174-176 and 1085-1087.

      L253: "Notably, we found that TFs at the top level, without cooperating TFs, exhibited a large number of binding peaks (Figure S3a)."

      I could not understand this sentence. Did the authors mean that top-level TFs with a large number of peaks showed a low level of co-association? If so, does this data suggest that these TFs do not tend to cooperate with other TFs? I was confused by the discussion in L253-L261.

      Thank you for your comment, and we agree with you. The low co-association scores and large peak numbers of these top-level TFs indicated that top-level TFs preferred to solely regulate target genes, but not to co-regulate with other top-level TFs.

      Thank you for your comment. From L253-256, PSPPH4700 was an example to show that top-level TFs with low co-association scores and large peak numbers tend to solely regulate target genes, but not to co-regulate with other top-level TFs. We revised the sentence to ‘For example, the top-level TF PSPPH4700 yielded over 1,700 peaks, but cooperated with only 24 top-level TFs with low co-association scores about 0.05 (Supplementary Table 2b).’.

      From L257-261, we analyzed high co-association scores of 125 TFs in three levels and further determined the co-association patterns. To identify the tendency of co-association of all these 125 TFs, the co-association patterns were classified into 4 clusters. Bottom-level TFs tend to co-regulate target genes with other TFs. We revised the sentence. Please see Lines 262-264, 265-266 and 269-272.

      L287: "The analysis of the peak locations of MexT demonstrated that MexT showed closer co-association relationships with top-level TFs (Figure 2b)."

      I could reach this conclusion by seeing Figure 2b. Additional explanation and/or data visualization would be appreciated.

      Thank you for your suggestion. In C1, C2 and C4, many bottom-level TFs performed co-association pattern with other TFs, especially bottom TFs (showed in C4). To explore the regulatory pattern in C3, the peak locations in target genes of MexT were analyzed with those of TFs in C3. Seven top-level TFs (PSPPH1435, PSPPH1758, PSPPH2193, PSPPH2454, PSPPH4638, PSPPH4998 and PSPPH3411), three middle-level TFs (PSPPH1100, PSPPH5132 and PSPPH5144) and four bottom-level TFs (PSPPH0700, PSPPH2300, PSPPH2444 and PSPPH2580) were compared with MexT. MexT showed higher co-association scores (more than 60 scores) with more top-level-TFs. Therefore, we demonstrated that MexT performed closer co-association relationships with top-level TFs. We added the statement in the revised manuscript. Please see Lines 291-296.

      Figure 6cd: What kind of enrichment analysis did the authors perform? Was any statistical test used? The figure only shows the number of genes, and sometimes the number is only 1 for a functional category. Can it be considered as significant enrichment?

      Thank you for your comment. We used hypergeometric test in this analysis. Although only one gene was enriched in some pathways, the adjusted p-value was less than 0.05. We added the details in the revised manuscript. Please see Lines 533-534.

      L169: "The hierarchical network revealed a downward information flow, suggesting the prioritization of collaboration between different hierarchy levels."<br /> Can the authors please explain the logic behind this statement more in detail?

      Thank you for your comment. The hierarchical network showed different number of TFs in three levels (54 top-level TFs, 62 middle-level TFs and 147 bottom-level TFs), which indicated that more than half of TFs (bottom-level TFs) tend to be regulated by other TFs and then directly bound to target genes. This finding showed a downward regulatory direction of transcription regulation in P. syringae. We revised the statement in the revised manuscript. Please see Lines 167-170.

      (3) The Method section lacks depth, especially on data analyses.

      How did the authors define promoter regions of each gene? How were operons treated in their analyses? Was P. syringae 1448A used for their main ChIP-seq?

      Thank you for your comment. We defined the intergenic region before each TF sequence as the promoter region.

      As pHM1 plasmid carries its own constitutive promoter (lacZ promoter), we amplified the TF-coding sequence and cloned into the site following the promoter. The TF protein expression was activated by the promoter of plasmid.

      P. syringae 1448A was used for our main ChIP-seq. We added the details in the revised manuscript. Please see Lines 705 and 727-730.

      Figure S3: I am not sure how the GO analyses were done. For example, in the case of the top-level TF PSPPH4700, did the authors perform GO analysis on genes that are co-bound by PSPPH4700 and any other top-level TFs?

      Thank you for your comment and we agree with you. We performed GO analysis on genes that were co-bound by TF pairs in the same level. We added the details in the revised manuscript. Please see Lines 248-252.

      The analysis presented in Figure 6a needs more explanation of the methodology employed by the authors.

      Thank you for your comment. We added more details for the analysis in Figure 6a. Please see Lines 514-522.

      It is strongly recommended that the authors share their analysis codes so that others can reproduce the analyses.

      Thank you for your comment. We shared our analysis codes on the website (https://github.com/dengxinb2315/PS-PATRnet-code) in the Data Availability. Please see Lines 800-801.

      (4) Other:

      Figure 3: I suggest putting additional panel labels to facilitate the interpretation of the figure.

      Thank you for your suggestion. We added detailed labels in the revised Figures 3 and 4. Please see in the revised Figures 3 and 4.

      I spotted several potential errors:

      L106: 170 TFs?

      Thank you for your comment, and we are sorry for the missing details. For the hierarchical network, we integrated the DNA-binding data of 170 TFs in this study and 100 TFs in our previous SELEX research. We added the details in the revised manuscript. Please see Lines 104, 147 and 159-160.

      L592: P. syringae not E. coli?

      Thank you for your comment. Here we discussed the hierarchical characteristics in E. coli. We revised the statement in the revised manuscript. Please see Line 618.

      L593: eukaryotic not prokaryotic?

      Thank you for your correction. Here we discussed the feedforward loops in our study. We revised the statement in the revised manuscript. Please see Line 618.

      References

      Alon, U. (2007). Network motifs: theory and experimental approaches. Nature Reviews Genetics, 8(6), 450-461.

      Becskei, A., & Serrano, L. (2000). Engineering stability in gene networks by autoregulation. Nature, 405(6786), 590-593.

      Rosenfeld, N., Elowitz, M. B., & Alon, U. (2002). Negative autoregulation speeds the response times of transcription networks. Journal of molecular biology, 323(5), 785-793.

      Wang, J., Zhuang, J., Iyer, S., Lin, X., Whitfield, T. W., Greven, M. C., . . . Cheng, Y. (2012). Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors. Genome research, 22(9), 1798-1812.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Corridori et al introduce IGNITE, a computational framework to infer gene regulatory networks (GRNs) from scRNA-seq data leveraging the kinetic Ising model, which can be used to simulate synthetic gene expression and perform in-silico knockout experiments. Other similar frameworks exist, but none combine these three aspects together. The authors have generated a scRNA-seq of murine ESCs differentiation which they use to compare their method with others. Specifically they show that they can infer known regulatory interactions, that they can generate similar data than the original and that it can potentially predict gene expression changes in transcription factor knock-out perturbations.

      Major comments:

      • Many of the authors' claims are backed by qualitative results and not properly quantified. In Fig2, authors qualitatively compare intra gene correlations between genes for the original data and their prediction. Instead of just visualizing they should compute and report the Spearman correlation between the original expression and the predicted one. The Fraction of Agreement is not a good metric to compare knockout predictions since it is completely dependent on the class imbalance of signs, for example if the selected genes are 75% positive and 25% negative, a naive predictor that only outputs positive predictions will still have a high score. Instead, the authors should quantify this with Spearman correlation or RMSE and compare across methods. In FigS4a-b the authors qualitatively claim that other methods could not predict the expected cell composition, which they should quantify and report the values across methods. When comparing against the ground truth network, the fraction of correctly inferred interactions is technically the same as precision but is ignoring recall. I suggest the authors compute precision, recall and a combined F1 score to compare the evaluated methods. Authors claim that the method is scalable to a larger number of genes but no data is provided, they should show how their method compares to others when using a different number of cells and number of genes at memory usage and running time.
      • The authors need to better describe which tests were performed when talking about significance, which thresholds and which corrections, if any, were employed.
      • To reduce the number of dimensions of scRNA-seq data the authors use t-SNE and then from the obtained result UMAP to project the data into a lower dimensional space. This is fundamentally wrong since distances are not well preserved in t-SNE. Instead the authors should first employ PCA and then UMAP. Additionally, the authors use UMAP distances in the Slingshot pseudotime calculation. Similar to t-SNE, UMAP distances have no real meaning and should only be used for visualization purposes. Instead, the authors should provide Slingshot the obtained PCA embeddings.
      • Dictys (PMID: 37537351) is a known GRN inference method that also can simulate gene expression but is missing in the benchmark, the authors should add it to the method comparison.
      • The current manuscript is not reproducible since it is missing the method's code, the code to reproduce the figures and the generated scRNA-seq data.
      • Authors claim that the method is scalable to a larger number of genes but no data is provided to back this claim. They should show how their method compares to others when using a different number of cells and number of genes.

      Minor points:

      • In the introduction, authors mention multimodal GRN inference methods but do not provide any references.
      • In Table 1, CellOracle is annotated as not being able to do multiple KO which is wrong. Additionally, the authors mention that IGNITE uses no prior knowledge which is not really true since it requires pseudotime ordering. The authors should add a column to Table 1 whether methods require pseudotime.
      • It is unclear what the dashed arrow of Fig1b means. Moreover, plotting gene expression values on top of UMAPs can be misleading, instead authors should plot the gene expression distributions binned by pseudotime.
      • The authors report a p-value of 1.04x10-171 which is below detection limit (see PMID: 30921532). Authors should change it to an interval such as p < 2.2×10-16.
      • To make CellOracle results easier to interpret and more comparable, authors should run it at the atlas level instead of at the cell type level, this way generating only one GRN. This can be achieved by assigning the same cluster label to all cells.
      • Experimental values in FigS3b seem to have been repeated and do not match the previous ones for IGNITE and SCODE.
      • It is unclear what the different circles mean in Fig5b.

      Significance

      This manuscript is an incremental and methodological work for specialized audiences. Its strengths are that the authors employ kinetic Ising model for GRN inference and that they provide a single framework capable of inferring, simulating and perturbing gene expression. The main limitations are that the claims should be better quantified and that the code and data need to be made accessible.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      Corridori and colleagues propose IGNITE, a novel method to recover Gene Regulatory Networks (GRN) from single cell RNA-sequencing (scRNA-seq) data. Their method solves the inverse Ising problem generating a cohort of candidate GRN optimising it to minimise the difference to the input expression matrix. Authors report the IGNITE is able to predict wild type data and simulate both single and multiple gene knockouts. Authors benchmark this method on a in-house data set of differentiating pluripotent stem cells (PSC). They focus on a small set of genes known to be involved in PSC differentiation into formative cells. Authors benchmark IGNITE against state of the art tools (SCODE, MaxEnt and CELLORACLE). They evaluate IGNITE ability to predict wild type gene expression by comparing their data with experimental data and with SCODE. They conclude the tool has generative capacity comparable with SCODE. They also evaluate IGNITE ability to recover known interactions with respect to other tools without finding it to significantly outperform them.

      Major comments

      • Are the key conclusions convincing?

      Conclusions appear convincing although model generalizability could be shown in a more thorough manner. For instance, analysing some other publicly available dataset could help demonstrate hyperparameters effects on GRN predictions and their robustness across different experiments. - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      Claims are well supported by data. - Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      I think the work would benefit from an additional benchmark on a different cellular system. This experiment would show how hyperparameters generalise across datasets and would provide potential users insights how to tweak them.

      Also, how does the model scale with the number of genes? A benchmark on computation time and resources required to infer GRN of growing size would be valuable in the adoption of this tool.

      In addition, I think the GRN comparison benchmark presented in section (3.4) would benefit from a quantitative discussion. Authors show inferred GRNs in Figure 4 and S5. For instance, measuring matrix similarity (when appropriate) would help understanding how predicted GRN compare. I understand authors attempt to do so by focusing on validated interactions and computing the fraction of correctly inferred interactions (FCI) but I think a measurement of the overall similarity (eg. Pearson correlation) would add on this.

      Another comment regards the dependency between Correlation Matrices Distance (CMD) and FCI, shown in Figure 5. I understand that IGNITE GRN that maximise FCI are not the same that minimise CMD. However, it looks like GRN that maximise FCI have higher value in terms of biological information. I wonder whether optimization for one or the other metric could be left to the end user as a tunable parameter.

      Authors should discuss why the expression of some genes does not follow the expected trends (Fig 1C vs Fig S1A). Out of the 24 genes they select for their analysis, at least four do not follow the expected trends: Sox2, according to literature, is a Naive gene, however, in Figure 1C its gene expression pattern is more similar to Formative late genes. Other genes with similar "unexpected" patterns are Zic3, Etv4 and Sall4.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      I think suggested experiments are doable as long as authors get publicly available data, i.e. the in-house dataset they generated for this study is enough to show applicability. For example datasets analysed in SCODE paper (https://doi.org/10.1093/bioinformatics/btx194) could be used as second benchmark. The point of applying the tool to another dataset is to show how it generalises across different biological systems, experiments and, potentially, sequencing technologies. - Are the data and the methods presented in such a way that they can be reproduced?

      The methods section is really clear. To enable reproducibility both raw scRNA-seq data, the IGNITE source code and code written to benchmark it should be released in the public domain in appropriate repositories (eg. ENA, GitHub, Binder etc). - Are the experiments adequately replicated and statistical analysis adequate?

      Yes.

      Minor comments

      • Specific experimental issues that are easily addressable.

      Related to the Sox2 expression pattern is the binarization shown in Figure 2D. How is it possible that Sox2 is always marked as active? Could the authors clarify how these outlier behaviours emerge and propose mitigation strategies, if any?

      In section 5.11.2 it is unclear if xi are in log scale or not. Since the model starts from binarized, log transformed expression values, should not generated ones be in the same scale as the input? - Are prior studies referenced appropriately?

      Yes, referencing is clear. - Are the text and figures clear and accurate?

      Yes, figures appear to be clear, readable and well documented both in captions and main text. - Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Section 3.3 could be improved by better describing experimental datasets. Only in the methods section it is clearly stated that experimental data for single KO experiments were retrieved from the literature.

      Check typesetting:

      • parenthesis missing in Eq. 1
      • Leftover $ in section 3.1
      • Parenthesis missing in Section 3.3
      • Misplaced comma in section 5.2.1

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      The paper presents a method to infer GRN from scRNA-seq data alone. Applications include GRN prediction and their perturbations. This paper represents a technical advance in the field as it is the first application of the inverse Ising problem GRN inference. - Place the work in the context of the existing literature (provide references, where appropriate).

      The paper itself presents the landscape of GRN inference tools using scRNA-seq data: SCODE, MaxEnt and CELLORACLE. More tools exist, for instance SCENIC (https://doi.org/10.1038/nmeth.4463) mainly relies on co-expression matrices. Other tools exist but require additional data types e.g. GRaNIE and GRaNPA (https://doi.org/10.15252/msb.202311627) leverage on physical interaction data (ATAC-seq, ChIP-seq). Similarly DeepFlyBrain uses deep neural networks to infer eGRN in Drosophila (https://doi.org/10.1038/s41586-021-04262-z). The value of tools like IGNITE and its competitors is that they do not require additional data types, which, in turn, helps in controlling experimental costs. - State what audience might be interested in and influenced by the reported findings.

      The paper might be of interest to biologists interested in regulation of gene expression. The tool might turn out to be useful in planning experimental work by guiding the choice of perturbations to introduce in experimental systems. - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      I am a computational biologist.

      I have no sufficient expertise to evaluate the mathematical details of the method.

    1. définition de cet humanisme. On est habitué à dire que le langage définit en quelque sorte l’humain. Or il me semble, suite aux urbanistes, mais aussi aux mythes fondateurs de nos cultures, que le propre de l’homme est aussi sa manière de façonner et d’habiter l’espace.

      Le code crée des architectures linguistiques, des unités de structures qui vont constituer des unités encore plus grandes (e.g. les réseaux sociaux)

    1. Author response:

      Reviewer #1 - Public Review

      This report describes work aiming to delineate multi-modal MRI correlates of psychopathology from a large cohort of children of 9-11 years from the ABCD cohort. While uni-modal characterisations have been made, the authors rightly argue that multi-modal approaches in imaging are vital to comprehensively and robustly capture modes of large-scale brain variation that may be associated with pathology. The primary analysis integrates structural and resting-state functional data, while post-hoc analyses on subsamples incorporate task and diffusion data. Five latent components (LCs) are identified, with the first three, corresponding to p-factor, internal/externalising, and neurodevelopmental Michelini Factors, described in detail. In addition, associations of these components with primary and secondary RSFC functional gradients were identified, and LCs were validated in a replication sample via assessment of correlations of loadings.

      1.1) This work is clearly novel and a comprehensive study of associations within this dataset. Multi-modal analyses are challenging to perform, but this work is methodologically rigorous, with careful implementation of discovery and replication assessments, and primary and exploratory analyses. The ABCD dataset is large, and behavioural and MRI protocols seem appropriate and extensive enough for this study. The study lays out comprehensive associations between MRI brain measures and behaviour that appear to recapitulate the established hierarchical structure of psychopathology.

      We thank Reviewer 1 for appreciating our methods and findings, and we address their suggestions below:

      1.2) The work does have weaknesses, some of them acknowledged. There is limited focus on the strength of observed associations. While the latent component loadings seem reliably reproducible in the behavourial domain, this is considerably less the case in the imaging modalities. A considerable proportion of statistical results focuses on spatial associations in loadings between modalities - it seems likely that these reflect intrinsic correlations between modalities, rather than associations specific to any latent component.

      We appreciate the Reviewer’s comment, and minimized the reporting of correlations between the loadings from the different modalities in the revised Results (specifically subsections on LC1, LC2, and LC3). We now refer to Table S4 in each subsection for this information: “Spatial correlations between modality-specific loadings are reported in Supplementary file 1c.”

      For completeness, we report the intrinsic correlations between the different modalities in Supplementary file 1c (P.19):

      “Lastly, although the current work aimed to reduce intrinsic correlations between variables within a given modality through running a PCA before the PLS approach, intrinsic correlations between measures and modalities may potentially be a remaining factor influencing the PLS solution. We, thus, provided an additional overview of the intrinsic correlations between the different neuroimaging data modalities in the supporting results (Supplementary file 1c).”

      1.3) Assessment of associations with functional gradients is similarly a little hard to interpret. Thus, it is hard to judge the implications for our understanding of the neurophysiological basis of psychopathology and the ability of MRI to provide clinical tools for, say, stratification.

      We now provide additional context, including a rising body of theoretical and empirical work, that outlines the value of functional gradients and cortical hierarchies in the understanding of brain development and psychopathology. Please see P.26.

      “Initially demonstrated at the level of intrinsic functional connectivity (Margulies et al., 2016), follow up work confirmed a similar cortical patterning using microarchitectural in-vivo MRI indices related to cortical myelination (Burt et al., 2018; Huntenburg et al., 2017; Paquola et al., 2019), post-mortem cytoarchitecture (Goulas et al., 2018; Paquola et al., 2020, 2019), or post-mortem microarray gene expression (Burt et al., 2018). Spatiotemporal patterns in the formation and maturation of large-scale networks have been found to follow a similar sensory-to-association axis; moreover, there is the emerging view that this framework may offer key insights into brain plasticity and susceptibility to psychopathology (Sydnor et al., 2021). In particular, the increased vulnerability of transmodal association cortices in late childhood and early adolescence has been suggested to relate to prolonged maturation and potential for plastic reconfigurations of these systems (Paquola et al., 2019; Park et al., 2022b). Between mid-childhood and early adolescence, heteromodal association systems such as the default network become progressively more integrated among distant regions, while being more differentiated from spatially adjacent systems, paralleling the development of cognitive control, as well as increasingly abstract and logical thinking. [...] This suggests that neurodevelopmental difficulties might be related to alterations in various processes underpinned by sensory and association regions, as well as the macroscale balance and hierarchy of these systems, in line with previous findings in several neurodevelopmental conditions, including autism, schizophrenia, as well as epilepsy, showing a decreased differentiation between the two anchors of this gradient (Hong et al., 2019). In future work, it will be important to evaluate these tools for diagnostics and population stratification. In particular, the compact and low dimensional perspective of gradients may provide beneficial in terms of biomarker reliability as well as phenotypic prediction, as previously demonstrated using typically developing cohorts (Hong et al. 2020) On the other hand, it will be of interest to explore in how far alterations in connectivity along sensory-to-transmodal hierarchies provide sufficient graduality to differentiate between specific psychopathologies, or whether they, as the current work suggests, mainly reflect risk for general psychopathology and atypical development.”

      1.4) The observation of a recapitulation of psychopathology hierarchy may be somewhat undermined by the relatively modest strength of the components in the imaging domain.

      We thank the Reviewer for this comment, and now expressed this limitation in the revised Discussion, P.23.

      “The p factor, internalizing, externalizing, and neurodevelopmental dimensions were each associated with distinct morphological and intrinsic functional connectivity signatures, although these relationships varied in strength.”

      1.5) The task fMRI was assessed with a fairly basic functional connectivity approach, not using task timings to more specifically extract network responses.

      In the revised Discussion on P.24, we acknowledge that more in-depth analyses of task-based fMRI may have offered additional insights into state-dependent changes in functional architecture.

      “While the current work derived main imaging signatures from resting-state fMRI as well as grey matter morphometry, we could nevertheless demonstrate associations to white matter architecture (derived from diffusion MRI tractography) and recover similar dimensions when using task-based fMRI connectivity. Despite subtle variations in the strength of observed associations, the latter finding provided additional support that the different behavioral dimensions of psychopathology more generally relate to alterations in functional connectivity. Given that task-based fMRI data offers numerous avenues for analytical exploration, our findings may motivate follow-up work assessing associations to network- and gradient-based response strength and timing with respect to external stimuli across different functional states.”

      1.6) Overall, the authors achieve their aim to provide a detailed multimodal characterisation of MRI correlations of psychopathology. Code and data are available and well organised and should provide a valuable resource for researchers wanting to understand MRI-based neural correlates of psycho-pathology-related behavioural traits in this important age group. It is largely a descriptive study, with comparisons to previous uni-modal work, but without particularly strong testing of neuroscience hypotheses.

      We thank the Reviewer for recognizing the detail and rigor of data-driven study and extensive code and data documentation.

      Reviewer #2 - Public Review

      In "Multi-modal Neural Correlates of Childhood Psychopathology" Krebets et al. integrate multi-modal neuroimaging data using machine learning to delineate dissociable links to diverse dimensions of psychopathology in the ABCD sample. This paper had numerous strengths including a superb use of a large resource dataset, appropriate analyses, beautiful visualizations, clear writing, and highly interpretable results from a data-driven analysis. Overall, I think it would certainly be of interest to a general readership. That being said, I do have several comments for the authors to consider.

      We thank Dr Satterthwaite for the positive evaluation and helpful comments.

      2.1) Out-of-sample testing: while the permutation testing procedure for the PLS is entirely appropriate, without out-of-sample testing the reported effect sizes are likely inflated.

      As discussed in the editorial summary of essential revisions, we agree that out-of-sample prediction indeed provides stronger estimates of generalizability. We assess this by applying the PCA coefficients derived from the discovery cohort imaging data to the replication cohort imaging data. The resulting PCA scores and behavioral data were then z-scored using the mean and standard deviation of the replication cohort. The SVD weights derived from the discovery cohort were applied to the normalized replication cohort data to derive imaging and behavioral composite scores, which were used to recover the contribution of each imaging and behavioral variable to the LCs (i.e., loadings). Out-of-sample replicability of imaging (mean r=0.681, S.D.=0.131) and behavioral (mean r=0.948, S.D.=0.022) loadings was generally high across LCs 1-5. This analysis is reported in the revised manuscript (P.18).

      “Generalizability of reported findings was also assessed by directly applying PCA coefficients and latent components weights from the PLS analysis performed in the discovery cohort to the replication sample data. Out-of-sample prediction was overall high across LCs1-5 for both imaging (mean r=0.681, S.D.=0.131) and behavioral (mean r=0.948, S.D.=0.022) loadings.”

      2.2) Site/family structure: it was unclear how site/family structure were handled as covariates.

      Only unrelated participants were included in discovery and replication samples (see P.6). The site variable was regressed out of the imaging and behavioral data prior to the PLS analysis using the residuals from a multiple linear model which also included age, age2, sex, and ethnicity. This is now clarified on P.29:

      “Prior to the PLS analysis, effects of age, age2, sex, site, and ethnicity were regressed out from the behavioral and imaging data using a multiple linear regression to ensure that the LCs would not be driven by possible confounders (Kebets et al., 2021, 2019; Xia et al., 2018). The imaging and behavioral residuals of this procedure were input to the PLS analysis.”

      2.3) Anatomical features: I was a bit surprised to see volume, surface area, and thickness all evaluated - and that there were several comments on the correspondence between the SA and volume in the results section. Given that cortical volume is simply a product of SA and CT (and mainly driven by SA), this result may be pre-required.

      As suggested, we reduced the reporting of correlations between the loadings from the different modalities in the revised Results (specifically subsections on LC1, LC2, and LC3). Instead, we now refer to Table S4 in each subsection for this information: “Spatial correlations between modality-specific loadings are reported in Supplementary file 1c.”

      We also reran the PLS analysis while only including thickness and surface area as our structural metrics, to account for potential redundancy of these measures with volume. This analysis and associated findings are reported on P.36 and P.19:

      “As cortical volume is a result of both thickness and surface area, we repeated our main PLS analysis while excluding cortical volume from our imaging metrics and report the consistency of these findings with our main model.”

      “Third, to account for redundancy within structural imaging metrics included in our main PLS model (i.e., cortical volume is a result of both thickness and surface area), we also repeated our main analysis while excluding cortical volume from our imaging metrics. Findings were very similar to those in our main analysis, with an average absolute correlation of 0.898±0.114 across imaging composite scores of LCs 1-5.”

      2.4) Ethnicity: the rationale for regressing ethnicity from the data was unclear and may conflict with current best practices.

      We thank the Reviewer for this comment. In light of recent discussions on including this covariate in large datasets such as ABCD (e.g., Saragosa-Harris et al., 2022), we elaborate on our rationale for including this variable in our model in the revised manuscript on P.30:

      “Of note, the inclusion of ethnicity as a covariate in imaging studies has been recently called into question. In the present study, we included this variable in our main model as a proxy for social inequalities relating to race and ethnicity alongside biological factors (age, sex) with documented effects on brain organization and neurodevelopmental symptomatology queried in the CBCL.”

      We also assess the replicability of our analyses when removing race and ethnicity covariates prior to computing the PLS analysis and correlating imaging and behavioral composite scores across both models. We report resulting correlations in the revised manuscript (P.37, 19, and 27):

      “We also assessed the replicability of our findings when removing race and ethnicity covariates prior to computing the PLS analysis and correlating imaging and behavioral composite scores across both models.”

      “Moreover, repeating the PLS analysis while excluding this variable as a model covariate yielded overall similar imaging and behavioral composites scores across LCs to our original analysis. Across LCs 1-5, the average absolute correlations reached r=0.636±0.248 for imaging composite scores, and r=0.715±0.269 for behavioral composite scores. Removing these covariates seemed to exert stronger effects on LC3 and LC4 for both imaging and behavior, as lower correlations across models were specifically observed for these components.”

      “Although we could consider some socio-demographic variables and proxies of social inequalities relating to race and ethnicity as covariates in our main model, the relationship of these social factors to structural and functional brain phenotypes remains to be established with more targeted analyses.”

      2.5) Data quality: the authors did an admirable job in controlling for data quality in the analyses of functional connectivity data. However, it is unclear if a comparable measure of data quality was used for the T1/dMRI analyses. This likely will result in inflated effect sizes in some cases; it has the potential to reduce sensitivity to real effects.

      We agree that data quality was not accounted for in our analysis of T1w- and diffusion-derived metrics. We now accounted for T1w image quality by adding manual quality control ratings to the regressors applied to all structural imaging metrics prior to performing the PLS analysis, and reported the consistency of this new model with original findings. See P.36, P.19:

      “We also considered manual quality control ratings as a measure of T1w scan quality. This metric was included as a covariate in a multiple linear regression model accounting for potential confounds in the structural imaging data, in addition to age, age2, sex, site, ethnicity, ICV, and total surface area. Downstream PLS results were then benchmarked against those obtained from our main model.”

      “Considering scan quality in T1w-derived metrics (from manual quality control ratings) yielded similar results to our main analysis, with an average correlation of 0.986±0.014 across imaging composite scores.”

      As for diffusion imaging, we also regressed out effects of head motion in addition to age, age2, sex, site, and ethnicity from FA and MD measures and reported the consistency with our original results (P.36, P.19):

      “We tested another model which additionally included head motion parameters as regressors in our analyses of FA and MD measures, and assessed the consistency of findings from both models.”

      “Additionally considering head motion parameters from diffusion imaging metrics in our model yielded consistent results to those in our main analyses (mean r=0.891, S.D.=0.103; r=0.733-0.998).”

      Reviewer #3 - Public Review

      In this study, the authors utilized the Adolescent Brain Cognitive Development dataset to investigate the relationship between structural and functional brain network patterns and dimensions of psychopathology. They identified multiple components, including a general psychopathology (p) factor that exhibited a strong association with multimodal imaging features. The connectivity signatures associated with the p factor and neurodevelopmental dimensions aligned with the sensory-to-transmodal axis of cortical organization, which is linked to complex cognition and psychopathology risk. The findings were consistent across two separate subsamples and remained robust when accounting for variations in analytical parameters, thus contributing to a better understanding of the biological mechanisms underlying psychopathology dimensions and offering potential brain-based vulnerability markers.

      3.1) An intriguing aspect of this study is the integration of multiple neuroimaging modalities, combining structural and functional measures, to comprehensively assess the covariance with various symptom combinations. This approach provides a multidimensional understanding of the risk patterns associated with mental illness development.

      We thank the Reviewer for acknowledging the multimodal approach, and for the constructive suggestions.

      3.2) The paper delves deeper into established behavioral latent variables such as the p factor, internalizing, externalizing, and neurodevelopmental dimensions, revealing their distinct associations with morphological and intrinsic functional connectivity signatures. This sheds light on the neurobiological underpinnings of these dimensions.

      We are happy to hear the Reviewer appreciates the gain in understanding neural underpinnings of dimensions of psychopathology resulting from the current work.

      3.3) The robustness of the findings is a notable strength, as they were validated in a separate replication sample and remained consistent even when accounting for different parameter variations in the analysis methodology. This reinforces the generalizability and reliability of the results.

      We appreciate that the Reviewer found our robustness and generalizability assessment convincing.

      3.4) Based on their findings, the authors suggest that the observed variations in resting-state functional connectivity may indicate shared neurobiological substrates specific to certain symptoms. However, it should be noted that differences in resting-state connectivity between groups can stem from various factors, as highlighted in the existing literature. For instance, discrepancies in the interpretation of instructions during the resting state scan can influence the results. Hence, while their findings may indicate biological distinctions, they could also reflect differences in behavior.

      For the ABCD dataset, resting-state fMRI scans were based on eyes open and passive viewing of a crosshair, and are thus homogenized. We acknowledge, however, that there may still be state-to-state fluctuations contributing to the findings, and this is now discussed in the revised Discussion, on P.28. Note, however, that prior literature has generally also suggested rather modest impacts of cognitive and daily variation on resting-state functional networks, compared to much more dominating inter-individual and inter-group factors.

      “Finally, while prior research has shown that resting-state fMRI networks may be affected by differences in instructions and study paradigm (e.g., with respect to eyes open vs closed) (Agcaoglu et al., 2019), the resting-state fMRI paradigm is homogenized in the ABCD study to be passive viewing of a centrally presented fixation cross. It is nevertheless possible that there were slight variations in compliance and instructions that contributed to differences in associated functional architecture. Notably, however, there is a mounting literature based on high-definition fMRI acquisitions suggesting that functional networks are mainly dominated by common organizational principles and stable individual features, with substantially more modest contributions from task-state variability (Gratton et al. 2018). These findings, thus, suggest that resting-state fMRI markers can serve as powerful phenotypes of psychiatric conditions, and potential biomarkers (Abraham et al., 2017; Gratton et al., 2020; Parkes et al., 2020).”

      3.5) The authors conducted several analyses to investigate the relationship between imaging loadings associated with latent components and the principal functional gradient. They found several associations between principal gradient scores and both within- and between-network resting-state functional connectivity (RSFC) loadings. Assessing the analysis presented here proves challenging due to the nature of relating loadings, which are partly based on the RSFC, to gradients derived from RSFC. Consequently, a certain level of correlation between these two variables would be expected, making it difficult to determine the significance of the authors' findings. It would be more intriguing if a direct correlation between the composite scores reflecting behavior and the gradients were to yield statistically significant results.

      We thank the Reviewer for the comment, and agree that investigating gradient-behavior relationships could offer additional insights into the neural basis of psychiatric symptomatology. However, the current analysis pipeline precludes this direct comparison which is performed on a region-by-region basis across the span of the cortical gradient. Indeed, the behavioral loadings are provided for each CBCL item, and not cortical regions.

      The Reviewer also evokes concerns of potential circularity in our analysis, as we compared imaging loadings, which are partially based on RSFC, and gradient values generated from the same RSFC data. In response to this comment, we cross-validated our findings using an RSFC gradient derived from an independent dataset (HCP), showing highly consistent findings to those presented in the manuscript. This correlation is now reported in the Results section P.15.

      “A similar pattern of findings was observed when cross-validating between- and within-network RSFC loadings to a RSFC gradient derived from an independent dataset (HCP), with strongest correlations seen for between-network RSFC loadings for LC1 and LC3 (LC1: r=0.50, pspin<0.001; LC3: r=0.37, pspin<0.001).”

      We furthermore note similar correlations between imaging loadings and T1w/T2w ratio in the same participants, a proxy of intracortical microstructure and hierarchy (Glasser et al., 2011). These findings are now detailed in the revised Results, P.15-16:

      “Of note, we obtain similar correlations when using T1w/T2w ratio in the same participants, a proxy of intracortical microstructure and hierarchy (Glasser et al., 2011). Specifically, we observed the strongest association between this microstructural marker of the cortical hierarchy and between-network RSFC loadings related to LC1 (r=-0.43, pspin<0.001).”

      3.6) Lastly, regarding the interpretation of the first identified latent component, I have some reservations. Upon examining the loadings, it appears that LC1 primarily reflects impulse control issues rather than representing a comprehensive p-factor. Furthermore, it is worth noting that within the field, there is an ongoing debate concerning the interpretation and utilization of the p-factor. An insightful publication on this topic is "The p factor is the sum of its parts, for now" (Fried et al, 2021), which explains that the p-factor emerges as a result of a positive manifold, but it does not necessarily provide insights into the underlying mechanisms that generated the data.

      We thank the Reviewer for this comment, and added greater nuance into the discussion of the association to the p factor. We furthermore discuss some of the ongoing debate about the use of the p factor, and cite the recommended publication on P.27.

      “Other factors have also been suggested to impact the development of psychopathology, such as executive functioning deficits, earlier pubertal timing, negative life events (Brieant et al., 2021), maternal depression, or psychological factors (e.g., low effortful control, high neuroticism, negative affectivity). Inclusion of such data could also help to add further mechanistic insights into the rather synoptic proxy measure of the p factor itself (Fried et al., 2021), and to potentially assess shared and unique effects of the p factor vis-à-vis highly correlated measures of impulse control.”

    2. Reviewer #1 (Public Review):

      This report describes work aiming to delineate multi-modal MRI correlates of psychopathology from a large cohort of children of 9-11 years from the ABCD cohort. While uni-modal characterisations have been made, the authors rightly argue that multi-modal approaches in imaging are vital to comprehensively and robustly capture modes of large-scale brain variation that may be associated with pathology. The primary analysis integrates structural and resting-state functional data, while post-hoc analyses on subsamples incorporate task and diffusion data. Five latent components (LCs) are identified, with the first three, corresponding to p-factor, internal/externalising, and neurodevelopmental Michelini Factors, described in detail. In addition, associations of these components with primary and secondary RSFC functional gradients were identified, and LCs were validated in a replication sample via assessment of correlations of loadings.

      This work is clearly novel and a comprehensive study of associations within this dataset. Multi-modal analyses are challenging to perform, but this work is methodologically rigorous, with careful implementation of discovery and replication assessments, and primary and exploratory analyses. The ABCD dataset is large, and behavioural and MRI protocols seem appropriate and extensive enough for this study. The study lays out comprehensive associations between MRI brain measures and behaviour that appear to recapitulate the established hierarchical structure of psychopathology.

      The work does have weaknesses, some of them acknowledged. There is limited focus on the strength of observed associations. While the latent component loadings seem reliably reproducible in the behavourial domain, this is considerably less the case in the imaging modalities. A considerable proportion of statistical results focuses on spatial associations in loadings between modalities - it seems likely that these reflect intrinsic correlations between modalities, rather than associations specific to any latent component. Assessment of associations with functional gradients is similarly a little hard to interpret. Thus, it is hard to judge the implications for our understanding of the neurophysiological basis of psychopathology and the ability of MRI to provide clinical tools for, say, stratification. The observation of a recapitulation of psychopathology hierarchy may be somewhat undermined by the relatively modest strength of the components in the imaging domain. The task fMRI was assessed with a fairly basic functional connectivity approach, not using task timings to more specifically extract network responses.

      Overall, the authors achieve their aim to provide a detailed multimodal characterisation of MRI correlations of psychopathology. Code and data are available and well organised and should provide a valuable resource for researchers wanting to understand MRI-based neural correlates of psycho-pathology-related behavioural traits in this important age group. It is largely a descriptive study, with comparisons to previous uni-modal work, but without particularly strong testing of neuroscience hypotheses.

    1. Welcome back and welcome to this CloudTrail demo where we're going to set up an organizational trail and configure it to log data for all accounts in our organization to S3 and CloudWatch logs.

      The first step is that you'll need to be logged into the IAM admin user of the management account of the organization. As a reminder, this is the general account. To set up an organizational trail, you always need to be logged into the management account. To set up individual trails, you can do that locally inside each of your accounts, but it's always more efficient to use an organizational trail.

      Now, before we start the demonstration, I want to talk briefly about CloudTrail pricing. I'll make sure this link is in the lesson description, but essentially there is a fairly simple pricing structure to CloudTrail that you need to be aware of.

      The 90-day history that's enabled by default in every AWS account is free. You don't get charged for that; it comes free by default with every AWS account. Next, you have the ability to get one copy of management events free in every region in each AWS account. This means creating one trail that's configured for management events in each region in each AWS account, and that comes for free. If you create any additional trails, so you get any additional copies of management events, they are charged at two dollars per 100,000 events. That won't apply to us in this demonstration, but you need to be aware of that if you're using this in production.

      Logging data events comes at a charge regardless of the number, so we're not going to enable data events for this demo lesson. But if you do enable it, then that comes at a charge of 10 cents per 100,000 events, irrespective of how many trails you have. This charge applies from the first time you're logging any data events.

      What we'll be doing in this demo lesson is setting up an organizational trail which will create a trail in every region in every account inside the organization. But because we get one for free in every region in every account, we won't incur any charges for the CloudTrail side of things. We will be charged for any S3 storage that we use. However, S3 also comes with a free tier allocation for storage, which I don't expect us to breach.

      With that being said, let's get started and implement this solution. To do that, we need to be logged in to the console UI again in the management account of the organization. Then we need to move to the CloudTrail console. If you've been here recently, it will be in the Recently Visited Services. If not, just type CloudTrail in the Find Services box and then open the CloudTrail console.

      Once you're at the console, you might see a screen like this. If you do, then you can just click on the hamburger menu on the left and then go ahead and click on trails. Now, depending on when you're doing this demo, if you see any warnings about a new or old console version, make sure that you select the new version so your console looks like what's on screen now.

      Once you're here, we need to create a trail, so go ahead and click on create trail. To create a trail, you're going to be asked for a few important pieces of information, the first of which is the trail name. For trail name, we're going to use "animals4life.org," so just go ahead and enter that. By default, with this new UI version, when you create a trail, it's going to create it in all AWS regions in your account. If you're logged into the management account of the organization, as we are, you also have the ability to enable it for all regions in all accounts of your organization. We're going to do that because this allows us to have one single logging location for all CloudTrail logs in all regions in all of our accounts, so go ahead and check this box.

      By default, CloudTrail stores all of its logs in an S3 bucket. When you're creating a trail, you have the ability to either create a new S3 bucket to use or you can use an existing bucket. We're going to go ahead and create a brand new bucket for this trail. Bucket names within S3 need to be globally unique, so it needs to be a unique name across all regions and across all AWS accounts. We're going to call this bucket starting with "CloudTrail," then a hyphen, then "animals-for-life," another hyphen, and then you'll need to put a random number. You’ll need to pick something different from me and different from every other student doing this demo. If you get an error about the bucket name being in use, you just need to change this random number.

      You're also able to specify if you want the log files stored in the S3 bucket to be encrypted. This is done using SSE-KMS encryption. This is something that we'll be covering elsewhere in the course, and for production usage, you would definitely want to use it. For this demonstration, to keep things simple, we're not going to encrypt the log files, so go ahead and untick this box.

      Under additional options, you're able to select log file validation, which adds an extra layer of security. This means that if any of the log files are tampered with, you have the ability to determine that. This is a really useful feature if you're performing any account-level audits. In most production situations, I do enable this, but you can also elect to have an SNS notification delivery. So, every time log files are delivered into this S3 bucket, you can have a notification. This is useful for production usage or if you need to integrate this with any non-AWS systems, but for this demonstration, we'll leave this one unchecked.

      You also have the ability, as well as storing these log files into S3, to store them in CloudWatch logs. This gives you extra functionality because it allows you to perform searches, look at the logs from a historical context inside the CloudWatch logs user interface, as well as define event-driven processes. You can configure CloudWatch logs to scan these CloudTrail logs and, in the event that any particular piece of text occurs in the logs (e.g., any API call, any actions by a user), you can generate an event that can invoke, for example, a Lambda function or spawn some other event-driven processing. Don't worry if you don't understand exactly what this means at this point; I'll be talking about all of this functionality in detail elsewhere in the course. For this demonstration, we are going to enable CloudTrail to put these logs into CloudWatch logs as well, so check this box. You can choose a log group name within CloudWatch logs for these CloudTrail logs. If you want to customize this, you can, but we're going to leave it as the default.

      As with everything inside AWS, if a service is acting on our behalf, we need to give it the permissions to interact with other AWS services, and CloudTrail is no exception. We need to give CloudTrail the ability to interact with CloudWatch logs, and we do that using an IAM role. Don’t worry, we’ll be talking about IAM roles in detail elsewhere in the course. For this demonstration, just go ahead and select "new" because we're going to create a new IAM role that will give CloudTrail the ability to enter data into CloudWatch logs.

      Now we need to provide a role name, so go ahead and enter "CloudTrail_role_for_CloudWatch_logs" and then an underscore and then "animals_for_life." The name doesn’t really matter, but in production settings, you'll want to make sure that you're able to determine what these roles are for, so we’ll use a standard naming format. If you expand the policy document, you'll be able to see the exact policy document or IAM policy document that will be used to give this role the permissions to interact with CloudWatch logs. Don’t worry if you don’t fully understand policy documents at this point; we’ll be using them throughout the course, and over time you'll become much more comfortable with exactly how they're used. At a high level, this policy document will be attached to this role, and this is what will give CloudTrail the ability to interact with CloudWatch logs.

      At this point, just scroll down; that's everything that we need to do, so go ahead and click on "next." Now, you'll need to select what type of events you want this trail to log. You’ve got three different choices. The default is to log only management events, so this logs any events against the account or AWS resources (e.g., starting or stopping an EC2 instance, creating or deleting an EBS volume). You've also got data events, which give you the ability to log any actions against things inside resources. Currently, CloudTrail supports a wide range of services for data event logging. For this demonstration, we won't be setting this up with data events initially because I’ll be covering this elsewhere in the course. So, go back to the top and uncheck data events.

      You also have the ability to log insight events, which can identify any unusual activity, errors, or user behavior on your account. This is especially useful from a security perspective. For this demonstration, we won’t be logging any insight events; we’re just going to log management events. For management events, you can further filter down to read or write or both and optionally exclude KMS or RDS data API events. For this demo lesson, we’re just going to leave it as default, so make sure that read and write are checked. Once you've done that, go ahead and click on "next." On this screen, just review everything. If it all looks good, click on "create trail."

      Now, if you get an error saying the S3 bucket already exists, you'll just need to choose a new bucket name. Click on "edit" at the top, change the bucket name to something that's globally unique, and then follow that process through again and create the trail.

      Certainly! Here is the continuation and completion of the transcript:


      After a few moments, the trail will be created. It should say "US East Northern Virginia" as the home region. Even though you didn't get the option to select it because it's selected by default, it is a multi-region trail. Finally, it is an organizational trail, which means that this trail is now logging any CloudTrail events from all regions in all accounts in this AWS organization.

      Now, this isn't real-time, and when you first enable it, it can take some time for anything to start to appear in either S3 or CloudWatch logs. At this stage, I recommend that you pause the video and wait for 10 to 15 minutes before continuing, because the initial delivery of that first set of log files through to S3 can take some time. So pause the video, wait 10 to 15 minutes, and then you can resume.

      Next, right-click the link under the S3 bucket and open that in a new tab. Go to that tab, and you should start to see a folder structure being created inside the S3 bucket. Let's move down through this folder structure, starting with CloudTrail. Go to US East 1 and continue down through this folder structure.

      In my case, I have quite a few of these log files that have been delivered already. I'm going to pick one of them, the most recent, and just click on Open. Depending on the browser that you're using, you might have to download and then uncompress this file. Because I'm using Firefox, it can natively open the GZ compressed file and then automatically open the JSON log file inside it.

      So this is an example of a CloudTrail event. We're able to see the user identity that actually generates this event. In this case, it's me, I am admin. We can see the account ID that this event is for. We can see the event source, the event name, the region, the source IP address, the user agent (in this case, the console), and all of the relevant information for this particular interaction with the AWS APIs are logged inside this CloudTrail event.

      Don’t worry if this doesn’t make a lot of sense at this point. You’ll get plenty of opportunities to interact with this type of logging event as you go through the various theory and practical lessons within the course. For now, I just want to highlight exactly what to expect with CloudTrail logs.

      Since we’ve enabled all of this logging information to also go into CloudWatch logs, we can take a look at that as well. So back at the CloudTrail console, if we click on Services and then type CloudWatch, wait for it to pop up, locate Logs underneath CloudWatch, and then open that in a new tab.

      Inside CloudWatch, on the left-hand menu, look for Logs, and then Log Groups, and open that. You might need to give this a short while to populate, but once it does, you should see a log group for the CloudTrail that you’ve just created. Go ahead and open that log group.

      Inside it, you’ll see a number of log streams. These log streams will start with your unique organizational code, which will be different for you. Then there will be the account number of the account that it represents. Again, these will be different for you. And then there’ll be the region name. Because I’m only interacting with the Northern Virginia region, currently, the only ones that I see are for US East 1.

      In this particular account that I’m in, the general account of the organization, if I look at the ARN (Amazon Resource Name) at the top or after US East 1 here, this number is my account number. This is the account number of my general account. So if I look at the log streams, you’ll be able to see that this account (the general account) matches this particular log stream. You’ll be able to do the same thing in your account. If you look for this account ID and then match it with one of the log streams, you'll be able to pull the logs for the general AWS account.

      If I go inside this particular log stream, as CloudTrail logs any activity in this account, all of that information will be populated into CloudWatch logs. And that’s what I can see here. If I expand one of these log entries, we’ll see the same formatted CloudTrail event that I just showed you in my text editor. So the only difference when using CloudWatch logs is that the CloudTrail events also get entered into a log stream in a log group within CloudWatch logs. The format looks very similar.

      Returning to the CloudTrail console, one last thing I want to highlight: if you expand the menu on the left, whether you enable a particular trail or not, you’ve always got access to the event history. The event history stores a log of all CloudTrail events for the last 90 days for this particular account, even if you don’t have a specific trail enabled. This is standard functionality. What a trail allows you to do is customize exactly what happens to that data. This area of the console, the event history, is always useful if you want to search for a particular event, maybe check who’s logged onto the account recently, or look at exactly what the IAM admin user has been doing within this particular AWS account.

      The reason why we created a trail is to persistently store that data in S3 as well as put it into CloudWatch logs, which gives us that extra functionality. With that being said, that’s everything I wanted to cover in this demo lesson.

      One thing you need to be aware of is that S3, as a service, provides a certain amount of resource under the free tier available in every new AWS account, so you can store a certain amount of data in S3 free of charge. The problem with CloudTrail, and especially organizational trails, is that they generate quite a large number of requests. There is also, in addition to space, a number of requests per month that are part of the free tier.

      If you leave this CloudTrail enabled for the duration of your studies, for the entire month, it is possible that this will go slightly over the free tier allocation for requests within the S3 service. You might see warnings that you’re approaching a billable threshold, and you might even get a couple of cents of bill per month if you leave this enabled all the time. To avoid that, if you just go to Trails, open up the trail that you’ve created, and then click on Stop Logging. You’ll need to confirm that by clicking on Stop Logging, and at that point, no logging will occur into the S3 bucket or into CloudWatch logs, and you won’t experience those charges.

      For any production usage, the low cost of this service means that you would normally leave it enabled in all situations. But to keep costs within the free tier for this course, you can, if required, just go ahead and stop the logging. If you don’t mind a few cents per month of S3 charges for CloudTrail, then by all means, go ahead and leave it enabled.

      With that being said, that’s everything I wanted to cover in this demo lesson. So go ahead, complete the lesson, and when you're ready, I look forward to you joining me in the next.

    1. Welcome to this lesson, where I'm going to introduce the theory and architecture of CloudWatch Logs.

      I've already covered the metrics side of CloudWatch earlier in the course, and I'm covering the logs part now because you'll be using it when we cover CloudTrail. In the CloudTrail demo, we'll be setting up CloudTrail and using CloudWatch Logs as a destination for those logs. So, you'll need to understand it, and we'll be covering the architecture in this lesson. Let's jump in and get started.

      CloudWatch Logs is a public service. The endpoint to which applications connect is hosted in the AWS public zone. This means you can use the product within AWS VPCs, from on-premises environments, and even other cloud platforms, assuming that you have network connectivity as well as AWS permissions.

      The CloudWatch Logs product allows you to store, monitor, and access logging data. Logging data, at a very basic level, consists of a piece of information, data, and a timestamp. The timestamp generally includes the year, month, day, hour, minute, second, and timezone. There can be more fields, but at a minimum, it's generally a timestamp and some data.

      CloudWatch Logs has built-in integrations with many AWS services, including EC2, VPC Flow Logs, Lambda, CloudTrail, Route 53, and many more. Any services that integrate with CloudWatch Logs can store data directly inside the product. Security for this is generally provided by using IAM roles or service roles.

      For anything outside AWS, such as logging custom application or OS logs on EC2, you can use the unified CloudWatch agent. I’ve mentioned this before and will be demoing it later in the EC2 section of the course. This is how anything outside of AWS products and services can log data into CloudWatch Logs. So, it’s either AWS service integrations or the unified CloudWatch agent. There is a third way, using development kits for AWS to implement logging into CloudWatch Logs directly into your application, but that tends to be covered in developer and DevOps AWS courses. For now, just remember either AWS service integrations or the unified CloudWatch agent.

      CloudWatch Logs are also capable of taking logging data and generating a metric from it, known as a metric filter. Imagine a situation where you have a Linux instance, and one of the operating system log files logs any failed connection attempts via SSH. If this logging information was injected into CloudWatch Logs, a metric filter can scan those logs constantly. Anytime it sees a mention of the failed SSH connection, it can increment a metric within CloudWatch. You can then have alarms based on that metric, and I’ll be demoing that very thing later in the course.

      Let’s look at the architecture visually because I'll be showing you how this works in practice in the CloudTrail demo, which will be coming up later in the section. Architecturally, CloudWatch Logs looks like this: It’s a regional service. So, for this example, let’s assume we’re talking about us-east-1.

      The starting point is our logging sources, which can include AWS products and services, mobile or server-based applications, external compute services (virtual or physical servers), databases, or even external APIs. These sources inject data into CloudWatch Logs as log events.

      Log events consist of a timestamp and a message block. CloudWatch Logs treats this message as a raw block of data. It can be anything you want, but there are ways the data can be interpreted, with fields and columns defined. Log events are stored inside log streams, which are essentially a sequence of log events from the same source.

      For example, if you had a log file stored on multiple EC2 instances that you wanted to inject into CloudWatch Logs, each log stream would represent the log file for one instance. So, you’d have one log stream for instance one and one log stream for instance two. Each log stream is an ordered set of log events for a specific source.

      We also have log groups, which are containers for multiple log streams of the same type of logging. Continuing the example, we would have one log group containing everything for that log file. Inside this log group would be different log streams, each representing one source. Each log stream is a collection of log events. Every time an item was added to the log file on a single EC2 instance, there would be one log event inside one log stream for that instance.

      A log group also stores configuration settings, such as retention settings and permissions. When we define these settings on a log group, they apply to all log streams within that log group. It’s also where metric filters are defined. These filters constantly review any log events for any log streams in that log group, looking for certain patterns, such as an application error code or a failed SSH login. When detected, these metric filters increment a metric, and metrics can have associated alarms. These alarms can notify administrators or integrate with AWS or external systems to take action.

      CloudWatch Logs is a powerful product. This is the high-level architecture, but don’t worry—you’ll get plenty of exposure to it throughout the course because many AWS products integrate with CloudWatch Logs and use it to store their logging data. We’ll be coming back to this product time and again as we progress through the course. CloudTrail uses CloudWatch Logs, Lambda uses CloudWatch Logs, and VPC Flow Logs use CloudWatch Logs. There are many examples of AWS products where we’ll be integrating them with CloudWatch Logs.

      I just wanted to introduce it at this early stage of the course. That’s everything I wanted to cover in this theory lesson. Thanks for watching. Go ahead, complete this video, and when you’re ready, join me in the next.

    1. Welcome back, and in this demo lesson, I want to give you some experience working with Service Control Policies (SCPs).

      At this point, you've created the AWS account structure which you'll be using for the remainder of the course. You've set up an AWS organization, with the general account that created it becoming the management account. Additionally, you've invited the production AWS account into the organization and created the development account within it.

      In this demo lesson, I want to show you how you can use SCPs to restrict what identities within an AWS account can do. This is a feature of AWS Organizations.

      Before we dive in, let's tidy up the AWS organization. Make sure you're logged into the general account, the management account of the organization, and then navigate to the organization's console. You can either type that into the 'Find Services' box or select it from 'Recently Used Services.'

      As discussed in previous lessons, AWS Organizations allows you to organize accounts with a hierarchical structure. Currently, there's only the root container of the organization. To create a hierarchical structure, we need to add some organizational units. We will create a development organizational unit and a production organizational unit.

      Select the root container at the top of the organizational structure. Click on "Actions" and then "Create New." For the production organizational unit, name it 'prod.' Scroll down and click on "Create Organizational Unit." Next, do the same for the development unit: select 'Route,' click on "Actions," and then "Create New." Under 'Name,' type 'dev,' scroll down, and click on "Create Organizational Unit."

      Now, we need to move our AWS accounts into these relevant organizational units. Currently, the Development, Production, and General accounts are all contained in the root container, which is the topmost point of our hierarchical structure.

      To move the accounts, select the Production AWS account, click on "Actions," and then "Move." In the dialogue that appears, select the Production Organizational Unit and click "Move." Repeat this process for the Development AWS account: select the Development AWS account, click "Actions," then "Move," and select the 'dev' OU before clicking "Move."

      Now, we've successfully moved the two AWS accounts into their respective organizational units. If you select each organizational unit in turn, you can see that 'prod' contains the production AWS account, and 'dev' contains the development AWS account. This simple hierarchical structure is now in place.

      To prepare for the demo part of this lesson where we look at SCPs, move back to the AWS console. Click on AWS, then the account dropdown, and switch roles into the production AWS account by selecting 'Prod' from 'Role History.'

      Once you're in the production account, create an S3 bucket. Type S3 into the 'Find Services' box or find it in 'Recently Used Services' and navigate to the S3 console. Click on "Create Bucket." For the bucket name, call it 'CatPics' followed by a random number—S3 bucket names must be globally unique. I’ll use 1, lots of 3s, and then 7. Ensure you select the US East 1 region for the bucket. Scroll down and click "Create Bucket."

      After creating the bucket, go inside it and upload some files. Click on "Add Files," then download the cat picture linked to this lesson to your local machine. Upload this cat picture to the S3 bucket by selecting it and clicking "Open," then "Upload" to complete the process.

      Once the upload finishes, you can view the picture of Samson. Click on it to see Samson looking pretty sleepy. This demonstrates that you can currently access the Samson.jpg object while operating within the production AWS account.

      The key point here is that you’ve assumed an IAM role. By switching roles into the production account, you’ve assumed the role called "organization account access role," which has the administrator access managed policy attached.

      Now, we’ll demonstrate how this can be restricted using SCPs. Move back to the main AWS console. Click on the account dropdown and switch back to the general AWS account. Navigate to AWS Organizations, then Policies. Currently, most options are disabled, including Service Control Policies, Tag Policies, AI Services, Opt-out Policies, and Backup Policies.

      Click on Service Control Policies and then "Enable" to activate this functionality. This action adds the "Full AWS Access" policy to the entire organization, which imposes no restrictions, so all AWS accounts maintain full access to all AWS services.

      To create our own service control policy, download the file named DenyS3.json linked to this lesson and open it in a code editor. This SCP contains two statements. The first statement is an allow statement with an effect of allow, action as star (wildcard), and resource as star (wildcard). This replicates the full AWS access SCP applied by default. The second statement is a deny statement that denies any S3 actions on any AWS resource. This explicit deny overrides the explicit allow for S3 actions, resulting in access to all AWS services except S3.

      Copy the content of the DenyS3.json file into your clipboard. Move back to the AWS console, go to the policy section, and select Service Control Policies. Click "Create Policy," delete the existing JSON in the policy box, and paste the copied content. Name this policy "Allow all except S3" and create it.

      Now, go to AWS Accounts on the left menu, select the prod OU, and click on the Policies tab. Attach the new policy "Allow all except S3" by clicking "Attach" in the applied policies box. We will also detach the full AWS access policy directly attached. Check the box next to full AWS access, click "Detach," and confirm by clicking "Detach Policy."

      Now, the only service control policy directly attached to production is "Allow all except S3," which allows access to all AWS products and services except S3.

      To verify, go back to the main AWS console and switch roles into the production AWS account. Go to the S3 console and you should receive a permissions error, indicating that you don't have access to list buckets. This is because the SCP attached to the production account explicitly denies S3 access. Access to other services remains unaffected, so you can still interact with EC2.

      If we switch back to the general account, reattach the full AWS access policy, and detach "Allow all except S3," the production account will regain access to S3. By following the same process, you’ll be able to access the S3 bucket and view the object once again.

      This illustrates how SCPs can be used to restrict access for identities within an AWS account, in this case, the production AWS account.

      To clean up, delete the bucket. Select the catpics bucket, click "Empty," type "permanently delete," and select "Empty." Once that's done, you can delete the bucket by selecting it, clicking "Delete," confirming the bucket name, and then clicking "Delete Bucket."

      You’ve now demonstrated full control over S3, evidenced by successfully deleting the bucket. This concludes the demo lesson. You’ve created and applied an SCP that restricts S3 access, observed its effects, and cleaned up. We’ll discuss more about boundaries and restrictions in future lessons. For now, complete this video, and I'll look forward to seeing you in the next lesson.

    1. Welcome back.

      In this lesson, I want to continue immediately from the last one by discussing when and where you might use IAM roles. By talking through some good scenarios for using roles, I want to make sure that you're comfortable with selecting these types of situations where you would choose to use an IAM role and where you wouldn't, because that's essential for real-world AWS usage and for answering exam questions correctly.

      So let's get started.

      One of the most common uses of roles within the same AWS account is for AWS services themselves. AWS services operate on your behalf and need access rights to perform certain actions. An example of this is AWS Lambda. Now, I know I haven't covered Lambda yet, but it's a function as a service product. What this means is that you give Lambda some code and create a Lambda function. This function, when it runs, might do things like start and stop EC2 instances, perform backups, or run real-time data processing. What it does exactly isn't all that relevant for this lesson. The key thing, though, is that a Lambda function, as with most AWS things, has no permissions by default. A Lambda function is not an AWS identity. It's a component of a service, and so it needs some way of getting permissions to do things when it runs. Running a Lambda function is known as a function invocation or a function execution using Lambda terminology.

      So anything that's not an AWS identity, this might be an application or a script running on a piece of compute hardware somewhere, needs to be given permissions on AWS using access keys. Rather than hard-coding some access keys into your Lambda function, there's actually a better way. To provide these permissions, we can create an IAM role known as a Lambda execution role. This execution role has a trust policy which trusts the Lambda service. This means that Lambda is allowed to assume that role whenever a function is executed. This role has a permissions policy which grants access to AWS products and services.

      When the function runs, it uses the sts:AssumeRole operation, and then the Secure Token Service generates temporary security credentials. These temporary credentials are used by the runtime environment in which the Lambda function runs to access AWS resources based on the permissions the role’s permissions policy has. The code is running in a runtime environment, and it's the runtime environment that assumes the role. The runtime environment gets these temporary security credentials, and then the whole environment, which the code is running inside, can use these credentials to access AWS resources.

      So why would you use a role for this? What makes this scenario perfect for using a role? Well, if we didn't use a role, you would need to hard-code permissions into the Lambda function by explicitly providing access keys for that function to use. Where possible, you should avoid doing that because, A, it's a security risk, and B, it causes problems if you ever need to change or rotate those access keys. It's always better for AWS products and services, where possible, to use a role, because when a role is assumed, it provides a temporary set of credentials with enough time to complete a task, and then these are discarded.

      For a given Lambda function, you might have one copy running at once, zero copies, 50 copies, a hundred copies, or even more. Because you can't determine this number, because it's unknown, if you remember my rule from the previous lesson, if you don't know the number of principals, if it's multiple or if it's an uncertain number, then it suggests a role might be the most ideal identity to use. In this case, it is the ideal way of providing Lambda with these credentials to use a role and allow it to get these temporary credentials. It's always the preferred option when using AWS services to do something on your behalf; use a role because you don't need to provide any static credentials.

      Okay, so let's move on to the next scenario.

      Another situation where roles are useful is emergency or out-of-the-usual situations. Here’s a familiar scenario that you might find in a workplace. This is Wayne, and Wayne works in a business's service desk team. This team is given read-only access to a customer's AWS account so that they can keep an eye on performance. The idea is that anything more risky than this read-only level of access is handled by a more senior technical team. We don't want to give Wayne's team long-term permissions to do anything more destructive than this read-only access, but there are always going to be situations which occur when we least want them, normally 3:00 a.m. on a Sunday morning, when a customer might call with an urgent issue where they need Wayne's help to maybe stop or start an instance, or maybe even terminate an EC2 instance and recreate it.

      So 99% of the time, Wayne and his team are happy with this read-only access, but there are situations when he needs more. This is a break-glass style situation, which is named after this. The idea of break glass in the physical world is that there is a key for something behind glass. It might be a key for a room that a certain team doesn't normally have access to, maybe it’s a safe or a filing cabinet. Whatever it is, the glass provides a barrier, meaning that when people break it, they really mean to break it. It’s a confirmation step. So if you break a piece of glass to get a key to do something, there needs to be an intention behind it. Anyone can break the glass and retrieve the key, but having the glass results in the action only happening when it's really needed. At other times, whatever the key is for remains locked. And you can also tell when it’s been used and when it hasn’t.

      A role can perform the same thing inside an AWS account. Wayne can assume an emergency role when absolutely required. When he does, he'll gain additional permissions based on the role's permissions policy. For a short time, Wayne will, in effect, become the role. This access will be logged and Wayne will know to only use the role under exceptional circumstances. Wayne’s normal permissions can remain at read-only, which protects him and the customer, but he can obtain more if required when it’s really needed. So that’s another situation where a role might be a great solution.

      Another scenario when roles come in handy is when you're adding AWS into an existing corporate environment. You might have an existing physical network and an existing provider of identities, known as an identity provider, that your staff use to log into various systems. For the sake of this example, let’s just say that it's Microsoft Active Directory. In this scenario, you might want to offer your staff single sign-on, known as SSO, allowing them to use their existing logins to access AWS. Or you might have upwards of 5,000 accounts. Remember, there’s the 5,000 IAM user limit. So for a corporation with more than 5,000 staff, you can’t offer each of them an IAM user. That is beyond the capabilities of IAM.

      Roles are often used when you want to reuse your existing identities for use within AWS. Why? Because external accounts can’t be used directly. You can’t access an S3 bucket directly using an Active Directory account. Remember this fact. External accounts or external identities cannot be used directly to access AWS resources. You can’t directly use Facebook, Twitter, or Google identities to interact with AWS. There is a separate process which allows you to use these external identities, which I’ll be talking about later in the course.

      Architecturally, what happens is you allow an IAM role inside your AWS account to be assumed by one of the external identities, which is in Active Directory in this case. When the role is assumed, temporary credentials are generated and these are used to access the resources. There are ways that this is hidden behind the console UI so that it appears seamless, but that's what happens behind the scenes. I'll be covering this in much more detail later in the course when I talk about identity federation, but I wanted to introduce it here because it is one of the major use cases for IAM roles.

      Now, why roles are so important when an existing ID provider such as Active Directory is involved is that, remember, there is this 5,000 IAM user limit in an account. So if your business has more than 5,000 accounts, then you can’t simply create an IAM user for each of those accounts, even if you wanted to. 5,000 is a hard limit. It can't be changed. Even if you could create more than 5,000 IAM users, would you actually want to manage 5,000 extra accounts? Using a role in this way, so giving permissions to an external identity provider and allowing external identities to assume this role, is called ID Federation. It means you have a small number of roles to manage and external identities can use these roles to access your AWS resources.

      Another common situation where you might use roles is if you're designing the architecture for a popular mobile application. Maybe it's a ride-sharing application which has millions of users. The application needs to store and retrieve data from a database product in AWS, such as DynamoDB. Now, I've already explained two very important but related concepts on the previous screen. Firstly, that when you interact with AWS resources, you need to use an AWS identity. And then secondly, that there’s this 5,000 IAM user limit per account. So designing an application with this many users which needs access to AWS resources, if you could only use IAM users or identities in AWS, it would be a problem because of this 5,000 user limit. It’s a hard limit and it can’t be raised.

      Now, this is a problem which can be fixed with a process called Web Identity Federation, which uses IAM roles. Most mobile applications that you’ve used, you might have noticed they allow you to sign in using a web identity. This might be Twitter, Facebook, Google, and potentially many others. If we utilize this architecture for our web application, we can trust these identities and allow these identities to assume an IAM role. This is based on that role’s trust policy. So they can assume that role, gain access to temporary security credentials, and use those credentials to access AWS resources, such as DynamoDB. This is a form of Web Identity Federation, and I'll be covering it in much more detail later in the course.

      The use of roles in this situation has many advantages. First, there are no AWS credentials stored in the application, which makes it a much more preferred option from a security point of view. If an application is exploited for whatever reason, there’s no chance of credentials being leaked, and it uses an IAM role which you can directly control from your AWS account. Secondly, it makes use of existing accounts that your customers probably already have, so they don't need yet another account to access your service. And lastly, it can scale to hundreds of millions of users and beyond. It means you don’t need to worry about the 5,000 user IAM limit. This is really important for the exam. There are very often questions on how you can architect solutions which will work for mobile applications. Using ID Federation, so using IAM roles, is how you can accomplish that. And again, I'll be providing much more information on ID Federation later in the course.

      Now, one scenario I want to cover before we finish up this lesson is cross-account access. In an upcoming lesson, I’ll be introducing AWS Organizations and you will get to see this type of usage in practice. It’s actually how we work in a multi-account environment. Picture the scenario that's on screen now: two AWS accounts, yours and a partner account. Let’s say your partner organization offers an application which processes scientific data and they want you to store any data inside an S3 bucket that’s in their account. Your account has thousands of identities, and the partner IT team doesn’t want to create IAM users in their account for all of your staff. In this situation, the best approach is to use a role in the partner account. Your users can assume that role, get temporary security credentials, and use those to upload objects. Because the IAM role in the partner account is an identity in that account, using that role means that any objects that you upload to that bucket are owned by the partner account. So it’s a very simple way of handling permissions when operating between accounts.

      Roles can be used cross-account to give access to individual resources like S3 in the onscreen example, or you can use roles to give access to a whole account. You’ll see this in the upcoming AWS Organization demo lesson. In that lesson, we’re going to configure it so a role in all of the different AWS accounts that we’ll be using for this course can be assumed from the general account. It means you won’t need to log in to all of these different AWS accounts. It makes multi-account management really simple.

      I hope by this point you start to get a feel for when roles are used. Even if you’re a little vague, you will learn more as you go through the course. For now, just a basic understanding is enough. Roles are difficult to understand at first, so you’re doing well if you’re anything but confused at this point. I promise you, as we go through the course and you get more experience, it will become second nature.

      So at this point, that’s everything I wanted to cover. Thanks for watching. Go ahead and complete this video, and when you're ready, join me in the next lesson.

    1. Have a registration code?

      This is only if the user doesn't follow the hyperlinked button. I think it should be moved to the bottom of sign in or be in a troubleshooting flow only

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Ever-improving techniques allow the detailed capture of brain morphology and function to the point where individual brain anatomy becomes an important factor. This study investigated detailed sulcal morphology in the parieto-occipital junction. Using cutting-edge methods, it provides important insights into local anatomy, individual variability, and local brain function. The presented work advances the field and will stimulate future research into this important area.

      Strengths:

      Detailed, very thorough methodology. Multiple raters mapped detailed sulci in a large cohort. The identified sulcal features and their functional and behavioural relevance are then studied using various complementary methods. The results provide compelling evidence for the importance of the described sulcal features and their proposed relationship to cortical brain function.

      We thank the Reviewer for highlighting the strengths of our methods and findings.

      Weaknesses:

      A detailed description/depiction of the various sulcal patterns is missing.

      We agree that adding these details for the newly described sulci is necessary and have now done so. These details are included in the Results (Page 6):

      “Beyond characterizing the incidence of sulci, it is also common in the neuroanatomical literature to qualitatively characterize sulci on the basis of fractionation and intersection with surrounding sulci (termed “sulcal types”; for examples in other cortical expanses, see Chiavaras & Petrides, 2000; Drudik et al., 2023; Miller et al., 2021; Paus et al., 1996; Weiner et al., 2014; Willbrand, Parker, et al., 2022). All four sulci most commonly did not intersect with other sulci (see Supplementary Tables 1-4 for a summary of the sulcal types of the slocs and pAngs dorsal and ventral components). The sulcal types were also highly comparable between hemispheres (rs > .99 , ps < .001).”

      And in four new Supplementary Tables.

      A possible relationship between sulcal morphology and individual demographics might provide more insight into anatomical variability.

      We have conducted additional analyses to relate sulcal incidence to demographic features (age and gender). These results are included on Pages 5-6:

      “Given that sulcal incidence and patterning is also sometimes related to demographic features (Cachia et al., 2021; Leonard et al., 2009; Wei et al., 2017), subsequent GLMs relating the incidence and patterning of the three more variable sulci (slocs-d, pAngs-v, and pAngs-d) to demographic features (age and gender) revealed no associations for any sulcus (ps > .05).”

      The unique dataset offers an opportunity to provide insights into laterality effects that should be explored.

      We included hemisphere as a factor in all models for this exact reason. Throughout the paper, we have edited the text to ensure that these laterality effects are more apparent to readers.

      Further, we have a Supplementary Results section on hemispheric effects regarding the slocs-v, cSTS3, and lTOS:

      “Hemispheric asymmetries in morphological, architectural, and functional features with regards to the slocs-v, cSTS3, and lTOS comparison

      We observed a sulcus x metric x hemisphere interaction on the morphological and architectural features of the slocs-v (F(4.20, 289.81) = 4.16, η2 = 0.01, p = .002; the cSTS3 is discussed in the next section). Post hoc tests showed that this interaction was driven by  the slocs-v being cortically thinner in the left than the right hemisphere (p < .001; Fig. 2a).

      There was also a sulcus x network x hemisphere interaction on the functional connectivity profiles (using functional connectivity parcellations from (Kong et al., 2019)) of the slocs-v and lTOS (F(32, 2144) = 3.99, η2 = 0.06, p < .001; the cSTS3 is discussed in the next section). Post hoc tests showed that this interaction was driven by three effects: (i) the slocs-v overlapped more with the Default C subnetwork in the left than the right hemisphere (p = .013), (ii) the lTOS overlapped more with Visual A subnetwork in the right than the left hemisphere (p = .002), and (iii) the lTOS overlapped more with the Visual B subnetwork in the left than the right hemisphere (p = .002; Fig. 2b).”

      As well as the other STS rami on morphology:

      “It is also worth noting that there was a sulcus x metric x hemisphere interaction (F(4, 284.12) = 6.60, η2 = 0.08, p < .001). Post hoc tests showed that: (i) the cSTS3 was smaller (p < .001) and thinner (p = .025) in the left than the right hemisphere (Supplementary Fig. 8a), (ii) the cSTS2 was shallower (p = .004) and thicker (p < .001) in the right than left hemisphere (Supplementary Fig. 8a), and (iii) the cSTS1 was shallower (p < .001), smaller (p = .002), thinner (p = .001), and less myelinated (p < .001) in the left than the right hemisphere (Supplementary Fig. 8a).”

      And functional connectivity of the STS rami:

      “There was also a sulcus x network x hemisphere interaction (F(32, 2208) = 12.26, η2 = 0.15, p < .001). Post hoc tests showed differences for each cSTS component. Here, the cSTS1 overlapped more with the Auditory network (p < .001), less with the Control B subnetwork (p < .001), more with the Control C subnetwork (p < .001), less with the Default B subnetwork (p < .001), more with the Default C subnetwork (p < .001), more with the Ventral Attention B subnetwork (p < .001), and more with the Visual A subnetwork (p = .024) in the right than in the left hemisphere (Supplementary Fig. 8b). In addition, the cSTS2 overlapped more with the Control B subnetwork (p < .001), more with the Control C subnetwork (p < .001), less with the Default B subnetwork (p < .001), and less with the Temporal-Parietal network (p = .011) in the right than in the left hemisphere (Supplementary Fig. 8b). Finally, the cSTS3 overlapped more with the Control B subnetwork (p = .002), less with the Default B subnetwork (p = .014), more with the Default C subnetwork (p = .022), less with the Ventral Attention B subnetwork (p = .029) in the right than in the left hemisphere (Supplementary Fig. 8b).”

      Reviewer #2 (Public Review):

      Summary: After manually labeling 144 human adult hemispheres in the lateral parieto-occipital junction (LPOJ), the authors 1) propose a nomenclature for 4 previously unnamed highly variable sulci located between the temporal and parietal or occipital lobes, 2) focus on one of these newly named sulci, namely the ventral supralateral occipital sulcus (slocs-v) and compare it to neighboring sulci to demonstrate its specificity (in terms of depth, surface area, gray matter thickness, myelination, and connectivity), 3) relate the morphology of a subgroup of sulci from the region including the slocs-v to the performance in a spatial orientation task, demonstrating behavioral and morphological specificity. In addition to these results, the authors propose an extended reflection on the relationship between these newly named landmarks and previous anatomical studies, a reflection about the slocs-v related to functional and cytoarchitectonic parcellations as well as anatomic connectivity and an insight about potential anatomical mechanisms relating sulcation and behavior.

      Strengths:

      - To my knowledge, this is the first study addressing the variable tertiary sulci located between the superior temporal sulcus (STS) and intraparietal sulcus (IPS).

      - This is a very comprehensive study addressing altogether anatomical, architectural, functional and cognitive aspects.

      - The definition of highly variable yet highly reproducible sulci such as the slocs-v feeds the community with new anatomo-functional landmarks (which is emphasized by the provision of a probability map in supp. mat., which in my opinion should be proposed in the main body).

      - The comparison of different features between the slocs-v and similar sulci is useful to demonstrate their difference.

      - The detailed comparison of the present study with state of the art contextualizes and strengthens the novel findings.

      - The functional study complements the anatomical description and points towards cognitive specificity related to a subset of sulci from the LPOJ

      - The discussion offers a proposition of theoretical interpretation of the findings

      - The data and code are mostly available online (raw data made available upon request).

      We thank the Reviewer for highlighting the strengths of our methods, analyses, and applications of our findings.

      Weaknesses:

      - While three independent raters labeled all hemispheres, one single expert finalized the decision. Because no information is reported on the inter-rater variability, this somehow equates to a single expert labeling the whole cohort, which could result in biased labellings and therefore affect the reproducibility of the new labels.

      Our group does not use an approach amenable to calculating inter-rater agreements to expedite the process of defining thousands of sulci at the individual level in multiple regions. Our method consists of a two-tiered procedure. Here, authors YT and TG defined sulci which were then checked by a trained expert (EHW). These were then checked again by senior author  (KSW) . We emphasize that this process has produced reproducible anatomical results in other regions such as posteromedial cortex (Willbrand et al., 2023 Science Advances; Willbrand et al., 2023 Communications Biology; Maboudian et al., 2024 The Journal of Neuroscience), ventral temporal cortex (Weiner et al., 2014 NeuroImage; Miller et al., 2020 Scientific Reports; Parker et al., 2023 Brain Structure and Function), and lateral prefrontal cortex (Miller et al., 2021 The Journal of Neuroscience; Voorhies et al., 2021 Nature Communications; Yao et al., 2022 Cerebral Cortex; Willbrand et al., 2022 Brain Structure and Function; Willbrand et al., 2023 The Journal of Neuroscience) across age groups, species, and clinical populations. Further, in the Supplemental Materials we provide post mortem images showing that these sulci exist outside of cortical reconstructions, supporting this updated sulcal schematic of the lateral parieto-occipital junction. For the present study, by the time the final tier of our method was reached, we emphasize that a very small percentage (~2%) of sulcal definitions were actually modified. We will include an exact percentage in future publications in LPC/LOPJ.

      - 3 out of the 4 newly labeled sulci are only described in the very first part and never reused. This should be emphasized as it is far from obvious at first glance of the article.

      We have edited the Abstract (shown below, on Page 1) and paper throughout to emphasize the emphasis on the slocs-v over the other three sulci.

      “After defining thousands of sulci in a young adult cohort, we revised the previous LPC/LPOJ sulcal landscape to include four previously overlooked, small, shallow, and variable sulci. One of these sulci (ventral supralateral occipital sulcus, slocs-v) is present in nearly every hemisphere and is morphologically, architecturally, and functionally dissociable from neighboring sulci. A data-driven, model-based approach, relating sulcal depth to behavior further revealed that the morphology of only a subset of LPC/LPOJ sulci, including the slocs-v, is related to performance on a spatial orientation task.”

      It is worth noting that we have added additional analyses that include the other three newly-characterized sulci in response to Reviewer 1. We first described the relationship between these sulci and demographic features, alongside analyses on the patterning of these sulci, which are included in the Results (Page 6):

      “Beyond characterizing the incidence of sulci, it is also common in the neuroanatomical literature to qualitatively characterize sulci on the basis of fractionation and intersection with surrounding sulci (termed “sulcal types”; for examples in other cortical expanses, see Chiavaras & Petrides, 2000; Drudik et al., 2023; Miller et al., 2021; Paus et al., 1996; Weiner et al., 2014; Willbrand, Parker, et al., 2022). All four sulci most commonly did not intersect with other sulci (see Supplementary Tables 1-4 for a summary of the sulcal types of the slocs and pAngs dorsal and ventral components). The sulcal types were also highly comparable between hemispheres (rs > .99 , ps < .001). Though we characterize these sulci in this paper for the first time, the location of these four sulci is consistent with the presence of variable “accessory sulci” in this cortical expanse mentioned in prior modern and classic studies (Supplementary Methods). We could also identify these sulci in post-mortem hemispheres (Supplementary Figs. 2, 3), ensuring that these sulci were not an artifact of the cortical reconstruction process.

      Given that sulcal incidence and patterning is also sometimes related to demographic features (Cachia et al., 2021; Leonard et al., 2009; Wei et al., 2017), subsequent GLMs relating the incidence and patterning of the three more variable sulci (slocs-d, pAngs-v, and pAngs-d) to demographic features (age and gender) revealed no associations for any sulcus (ps > .05).  Finally, to help guide future research on these newly- and previously-classified LPC/LPOJ sulci, we generated probabilistic maps of each of these 17 sulci and share them with the field with the publication of this paper (Supplementary Fig. 6; Data availability).”

      - The tone of the article suggests a discovery of these 4 sulci when some of them have already been reported (as rightfully highlighted in the article), though not named nor studied specifically. This is slightly misleading as I interpret the first part of the article as a proposition of nomenclature rather than a discovery of sulci.

      We have toned down our language throughout the paper, emphasizing that this paper is updating the sulcal landscape of LPC/LOPJ taking into account these sulci that have not been comprehensively described previously. For example, in the Abstract (Page 1), we now write:

      “After defining thousands of sulci in a young adult cohort, we revised the previous LPC/LPOJ sulcal landscape to include four previously overlooked, small, shallow, and variable sulci. One of these sulci (ventral supralateral occipital sulcus, slocs-v) is present in nearly every hemisphere and is morphologically, architecturally, and functionally dissociable from neighboring sulci. A data-driven, model-based approach, relating sulcal depth to behavior further revealed that the morphology of only a subset of LPC/LPOJ sulci, including the slocs-v, is related to performance on a spatial orientation task. “

      - The article never mentions the concept of merging of sulcal elements and the potential effect it could have on the labeling of the newly named variable sulci.

      We emphasize that we use multiple surfaces (pial, inflated, smoothwm) to help distinguish intersecting sulci from one another. We include extra text in the Methods (Page 21):

      “We defined LPC/LPOJ sulci for each participant based on the most recent schematics of sulcal patterning by Petrides (2019) as well as pial, inflated, and smoothed white matter (smoothwm) FreeSurfer cortical surface reconstructions of each individual. In some cases, the precise start or end point of a sulcus can be difficult to determine on a surface (Borne et al., 2020); however, examining consensus across multiple surfaces allowed us to clearly determine each sulcal boundary in each individual. “

      Further, upon quantifying the patterning of these variable sulci, a majority of the time they are independent (described in the Results on Page 6):

      “Beyond characterizing the incidence of sulci, it is also common in the neuroanatomical literature to qualitatively characterize sulci on the basis of fractionation and intersection with surrounding sulci (termed “sulcal types”; for examples in other cortical expanses, see (Chiavaras & Petrides, 2000; Drudik et al., 2023; Miller et al., 2021; Paus et al., 1996; Weiner et al., 2014; Willbrand, Parker, et al., 2022). All four sulci most commonly did not intersect with other sulci (see Supplementary Tables 1-4 for a summary of the sulcal types of the slocs and pAngs dorsal and ventral components). The sulcal types were also highly comparable between hemispheres (rs > .99 , ps < .001).”

      Thus, merging sulcal elements likely had a minimal impact on the present definitions.

      - The definition of the new sulci is solely based on their localization relative to other sulci which are themselves variable (e.g. the 3rd branch of the STS can show different locations and different orientation, potentially affecting the definition of the slocs-v). This is not addressed in the discussion.

      As displayed in our probabilistic maps of these sulci (Supplementary Fig. 6), the cSTS components (2-4) are actually relatively consistent between individuals, and thus, future investigators can utilize these maps to help define these sulci in new hemispheres.

      Nevertheless, there is, of course, individual variability in the location of these sulci, and we do agree that this point brought up by the Reviewer is important. We have now added text to the Limitations section of the Discussion (Pages 15-16):

      “The main limitation of our study is that presently, the most accurate methodology to define sulci —especially the small, shallow, and variable PTS—requires researchers to manually trace each structure on the cortical surface reconstructions. This method is limited due to the individual variability of cortical sulcal patterning (Fig. 1, Supplementary Fig. 5), which makes it challenging to identify sulci, let alone PTS, without extensive experience and practice. However, we anticipate that our probabilistic maps  will provide a starting point and hopefully, expedite the identification of these sulci in new participants. This method is also arduous and time-consuming—which, on the one hand, limits the sample size in terms of number of participants, while on the other, results in thousands of precisely defined sulci. This push-pull  relationship reflects a broader conversation in the human brain mapping and cognitive neuroscience fields between a balance of large N studies and “precision imaging” studies in individual participants (Allen et al., 2022; Gratton et al., 2022; Naselaris et al., 2021; Rosenberg and Finn, 2022).”

      - The new sulci are only defined in terms of localization relative to other sulci, and no other property is described (general length, depth, orientation, shape...), making it hard for a new observer to take labeling decisions in case of conflict.

      To help guide future investigators, we now show these metrics for all sulci in Supplemental Figure 7 to help future groups identify these sulci with the assistance of their general morphology.

      - The very assertive tone of the article conveys the idea that these sulci are identifiable certainly in most cases, when by definition these highly variable tertiary sulci are sometimes very difficult to take decisions on.

      The highly variable nature of ¾ of the putative tertiary sulci (slocs-v, slocs-d, pAngs-v, pAngs-d) described here is why we focused on the slocs-v (as it is identifiable in nearly all f hemispheres). However, we have edited our language throughout the text to also emphasize the variability of these sulci. For example, in the Results (Page 5), we now write:

      “In previous research in small sample sizes, neuroanatomists noticed shallow sulci in this cortical expanse (Supplementary Methods and Supplementary Figs. 1-4 for historical details). In the present study, we fully update this sulcal landscape considering these overlooked indentations. In addition to defining the 13 sulci previously described within the LPC/LPOJ, as well as the posterior superior temporal cortex (Methods) (Petrides, 2019) in individual participants, we could also identify as many as four small and shallow PTS situated within the LPC/LPOJ that were highly variable across individuals and uncharted until now (Supplementary Methods and Supplementary Figs. 1-4). Macroanatomically, we could identify two sulci between the cSTS3 and the IPS-PO/lTOS ventrally and two sulci between the cSTS2 and the pips/IPS dorsally. We focus our analyses on the slocs-v since it was identifiable in nearly every hemisphere.”

      - I am not absolutely convinced with the labeling proposed of a previously reported sulcus, namely the posterior intermediate parietal sulcus.

      In defining previously-identified LPC sulci, we followed the previous labeling procedure by Petrides (2019) alongside historical definitions (detailed in Supplementary Figures 1-4). Nevertheless, future deep learning algorithms using these and others data can be used to rectify discrepancies in labeling (e.g., Borne et al., 2020 Medical Image Analysis; Lyu et al., 2021 NeuroImage). We discuss these points in the Limitations section of the Discussion (Pages 16-17):

      “The main limitation of our study is that presently, the most accurate methodology to define sulci —especially the small, shallow, and variable PTS—requires researchers to manually trace each structure on the cortical surface reconstructions. This method is limited due to the individual variability of cortical sulcal patterning (Fig. 1, Supplementary Fig. 5), which makes it challenging to identify sulci without extensive experience and practice. However, we anticipate that our probabilistic maps  will provide a starting point and hopefully, expedite the identification of these sulci in new participants. This should accelerate the process of subsequent studies confirming the accuracy of our updated schematic of LPC/LOPJ. This manual method is also arduous and time-consuming, which, on the one hand, limits the sample size in terms of number of participants, while on the other, results in thousands of precisely defined sulci. This push-pull relationship reflects a broader conversation in the human brain mapping and cognitive neuroscience fields between a balance of large N studies and “precision imaging” studies in individual participants (Allen et al., 2022; Gratton et al., 2022; Naselaris et al., 2021; Rosenberg & Finn, 2022). Though our sample size is comparable to other studies that produced reliable results relating sulcal morphology to brain function and cognition (e.g., (Cachia et al., 2021; Garrison et al., 2015; Lopez-Persem et al., 2019; Miller et al., 2021; Roell et al., 2021; Voorhies et al., 2021; Weiner, 2019; Willbrand, Parker, et al., 2022; Willbrand, Voorhies, et al., 2022; Yao et al., 2022), ongoing work that uses deep learning algorithms to automatically define sulci should result in much larger sample sizes in future studies (Borne et al., 2020; Lyu et al., 2021). Finally, the time-consuming manual definitions of primary, secondary, and PTS also limit the cortical expanse explored in each study, thus, restricting the present study to LPC/LPOJ. “

      Assuming that the labelling of all sulci reported in the article is reproducible, the different results are convincing and in general, this study achieves its aims in defining more precisely the sulcation of the LPOJ and looking into its functional/cognitive value. This work clearly offers a finer understanding of sulcal pattern in this region, and lacks only little for the new markers to be convincingly demonstrated. An overall coherence of the labelling can still be inferred from the supplementary material which support the results and therefore the conclusions, yet, addressing some of the weaknesses listed above would greatly enhance the impact of this work. This work is important to the understanding of sulcal variability and its implications on functional and cognitive aspects.

      We thank the Reviewer for their positive remarks on the implications of this work.

      Reviewer #3 (Public Review):

      Summary: 72 subjects, and 144 hemispheres, from the Human Connectome Project had their parietal sulci manually traced. This identified the presence of previously undescribed shallow sulci. One of these sulci, the ventral supralateral occipital sulcus (slocs-v), was then demonstrated to have functional specificity in spatial orientation. The discussion furthermore provides an eloquent overview of our understanding of the anatomy of the parietal cortex, situating their new work into the broader field. Finally, this paper stimulates further debate about the relative value of detailed manual anatomy, inherently limited in participant numbers and areas of the brain covered, against fully automated processing that can cover thousands of participants but easily misses the kinds of anatomical details described here.

      Strengths:

      - This is the first paper describing the tertiary sulci of the parietal cortex with this level of detail, identifying novel shallow sulci and mapping them to behaviour and function.

      - It is a very elegantly written paper, situating the current work into the broader field.

      - The combination of detailed anatomy and function and behaviour is superb.

      We thank the Reviewer for their positive remarks on paper and our findings.

      Weaknesses:

      - The numbers of subjects are inherently limited both in number as well as in typically developing young adults.

      We emphasize that the sample size is limited due to the arduous nature of manually defining sulci; however, we provide probabilistic maps with the publication of this work to help expedite this process for future investigators. Further, with improved deep learning algorithms, the sample sizes in future neuroanatomical studies should be enhanced. We discuss these points in the Limitations section of the Discussion (Pages 16-17):

      “The main limitation of our study is that presently, the most accurate methodology to define sulci —especially the small, shallow, and variable PTS—requires researchers to manually trace each structure on the cortical surface reconstructions. This method is limited due to the individual variability of cortical sulcal patterning (Fig. 1, Supplementary Fig. 5), which makes it challenging to identify sulci without extensive experience and practice. However, we anticipate that our probabilistic maps  will provide a starting point and hopefully, expedite the identification of these sulci in new participants. This should accelerate the process of subsequent studies confirming the accuracy of our updated schematic of LPC/LOPJ. This manual method is also arduous and time-consuming, which, on the one hand, limits the sample size in terms of number of participants, while on the other, results in thousands of precisely defined sulci. This push-pull relationship reflects a broader conversation in the human brain mapping and cognitive neuroscience fields between a balance of large N studies and “precision imaging” studies in individual participants (Allen et al., 2022; Gratton et al., 2022; Naselaris et al., 2021; Rosenberg & Finn, 2022). Though our sample size is comparable to other studies that produced reliable results relating sulcal morphology to brain function and cognition (e.g., (Cachia et al., 2021; Garrison et al., 2015; Lopez-Persem et al., 2019; Miller et al., 2021; Roell et al., 2021; Voorhies et al., 2021; Weiner, 2019; Willbrand, Parker, et al., 2022; Willbrand, Voorhies, et al., 2022; Yao et al., 2022), ongoing work that uses deep learning algorithms to automatically define sulci should result in much larger sample sizes in future studies (Borne et al., 2020; Lyu et al., 2021). The time-consuming manual definitions of primary, secondary, and PTS also limit the cortical expanse explored in each study, thus restricting the present study to LPC/LPOJ.”

      - While the paper begins by describing four new sulci, only one is explored further in greater detail.

      Due to the increased variability of three of the four newly-classified sulci, we chose to only focus on the slocs-v given that it was present in nearly all hemispheres. In response to other reviewers, we have conducted additional analyses that also describe these new sulci and potential factors related to their incidence (Page 6):

      “Given that sulcal incidence and patterning is also sometimes related to demographic features (Cachia et al., 2021; Leonard et al., 2009; Wei et al., 2017), subsequent GLMs relating the incidence and patterning of the three more variable sulci (slocs-d, pAngs-v, and pAngs-d) to demographic features (age and gender) revealed no associations for any sulcus (ps > .05).”

      In addition, given that sulcal variability is cognitively (e.g., Amiez et al., 2018 Scientific Reports; Cachia et al., 2021 Frontiers in Neuroanatomy; Garrison et al., 2015 Nature Communications; Willbrand et al., 2022, 2023 Brain Structure & Function), anatomically (e.g., Amiez et al., 2021 Communications Biology; Vogt et al., 1995 Journal of Comparative Neurology), functionally (e.g., Lopez Persem et al., 2019 The Journal of Neuroscience), and translationally (e.g., Yucel et al., 2002 Biological Psychiatry) relevant, future research can investigate these relationships regarding the slocs-d and pAngs components. We have added text to the Limitations section of the Discussion (Pages 17-18) to discuss this:

      “Finally, although we did not focus on the relationship between the other three PTS (slocs-d, pAngs-v, and pAngs-d) to anatomical and functional features of LPC and cognition, given that variability in sulcal incidence is cognitively (Amiez et al., 2018; Cachia et al., 2021; Garrison et al., 2015; Willbrand, Jackson, et al., 2023; Willbrand, Voorhies, et al., 2022), anatomically (Amiez et al., 2021; Vogt et al., 1995), functionally (Lopez-Persem et al., 2019), and translationally (Clark et al., 2010; Le Provost et al., 2003; Meredith et al., 2012; Nakamura et al., 2020; Yücel et al., 2002, 2003) relevant, future work can also examine the relationship between the more variable slocs-d, pAngs-v, and pAngs-d and these features.”

      - There is some tension between calling the discovered sulci new vs acknowledging they have already been reported, but not named.

      We have edited the manuscript throughout to emphasize our primary focus on revising the LPC/LOPJ sulcal landscape to include these often overlooked small, shallow, and variable putative tertiary sulci, rather than using the terms “discovered sulci” and “new.”

      - The anatomy of the sulci, as opposed to their relation to other sulci, could be described in greater detail.

      Beyond the radar plots in the main text which compare specific groupings of sulci, we now show the morphological metrics for all sulci investigated in the present work in Supplemental Figure 7.

      Overall, to summarize, I greatly enjoyed this paper and believe it to be a highly valued contribution to the field.

      We are glad the Reviewer enjoyed reading our paper and thank them for their positive thoughts on the potential impact of this work on the field.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The slocs-v is found in 71 subjects left and right. Is that the same subject?

      No, these are different subjects.

      (2) How were the 72 subjects chosen?

      The subjects were randomly selected from the HCP database as describe in the methods (Page 18):

      “Here, we used 72 randomly-selected participants, balanced for gender (following the terminology of the HCP data dictionary), from the HCP database (50% female, 22-36 years old, and 90% right-handed; there was no effect of handedness on our behavioral tasks; Supplementary materials) that were also analyzed in several prior studies (Hathaway et al., 2023; Miller et al., 2021, 2020; Willbrand et al., 2023b, 2023c, 2022a).”

      (3) Are there effects of laterality on sulcal pattern? Table?

      We now include sulcal pattern results in the Results section and Supplementary Materials; although there were no laterality effects regarding the sulcal pattern .

      (4) Depiction/description of common sulcal patterns

      We now include sulcal pattern results in the Results section and Supplementary Materials.

      (5) Is there a relationship between sulcal patterns and demographic features?

      We now include analyses on this in the Results section. There is no relationship between sulcal patterns and demographic features.

      (6) Just for clarity, the sulcal features are studied and extracted in native space?

      Yes, sulcal features are studied and extracted in native space, as described in the Methods section (Page 19):

      “Anatomical T1-weighted (T1-w) MRI scans (0.8 mm voxel resolution) were obtained in native space from the HCP database. Reconstructions of the cortical surfaces of each participant were generated using FreeSurfer (v6.0.0), a software package used for processing and analyzing human brain MRI images (surfer.nmr.mgh.harvard.edu) (Dale et al., 1999; Fischl et al., 1999). All subsequent sulcal labeling and extraction of anatomical metrics were calculated from these native space reconstructions generated through the HCP’s version of the FreeSurfer pipeline (Glasser et al., 2013).”

      (7) The authors use "Gender". Are they referring to biological sex (female/male) or socially defined characteristics (man/woman etc.)?

      The term gender is referred to socially defined characteristics, as used by the HCP data dictionary (Methods page 18):

      “Here, we used 72 randomly-selected participants, balanced for gender (following the terminology of the HCP data dictionary), from the HCP database (50% female, 22-36 years old, and 90% right-handed; there was no effect of handedness on our behavioral tasks; Supplementary materials) that were also analyzed in several prior studies (Hathaway et al., 2023; Miller et al., 2021, 2020; Willbrand et al., 2023b, 2023c, 2022a).”

      (8) Fig 2. Grey is poorly visible compared to green and blue.

      The shade of gray has been edited to be more distinguishable.

      (9) The relationship between behavior and sulcal features is significant but weak.

      We acknowledge that the morphological-behavioral relationship identified in the present study explains a modest amount of variance; however, the more important aspect of the finding is that multiple sulci identified in the model are recently-characterized sulci in LPC/LOPJ identified by our group and others (Petrides, 2019), and thus, the relationship would have been overlooked or lost if these sulci were not identified. We have added text to the Limitations section of the Discussion (Pages 17-18) to emphasize this point:

      “It is also worth noting that the morphological-behavioral relationship identified in the present study explains a modest  amount of variance; however, the more important aspect of our findings is that multiple sulci identified in our model-based approach are recently-characterized sulci in LPC/LOPJ identified by our group and others (Petrides, 2019), and thus, the relationship would have been overlooked or lost if these sulci were not identified. “

      (10) The Limitation section could be expanded.

      We have added additional text to flesh out the Limitations section of the Discussion (Pages 17-18):

      “It is also worth noting that the morphological-behavioral relationship identified in the present study explains a modest  amount of variance; however, the more important aspect of our findings is that multiple sulci identified in our model-based approach are recently-characterized sulci in LPC/LOPJ identified by our group and others (Petrides, 2019), and thus, the relationship would have been overlooked or lost if these sulci were not identified. Finally, although we did not focus on the relationship between the other three PTS (slocs-d, pAngs-v, and pAngs-d) to anatomical and functional features of LPC and cognition, given that variability in sulcal incidence is cognitively (Amiez et al., 2018; Cachia et al., 2021; Garrison et al., 2015; Willbrand, Jackson, et al., 2023; Willbrand, Voorhies, et al., 2022), anatomically (Amiez et al., 2021; Vogt et al., 1995), functionally (Lopez-Persem et al., 2019), and translationally (Clark et al., 2010; Le Provost et al., 2003; Meredith et al., 2012; Nakamura et al., 2020; Yücel et al., 2002, 2003) relevant, future work can also examine the relationship between the more variable slocs-d, pAngs-v, and pAngs-d and these features. “

      Reviewer #2 (Recommendations For The Authors):

      First, I would like to thank the authors for their important contribution to the field of sulcal studies and anatomo-functional correlates. My main comments about the work are treated in the public review, and I will only address details in this section. I have detected a number of typos which are harder to report from a document in which lines are not numbered. Could you please submit a numbered document for the next iteration?

      - p2. "hominoid-specific, shallow indentations, or sulci" - can lead to misunderstanding that sulci are hominoid-specific and shallow

      Sentence has been rewritten:

      “Of all the neuroanatomical features to target, recent work shows that morphological features of the shallower, later developing, hominoid-specific indentations of the cerebral cortex (also known as putative tertiary sulci, PTS) are not only functionally and cognitively meaningful, but also are particularly impacted by multiple brain-related disorders and aging (Amiez et al., 2019, 2018; Ammons et al., 2021; Cachia et al., 2021; Fornito et al., 2004; Garrison et al., 2015; Harper et al., 2022; Hathaway et al., 2023; Lopez-Persem et al., 2019; Miller et al., 2021, 2020; Nakamura et al., 2020; Parker et al., 2023; Voorhies et al., 2021; Weiner, 2019; Willbrand et al., 2023b, 2023c, 2022a, 2022b; Yao et al., 2022).”

      - p2. next sentence (starting with "The combination [...]": not clear that you are addressing tertiary sulci here, maybe introduce the concept beforehand?

      The previous sentence (just above) has been edited to introduce putative tertiary sulci beforehand.

      - p5. error in numbering of sulci relative to Fig1. (5,6,7,8 -> 6,7,8,9)

      Sulcal numbering has been fixed.

      -p5. reference to supp mat -> I would have expected the nomenclature used in Borne et al. 2020 to be discussed alongside with the state of the art. How would you relate F.I.P.r.int.1 and F.I.P.r.int.2 to the sulci you describe?

      We thank the Reviewer for bringing up this relevant literature. The F.I.P.r.int. 1 and 2 are described as rami of the IPS, whereas the slocs and pAngs are independent, small indentations near the IPS, but not part of the complex. Nevertheless, future work should integrate these two schematics together to establish the most comprehensive sulcal map of LPC/LOPJ. We have added text to the Supplementary Methods detailing the differences between the F.I.P.r.int.1 and F.I.P.r.int.2 and slocs-/pAngs:

      “slocs/pAng vs. F.I.P.r.int.1 and F.I.P.r.int.2

      Recent work (Borne et al., 2020; Perrot et al., 2011) identified two intermediate rami of the IPS (F.I.P.r.int.1 and F.I.P.r.int.2) that were not defined in the present investigation. Crucially, the newly classified sulci here (slocs and pAngs) are distinguishable from the two F.I.P.r.int. in that the F.I.P.r.int. are branches coming off the main body of the IPS (Borne et al., 2020; Perrot et al., 2011), whereas the slocs/pAngs are predominantly non-intersecting (“free”) structures that never intersected with the IPS (Supplementary Tables 1-4).”

      - p6. Fig 1.a. labelling discrepancy between line 1 and 2, column 4: the labels 10 and 11 from the inflated hemisphere do not match the labels 10 and 11 in the pial surface. Fig 1.b. swapped label 2 and 3 in the 4th hemisphere

      These aspects of Figure 1 have been edited accordingly.

      - p7. "(iii) the slocs-v was thicker than both the cSTS3 and lTOS" -> the slocs-v showed thicker gray matter?

      The sentence has been adjusted (Page 7):

      “(iii) the slocs-v showed thicker gray matter than both the cSTS3 and lTOS (ps < .001), “

      - p9. Six left hemisphere LPC/LPOJ sulci were related to spatial orientation task performance -> missing

      Fixed (Page 9):

      “Six left hemisphere LPC/LPOJ sulci were related to spatial orientation task performance (Fig. 3a, b). “

      - p14. "Steel and colleagues" -> missing space

      Fixed (Page 14):

      “Furthermore, the slocs-v appears to lie at the junction of scene-perception and place-memory activity (a transition that also consistently co-localizes with the HCP-MMP area PGp) as identified by Steel and colleagues (2021).”

      - p20. Probability maps "we share these maps with the field" -> specify link to data availability

      The link to data availability has been added (Page 21):

      “To aid future studies interested in investigating LPC/LPOJ sulci, we share these maps with the field (Data availability). “

      Reviewer #3 (Recommendations For The Authors):

      No detailed recommendations not already present in the rest of the review.

    2. Reviewer #2 (Public Review):

      Summary:<br /> After manually labelling 144 human adult hemispheres in the lateral parieto-occipital junction (LPOJ), the authors 1) propose a nomenclature for 4 previously unnamed highly variable sulci located between the temporal and parietal or occipital lobes, 2) focus on one of these newly named sulci, namely the ventral supralateral occipital sulcus (slocs-v) and compare it to neighbouring sulci to demonstrate its specificity (in terms of depth, surface area, gray matter thickness, myelination, and connectivity), 3) relate the morphology of a subgroup of sulci from the region including the slocs-v to the performance in a spatial orientation task, demonstrating behavioural and morphological specificity. In addition to these results, the authors propose an extended reflection on the relationship between these newly named landmarks and previous anatomical studies, a reflection about the slocs-v related to functional and cytoarchitectonic parcellations as well as anatomic connectivity and an insight about potential anatomical mechanisms relating sulcation and behaviour.

      Strengths:<br /> - To my knowledge, this is the first study addressing the variable tertiary sulci located between the superior temporal sulcus (STS) and intra-parietal sulcus (IPS).<br /> - This is a very comprehensive study addressing altogether anatomical, architectural, functional and cognitive aspects.<br /> - The definition of highly variable yet highly reproductible sulci such as the slocs-v feeds the community with new anatomo-functional landmarks (which is emphasized by the provision of a probability map in supp. mat., which in my opinion should be proposed in the main body).<br /> - The comparison of different features between the slocs-v and similar sulci is useful to demonstrate their difference.<br /> - The detailed comparison of the present study with state of the art contextualises and strengthens the novel findings.<br /> - The functional study complements the anatomical description and points towards cognitive specificity related to a subset of sulci from the LPOJ<br /> - The discussion offers a proposition of theoretical interpretation of the findings<br /> - The data and code are mostly available online (raw data made available upon request).

      Weaknesses:<br /> - While the identification of the sulci has been done thoroughly with expert validation, the sulci have not been labelled in a way that enables the demonstration of the reproducibility of the labelling.

      The proposed methodology is convincing in identifying and studying the relationship between highly variable sulci and cognition. This improves our refined understanding of the general anatomical variability in the LPOJ and its potential functional/cognitive correlates. This work is important to the understanding of sulcal variability and its implications on functional and cognitive aspects.

      Comments on revised version:

      Thank you for the elegant and informative work.

    1. À l'intérieur de ces balises structurantes, comme vous l'avez sûrement vu dans le bout de code embarqué, vous pouvez également utiliser des balises universelles  <div>  et  <span>  afin de créer des blocs au sein de votre contenu, qui vous permettront ensuite de leur appliquer du style.

      J'aimerais comprendre pourquoi les balises universelles <div> à l'intérieur des balises header et footer son disparus alors que ci-dessus mentionne que l'on peut les avoir. Merci d'avance

    1. Reviewer #2 (Public Review):

      Summary:

      This paper addresses an important computational problem in learning and memory. Why do related memory representations sometimes become more similar to each other (integration) and sometimes more distinct (differentiation)? Classic supervised learning models predict that shared associations should cause memories to integrate, but these models have recently been challenged by empirical data showing that shared associations can sometimes cause differentiation. The authors have previously proposed that unsupervised learning may account for these unintuitive data. Here, they follow up on this idea by actually implementing an unsupervised neural network model that updates the connections between memories based on the amount of coactivity between them. The authors use their modeling framework to simulate three recent empirical studies, showing that their model captures aspects of these findings that are hard to account for with supervised learning.

      Overall, this is a strong and clearly described work that is likely to have a positive impact on computational and empirical work in learning and memory. While the authors have written about some of the ideas discussed in this paper previously, a fully implemented and openly available model is a clear advance that will benefit the field. It is not easy to translate a high-level description of a learning rule into a model that actually runs and behaves as expected. The fact that the authors have made all their code available makes it likely that other researchers will extend the model in numerous interesting ways, many of which the authors have discussed and highlighted in their paper.

      Strengths:

      The authors succeed in demonstrating that unsupervised learning with a simple u-shaped rule can produce results that are qualitatively in line with the empirical reports. In each of the three models, the authors manipulate stimulus similarity (following Chanales et al.), shared vs distinct associations (following Favila et al.), or learning strength (a stand-in for blocked versus interleaved learning schedule; following Schlichting et al.). In all cases, with hand-tuning of additional parameters, the authors are able to produce model representations that fit the empirical results, but that can't easily be accounted for by supervised learning. Demonstrating these effects isn't trivial and a formal modeling framework for doing so is a valuable contribution. Overall, the work is very thorough. The authors investigate many different aspects of the learning dynamics (learning rate, oscillation strength, hidden layer overlap etc) in these models and produce several key insights. Of particular value are their demonstrations that when differentiation occurs, it occurs very quickly and asymmetrically and results in anti-correlated representations, as well as the distinction between symmetric and asymmetric integration in their model. The authors thoroughly acknowledge the relative difficulty of producing differentiation in their models relative to integration, and are now more clear about why they don't necessarily view this as mismatch with the empirical data. The authors are also more clear about the complicated activation dynamics in their model and why critical ranges for some parameters can't be given -- the number of interacting parameters mean that there are many combinations that could produce the critical activation dynamics and thus the same result. Despite this complexity, the paper is very clearly written; the authors do a good job of both formally describing their model as well as giving readers a high level sense of how many of their critical model components work.

      Weaknesses:

      Though the u-shaped learning rule is essential to this framework, the paper doesn't do any formal investigation of this learning rule or comparison with other learning rules. The authors do have a strong theoretical interest in this rule as well as experimental precedent for testing this rule, which they now thoroughly discuss in the paper. Still, a stronger argument in support of the non monotonic plasticity hypothesis could have been made by comparing this learning rule to alternatives. Additionally, the authors' choice of strongly prewiring associations makes it difficult to think about how their model maps onto experimental contexts where associations are only weakly learned. However, the authors thoroughly acknowledge why this was necessary and discuss this limitation in the paper.

    1. Reviewer #2 (Public Review):

      Guan and colleagues address the question of how a single neuroblast produces a defined number of progeny, and what influences its decommissioning. The focus of the experiments are two well-studied RNA-binding proteins: Imp and Syp. The Authors find that these factors play an important role in determining the number of neurons in their preferred model system of VNC motor neurons coming from a single lineage (LinA/15) by separate functions taking place at specific stages of development of this lineage: influencing the life-span of the LinA neuroblast to control its timely decommissioning and functioning in the Late-born post-mitotic neurons to influence cell death after the appropriate number of progeny is generated. The post-mitotic role of Imp/Syp in regulating programmed-cell death (PCD) is also correlated with a specific code of key transcription factors that are suspected to influence neuronal identity, linking the fate of neuronal survival with its specification. This paper addresses a wide scope of phenotypes related to the same factors, thus providing an intriguing demonstration of how the nervous system is constructed by context-specific changes in key developmental regulators. The bulk of conclusions drawn by the authors are supported by careful experimental evidence, and the findings are a useful addition to an important topic in developmental neuroscience.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      This study addresses the temporal patterning of a specific Drosophila CNS neuroblast lineage, focusing on its larval development. They find that a temporal cascade, involving the Imp and Syb genes changes the fate of one daughter cell/branch, from glioblast (GB) to programmed cell death (PCD), as well as gates the decommissioning of the NB at the end of neurogenesis.

      I believe there are some inaccuracies in this summary. We address temporal patterning during larval and pupal stages until the adult stage. The Imp and Syp genes change the fate of one daughter cell/branch from survival to programmed cell death (PCD). The change from glioblast (GB) to PCD, which occurs at an early time point, is not addressed here. The main point of the paper is missing:

      • Last-born MNs undergo apoptosis due to their failure to express a functional TF code, and this code is post-transcriptionally regulated by the opposite expression of Imp and Syp in immature MNs.

      Reviewer #2 (Public Review):

      Summary:

      Guan and colleagues address the question of how a single neuroblast produces a defined number of progeny, and what influences its decommissioning. The focus of the experiments are two well-studied RNA-binding proteins: Imp and Syp. The Authors find that these factors play an important role in determining the number of neurons in their preferred model system of VNC motor neurons coming from a single lineage (LinA/15) by separate functions taking place at specific stages of development of this lineage: influencing the life-span of the LinA neuroblast to control its timely decommissioning and functioning in the Late-born post-mitotic neurons to influence cell death after the appropriate number of progeny is generated. The post-mitotic role of Imp/Syp in regulating programmed-cell death (PCD) is also correlated with a specific code of key transcription factors that are suspected to influence neuronal identity, linking the fate of neuronal survival with its specification. This paper addresses a wide scope of phenotypes related to the same factors, thus providing an intriguing demonstration of how the nervous system is constructed by context-specific changes in key developmental regulators.

      The bulk of conclusions drawn by the authors are supported by careful experimental evidence, and the findings are a useful addition to an important topic in developmental neuroscience.

      I cannot summarize better the paper.

      Strengths:

      A major strength is the use of a genetic labeling tool that allows the authors to specifically analyze and manipulate one neuronal lineage. This allows for simultaneous study of both the progenitors and post-mitotic progeny. As a result the paper conveys a lot of useful information for this particular neuronal lineage. Furthermore addressing the association of cell fate specification, taking advantage of this lab's extensive prior work in the system, with developmentally-regulated programmed celldeath is an important contribution to the field.

      Beyond Imp/Syp, additional characterization of this model system is provided in characterizing a previously unrecognized death of a hemilineage in early-born neurons.

      Thanks!

      Weaknesses:

      The main observations that distinguish this study from others that have investigated Imp/Syp in the fly nervous system is the role played in late-born post-mitotic neurons to regulate programmed cell death. This is an important and plausible (based on the presented findings) newly discovered role for these proteins. However the precision of experiments is not particularly strong, which limits the authors claims. The genetic strategy used to manipulate Imp/Syp or the TF code appears to be done throughout the entire lineage, or all neuronal progeny, and not restricted to only the late born cells. Can the authors rule out survival of the early born hemi-lineage normally fated to die? Therefore statements such as this: 

      To further investigate this possibility, we used the MARCM technique to change the TF code of lastborn MNs without affecting the expression of Imp and Syp should be qualified to specify that the result is obtained by misexpressing these factors throughout the entire lineage.

      We agree that our genetic manipulations affect the entire lineage or all neuronal progeny. We do not have genetic tools to gain such precision. We have changed our descriptions to specify the entire lineage or all neuronal progeny. As the reviewer raised, we were also concerned about the possibility that the overexpression of Imp or knockdown of Syp could induce the survival of the early-born hemilineage. We have two experiments that rule out this possibility:

      (1) In late LL3 larvae, Imp OE or syp MARCM clones do not change the number of cells in LL3 larvae (see Guan et al., 2022), indicating that the hemilineage that died by PCD is not affected. If Imp or Syp played a role in the survival of the hemilineage, we would see at least a 50% increase in the number of MNs at this stage.

      (2) The MARCM experiment using the VGlut driver to overexpress P35 or Imp allows us to manipulate only elav+ VGlut+ neurons. The hemilineage removed by PCD is elav- VGlut- and is not affected by this experiment. Consequently, the increase in MNs in adults with genetic manipulation can only be the result of the survival of the other hemilineage (elav+, VGlut+). Moreover, this experiment shows an increase in the number of neurons in the adult but not in LL3, demonstrating that the hemilineage (elav- VGlut-) is still removed by PCD with this genetic manipulation.

      The authors make an observation that differs from other systems in which Imp/Syp have been studied: that the expression of the two proteins appears to be independent and not influenced by cross-regulation. However there is a lack of investigation as to what effect this may have on how Imp/Syp regulate temporal identity. A key implication of the previously observed cross-regulation in the fly mushroom body is that the ratio of Imp/Syp could change over the life of the NB which would permit different neuronal identities. Without cross-regulation, do the authors still observe a gradient in the expression pattern of time? Because the data is presented with Imp and Syp stained in different brain samples, and without quantification across different stages, this is unclear. The authors use the term 'gradient' but changes in levels of these factors are not evident from the presented data.

      We have now quantified the transcriptional activity of Imp and Syp in the NB over time using smFISH. We have also quantified the relative expression of Imp and Syp protein in the NB over time by co-immunostaining. Additionally, we quantified the relative expression of Imp and Syp protein in postmitotic neurons as a function of their birth order in late LL3 larvae. All these data show an opposite temporal gradient of Imp and Syp in the NB and an opposite spatial gradient in immature neurons according to their birth order (Figure. 4). How these gradients are established in our system remains to be elucidated. 

      Reviewer #3 (Public Review):

      This study by Guan and co-workers focuses on a model neuronal lineage in the developing Drosophila nervous system, revealing interesting aspects about: a) the generation of supernumerary cells, later destined for apoptosis; and, b) new insights into the mechanisms that regulate this process. The two RNA-binding proteins, Imp and Syp, are shown to be expressed in temporally largely complementary patterns, their expression defining early vs later born neurons in this lineage, and thus also regulating the apoptotic elimination. Moreover, neuronal 'fate' transcription factors that are downstream of Imp and signatures of early-born neurons, can also be sufficient to convert later born cells to an earlier 'fate', including survival.

      The authors provide solid evidence for most of their statements, including the temporal windows during which the early and the later-born motoneurons are generated by this model lineage, how this relates to patterns of cell death by apoptosis and that mis-expression of early-born transcription factors in later-born cells can be sufficient to block apoptosis (part of, and perhaps indicative of the late-born identity).

      Other studies have previously outlined analogous, mutually antagonistic roles for Imp and Syp during nervous system development in Drosophila, in different parts and at different stages, with which the working model of this study aligns.

      Overall, this study adds to and extends current working models and evidence on the developmental mechanisms that underlie temporal cell fate decisions.

      I cannot summarize better the paper.

      Reviewer #1 (Recommendations For The Authors):

      While this is an interesting topic, I raised two issues in my original review.

      (1) Against the backdrop of numerous previous studies linking many developmental regulators, including tTFs, to programmed cell death in the developing CNS, which in several cases have involved identifying key PCD genes and decoding the molecular regulatory interplay between regulators and PCD genes, this study does not provide any new insight into the regulation of developmental PCD in the CNS.

      The authors have not added any new data to address this shortcoming.

      I agree with the reviewer that we did not attempt to link Imp/Syp with the temporal transcription factor (tTF) cascade or spatial selectors such as Hox genes. However, this decision was intentional as our primary focus was on studying immature MNs. It is worth noting that the decommissioning of NBs by autophagic cell death or terminal differentiation, which is mediated by Imp/Syp in other lineages, has not been correlated with tTFs or spatial selectors. Although we have not directly examined the involvement of the hb + sv > kr > pdm > cas > cas-svp > Grh cascade in the decommissioning of the Lin A neuroblast, our preliminary data indicate that Hb, Sv, Pdm, and Cas are not expressed in the Lin A NB, while Grh is consistently expressed in the NB (Wenyue et al., 2022). Thus, it is less likely that this particular tTF cascade is not implicated in Lin A neuroblast decommissioning. In contrast, spatial selectors, such as the Hox gene Antp, play an opposing role compared to HOX transcription factors in abdominal NBs. In the Lin A lineage, Antp promotes survival (Baek, Enriquez, & Mann, 2013). Here, to avoid repeating what has already been described in the literature, we focused on the role of Imp/Syp in postmitotic neurons and revealed that the precise elimination of MNs is linked to the control of TFs expressed in the MNs.

      (2) I raised the issue that it is unclear if Imp/Syp acts in the NB, and/or in IMC/GMC, and/or in the daughter cells generated from these.

      I agree with the reviewer's concern regarding the unclear function of Imp/Syp, i.e., whether it acts in the NB, IMC/GMC, or daughter cells. To address this, one possible approach would be to attempt rescuing Imp and Syp mutants by transgenic expression in specific cell types, such as NBs, IMC/GMC, or GB/daughter cells. However, we have not conducted such experiments as we were skeptical about the outcome. Previous published work has used drivers expressed in NBs, IMC/GMC, or postmitotic neurons to decipher the function of a gene in a specific cell type. But the results of these experiments must be taken with caution. Using NB/GMC drivers to study gene function can lead to effects not only in the NB but also in its progeny, including GMC or postmitotic neurons, due to the perdurance and stability of the Gal4 and UAS-gene expression system. For instance, dpn-Gal4 UASGFP not only labels the NB but also many of its progeny, even if Dpn is only expressed in NBs. And elav-Gal4 is expressed in the NB and GMCs.

      However, our overexpression of Imp in immature neurons using Vglut demonstrates that Imp promotes cell survival through an autonomous function in these neurons. This driver is only expressed in postmitotic neurons (elav+) and not in the NB, IMC/GMC, or in the hemilineage eliminated by cell death (elav-vglut-).

      Reviewer #2 (Recommendations For The Authors):

      Oddly knockdown of Imp in the neuroblast (Fig. 5D) only led to death at 8h APF, when Imp is no longer expressed. Do the authors have an explanation as to how the stem cell can survive until this point? A discussion would be helpful.

      The simple explanation is the efficiency of RNAi. The imp-/- MARCM clones (Guan et al., 2022) lead to a stronger reduction of MNs in LL3.

      A simple experiment I would recommend is to repeat the antibody stainings of staged larvae/pupae (Fig. 4) having the anti-Imp/Syp antibodies in the same brain sample, and perhaps a quantification of the ratio in the NB. Given the species in which the ABs were raised seem compatible, this should be feasible. As it stands now, there is no indication of whether the ratio of Imp vs Syp change over time.

      We have now quantified the transcriptional activity of Imp and Syp in the NB over time. We have also quantified the relative expression of Imp and Syp proteins in the NB over time and quantified the relative expression of Imp and Syp proteins in postmitotic neurons as a function of their birth in late LL3 larvae. How these gradients are established in our system still remains to be 

      Minor errors/suggestions:

      Fig 4. Time legend at the top goes A, B, C, E, F (no D). So it doesn't match the panels below

      Yes, we have made the corrections.

      Sentence repeated in Intro:

      The process of terminating NB neurogenesis through autophagic cell death or terminal differentiation is commonly referred to as decommissioning.

      Yes, corrections have been made.

      IN FIGURE 1 THEY SAY 'TYPE IB' AND IN FIGURE 2 THEY SAY 'TYPE 1B'

      We have changed it to type 1b.

      In Fig2A-It's hard to see lack of Elav and Fig2G-It's hard to see presence of Dcp1. Panels could be adjusted to emphasize these results

      We have increased the size of the panels and made two separate panels where only the elav and Dcp1 signals are present.

      Observations that the result is equivalent in all thoracic segments is expected, since all legs need the same number of neurons. This is nice to have but can be in the supplement.

      Overall the figure number seems excessive, especially considering much of the results included(particularly the NB results) are findings consistent with previous papers and some is characterization of the system that does not fit well with the main focus regarding Imp/Syp (i.e death of one hemi-lineage:

      Figure 5 and 6 can be joined as one.

      We have combined Figures 5 and 6, showing only the T1 segments.

      There is some discrepancy between graphs Fig7F and K: At LL3 the number of neurons is different for the control in 7F and the count in K

      Yes, because the genetic backgrounds are not the same and we are not counting the same type of cells. In 7F, we are counting the elav+ and VGlut+ cells, whereas in Figure 7K, we are counting all the elav+ in Lin A, including those elav+ VGlut-. VGlut expression arrives a bit later after elav+, which is why we have fewer elav+ cells in 7F. In other words, VGlut MARCM clones do not label all Lin A elav+ cells. I have clarified this in the figure.

      Reviewer #3 (Recommendations For The Authors):

      Main comment: on the notion of Imp and Syp gradients:

      p. 5, related to figure 4 - there are clearly distinct windows for predominantly (if not exclusively) Imp, and later, Syp expression in lineage 15, with a phase of co-expression.

      However, based on the data shown, it is unclear whether these windows represent gradients, as repeatedly stated. If the notion of gradients is derived from other studies, on other lineages, then this would be good to clarify. Alternatively, the idea of temporally opposing gradients of Imp and Syp would need to be demonstrated for this lineage.

      For example, a more accurate way to describe this study's data is given on p.7 "In conclusion, our findings demonstrate that the opposite expression pattern of Imp and Syp in postmitotic neurons precisely shapes the size of Lin A/15 lineage by controlling the pattern of PCD in immature MNs (Fig. 8)."

      We have now quantified the transcriptional activity of Imp and Syp in the NB over time. We have also quantified the relative expression of Imp and Syp proteins in the NB over time. We have also quantified the relative expression of Imp and Syp proteins in postmitotic neurons as a function of their birth in late LL3 larvae. How these gradients are established in our system still remains to be identified.

      Minor points:

      p.6, related to figure 7: Are numbers of EDU- early born and EDU+, late born, MNs expressed as means in the main text? As written, it suggests absence of any variability, which one would expect and which is shown in Fig.7 data.

      Yes, we have added averages in the text.

      Methods: the author name 'Lacin' has been mis-spelled

      Sorry about that, it's been corrected.

    1. Note: This response was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would first like to thank the reviewers for their careful reading and thoughtful feedback.

      We have substantially revised the manuscript and included additional experimental evidence on O-GlcNAc and OGT/OGA protein levels in the placenta of embryos bearing the OGT-Y851A hypomorphic mutation.

      Overall, we believe our improved manuscript provides compelling evidence that the glycosyltransferase activity of OGT, and thus the O-GlcNAc modification itself, plays a sexually dimorphic function in placental development and the developmental repression of retrotransposons in the developing embryo.

      We have addressed each of the reviewers' comments below. The original comments (C) are in italic, our responses (R) in Roman font.

      Reviewer #1

      Evidence, reproducibility and clarity

      C1: Formichetti at el. developed mice with OGT catalytic dead mutations and then studied their function during early embryogenesis. Not surprisingly, dramatic reduction in OGT activity failed to produce embryos; however, mild reduction in OGT did produce animals. The authors then use the T931 animals that have a mild reduction in activity to further characterize the function in the early embryo. Not surprisingly, male mice showed changes in gene expression, implantation sub-lethality, and an uptick in loss of retrotransposon silencing. The authors also show that an even milder reduction in OGT activity (Y851A) effects male placenta function and chromatin remodeling. Finally, the authors make a less stable OGT transgene within the mouse and again found embryogenesis issues in the males and alterations in numerous gene families including mTOR signaling and p53 function. All in all, this is an interesting study that track functions of OGT in early embryonic development. The studies are well-controlled and rigorous.

      R1: We thank the reviewer for their clear understanding and their appreciation of the rigor and impact of this work.

      Significance

      C2: This is a good study and novel. Not only is it of interest to reproductive biologist, but it echos themes found in O-GlcNAc biology.

      R1: We are pleased that the reviewer underlined the novelty of the study and its impact across fields.

      Reviewer #2

      Evidence, reproducibility and clarity

      Comments to authors

      C3: To investigate the function of OGT at specific developmental stages, the authors perturbed OGT's function in vivo by creating a murine allelic series featuring four single amino acid substitutions that variably reduced OGT's catalytic activity. The goal was to identify the direct effect of O-GlcNAcylation, using a sophisticated collection of genetic mutants to evaluate in vivo the role of this modification at early stages of development. Overall, the severity of embryonic lethality correlated with the extent of catalytic impairment of OGT, demonstrating that the O-GlcNAc modification is essential for early development.The study represents a substantial advance in our understanding of OGT and O-GlcNAcylation in mammalian development. The creation of novel murine models and inducible systems is an important contribution, providing powerful tools for future research in this field. The insights into the role of OGT's catalytic activity and its involvement in epigenetic regulation during embryonic development are noteworthy, opening new avenues for research.

      R3: We thank the reviewer for their insightful comments. We are grateful for the supporting statements. Please find below detailed response to all your comments.

      However, there are a few considerations and concerns:

      Major:

      C4: 1. An assumption of the study is that different mutations cause different levels of O-GlcNAcylation rather than alterations in substrate specificity. It might be important to test, at least in cultured cells, that the different mutations do not change the preference of OGT to modify certain proteins rather than others, which can provide alternative explanations for their findings.

      R4: Thanks for asking this question, it helped us to better explain the rationale behind the choice of the Ogt amino-acid substitutions.

      This is a critical point that we carefully considered in the design of the single amino-acid substitutions. Two lines of evidence support that the precise mutations created impact the catalytic rate without modifying the substrate specificity:

      First, as explained in the text, the choice of the single amino-acid substitutions was driven by previous structural and enzymology knowledge. The impact of the four point mutations selected on OGT protein stability and on the Michaelis-Menten kinetic values had previously been determined experimentally (Fig. 1A legend and Martinez-Fleites, C. et al. Nature Structure Molecular Biology 2008; https://doi.org/10.1038/nsmb.1443).

      There is a second important rationale that we added in the revised manuscript: the four point mutations selected are all located in the catalytic domain (specifically, H568A in the N-Cat domain and Y851A, T931A and Q849A in the C-Cat domain), while the substrate recognition is operated via two other domains namely the intervening domain (Int-D) https://doi.org/10.1038/s41589-023-01422-2) and the tetratricopeptide Repeat (TPR) superhelix (10.1021/jacs.7b13546; https://doi.org/10.1073/pnas.2303690120). Therefore, for both these reasons, it is extremely unlikely that these mutations could influence the substrate specificity.

      C5.1: 2. In Fig 1D and 1H, the thresholds to define a gene or TE as differentially expressed are not strong. According to the figure legends, "any" change in terms of log2Fc was considered as DE and colored. I think the figures should illustrate better that the changes are subtle, by for example adding a dotted line (at least) in the value 0.5 of the y-axis. These subtle transcriptional changes should be reflected better in certain paragraphs where the expression of TEs are presented/and discussed as a hallmark of the absence of O-GlcNAcylation in the OGT-mutants. The same happens with Suppl Fig 3C (changes are very minor). {. Applying a stronger threshold, among the upregulated genes, only Xist will be significantly overexpressed. If a gentle threshold needs to be applied to this data, authors should at least justify the reasons behind doing so. Same for Fig2D.

      R5.1: The reviewer means Figure 2D for MA plot of gene expression and Figure 2H for retrotransposons expression. These figures now include a dash line to indicate Log2FC = 0.5 (as all MA plots).

      The text is explicit on the subtle changes in transcription, it reads "with 2/3 of the genes downregulated and 90% of the significant changes below 1 log__2__FC"; "most of the Ogt__T931del/Y embryos showed a low magnitude upregulation of retrotransposons".

      The revised text states "Notably, most of the OgtT931__del/Y embryos showed a low magnitude (log2FC < 1) upregulation of retrotransposons".

      We expand on this topic in the next response (R5.2) noting that changes in gene expression upon O-GlcNAc perturbation in different systems were previously characterized as subtle and widespread. We suggest that this phenotype may arise from the scarcely understood pleiotropic function of O-GlcNAc in fine-tuning gene expression; this phenotype could have a biological significance.

      C5.2: If a gentle threshold needs to be applied to this data, authors should at least justify the reasons behind doing so. Same for Fig2D.

      R5.2: Previous studies in different systems reported that O-GlcNAc perturbation causes a widespread change in gene expression of low magnitude (https://doi.org/10.1101/2024.01.22.576677, https://www.pnas.org/doi/10.1073/pnas.2218332120). We use the same thresholds as a recent functional Ogt study in ES cells to call differentially expressed genes, specifically: p<0.05 (Wald test), any FC (Li et al. PNAS 2023, https://www.pnas.org/doi/10.1073/pnas.2218332120). The p value threshold is standard; the absence of FC threshold is dictated by the insufficient knowledge of the significance of the low magnitude changes observed across many transcripts.

      C6: 3. In Figure 2B, the T931del allele was recovered in the blastocyst population with a very high frequency, even higher than the male WT group (T931del: 10; WT: 3). This observation suggests that the T931del allele did not significantly affect blastocyst survival. Further clarification or additional experiments might be necessary to understand the implications of this finding on early developmental stages.

      R6: This is only a hint as the numbers of blastocysts recovered were too small to perform statistics on Mendelian distribution. Thus, more experiments are needed to perform these statistical tests. These experiments are onerous because the low frequency of germline transmission is incompatible with maintaining this mutation by breeding heterozygous animals. Because of this, a new mouse line needs to be created by CRISPR-HDR targeting in the zygote in order to compute statistics on Mandelian ratios. Importantly, this question - does T931del affect blastocyst survival? - is peripheral, and the results of these experiments would not affect our conclusions in any way.

      C7: 4. Similarly, in Figure 2G, there is an apparent higher expression of TE expression in the T931A/Y embryos group than in the T931del/Y group, which combined with the higher frequency of blastocyst generated in this latest group it may indicate a deeper molecular consequence after the deletion of the T931. A comparison of the transcriptome between these two cell lines help to address this possibility. Also, the authors should compare the O-GlcNAc levels of WT, T931A, and T931del mutant blastocysts by immunostaining, similar to what was done in Figure S5F.

      R7: We agree that a direct comparison between the two mutations of the T931 residue would be interesting; however, this comment is very difficult to address experimentally for the reasons outlined below:

      Firstly, it is not possible to perform a statistical comparison of the transcriptome T931A/Y VS. T931del/Y with the data generated because the number of hemizygous T931A/Y (n=2) is too small. Hence, it cannot be ruled out that the seemingly milder retrotransposon reactivation in one of the T931A/Y embryos could have occurred by chance.

      Secondly, considering the low magnitude effect on gene expression changes upon O-GlcNAc genetic perturbation, to statistically assess the penetrance of the molecular phenotype and perform the differential expression analysis, numerous (>>3) hemizygous blastocysts of each genotype would be needed. Because females heterozygous for the T931 mutations transmit the mutant allele at very low frequency, these experiments require numerous de novo CRISPR injection sessions.

      Thirdly, for the immunostaining of O-GlcNAc to be semi-quantitative, a large number of hemizygous blastocysts for each genotype would be required (note that in Figure S5F, 29 morulae per condition were imaged), thus requiring numerous CRISPR injection experiments as discussed above. Moreover, O-GlcNAc changes could be subtler than what expected based on the strong reduction of OGT activity, since as a compensatory mechanism Ogt expression is upregulated in the Ogt__T931A/del blastocysts (Fig. S2D), making a quantification even more challenging despite a high number of stained embryos.

      In sum, these in vivo experiments are difficult and require sacrificing many animals (about 20 females per CRISPR injection experiment). Because the results would bring refinement to the study but would not change our conclusions, we suggest that the cost/benefit is too high.

      C8: 5. In Boulard et al. 2019 O-GlcNAcylation was shown to be sufficient to modulate expression of DNA methylation-dependent TEs. It would be interesting to know (or at least discuss) if the changes in TE expression observed in OGT-mutant embryos in this study involve changes in DNA methylation. Ideally, some DNA methylation measurement optimized for low input numbers of cells would be useful.

      R8: Thank you for making the link with our previous study. In the PNAS paper, we report that targeted removal of O-GlcNAc at proteins bound to specific TEs (e.g. IAPez) causes their full-blown reactivation without detectable changes in DNA methylation, thus suggesting a role of the O-GlcNAc modification for the silencing of methylated TEs downstream or independent of DNA methylation. We agree that it would be informative to quantify DNA methylation in the T931-mutant blastocysts to test if the in vitro result is the same in vivo, but this would require performing onerous microinjection sessions as explained above.

      C9: 6. The data related with the OGT-degron system in MEs seem disconnected with the rest of the manuscript. While the developmental models (blastocyst, etc) elegantly assess the contribution of O-GlcNAcylation to the control of cell survival and gene expression through the use of different OGT mutants, the degron system is a system of graded depletion that unfortunately was only possible to be used in MEFs (instead of embryos). Thus, the results obtained with the degron system in MEFs are difficult to intersect with the data from the use of OGT-mutants in embryos. Even though there are obvious interesting questions that one may want to know about this OGT degron MEF system, none of them would demonstrate a direct role for O-GlcNAcylation in cellular function, the major point addressed in the developmental system. Using the degron system in embryonic stem cells might have provided a more parallel comparison. The authors should discuss this point in more detail and either use ESC instead of MEFs or provide a stronger justification for the use of MEFs over ESC.

      R9: We thank the reviewer for their clear understanding of the system. The choice of primary MEF as an in vitro model was imposed by technical limitations we encountered during the study. We fully agree that ES cells is the model of choice for preimplantation embryos; thus we initially derived ES cells and obtained only one male clone bearing the AID degron system. Upon auxin addition to the culture media, OGT's level remained unchanged in ES cells. Thus, the ES cells model was not usable. To test the AID degron in a different cell type, we then derived MEFs and showed its effectiveness (Figures 4C and S4C-E), which also allowed to collect functional data on OGT's cellular function (Figures 4D-F). We took the comment on board and clarified the rationale of studying MEFs in the revised manuscript. We agree that it remains to be verified that the OGT-dependent pathways uncovered in MEFs are relevant in the preimplantation embryo. Despite this caveat, we feel the mouse model for endogenous OGT-degron, as well as the negative results in vivo and conclusions in MEFs should be shared with the community, which could take advantage of our results to refine the system.

      Minor:C10: 7. In Fig 2C the color and shape codes are confusing to understand - there are some colors/shapes that are not represented in the PCA plot. The same in Fig 3H, where in the PCA plot there are pink triangles that do not match with the code legends.

      R10: We apologize for the confusion with the legends of Figures 2C and 3H, that we have made unambiguous in the revised version (as well as Figures S2B,C and S3C).

      C11: 8. In the figure legends of Figures 2D, 2E, 2F, and 2H, the notation should be corrected from "OgtT931A/Y" to "OgtT931del/Y".

      R11: This has been corrected; many thanks for bringing it to our attention.

      Significance

      C12: To investigate the function of OGT at specific developmental stages, the authors perturbed OGT's function in vivo by creating a murine allelic series featuring four single amino acid substitutions that variably reduced OGT's catalytic activity. The goal was to identify the direct effect of O-GlcNAcylation, using a sophisticated collection of genetic mutants to evaluate in vivo the role of this modification at early stages of development. Overall, the severity of embryonic lethality correlated with the extent of catalytic impairment of OGT, demonstrating that the O-GlcNAc modification is essential for early development.

      R12: We thank the reviewer for their clear understanding of our work and their appreciation of the biological importance of the findings.

      Reviewer #3

      Evidence, reproducibility and clarity

      C13: This is a conceptually interesting paper that attempts to leverage the knowledge of OGT catalysis to begin to dissect OGT function. The evidence is presented I a straightforward fashion and is in general well documented. The breeding strategies are well informed and the paper draws heavily on previous work carried out in the mouse.

      R13: We greatly appreciate the overall supporting review. However, we fail to understand what they mean with "the paper draws heavily on previous work carried out in the mouse". This comment may stem from a misunderstanding because this work is not based on any previously published study. Specifically, neither the seven murine alleles presented and analyzed nor the single embryo-transcriptomic data sets on which our conclusions are based have been published elsewhere.

      To put this work into context, before our study there were two seminal studies published two decades ago that reported the essential role of Ogt for mouse development, but no molecular profiling was performed (10.1073/pnas.100471497, 10.1128/mcb.24.4.1680-1690.2004). The two Ogt loss-of-function alleles studied in these papers were deemed as not suitable for interrogating molecular phenotypes because they caused cell death that confounds molecular profiling and embryonic lethality at implantation, thus preventing study of the sexually-dimorphic role of Ogt placenta. To overcome this long-standing problem, we created new seven murine alleles, which allowed us to tease apart molecular phenotypes at key stages of mouse embryonic development, focusing on the blastocyst and the placenta.

      Significance

      C14: The paper describes tools which will help dissect the many potential roles of O-GlcNAc addition in early development. As it stands, this is a descriptive manuscript that will lead to hypothesis generation and testing and this should not be undervalued. The biological reagents produced and characterized will be of general interest to the field. Most of the findings presented represented a verification of existing ideas in the field but this is not meant as a criticism since part of the motivation for the approach was to generate a reproducible system for analyzing the biological phenomena.

      R14: We thank the reviewer for their appreciation of the importance of experimentally testing ideas shared in the field without direct evidence.

      However, we must respectfully disagree with the qualification of "descriptive manuscript". This qualification may stem from the particularly difficult challenge to accessing the molecular details on how the O-GlcNAc modification exerts the biological functions we report. We are fully cognizant of the limitations of the study that we discussed in the discussion section and in R20.2. However, we feel that the adjective "descriptive" is not a fair qualification because we provide numerous novel functional evidence. Specifically, we introduce two novel orthogonal in vivo perturbations for endogenous Ogt that allowed us to interrogate for the first time its function in the developing mouse embryo. These perturbations allow us to draw causative conclusions (not descriptive) on the essential role of the O-GlcNAc modification itself for preimplantation development, its sexually-dimorphic role in the placenta and its requirement in vivo for the stable repression of retrotransposons.

      C15: There are perhaps some bioinformatic shortcuts taken that may need to be corrected upon thorough review. These do not lessen the overall impact of the contribution.

      R15: All the code written for the bioinformatic analyses performed in this study is publicly available: https://github.com/boulardlab/Ogt_mouse_models_Formichetti2024. The reviewer needs to specify which bioinformatic analysis they suggest could be improved.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      Summary

      C16: O-GlcNAcylation is the fundamental post-translational modification of numerous nuclear and cytosolic proteins. OGT is the sole enzyme catalyzing O-GlcNAc addition onto the proteins. The essentiality of OGT for early development and cellular viability has been established by using OGT-KO mice and cell lines. However, it remains to be elucidated whether the catalytic activity of OGT is required for the early development, and if the catalytic activity of OGT is required what are the functions of OGT or O-GlcNAcylation in early development due to a lack of appropriate mouse models. To overcome the technical difficulty of manipulating the levels of O-GlcNAcylation in early embryos, Formichetti et al. created the series of four mouse models (OgtY851A, OgtT931A, OgtQ849N, and OgtH568A) with different OGT activity by introducing single amino acid substitution in the catalytic domain. By analyzing the inheritance of the hypomorphic OGT alleles and the lethality of mouse embryos, they discovered OGT activity is a critical factor for early development. Subsequently, RNA-seq analyses with two mouse models showing the maternal inheritance of the hypomorphic OGT alleles indicated that sever hypo-OGT activity altered transcription and silencing of retrotransposon in preimplantation development while mild reduction of OGT's activity affected placental development in a sexually dimorphic manner rather than preimplantation development. Furthermore, to study the function of OGT at specific developmental stages, they developed a mouse model bearing endogenously AID-tagged OGT for acute degradation of OGT. Although the degron system wasn't efficient in preimplantation embryos, they discovered quick transcriptional changes upon OGT deletion in MEFs. The quality of the manuscript is good because the question to be solved was appropriately set, the approach was well designed, and their findings were interesting, although their writing was sometimes hard to understand as I raised in my following comments. Nevertheless, there are several points to be fixed before being published.

      R16: We thank the reviewer for their clear understanding of our work and their appreciation of the biological importance of the findings. Your comprehensive review of the manuscript and the questions you raised were extremely helpful in improving the manuscript and fully addressing its limitations. Below, we respond to comments in full, have revised the manuscript to improve clarity and have included novel results.

      Major Comments

      C17: 1. Although the authors showed in vitro activity of each mutant of OGT used in this manuscript by referencing the previous literature, they never showed the levels of global O-GlcNAcylation (and OGT itself) in their established mouse embryos. Although it could be impossible to determine O-GlcNAc levels in OgtQ849N and OgtH568A embryos because of the lack of germline transmission and founder line, respectively, they could do that in OgtY851A and OgtT931A embryos. Given that Y851A and T931A mutants had similar VMAX/KM with different VMAX, it is possible that their activity is comparable or Y851A has even lower activity in vivo depending on the concentration of UDP-GlcNAc in embryos. Therefore, it is critical to assess whether in vivo OGT activity is correlated with that in vitro as expected to conclude that severity of sub-Mendelian inheritance is proportional to the reduction of activity of OGT in vivo. Moreover, since the authors developed the elegant system to deplete OGT, the activity of Q849N and H568A mutant OGT can be examined at least in cells by expressing them in MEFs with OGT-degron system. Thus, I propose determination of global O-GlcNAc levels compensated by OGT levels by western blotting in OgtY851A, and OgtT931A embryos or MEFs with the OGT degron system re-expressing the individual four mutant OGTs. If the protein amount is insufficient for western blotting in the embryos because of the sizes of the earlier stages of embryos, I believe the author could address this by utilizing immunofluorescence as shown in Figure S5.

      R17: We fully agree that this is an important point that requires revision. The only mutation for which the level of O-GlcNAc and OGT can be assessed by western blot in vivo is Y851A, the other mutations resulting in embryonic lethality before the blastocyst stage.

      We have included in the revised manuscript western blot analyses of protein expression for OGT, OGA and O-GlcNAc levels in the placenta of the OgtY851A mutants (new Figures 3C,D). The new data show that OGT is upregulated at the protein level in homozygous females, in good agreement with our transcriptomic analysis. Furthermore, O-GlcNAc levels were slightly reduced in homozygous and hemizygous placentae thus showing the impact of the point mutation on global O-GlcNAc levels in the placentae. Moreover, the analysis of OGA protein level unexpectedly revealed the enrichment of a previously uncharacterized OGA fast migrating isoform in hemizygous and homozygous placentae.

      We agree that it would be informative to compare O-GlcNAc levels in OgtT931A versus OgtY851A embryos. A comparison implies performing the experiment at the same developmental stage, which has to be the blastocyst stage or prior because T931A/Y embryos die around implantation. The blastocyst being made of approximately 140 cells, it would require to pool many single blastocysts to obtain the necessary protein input for western blot. We are not aware of another study performing western blot with pooled blastocysts. An additional great challenge for this experiment is the necessity to genotype and sex the blastocysts before pooling. Thus, the feasibility of this experiment is uncertain.

      As an alternative, the reviewer suggests measuring O-GlcNAc levels in the degron MEFs after introduction of OGT transgenes bearing the mutation studied. This experiment would not be conclusive because of residual O-GlcNAc after OGT degradation (Figure S4E). Furthermore, the O-GlcNAc proteome is dynamic during development (as shown in the developing brain by Liu et al. https://doi.org/10.1371/journal.pone.0043724), therefore the MEFs results would have limited value to explain our results in the early embryo.

      In sum, available technologies to quantify O-GlcNAc (e.g. western bot, mass spectrometry) are inadequate for low input samples as the early embryo. However, our series of hypomorphic alleles backed up with in vitro enzymology measurements brings indirect evidence to this question. Specifically, the qualitative correlation between the measured OGT activity in vitro and the developmental phenotype indicates that the resulting relative levels of O-GlcNAc are consistent with in vitro measurements.

      C18.1 : 2. I didn't understand why the authors couldn't find any founder lines of the OgtH568A mutant. Was that because mosaic mice with OgtH568A mutation are lethal?

      R18.1: To answer to this question, it is important to recall two key features of the biological system:

      1) The mutation H568A was reported to disrupt the glycosyltransferase activity completely (10.1038/nsmb.1443). Hence, OGT-H588A is catalytic dead.

      2) We performed the CRISPR-HDR targeting in the 1-cell embryo.

      Based on these premises, the absence of F0 with the OgtH568A mutation (0/31) suggests that introducing this mutation causes embryonic lethality in both males and females. This hypothesis is consistent with the previously reported lethality around implementation of Ogt-null alleles (10.1128/mcb.24.4.1680-1690.2004). It is possible that the sgRNA is very efficient and results in homozygous mutations in all female zygotes injected (as we have not obtained heterozygous females bearing these mutations). High efficiency of the targeted mutagenesis in the zygote results in mutants where all or the majority of cells bear the mutation (no or low mosaicism). The high number of microinjections performed (416 embryos over the 3 injection sessions) allows us to make these claims.

      C18.2 : Also, I believe there was no explanation why the OgtQ849N allele showed no maternal inheritance. Was that because Q849N possesses enough activity for sustaining mosaic embryos, but not oocytes? The authors should better explain these points in the manuscript text.

      R18.2: Thanks for this comment, we agree that this maternal effect phenotype demands further explanation.

      The phenotype observed suggests two possibilities: either that the oocyte cannot maturate or that the cleavage-stage embryo cannot develop with the resulting lower levels of O-GlcNAc. The cleavage-stage embryo does not transcribe a catalytically active OGT before the 8-cell stage and thus relies on the OGT protein inherited from the oocyte until this stage (https://doi.org/10.1101/2024.01.22.576677).

      Thank you for this comment, we added this interpretation of the result in the text:<br /> "The lack of maternal transmission of the Q849N allele from seemingly mosaic founder females is likely explained by the reliance of the cleavage stage embryo onto the oocyte payload of OGT and O-GlcNAc modified proteins. Specifically, Ogt's exons encoding for the catalytic domains are not detectable before the 8-cell stage, while OGT full-length protein is present and thus maternally inherited (Formichetti et al, 2024)."

      C19: 3. The authors serendipitously found a T931del-allele in the "WT" allele of the OgtT931A line, and suggested that T931del had milder activity loss, although the lethality of embryos was greatly mitigated. Nevertheless, transcriptome analyses in male blastocysts revealed that 120 genes' expression was changed in T931del/Y males. This raised the question about which mutant OGT has higher activity, Y851A or T931del. I think comparing the activity of Y851A and T931del mutants in MEFs with OGT-degron system is important to confirm the proportional relationship between activity and phenotypic severity.

      R19: We agree that it is a limitation that the effect of the T931del mutation on OGT activity has not been biochemically characterized. However, the important point here is that our assessment of phenotypic severity based on maternal inheritance of the mutant allele and embryonic lethality is based on the point mutations for which the catalytic activity has been determined, namely Y851A, T931A, Q849N and H568A, but not T931del.

      We studied the serendipitously discovered T931del mutation to obtain transcriptional insights in the blastocyst. Because the deleted residue T931 is key for the binding to the donor substrate, we can reasonably assume that this mutation affects the catalytic activity, albeit to an undetermined level.

      Hence, our conclusions regarding the requirement of O-GlcNAcylation for development are unaffected by the lack of biochemical knowledge on T931del.

      C20.1: 4. Regarding transcriptomes of T931del/Y, the authors found the upregulation of proteasomal activity and stress granules along with the downregulation of amino acid metabolism, mitochondrial respiration, and so on. To validate the results, the authors should perform qPCR on several up- or down-regulated genes.

      R20.1 : We agree that, in principle, qPCR validation is suitable. However, this validation experiment is particularly expensive in this case because of the requirement of numerous CRISPR zygote pronuclear injection sessions.

      The conclusions of the RNA-seq analysis are strongly supported by a high number of biological replicates (n=10). This high number of biological replicates was essential to obtain sufficient statistical power to quantify with a high level of confidence transcriptional changes of low magnitudes (below 2-fold change, see R5.1 and R5.2).

      Therefore, the qPCR validation experiment would require to repeat the CRISPR zygote pronuclear injection sessions with the same high number of animals. This represents a major investment in experimental work and the sacrificing of about 40 animals. Importantly, the RNA-seq results presented are authoritative because of a high number of biological replicates and high number of sequencing reads per sample. Thus, we argue that qPCR validation is not essential and thus the high cost of this experiment is difficult to justify.

      C20.2: In addition, according to Figure S2E, the authors pointed out that at least for genes upregulated in OgtT931A embryos, the changes were not explained by a developmentally delayed transcriptome, suggesting that upregulation of these genes was the cause of developmental delay. Therefore, I strongly encourage them to discuss in the manuscript text how up-regulated genes could contribute to developmental delay.

      R20.2: Throughout the manuscript, we have been cautious to avoid establishing causal relationships between the differentially expressed genes uncovered and the developmental phenotypes (e.g. delayed development). There are two main obstacles which we believe prevent us from establishing causality with the data available. Firstly, it is not possible to disentangle differentially expressed genes and developmental delay (in other words, we have no way to tell which is the cause and which is the consequence). Secondly, O-GlcNAc modifies over 5000 proteins and the developing embryo is a particularly dynamic system; thus we cannot know whether the differentially expressed promoters are direct targets of O-GlcNAc modified proteins (or alternatively secondary effect of another molecular alteration, for example of the proteome). We discuss this limitation of the study in the discussion section.

      C21: 5. Regarding the transcriptome in OgtY851A mice, Y851A/Y male mice had huge transcriptomic differences, while Y851A/Y851A female mice barely had any. Although it seems to agree with the number of Ogt alleles, I wonder whether other X-linked genes expressed higher in female placenta as shown in Figure 3C could attenuate the effects of decreased OGT activity. I don't think this possibility can be excluded, unless the authors further decrease OGT activity in Y851A/Y851A female placenta and obtain the similar results as for male placenta. Or if they compared the levels of global O-GlcNAcylation between Y851A/Y and Y851A/Y851A mouse placentas and discovered they had similar levels of O-GlcNAcylation, then the authors could conclude that the number of Ogt alleles was not the reason of sexual-dimorphism. The authors should determine the levels of O-GlcNAcylation in Y851A/Y and Y851A/Y851A mouse placentas and/or at least discuss the above possibilities in the manuscript text.

      R21: Thank you for the thoughtful feedback. We agree that the most likely explanation for the higher sensitivity of males placenta as compared to females to OGT reduced activity is the difference in Ogt copy number, especially because Ogt escapes X-chromosome inactivation in the placenta (new Figure S3A).

      Western blot quantification of global O-GlcNAc levels was now performed (new Figures 3C,D). We measured similar level of O-GlcNAc in Y851A/Y and Y851A/Y851A placentas (lowered than WT males in both cases), but we cannot exclude that the WB does not have the dynamic range required to detect a subtle difference. In fact, female homozygous were expected to have an intermediate level between WT males and hemizygous males, and the difference between the two male genotypes (also considering sample-to-sample variability) is already small when quantified from the blot (new Figure 3D). It is possible that a X-linked modifier attenuates the impact of hypo-O_GlcNAcylation in female mutant placenta in the case of identical O-GlcNAc levels in homozygous females and hemizygous males. Thank you for the idea that we included in the revised manuscript:

      "Of note, the lower sensitivity of the homozygous females' transcriptome to Ogt disruption (Fig. 3F,I and S3B) seems difficult to reconcile with their lower O-GlcNAc level comparable (lower) O-GlcNAc level to the hemizygous males (Fig. 3C). It is possible that the western blot technique is not sensitive enough to detect subtle differences in O-GlcNAcylation. An alternative hypothesis, if O-GlcNAc levels were truly identical between Y851A/Y and Y851A/Y851A, could be the existence of a modifier in female that could be a XCI-escapee."

      C22: 6. In terms of the transcriptome in OgtY851A mice, similar to comment 4, the authors should confirm their transcriptomics data shown as Figure 3D by qPCR. In addition, the authors should describe the potential mechanisms by which the differentiation of precursor cells of LaTPs and JZPs were disrupted. Were master regulators of the differentiation known to be O-GlcNAcylated and loss of O-GlcNAcylation perturbed the function?

      R22: As for the whole embryo discussed in R20.2, we also interpret cautiously the gene expression phenotype observed in the placenta. Specifically, we state in the manuscript that it could either be caused by an impact of lower O-GlcNAcylation on placental differentiation or by a general delay in placentation or in the development of the embryo as a whole. The hypothesis of a general delay (of the whole embryo and/or of placental formation specifically) is supported by the downregulation of essentially all markers of more differentiated cell types and the upregulation of the precursor marker. We favor this hypothesis because it is consistent with what observed with the T931 mutants and also with the enzymatic removal of O-GlcNAc in the zygote (Formichetti et al., 2024 BioRxiv). Because of the thousands of O-GlcNAcylated proteins present in the cell, it is impossible to know which is the responsible molecular mechanism, which could even start at much earlier stages.

      Minor Comments

      C23: 1. Regarding DFP461-463 mutant, I couldn't understand the point of this figure because the results had no difference, and the meaning of the mutation was quite different from the others. Thus, the figure was awkward and a little confusing to me. If the authors still want to include the figures, I would suggest that they should reorganize the position of the figure (maybe after figure 3 is better to show you had tried to investigate the effects of nuclear localization of OGT on the changes of transcriptomes) and add some results. Since WT OGT seems to be localized mainly in the cytosol at steady state (Figure S1B and S1C), the effect of mutation on its nuclear localization should not be obvious. Therefore, it is difficult to conclude the mutation had no effect on the nuclear localization unless the ratio of nuclear and cytosol localization is quantified. Also, I wonder whether the O-GlcNAc levels of nuclear and cytosolic proteins in the mutant cells were comparable to those in WT cells. If this is the case, the results would also support the authors' conclusion.

      R23: We took the comments on board and made it clearer that the rationale for the DFP461-463 mutant was an attempt to separate OGT's nuclear and cytosolic functions. We fully agree that these results are peripheral, and thus we presented these results in Supplementary Figure 1 (not in the main figure).

      The biochemical evidence presented in Fig S1C shows that the genetic substitution of DFP to AAA on endogenous OGT has no detectable impact on its nuclear localization in primary MEFs. This result is far more authoritative than the evidence provided by Seo et al. 2016 (doi: 10.1038/srep34614), which is based on the overexpression of OGT transgenes in HeLa cells. Importantly, Seo et al. 2016 did not assess the impact of their mutations on endogenous OGT.

      We believe that the negative results we obtained with the DFP461-463 mouse model shall be extremely valuable for the field. Firstly, science can move forward only if both negative and positive results are shared. In this specific case, we found that mutation of endogenous OGT in MEFs yielded to a different result than previously reported overexpression of the same mutant construct in HeLa cells. Secondly, we want to make the Ogt-NLS- mouse model available for further investigations.

      C24: 2. Since OGT or O-GlcNAcylation regulates chromatin status, the authors analyzed the gene expression profiles of retrotransposons in T931del/Y or T931A/Y mice. Is it possible to investigate if the release of gene silencing is also seen in non-retrotransposon genes? I assumed retrotransposons might be a well-established system to analyze gene silencing status, however, if the authors could find similar effects on genes other than retrotransposons, that would be highly valuable.

      R24: This is an interesting idea. This notion refers to the activation of promoters that are normally epigenetically repressed (e.g. silent despite the presence of all trans-active factors required for their expression). Epigenetically repressed promoters include retrotransposons, imprinted genes and germline specific genes that are normally expressed in germ cells and maintained in a repressed state in somatic cells (10.1038/s41580-019-0159-6). Testing of mono-allelic expression of imprinted genes required F1-hybrid. Thus, we assessed whether well-studied germline specific genes could be realized from silencing in T931del/Y or T931A/Y blastocyst and found no evidence for it (see dot plot below). The unbiased transcriptomic analysis presented in the manuscript shows that the product of upregulated genes are enriched in mRNA processing (Figure 2E), but these genes are not normally epigenetically repressed. Thus, contrary to retrotransposons, the role of O-GlcNAc at cellular gene promoters appears not to be linked to epigenetic silencing. This could be explained by the many different protein substrates for O-GlcNAc.

      C25: 3. OgtY851A mice with milder OGT activity loss didn't exhibit impaired preimplantation development, but did display postimplantation development such as placental development, suggesting that O-GlcNAcylation of proteins required for preimplantation and postimplantation development relies on different degrees of OGT activity. I wonder whether global O-GlcNAc levels in embryos in preimplantation and postimplantation developmental stages are different or not. This might include both the pattern of blotting and intensities. The results would give the authors an explanation why the dependency on OGT activity was different in two developmental stages. Can the authors provide data? If not, then the authors should at least describe hypotheses in the manuscript to address these questions.

      R25: We recently reported that the subcellular patterns of O-GlcNAc are highly dynamic during preimplantation development (Formichetti et al. 2024, BioRxiv). The most striking O-GlcNAc remodeling we observed is the enrichment of nuclear O-GlcNAc as compared to cytoplasmic O-GlcNAc that is concomitant to embryonic genome activation (Formichetti et al. 2024, BioRxiv). We quantified the ratio of the nuclear/cytoplasmic signal by immunofluorescence, but absolute quantification is not possible with this method. Due to the limited number of cells of the preimplantation embryo, this analysis cannot be performed by western blot. Hence, there is no appropriate method to quantitatively compare O-GlcNAc levels between preimplantation and postimplantation embryos.

      C26: 4. The authors' AID-degron system elegantly worked in MEFs but was inefficient in preimplantation embryos. I wonder if this was because of the high expression of the shorter isoform of OGT detected as OGTp78 in the author's western blot. Is it possible to examine this possibility in the embryos? Either way, the authors should describe a potential explanation for why the efficiency in the embryos was low. In addition, the authors should describe why they inserted the AID tag only into the longest OGT isoform.

      R26: This is a good point. The smallest isoform OGTp78 bears the catalytic domain and thus can partially compensate for the degradation of OGTp110. Note that the level of OGTp78 is low and does not increase upon OGTp110 degradation; thus a compensation can only be partial (Figures S4A and S4D). Alternative hypotheses for the ineffectiveness of the degron system in ex vivo grown embryos include: i) the expression level of OsTIR that may be too low in the early embryo (Rosa26 promoter not being activated at EGA), ii) a possible steric hindrance of the N-ter AID tag in these cells, iii) the lower concentration of Auxin imposed by toxicity on the embryo is likely suboptimal. Testing these possibilities is very difficult in preimplantation embryos.

      It is unclear how the OGTp78 isoform is produced; it was hypothesized to originate from an alternative transcription start site (https://doi.org/10.1007/s00335-001-2108-9). We initially attempted to target both isoforms by inserting the AID tag at the C-terminus, but we were unsuccessful in producing this mouse model. It is possible that the C-terminus that is near the catalytic site cannot tolerate the AID knock-in.

      C27: 5. In Figure S1C, is the band detected right below OGTp78 in nuclei fractions non-specific or do both bands correspond to OGTp78 ?

      R27: To answer this question, a knockout control would be needed. OGTp78 being not targeted by our AID-degron, we cannot test the specificity of these bands using our perturbation tool kit.

      C28: 6. Figure 1D top row third column: hemizgous -> hemizygous

      R28: Many thanks; the embarrassing typo has been corrected.

      C29: 7. Figure 1D second row third column: hemyzygous -> hemizygous

      R29: Thanks for bringing this other typo to our attention, it is now corrected.

      Reviewer #4 (Significance (Required)):

      General assessment: strengths and limitations

      C30: Strength: This manuscript elegantly revealed the requirement of OGT in mammalian development by taking advantage knock-in mouse models with different OGT activity. In addition, the manuscript provided the interesting and important transcriptomics data in both pre- and post-implantation embryos of OGT mutant mice. These data sets could explain detailed mechanisms how OGT or O-GlcNAcylation regulates mammalian development in the future. Furthermore, development of AID-tagged OGT system would be a useful tool for other researchers studying OGT function.

      Limitation: Although they found interesting changes in terms transcriptomes in developing mice with different OGT activity, they lack the data showing how these changes caused the observed phenotypes. In other words, there are less mechanistic insights behind the developmental problems seen in mice with different OGT activity.

      In addition, although I agree the question about whether OGT activity itself is crucial for the early development of mammals has not been completely solved for a long time, I assume people thought OGT activity is actually important for the mammalian development thorough the observation of OGT-linked congenital disorders of glycosylation.

      Therefore, I would say the novelty of the manuscript is a little less impactful. Furthermore, although AID-tagged OGT system revealed fundamental questions regarding the transcriptional changes upon acute depletion of OGT in cellular levels, the system was inefficient in mouse embryos. So, they showed nothing about developmental-stage specific requirements of OGT.

      Advance: The manuscript can fill a current gap regarding requirement of OGT in mammalian development. Also, the manuscript developed a series of mutant mice with different OGT activity and an AID-tagged OGT mouse line. These mice provide technical advances.

      Audience: The manuscript will be interested in researchers in specific fields such as glycobiology, developmental biology, and clinical fields.

      Describe your expertise: Biochemistry, Glycobiology, Cell biology

      R30: We are thankful for the constructive and supportive review.

      We fully agree with the limitations of the study and discussed them in the manuscript. Our in vivo approach revealed the most phenotypically relevant transcriptional phenotypes resulting from OGT catalytic impairment during embryonic development. We make the mouse models created for this study available to the community to facilitate follow-up studies aiming at exploring the underlying molecular details.

      As pointed out in the comments, the requirement of OGT glycosyltransferase activity for mammalian development was widely assumed by the field, but this belief was without direct experimental evidence. This study provides the first in vivo evidence for this important conclusion.

      Conclusion: The reviewers' comments were tremendously useful to improving the clarity of the manuscript and adding important new in vivo evidence. We note that none of the reviewers provided any reason to doubt our important conclusions:

      • The demonstration that the enzymatic activity of Ogt, thus the O-GlcNAc modification itself, is essential for preimplantation development.
      • The finding that a mild reduction of OGT's activity is sufficient to perturb the silencing of multiple families of retrotransposons in the growing embryo.
      • The indication, from transcriptomes of hypo-O-GlcNAcylated embryos, of a developmental retardation upon a mild O-GlcNAc perturbation.

      • The discovery that OGT's rapid depletion in vitro downregulates basal cellular function, including translation. This result provides mechanistic support to the embryonic growth delay resulting from decreasing O-GlcNAc in vivo.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Comments to authors

      To investigate the function of OGT at specific developmental stages, the authors perturbed OGT's function in vivo by creating a murine allelic series featuring four single amino acid substitutions that variably reduced OGT's catalytic activity. The goal was to identify the direct effect of O-GlcNAcylation, using a sophisticated collection of genetic mutants to evaluate in vivo the role of this modification at early stages of development. Overall, the severity of embryonic lethality correlated with the extent of catalytic impairment of OGT, demonstrating that the O-GlcNAc modification is essential for early development.<br /> The study represents a substantial advance in our understanding of OGT and O-GlcNAcylation in mammalian development. The creation of novel murine models and inducible systems is an important contribution, providing powerful tools for future research in this field. The insights into the role of OGT's catalytic activity and its involvement in epigenetic regulation during embryonic development are noteworthy, opening new avenues for research. However, there are a few considerations and concerns:

      Major:

      1. An assumption of the study is that different mutations cause different levels of O-GlcNAcylation rather than alterations in substrate specificity. It might be important to test, at least in cultured cells, that the different mutations do not change the preference of OGT to modify certain proteins rather than others, which can provide alternative explanations for their findings.
      2. In Fig 1D and 1H, the thresholds to define a gene or TE as differentially expressed are not strong. According to the figure legends, "any" change in terms of log2Fc was considered as DE and colored. I think the figures should illustrate better that the changes are subtle, by for example adding a dotted line (at least) in the value 0.5 of the y-axis. These subtle transcriptional changes should be reflected better in certain paragraphs where the expression of TEs are presented/and discussed as a hallmark of the absence of O-GlcNAcylation in the OGT-mutants. The same happens with Suppl Fig 3C (changes are very minor). Similarly, in Fig2C, the changes in gene expression are lower than log2FC 1 (which represent the double in absolute expression). Applying a stronger threshold, among the upregulated genes, only Xist will be significantly overexpressed. If a gentle threshold needs to be applied to this data, authors should at least justify the reasons behind doing so. Same for Fig2D.
      3. In Figure 2B, the T931del allele was recovered in the blastocyst population with a very high frequency, even higher than the male WT group (T931del: 10; WT: 3). This observation suggests that the T931del allele did not significantly affect blastocyst survival. Further clarification or additional experiments might be necessary to understand the implications of this finding on early developmental stages.
      4. Similarly, in Figure 2G, there is an apparent higher expression of TE expression in the T931A/Y embryos group than in the T931del/Y group, which combined with the higher frequency of blastocyst generated in this latest group it may indicate a deeper molecular consequence after the deletion of the T931. A comparison of the transcriptome between these two cell lines help to address this possibility. Also, the authors should compare the O-GlcNAc levels of WT, T931A, and T931del mutant blastocysts by immunostaining, similar to what was done in Figure S5F.
      5. In Boulard et al. 2019 O-GlcNAcylation was shown to be sufficient to modulate expression of DNA methylation-dependent TEs. It would be interesting to know (or at least discuss) if the changes in TE expression observed in OGT-mutant embryos in this study involve changes in DNA methylation. Ideally, some DNA methylation measurement optimized for low input numbers of cells would be useful.
      6. The data related with the OGT-degron system in MEs seem disconnected with the rest of the manuscript. While the developmental models (blastocyst, etc) elegantly assess the contribution of O-GlcNAcylation to the control of cell survival and gene expression through the use of different OGT mutants, the degron system is a system of graded depletion that unfortunately was only possible to be used in MEFs (instead of embryos). Thus, the results obtained with the degron system in MEFs are difficult to intersect with the data from the use of OGT-mutants in embryos. Even though there are obvious interesting questions that one may want to know about this OGT degron MEF system, none of them would demonstrate a direct role for O-GlcNAcylation in cellular function, the major point addressed in the developmental system. Using the degron system in embryonic stem cells might have provided a more parallel comparison. The authors should discuss this point in more detail and either use ESC instead of MEFs or provide a stronger justification for the use of MEFs over ESC.

      Minor:

      1. In Fig 2C the color and shape codes are confusing to understand - there are some colors/shapes that are not represented in the PCA plot. The same in Fig 3H, where in the PCA plot there are pink triangles that do not match with the code legends.
      2. In the figure legends of Figures 2D, 2E, 2F, and 2H, the notation should be corrected from "OgtT931A/Y" to "OgtT931del/Y".

      Significance

      To investigate the function of OGT at specific developmental stages, the authors perturbed OGT's function in vivo by creating a murine allelic series featuring four single amino acid substitutions that variably reduced OGT's catalytic activity. The goal was to identify the direct effect of O-GlcNAcylation, using a sophisticated collection of genetic mutants to evaluate in vivo the role of this modification at early stages of development. Overall, the severity of embryonic lethality correlated with the extent of catalytic impairment of OGT, demonstrating that the O-GlcNAc modification is essential for early development.

    1. Reviewer #2 (Public Review):

      Pyoverdines, siderophores produced by many Pseudomonads, are one of the most diverse groups of specialized metabolites and frequently used as model systems. Thousands of Pseudomonas genomes are available, but large scale analyses of pyoverdines are hampered by the biosynthetic gene clusters (BGCs) being spread across multiple genomic loci and existing tools' inability to accurately predict amino acid substrates of the biosynthetic adenylation (A) domains. The authors present a bioinformatics pipeline that identifies pyoverdine BGCs and predicts the A domain substrates with high accuracy. They tackled a second challenging problem by developing an algorithm to differentiate between outer membrane receptor selectivity for pyoverdines versus other siderophores and substrates. The authors applied their dataset to thousands of Pseudomonas strains, producing the first comprehensive overview of pyoverdines and their receptors and predicting many new structural variants.

      The A domain substrate prediction is impressive, including the correction of entries in the MIBiG database. Their high accuracy came from a relatively small training dataset of A domains from 13 pyoverdine BGCs. The authors acknowledge that this small dataset does not include all substrates, and correctly point out that new sequence/structure pairs can be added to the training set to refine the prediction algorithm. The workflow unfortunately cannot differentiate between different variants of Asp and OHOrn. To validate their predictions, they elucidated structures of several new pyoverdines, and their predictions performed well. The authors tested their workflow on Burkholderiales A domains and had good results, suggesting it can be used on other taxa. Skimming through the source code and data, the algorithm itself appears to be sound and a clear improvement over existing tools for pyoverdine BGC annotation.

      Predicting outer membrane receptor specificity is likewise a challenging problem and the authors have made a promising achievement by finding specific gene regions that differentiate the pyoverdine receptor FpvA from FpvB and other receptor families. Their predictions were not tested experimentally, but the finding that only predicted FpvA receptors were proximate to the biosynthesis genes lends credence to the predictive power of the workflow. The authors find predicted pyoverdine receptors across an impressive 468 genera, an exciting finding for expanding the role of pyoverdines as public goods beyond Pseudomonas. However, whether or not these receptors can actually recognize pyoverdines (and if so, which structures!) remains to be investigated.

      In all, the authors have assembled a rich dataset that will enable large scale comparative genomic analyses. This dataset could be used by a variety of researchers, including those studying natural product evolution, public good eco/evo dynamics, and NRPS engineering.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Review: 

      This study used ATAC-Seq to characterize chromatin accessibility during stages of GABAergic neuron development in induced pluripotent stem cells (iPSCs) derived from both Dravet Syndrome (DS) patients and healthy donors. The authors report accelerated GABAergic maturation to a point, followed by further differentiation into a perturbed chromatin profile, in the cells from patients. In a preliminary analysis, valproic acid, an anti-seizure medication commonly used in patients with DS, increased open chromatin in both patient and control iPSCs in a nonspecific manner, and to different degrees in cultures derived from different patients. These findings provide new information about DS-associated changes in chromatin, and provide further evidence for developmental abnormalities in interneurons with DS. 

      Strengths:

      This is a novel study that aims to investigate the epigenetic changes that occur in a sodium channel model of epilepsy; these changes are often ignored but may be an interesting area for future therapeutics. In general, the flow of the paper is good, and the figures are well-designed.  Reply: Thank you for your positive feedback about our work. 

      Weaknesses:

      The most substantial weakness relates to the observation that DS is often viewed as a monogenic form of epilepsy. It is directly linked to SCN1A gene haploinsufficiency (Yu et al, 2006; Ogiwara et al, 2007). The gene product is Nav1.1, the alpha subunit of voltage-gated sodium channel type I that regulates neuronal excitability. Yet, analysis was conducted at time points of GABAergic interneuron differentiation in which SCN1A is likely not expressed. The paper would be strengthened if SCN1A expression and Nav1.1 protein were examined across the experimental time course. If SCN1A is not yet expressed, this would complicate any explanation of how the observed epigenetic changes might arise. It also seems counterintuitive that the absence of a sodium channel can accelerate differentiation, when, a priori, one might expect the opposite (a 'less neuronal' signal). 

      Thanks, this is an important point!  In our revised manuscript, we have incorporated data on the expression of SCN1A at d19 and d65 of GABAergic development in both the control and patient groups. We first retrieved data from our previous RNA-Seq analysis, showing SCN1A gene expression in our cells at both d19 and d65. We have now updated our text on the SCN1A gene expression in the revised manuscript (Revised Supplementary Figure 1A, revised text Line 108-109). Second, we confirmed the dynamics of SCN1A expression by real-time quantitative RT/PCR analysis at four time-pionts of GABAergic development (d0, d19, d35 and d65). Notably, expression of SCN1A was detected by qRT-PCR from d19 and the expression increased with differentiation. We have now included this information in the revised manuscript (Revised Supplementary Figure 1B, revised text Line 112). 

      Related to this, another important limitation of the study is that the controls are cells derived from healthy individuals and not from isogenic lines. The usage of isogenic lines is extremely relevant for every study in which iPSC-derived somatic cells are used to model a disease, but specifically in diseases like DS, in which the genetic background has an ascertained impact on disease phenotype (Cetica et al, 2017 and others). This serious limitation should be considered.

      Yes, we fully agree that isogenic and edited patient-derived iPSC would have been the ideal controls. At an early stage we therefore invested considerable time and efforts in order to generate isogenic lines from patientderived iPSC. However, editing of the SCN1A variants in patient-derived iPSC turned out unsuccessful after several trials and modifications so we finally turned to iPSC from healthy donors. This is now discussed together with other limitations of our study in the revised manuscript (end of discussion section, lines 499-506).

      In addition, the authors should provide data on variability across cell lines and differentiations to help convince the reader that the results can be attributed to genetic defects, rather than variability across individuals. 

      This is a valuable point. In the revised manuscript, we have now added plots and IF staining from individual samples to give the readers a complete picture on how they are distributed (Revised Supplementary Figure 1C, Revised Supplementary Figure 2, and Revised Supplementary Figure 4).

      In the revised manuscript, we incorporated an explanation on the strategy used to compare the two groups (cases vs. controls) in more detail. In our analysis, we first compared the dynamic changes of chromatin accessibility cell line by cell line across differentiation. We then extracted the common changes from different cell lines at each time point (Revised text line 152-155, line 226-228). Using this strategy, we extracted the common changes confined to the control and patient groups, respectively. With this approach we avoid to capture the variability across individuals.

      Additionally, the authors acknowledge the variability of the differentiations and cell lines, which is commendable, and they attribute this to "possibly reflecting cell line specific and endogenous differences reported previously", but could also have to do with cell death. This is a large confounding factor for ATAC-seq. Certainly, Sup Fig 1C shows lower FrIP scores, consistent with cell death, and there seems to be a lot of death in the representative images. Moreover, the iGABA neurons are very difficult to keep alive, especially to 65 days, without co-culturing with glia and/or glutamatergic neurons. The authors should comment on how much these factors may have influenced their results. 

      With this point in mind, we re-examined QC of our ATAC-Seq across all samples: As shown in revised

      Supplementary Figure 2C and Supplementary Figure 4C, our cutoff for FRiP is 15%, and all of samples have an FrIP of more than 15%. At the later time points (d35 or d65), we did not observe a FRiP <15%. We therefore feel confident that the quality of ATAC-Seq is good enough for downstream analysis and data interpretation.  

      Regarding the differentiation protocol, we are following a directed protocol of iPSC towards interneurons. The protocol is described in detail by Maroof et al (reference 34) and slightly modified in our lab (described in reference 13). With our modified protocol, GABAergic cells are viable beyond day 65 without the need of co-cultures with astrocyte or microglia. This is also reflected by the electrophysiological activity of interneurons at d65 and at later time points (reference 13). Additionally, our ambition was to obtain a homogeneous cell population for further analysis. Adding other cell types to the cultures would have interfered with downstream processes and a need for cell sorting. Using our protocol, we obtain viable GABA interneurons after up to 100 days in culture. To assess the viability of our cells at the point of sampling (other than by morphological assessment), we used Trypan blue staining and an automated cell counter. Only samples with a viability >90% were processed for ATAC seq. which is a commonly used cut-off for cell viability. We have now modified the method section in the revised version to describe the GABAergic differentiation and sampling (line 519-529).

      Finally, changes in gene expression are only inferred, as no RNA levels were measured. If RNA-seq was not possible it would have been good to see at least some of the key genes/findings corroborated with RNA/protein levels vs chromatin accessibility alone, particularly given that these molecular readouts do not always correlate. 

      In our revised manuscript, we include our recently published RNA-seq performed at d19 and d65. We also correlated the RNAseq and ATACseq data obtained from the same samples.  The Pearson correlations between gene expression and chromatin accessibility were within the range 0.49-0.57 (Revised Supplementary Figure 2G, Revised supplementary Figure 4G), which is acceptable according to standard criteria. The results confirmed that the quality of ATAC-Seq is good enough for analysis of expression levels and chromatin openness in key genes. We also added gene expression levels from RNA-seq (d19 and d65) in our revised manuscript (Revised Figure 1G, Revised Figure 2G). Finally, we performed qRT-PCR analysis of key genes in each cluster and the results are now included in the revised version (Revised Supplementary Figure 3E, Revised Supplementary Figure 5E)

      Additional Points:

      (1) Representative images for cell-identity markers for only D65 are shown, and not D0, D19, and D35 though it is stated in the text that this was performed. At a minimum, these representative images should be shown for all lines. 

      As suggested, we have now added images for cell identity markers of all iPSC lines in the revised version (Revised Supplementary Figure 1C).

      (2) What QC was performed on iPSC lines, i.e. karyotype/CNV analysis and confirmation of genotypes?

      All iPSC lines used in this study have been fully characterized according to standard and state-of-the art procedures: Expression of pluripotency and stemness genes has been shown by immunostaining, flow cytometry and scorecard analysis; integrity of the genome has been assessed by karyotyping using g-banding; differentiation capacity was characterized using an embryoid body assay in combination with scorecard analysis; and genotypes were verified by Sanger sequencing. Please, see the following publications for full datasets: Schuster et all, Neurobiol Dis 2019, Schuster et al Stem Cell Res 2019, Sobol et al Stem Cells and Development 2015. In our lab, the integrity of iPSC lines are routinely verified using flow cytometry (expression for TRA-1-60 and SSEA4), immunostaining (expression of NANOG, SOX2 and OCT4), Sanger sequencing (targeting variants in SCN1A gene), cell morphology analysis and analysis of mycoplasma by MycoAlert® (Lonza).

      (3) Were all experiments performed on a single differentiation? Or multiples? Were the differentiations performed with the same type? If not, was batch considered in the analysis? 

      Thank you for raising this question. The text Material and Methods has been modified as follows, to better describe the differentiation and sampling procedure:

      “GABAergic interneuron differentiation from iPSCs was performed as previously described (reference 13). The protocol utilizes DUAL SMAD inhibition to induce neurogenesis towards neural stem cells for 10 days, followed by patterning with high levels of sonic hedgehog for nine days towards cortically fated neuronal progenitor cells (NPC) and subsequent maturation for 46 days, i.e. a total of 65 days (Figure 1A). Neuronal cells at day 65 and onwards are healthy and viable as judged by morphological assessment by light microscopy. Differentiation was performed at least 3 times per cell line.  

      Cell cultures were sampled at days 0 (D0), D19, D35 and D65, respectively, by harvesting cells with TryplE and centrifugation (300 x g, 3 min). Harvested cells were counted and assessed for viability using trypan blue staining and an automated EVE cell counter (Nano Entek). Samples with a viability of >90% were chosen for ATAC-Seq library preparation (see below).”.  

      I also assume that technical replicates were merged, and then all three biological replicates were kept for each analysis and outliers were not removed, e.g. Control_D19_8F seems like an example of an outlier. 

      This is a valuable point. We agree on that there is variability across three health donors and patients, respevtively, but the quality of ATAC-Seq is good after multiple assessment of QC (Revised Supplementary Figure 2B-D). The color code in Supplementary Figure 1C may be mis-leading as the Pearsson correlation of all samples was displayed. Overall, the correlation from all ATAC-seq among replicates are over 0.8. At the same time, we observed that samples at d0 are clustered together, but not at the later time points. We interpret this as related to the cell-line specific plasticity of chromatin dynamic during differentiation. The observation agrees with our results from PCA (Revised Supplementary Figure 2F).  

      (4) In Figure 1C, it is intriguing that the ATACseq signal gets stronger in imN. One might expect it to be strongest in the iPSCs which are undifferentiated and have the highest levels of open chromatin. Is this a function of sequencing depth, or are all the Y-axes normalized across all time points? 

      This is another valuable point. Figure 1C present the average chromatin openness for clusters specific regions- not of chromatin openness from the entire genome, which is a reason for why the chromatin openness at

      D35 is higher than at other time-points. The genome-wide chromatin openness is presented in revised

      Supplementary Figure 2D and we have now updated the figure legend to avoid any potential misunderstanding. 

      The sequencing depth for each sample is extracted in a similar range. To give the readers a complete picture, we also present the depth of sequencing reads for each sample (Revised Supplementary Figure 2A and Revised Supplementary Figure 4A). The Y-axes of genome browser tracks were normalized, and we added the normalized value in the figures. 

      (5) In Figure 1F, are these all enriched terms, or were they prioritized somehow? 

      Yes, the enriched terms are prioritized based on biological meanings, and we have now clarified this in the updated legend of the manuscript. In addition, all enriched terms are now included in revised Supplementary Table 2 and Supplementary Table 4. 

      (6) In Figure 1G (also the same plots in Fig 2/3), are all these images normalized i.e. there is no scale bar for each track, and do they represent and aggregate BAM/bigwig?

      Yes, the genome browser tracks were normalized and we have now revised the figures by adding scale bars.

      It would be good to show in supplement the variability across cell lines/diffs - particularly given the variability in the heatmap/PCA - and demonstrate the rigor/reproducibility of these results. This comment applies to all these plots across the 3 figures, particularly as in some instances the samples appear to cluster by individual first and then time point (Sup Fig 3B). 

      Thanks. We have now revised the figure with plots showing individual samples. 

      How confident are the authors that these effects are driven by genotype and not a single cell line? In the Fig 3D representation of NANOG, it is very difficult to see any difference between patient and control. 

      In Figure 3D, we showed common chromatin dynamics in the control and patient groups. To avoid any misunderstanding, we have now updated our legend in the revised manuscript. 

      (7) For the changes in occupancy annotation (UTR/exon/intron etc), are these differences still significant after correcting for variability from cell line to cell line at each time point? I.e. rather than average across all three samples, what is the range?  Reply: Revised accordingly. 

      (8) The VPA timepoint is not well-justified. Given that VPA would be administered in patients with fully mature inhibitory neurons, it is difficult to determine the biological relevance. I appreciate that this is a limitation of the model, but this should at least be addressed in the manuscript. 

      We agree on that our model system of GABAergic interneuron development has limitations and that cells may not fully recapitulate the development and physiology in vivo. Obvious factors to consider in our system are the directed protocol to enrich for GABAergic interneurons and the differentiation time-line restricted to 65d. This is now discussed (lines 499-506).

      Recommendations for the authors:

      (1) The term 'mutation' has been replaced with the term ' pathogenic variant' or likely pathogenic variant depending on the context, please see PMID: 25741868 

      Thank you for pointing this out. We have replaced all instances of “mutation” with “pathogenic variant” throughout the manuscript.

      (2) It is unclear what the nomenclature for sample labelling is in Supplementary Figure 1, e.g. 7C, 8F, 1B.  

      We apologize for this confusion. There are cell lines names. We labeled all data and images according to cell line name, i.e. control lines: Ctl1B, Ctl7C and Ctl8F; patient lines: DD1C, DD4A, DD5A. To avoid any potential confusion, we have added a note in the revised legend of Supplementary Figure 1B.

      (3) Can the authors confirm that the Deseq2 FDR values are Benjamini-Hochberg procedure corrected per default settings? If so, this should ideally be added to methods or legend for clarity 

      Yes, default settings were used in Deseq2 FDR values, which is added in the method part of revised manuscript. 

      (4) While it makes sense that the authors present the data in the order of Figure 1, and Figure 2, this actually makes it quite difficult to compare the two datasets, especially for the functional enrichment in the "F" figures. It may be helpful to consider re-organizing the figure order. For instance, for the long-term potentiation signal in the DS-iPSCs, what does this mean in terms of biological relevance? Or maybe Figure 2 needs to be supplementary given that Figure 3 is a more direct comparison.  

      Thank you for the suggestions. We attempted to reorganize during our revision. We still believe it is easier for the audience to grasp the main message if we organize it according to our current workflow—first presenting an individual differential landscape for controls and patients, and then comparing the common and unique aspects among them.

    1. AbstractBackground Visualization is an indispensable facet of genomic data analysis. Despite the abundance of specialized visualization tools, there remains a distinct need for tailored solutions. However, their implementation typically requires extensive programming expertise from bioinformaticians and software developers, especially when building interactive applications. Toolkits based on visualization grammars offer a more accessible, declarative way to author new visualizations. Nevertheless, current grammar-based solutions fall short in adequately supporting the interactive analysis of large data sets with extensive sample collections, a pivotal task often encountered in cancer research.Results We present GenomeSpy, a grammar-based toolkit for authoring tailored, interactive visualizations for genomic data analysis. Users can implement new visualization designs with little effort by using combinatorial building blocks that are put together with a declarative language. These fully customizable visualizations can be embedded in web pages or end-user-oriented applications. The toolkit also includes a fully customizable but user-friendly application for analyzing sample collections, which may comprise genomic and clinical data. Findings can be bookmarked and shared as links that incorporate provenance information. A distinctive element of GenomeSpy’s architecture is its effective use of the graphics processing unit (GPU) in all rendering. GPU usage enables a high frame rate and smoothly animated interactions, such as navigation within a genome. We demonstrate the utility of GenomeSpy by characterizing the genomic landscape of 753 ovarian cancer samples from patients in the DECIDER clinical trial. Our results expand the understanding of the genomic architecture in ovarian cancer, particularly the diversity of chromosomal instability. We also show how GenomeSpy enabled the discovery of clinically actionable genomic aberrations.Conclusions GenomeSpy is a visualization toolkit applicable to a wide range of tasks pertinent to genome analysis. It offers high flexibility and exceptional performance in interactive analysis. The toolkit is open source with an MIT license, implemented in JavaScript, and available at https://genomespy.app/.

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giae040), where the paper and peer reviews are published openly under a CC-BY 4.0 license. These peer reviews were as follows:

      Reviewer 3: Luca Beltrame

      Lavikka and coworkers present an interesting visualization framework and associated application for genomics visualization. The challenges outlined by the authors in finding appropriate visualization tools for large-scale genomics data were also experienced by this reviewer, and thus better and improved tools are always welcome.

      The manuscript is well laid out, presenting the key facts in a proper manner. The use of GPU rendering for graphs is an excellent move, and I expect to be extremely useful even for machines with lower-end GPUs. The code looks reasonably written and commented (being an application, this too is important for a review). I have also tested the examples, and indeed the software is very useful (the documentation should, however, point out that some issues regarding saving the canvas still exist). One may argue that the use of JSON for the graph grammar can be awkward, but at the same time other file formats may be more problematic and/or require specialized parsers (which open yet another can of worms).

      Documentation is also logically organized. As a minor suggestion, the authors may want to add some form of search to their documentation page.

      There are is an open questions that the authors may want to answer: they explicitly mention GISTIC 1.0 for the G-score plots. Is there a specific reason why they chose 1.0? The 2.0 algorithm is far more robust and produces more reliable results.

    1. def show_spread(left, middle, right): print(f"left {left} middle {middle} right {right}") all_in_list = [1, 2, 3] show_spread(*all_in_list) all_in_dict = {"right": 30, "left": 10, "middle": 20} show_spread(**all_in_dict)

      I asked Claude: Can you expand on what's happening here?

      Claude said:

      Certainly! This code demonstrates two important Python features: argument unpacking with the * and ** operators. Let's break it down:

      First, we define a function show_spread that takes three parameters: left, middle, and right.

      Then, we have two examples of calling this function using different forms of argument unpacking:

      a) List unpacking with *: python all_in_list = [1, 2, 3] show_spread(*all_in_list) Here, *all_in_list unpacks the list into individual arguments. It's equivalent to calling show_spread(1, 2, 3).

      b) Dictionary unpacking with **: python all_in_dict = {"right": 30, "left": 10, "middle": 20} show_spread(**all_in_dict) Here, **all_in_dict unpacks the dictionary into keyword arguments. It's equivalent to calling show_spread(left=10, middle=20, right=30).

      The key points are:

      • unpacks an iterable (like a list or tuple) into positional arguments.
      • unpacks a dictionary into keyword arguments. When using **, the keys in the dictionary must match the parameter names in the function.

      This technique is powerful for making function calls more flexible and for passing collections of values as separate arguments.

    1. The seasoned engineer learns that sometimes the best code is the code you never wrote. They become adept at delegating tasks, capitalizing on the strengths of their colleagues, and asking the dreaded question, "But why?" — a question that often leads to the heart of what needs to be solved, avoiding unnecessary work and focusing on what truly adds value.
    1. The song's criticism on mass media is mainly related to sensationalism.

      "Good" things are usually not sensational. They do not demand attention, hence why the code of known/unknown based on selectors for attention filters it out.

      Reference Hans-Georg Moeller's explanations of Luhmann's mass media theory based on functionally differentiated systems theory.

      Can also compare to Simone Weil's thoughts on collectives and opinion; organizations (thus most part of mass media) should not be allowed to form opinions as this is an act of the intellect, only residing in the individual. Opinion of any form meant to spread lies or parts of the truth rather than the whole truth should be disallowed according to her because truth is a foundational, even the most sacred, need for the soul.

      People must be protected against misinformation.

    1. It is a fun environment to write, think, and work in. Most of our new code gets its start as a Lepiter document, where we play in snippets and add views until the code base has formed, almost as if by accident. It is hard to describe, but I was thinking to do a live-coding session or two when I find the time.

      Agreed. Lepiter creates a pretty joyful experience for what I would call "story driven development" and/or data narratives. It makes pretty fluent the "story/argument first driven" workflow that we already had with Grafoscopio and that is pretty usual in social sciences and humanities, in contrast with the "test first driven" approach of development cultures.

      In my experience it is easier to introduce non developers to this kind of mindset, as we don't fight against some established tradition. Of course, different traditions can prioritize the starting/main point in differently.

    2. page := thisSnippet page

      Being able to refer from the code inside a document to the document that contains it, was also a need we felt with Grafoscopio. For that, We used thisNotebook, as can be seen in our republication of the Data Journalism Handbook (in Spanish) or in this screenshot from its repository:

      In our case, because my approach was not to convert the document inside Pharo, but leveraging Pandoc, thisNotebook allows us to provide PDF export options on the Markdown version of the notebook, to produce high quality PDFs besides the HTML export. As the screenshot above shows, the options where stored inside the document itself, internalizing what would be an external shell command, increasing document reproducibility also in the publication front (what added to the data reproducibility front). This was years before similar approaches like the Jupyter Book or Quarto and I still think that Pharo based tools can have leaner reproducible documentation workflows that their counterparts in other languages.

      Having support for similar ideas later in Lepiter, implemented by more experienced programmers, in the form thisSnippet and being able to compose it with page and database has been a real time saver in migrating some lessons from Grafoscopio to Lepiter.

    1. Reviewer #1 (Public Review):

      Summary:

      The authors report a study aimed at understanding the brain's representations of viewed actions, with a particular aim to distinguish regions that encode observed body movements, from those that encode the effects of actions on objects. They adopt a cross-decoding multivariate fMRI approach, scanning adult observers who viewed full-cue actions, pantomimes of those actions, minimal skeletal depictions of those actions, and abstract animations that captured analogous effects to those actions. Decoding across different pairs of these actions allowed the authors to pull out the contributions of different action features in a given region's representation. The main hypothesis, which was largely confirmed, was that the superior parietal lobe (SPL) more strongly encodes movements of the body, whereas the anterior inferior parietal lobe (aIPL) codes for action effects of outcomes. Specifically, region of interest analyses showed dissociations in the successful cross-decoding of action category across full-cue and skeletal or abstract depictions. Their analyses also highlight the importance of the lateral occipito-temporal cortex (LOTC) in coding action effects. They also find some preliminary evidence about the organisation of action kinds in the regions examined.

      Strengths:

      The paper is well-written, and it addresses a topic of emerging interest where social vision and intuitive physics intersect. The use of cross-decoding to examine actions and their effects across four different stimulus formats is a strength of the study. Likewise, the a priori identification of regions of interest (supplemented by additional full-brain analyses) is a strength.

      Weaknesses:

      I found that the main limitation of the article was in the underpinning theoretical reasoning. The authors appeal to the idea of "action effect structures (AES)", as an abstract representation of the consequences of an action that does not specify (as I understand it) the exact means by which that effect is caused, nor the specific objects involved. This concept has some face validity, but it is not developed very fully in the paper, rather simply asserted. The authors make the claim that "The identification of action effect structure representations in aIPL has implications for theories of action understanding" but it would have been nice to hear more about what those theoretical implications are. More generally, I was not very clear on the direction of the claim here. Is there independent evidence for AES (if so, what is it?) and this study tests the following prediction, that AES should be associated with a specific brain region that does not also code other action properties such as body movements? Or, is the idea that this finding -- that there is a brain region that is sensitive to outcomes more than movements -- is the key new evidence for AES?

      On a more specific but still important point, I was not always clear that the significant, but numerically rather small, decoding effects are sufficient to support strong claims about what is encoded or represented in a region. This concern of course applies to many multivariate decoding neuroimaging studies. In this instance, I wondered specifically whether the decoding effects necessarily reflected fully five-way distinction amongst the action kinds, or instead (for example) a significantly different pattern evoked by one action compared to all of the other four (which in turn might be similar). This concern is partly increased by the confusion matrices that are presented in the supplementary materials, which don't necessarily convey a strong classification amongst action kinds. The cluster analyses are interesting and appear to be somewhat regular over the different regions, which helps. However: it is hard to assess these findings statistically, and it may be that similar clusters would be found in early visual areas too.

    2. Reviewer #2 (Public Review):

      Summary:

      This study uses an elegant design, using cross-decoding of multivariate fMRI patterns across different types of stimuli, to convincingly show a functional dissociation between two sub-regions of the parietal cortex, the anterior inferior parietal lobe (aIPL) and superior parietal lobe (SPL) in visually processing actions. Specifically, aIPL is found to be sensitive to the causal effects of observed actions (e.g. whether an action causes an object to compress or to break into two parts), and SPL to the motion patterns of the body in executing those actions.

      To show this, the authors assess how well linear classifiers trained to distinguish fMRI patterns of response to actions in one stimulus type can generalize to another stimulus type. They choose stimulus types that abstract away specific dimensions of interest. To reveal sensitivity to the causal effects of actions, regardless of low-level details or motion patterns, they use abstract animations that depict a particular kind of object manipulation: e.g. breaking, hitting, or squashing an object. To reveal sensitivity to motion patterns, independently of causal effects on objects, they use point-light displays (PLDs) of figures performing the same actions. Finally, full videos of actors performing actions are used as the stimuli providing the most complete, and naturalistic information. Pantomime videos, with actors mimicking the execution of an action without visible objects, are used as an intermediate condition providing more cues than PLDs but less than real action videos (e.g. the hands are visible, unlike in PLDs, but the object is absent and has to be inferred). By training classifiers on animations, and testing their generalization to full-action videos, the classifiers' sensitivity to the causal effect of actions, independently of visual appearance, can be assessed. By training them on PLDs and testing them on videos, their sensitivity to motion patterns, independent of the causal effect of actions, can be assessed, as PLDs contain no information about an action's effect on objects.

      These analyses reveal that aIPL can generalize between animations and videos, indicating that it is sensitive to action effects. Conversely, SPL is found to generalize between PLDs and videos, showing that it is more sensitive to motion patterns. A searchlight analysis confirms this pattern of results, particularly showing that action-animation decoding is specific to right aIPL, and revealing an additional cluster in LOTC, which is included in subsequent analyses. Action-PLD decoding is more widespread across the whole action observation network.

      This study provides a valuable contribution to the understanding of functional specialization in the action observation network. It uses an original and robust experimental design to provide convincing evidence that understanding the causal effects of actions is a meaningful component of visual action processing and that it is specifically localized in aIPL and LOTC.

      Strengths:

      The authors cleverly managed to isolate specific aspects of real-world actions (causal effects, motion patterns) in an elegant experimental design, and by testing generalization across different stimulus types rather than within-category decoding performance, they show results that are convincing and readily interpretable. Moreover, they clearly took great care to eliminate potential confounds in their experimental design (for example, by carefully ordering scanning sessions by increasing realism, such that the participants could not associate animation with the corresponding real-world action), and to increase stimulus diversity for different stimulus types. They also carefully examine their own analysis pipeline, and transparently expose it to the reader (for example, by showing asymmetries across decoding directions in Figure S3). Overall, this is an extremely careful and robust paper.

      Weaknesses:

      I list several ways in which the paper could be improved below. More than 'weaknesses', these are either ambiguities in the exact claims made, or points that could be strengthened by additional analyses. I don't believe any of the claims or analyses presented in the paper show any strong weaknesses, problematic confounds, or anything that requires revising the claims substantially.

      (1) Functional specialization claims: throughout the paper, it is not clear what the exact claims of functional specialization are. While, as can be seen in Figure 3A, the difference between action-animation cross-decoding is significantly higher in aIPL, decoding performance is also above chance in right SPL, although this is not a strong effect. More importantly, action-PLD cross-decoding is robustly above chance in both right and left aIPL, implying that this region is sensitive to motion patterns as well as causal effects. I am not questioning that the difference between the two ROIs exists - that is very convincingly shown. But sentences such as "distinct neural systems for the processing of observed body movements in SPL and the effect they induce in aIPL" (lines 111-112, Introduction) and "aIPL encodes abstract representations of action effect structures independently of motion and object identity" (lines 127-128, Introduction) do not seem fully justified when action-PLD cross-decoding is overall stronger than action-animation cross-decoding in aIPL. Is the claim, then, that in addition to being sensitive to motion patterns, aIPL contains a neural code for abstracted causal effects, e.g. involving a separate neural subpopulation or a different coding scheme? Moreover, if sensitivity to motion patterns is not specific to SPL, but can be found in a broad network of areas (including aIPL itself), can it really be claimed that this area plays a specific role, similar to the specific role of aIPL in encoding causal effects? There is indeed, as can be seen in Figure 3A, a difference between action-PLD decoding in SPL and aIPL, but based on the searchlight map shown in Figure 3B I would guess that a similar difference would be found by comparing aIPL to several other regions. The authors should clarify these ambiguities.

      (2) Causal effect information in PLDs: the reasoning behind the use of PLD stimuli is to have a condition that isolates motion patterns from the causal effects of actions. However, it is not clear whether PLDs really contain as little information about action effects as claimed. Cross-decoding between animations and PLDs is significant in both aIPL and LOTC, as shown in Figure 4. This indicates that PLDs do contain some information about action effects. This could also be tested behaviorally by asking participants to assign PLDs to the correct action category. In general, disentangling the roles of motion patterns and implied causal effects in driving action-PLD cross-decoding (which is the main dependent variable in the paper) would strengthen the paper's message. For example, it is possible that the strong action-PLD cross-decoding observed in aIPL relies on a substantially different encoding from, say, SPL, an encoding that perhaps reflects causal effects more than motion patterns. One way to exploratively assess this would be to integrate the clustering analysis shown in Figure S1 with a more complete picture, including animation-PLD and action-PLD decoding in aIPL.

      (3) Nature of the motion representations: it is not clear what the nature of the putatively motion-driven representation driving action-PLD cross-decoding is. While, as you note in the Introduction, other regions such as the superior temporal sulcus have been extensively studied, with the understanding that they are part of a feedforward network of areas analyzing increasingly complex motion patterns (e.g. Riese & Poggio, Nature Reviews Neuroscience 2003), it doesn't seem like the way in which SPL represents these stimuli are similarly well-understood. While the action-PLD cross-decoding shown here is a convincing additional piece of evidence for a motion-based representation in SPL, an interesting additional analysis would be to compare, for example, RDMs of different actions in this region with explicit computational models. These could be, for example, classic motion energy models inspired by the response characteristics of regions such as V5/MT, which have been shown to predict cortical responses and psychophysical performance both for natural videos (e.g. Nishimoto et al., Current Biology 2011) and PLDs (Casile & Giese Journal of Vision 2005). A similar cross-decoding analysis between videos and PLDs as that conducted on the fMRI patterns could be done on these models' features, obtaining RDMs that could directly be compared with those from SPL. This would be a very informative analysis that could enrich our knowledge of a relatively unexplored region in action recognition. Please note, however, that action recognition is not my field of expertise, so it is possible that there are practical difficulties in conducting such an analysis that I am not aware of. In this case, I kindly ask the authors to explain what these difficulties could be.

      (4) Clustering analysis: I found the clustering analysis shown in Figure S1 very clever and informative. However, there are two things that I think the authors should clarify. First, it's not clear whether the three categories of object change were inferred post-hoc from the data or determined beforehand. It is completely fine if these were just inferred post-hoc, I just believe this ambiguity should be clarified explicitly. Second, while action-anim decoding in aIPL and LOTC looks like it is consistently clustered, the clustering of action-PLD decoding in SPL and LOTC looks less reliable. The authors interpret this clustering as corresponding to the manual vs. bimanual distinction, but for example "drink" (a unimanual action) is grouped with "break" and "squash" (bimanual actions) in left SPL and grouped entirely separately from the unimanual and bimanual clusters in left LOTC. Statistically testing the robustness of these clusters would help clarify whether it is the case that action-PLD in SPL and LOTC has no semantically interpretable organizing principle, as might be the case for a representation based entirely on motion pattern, or rather that it is a different organizing principle from action-anim, such as the manual vs. bimanual distinction proposed by the authors. I don't have much experience with statistical testing of clustering analyses, but I think a permutation-based approach, wherein a measure of cluster robustness, such as the Silhouette score, is computed for the clusters found in the data and compared to a null distribution of such measures obtained by permuting the data labels, should be feasible. In a quick literature search, I have found several papers describing similar approaches: e.g. Hennig (2007), "Cluster-wise assessment of cluster stability"; Tibshirani et al. (2001) "Estimating the Number of Clusters in a Data Set Via the Gap Statistic". These are just pointers to potentially useful approaches, the authors are much better qualified to pick the most appropriate and convenient method. However, I do think such a statistical test would strengthen the clustering analysis shown here. With this statistical test, and the more exhaustive exposition of results I suggested in point 2 above (e.g. including animation-PLD and action-PLD decoding in aIPL), I believe the clustering analysis could even be moved to the main text and occupy a more prominent position in the paper.

      (5) ROI selection: this is a minor point, related to the method used for assigning voxels to a specific ROI. In the description in the Methods (page 16, lines 514-24), the authors mention using the MNI coordinates of the center locations of Brodmann areas. Does this mean that then they extracted a sphere around this location, or did they use a mask based on the entire Brodmann area? The latter approach is what I'm most familiar with, so if the authors chose to use a sphere instead, could they clarify why? Or, if they did use the entire Brodmann area as a mask, and not just its center coordinates, this should be made clearer in the text.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment:

      This study presents valuable framework and findings to our understanding of the brain as a fractal object by observing the stability of its shape property within 11 primate species and by highlighting an application to the effects of aging on the human brain. The evidence provided is solid but the link between brain shape and the underlying anatomy remains unclear. This study will be of interest to neuroscientists interested in brain morphology, whether from an evolutionary, fundamental or pathological point of view, and to physicists and mathematicians interested in modeling the shapes of complex objects.

      We now clarified the outstanding questions regarding if our model outputs can be related to actual primate brain anatomy, which we believe was mainly based on comments regarding the validity of our output of apparently thicker cortices than nature can produce.

      We address this point in more detail in the point-by-point response below, but want to address this misunderstanding directly here: Our algorithm does not produce thicker cortices with increasing coarse-graining scales; in fact, the cortical thickness never exceeds the actual cortical thickness in our outputs, but rather thins with each coarse-graining scale. In other words, we believe that our outputs are fully in line with neuroanatomy across species.

      Reviewer #2 (Public Review): 

      In this manuscript, the authors analyze the shapes of cerebral cortices from several primate species, including subgroups of young and old humans, to characterize commonalities in patterns of gyrification, cortical thickness, and cortical surface area. The authors state that the observed scaling law shares properties with fractals, where shape properties are similar across several spatial scales. One way the authors assess this is to perform a "cortical melting" operation that they have devised on surface models obtained from several primate species. The authors also explore differences in shape properties between brains of young (~20 year old) and old (~80) humans. A challenge the authors acknowledge struggling with in reviewing the manuscript is merging "complex mathematical concepts and a perplexing biological phenomenon." This reviewer remains a bit skeptical about whether the complexity of the mathematical concepts being drawn from are justified by the advances made in our ability to infer new things about the shape of the cerebral cortex. 

      To allow scientists from all backgrounds to adopt these complex ideas, we have made our code to “melt” the brains and for further downstream analysis publicly available. We have now also provided a graphical user interface, to allow users without substantial coding experience to run the analysis. We also believe that the algorithmic concepts are easy to understand due to the similarity to the coarse-graining procedures found in long-standing and well-accepted box-counting algorithms.

      Beyond the theoretical insight of the fractal nature of cortices and providing an explicit and crucial link between vastly different brains that are gyrified and those that are not, we believe that the advance gained by our methods for future applications is clearly demonstrated in our proof-of-principle with a four-fold increase in effect size. For reference, an effect size of 8 would translate to an almost perfect separation of groups, i.e. an ideal biomarker with near 100% sensitivity and specificity.

      (1) The series of operations to coarse-grain the cortex illustrated in Figure 1 produces image segmentations that do not resemble real brains.

      As re-iterated in our Methods and Discussion: “Note, of course, that the coarse-grained brain surfaces are an output of our algorithm alone and are not to be directly/naively likened to actual brain surfaces, e.g. in terms of the location or shape of the folds. Our comparisons here between coarse-grained brains and actual brains is purely on the level of morphometrics across the whole cortex.”

      Fig. 1 therefore serves as an explanation to the reader on the algorithmic outputs, but each melted brain is not supposed to be directly/visually compared to actual brains. Similar to algorithms measuring the fractal dimension, or the exposed surface area of a given brain, the intermediate outputs of these algorithms are not supposed to represent any biologically observed brain structures, but rather serve as an abstraction to obtain meaningful morphometrics.

      We additionally added a note to the caption of Fig. 1 to clarify this point:

      “Note that the actual size of the brains for analysis are rescaled (see Methods and Fig. 3); we display all brains scaled at an equal size here for the ease of visualisation of the method.”

      Finally, we also edited the entire paper for terminology to clearly distinguish the terms of (1) the cortex as a 3D object, (2) coarse-grained and voxelised versions thereof, and (3) summary morphological measures derived from the former. When we invite comparisons in our paper between real brains and coarse-grained brains, this is always at the level of summary morphological measures, not at the level of the 3D objects/voxelisations themselves.

      The process to assign voxels in downsampled images to cortex and white matter is biased towards the former, as only 4 corners of a given voxel are needed to intersect the original pial surface, but all 8 corners are needed to be assigned a white matter voxel. The reason for introducing this bias (and to the extent that it is present in the authors' implementation) is not provided.

      This detail was in the Supplementary, and we have now added additional clarification on this specific point to our Supplementary:

      “In detail, we assign all voxels in the grid with at least four corners inside the original pial surface to the pial voxelization. This process allows the exposed surface to remain approximately constant with increasing voxel sizes. A constant exposed surface is desirable, as we only want to gradually ‘melt’ and fuse the gyri, but not grow the bounding/exposed surface as well. We want the extrinsic area to remain approximately constant as we decrease the intrinsic area via coarse-graining; it is like generating iterates of a Koch curve in reverse, from more to less detailed, by increasing the length of smallest line segment.

      We then assign voxels with all eight corners inside the original white matter surface to the white matter voxelization. This is to ensure integrity of the white matter, as otherwise white matter voxels in gyri may become detached from the core white matter, and thus artificially increase white matter surface area. Indeed, the main results of the paper are not very sensitive to this decision using all eight corners, vs. e.g. only four corners, as we do not directly use white matter surface area for the scaling law measurements. However, we still maintained this choice in case future work wants to make use of the white matter voxelisations or derivative measures.”

      Note on the point of white matter integrity that if both grey and white matter voxelisations require all 8 corner to be inside the respective mesh, there will be voxels not assigned to either at the grey/white matter interface, causing potential downstream issues.

      We further acknowledge:

      “Of course, our proposed procedure is not the only conceivable way to erase shape details below a given scale; and we are actively working on related algorithms that are also computationally cheaper. Nevertheless, the current version requires no fine-tuning, is computationally feasible and conceptually simple, thus making it a natural choice for introducing the methodology and approach.”

      The authors provide an intuitive explanation of why thickness relates to folding characteristics, but ultimately an issue for this reviewer is, e.g., for the right-most panel in Figure 2b, the cortex consists of several 4.9-sided voxels and thus a >2 cm thick cortex. A structure with these morphological properties is not consistent with the anatomical organization of typical mammalian neocortex. 

      We assume the reviewer refers to Fig. 1B with the panel on scale=4.9mm. We would like to point out that Fig. 1 serves as an explanation of the voxelisation method. For the actual analysis and Results, we are using re-scaled brains (see Fig. 2 with the ever decreasing brain sizes). The rescaling procedure is now expanded as below:

      “Morphological properties, such as cortical thicknesses measured in our ‘melted’ brains are to be understood as a thickness relative to the size of the brain. Therefore, to analyse the scaling behaviour of the different coarse-grained realisations of the same brain, we apply an isometric rescaling process that leaves all dimensionless shape properties unaffected (more details in Suppl. S3.1). Conceptually, this process fixes the voxel size, and instead resizes the surfaces relative to the voxel size, which ensures that we can compare the coarse-grained realisations to the original cortices, and test if the former, like the latter, also scale according to Eqn. (1). Resizing, or more precisely, shrinking the cortical surface is mathematically equivalent to increasing the box size in our coarse-graining method. Both achieved an erasure of folding details below a certain threshold. After rescaling, as an example, the cortical thickness also shrinks with increasing levels of coarse-graining, and never exceeds the thickness measured at native scale.”

      We additionally added a note to the caption of Fig. 1 to clarify this point:

      “Note that the actual size of the brains for analysis are rescaled (see Methods and Fig. 3); we display all brains scaled at an equal size here for the ease of visualisation of the method.”

      Finally, we also edited the entire paper for terminology to clearly distinguish the terms of (1) the cortex as a 3D object, (2) coarse-grained versions thereof, and (3) summary morphological measures derived from the former. When we invite comparisons in our paper between real brains and coarse-grained brains, this is always at the level of summary morphological measures, not at the level of the 3D objects themselves and their detailed anatomical features.

      (2) For the comparison between 20-year-old and 80-year-old brains, a well-documented difference is that the older age group possesses more cerebral spinal fluid due to tissue atrophy, and the distances between the walls of gyri becomes greater. This difference is born out in the left column of Figure 4b. It seems this additional spacing between gyri in 80 year olds requires more extensive down-sampling (larger scale values in Figure 4a) to achieve a similar shape parameter K as for the 20 year olds. The authors assert that K provides a more sensitive measure (associated with a large effect size) than currently used ones for distinguishing brains of young vs. old people. A more explicit, or elaborate, interpretation of the numbers produced in this manuscript, in terms of brain shape, might make this analysis more appealing to researchers in the aging field.

      We have removed the main results relating to K and aging from our last revision already to avoid confusion. This is now only in the supplementary analysis, and our claim of K being a more sensitive measure for age and ageing – whilst still true – will be presented in more detail in a series of upcoming papers.

      (3) In the Discussion, it is stated that self-similarity, operating on all length scales, should be used as a test for existing and future models of gyrification mechanisms. Given the lack of association between the abstract mathematical parameters described in this study and explicit properties of brain tissue and its constituents, it is difficult to envision how the coarse-graining operation can be used to guide development of "models of cortical gyrification."

      We have clarified in more detail what we meant originally in Discussion:

      “Finally, this dual universality is also a more stringent test for existing and future models of cortical gyrification mechanisms at relevant scales, and one that moreover is applicable to individual cortices. For example, any models that explicitly simulate a cortical surface as an output could be directly coarse-grained with our method and the morphological trajectories can be compared with those of actual human and primate cortices. The simulated cortices would only be ‘valid’ in terms of the dual universality, if it also produces the same morphological trajectories.”

      However, we agree with the reviewer that our paper could be misread as demanding direct comparisons of each coarse-grained brain with an actual brain, and we have now added the following text to clarify that this is not our intention for the proposed method or outputs.

      “Note, we do not suggest to directly compare coarse-grained brain surfaces with actual biological brain surfaces. As we noted earlier, the coarse-grained brain surfaces are an output of our algorithm alone and not to be directly/naively likened to actual brain surfaces, e.g. in terms of the location or shape of the folds. Our comparisons here between coarse-grained brains and actual brains is purely on the level of morphometrics across the whole cortex.”

      Indeed, the dual universality imposes restrictive constraints on the possible shapes of real cortices, but do not fully specify them. Presumably, the location of individual folds in different individuals and species will depend on their respective evolutionary histories, so there is no reason to expect a match in fold location between the ‘melted’ cortices of more gyrified species, on one hand, and the cortex of a less-gyrified one, on the other,  even if their global morphological parameters and global mechanism of folding coincide.

      (4) There are several who advocate for analyzing cortical mid-thickness surfaces, as the pial surface over-represents gyral tips compared to the bottoms of sulci in the surface area. The authors indicate that analyses of mid-thickness representations will be taken on in future work, but this seems to be a relevant control for accepting the conclusions of this manuscript.

      In the context of some applications and methods, we agree that the mid-surface is a meaningful surface to analyse. However, in our work, the mid-surface is not. The fractal estimation rests on the assumption that the exposed area hugs the object of interest (hence convex hull of the pial surface), as the relationship between the extrinsic and intrinsic areas across scales determine the fractal relationship (Eq. 2). If we used the mid-surface instead of the pial surface for all estimation, this would not represent the actual object of interest, and it is separated from the convex hull. Estimating a new convex hull based on the mid surface would be the equivalent of asking for the fractal dimension of the mid-surface, not of the cortical ribbon. In other words, it would be a different question, bound to yield a different answer.

      Hence, we indicated in our original response that we only have a provisional answer, but more work beyond the scope of this paper is required to answer this question, as it is a separate question. The mid-surface, as a morphological structure in its own right, will have its own scaling properties, and our provisional understanding is that these also yield a scaling law parallel to those of the cortical ribbon with the same or a similar fractal dimension. But more systematic work is required to investigate this question at native scale and across scales.

      Reviewer #3 (Public Review):

      Summary: Through a rigorous methodology, the authors demonstrated that within 11 different primates, the shape of the brain followed a universal scaling law with fractal properties. They enhanced the universality of this result by showing the concordance of their results with a previous study investigating 70 mammalian brains, and the discordance of their results with other folded objects that are not brains. They incidentally illustrated potential applications of this fractal property of the brain by observing a scale-dependant effect of aging on the human brain. 

      Strengths: 

      - New hierarchical way of expressing cortical shapes at different scales derived from previous report through implementation of a coarse-graining procedure 

      - Investigation of 11 primate brains and contextualisation with other mammals based on prior literature 

      - Proposition of tool to analyse cortical morphology requiring no fine tuning and computationally achievable 

      - Positioning of results in comparison to previous works reinforcing the validity of the observation. 

      - Illustration of scale-dependance of effects of brain aging in the human. 

      Weaknesses: 

      - The notion of cortical shape, while being central to the article, is not really defined, leaving some interpretation to the reader 

      - The organization of the manuscript is unconventional, leading to mixed contents in different sections (sections mixing introduction and method, methods and results, results and discussion...). As a result, the reader discovers the content of the article along the way, it is not obvious at what stages the methods are introduced, and the results are sometimes presented and argued in the same section, hindering objectivity. 

      To improve the document, I would suggest a modification and restructuring of the article such that: 1) by the end of the introduction the reader understands clearly what question is addressed and the value it holds for the community, 2) by the end of the methods the reader understands clearly all the tools that will be used to answer that question (not just the new method), 3) by the end of the results the reader holds the objective results obtained by applying these tools on the available data (without subjective interpretations and justifications), and 4) by the end of the discussion the reader understands the interpretation and contextualisation of the study, and clearly grasps the potential of the method depicted for the better understanding of brain folding mechanisms and properties. 

      We thank this reviewer again for their attention to detail and constructive comments. We have followed the detailed suggestions provided by us in the Recommendations For The Authors, and summarise the main changes here:

      - We have restructured all sections to be more clearly following Introduction, Methods, Results, and Discussion; by using subsections, we believe the structure is now more accessible to readers.

      -  We have now clarified the concept of “cortical shape”, as we use it in our paper in several places, by distinguishing clearly the object of study, and the morphological properties measured from it.

      Recommendations for the authors: 

      Reviewer #2 (Recommendations For The Authors): None 

      Reviewer #3 (Recommendations For The Authors): 

      I once again compliment the authors for their elegant work. I am happy with the way they covered my first feedback. My second review takes into account some comments made by other reviewers with which I agree. 

      We thank this reviewer again for their attention to detail and constructive comments.

      Recommendations for clarifications: 

      General comments: The purpose of the article could be made clearer in the introduction. When I differentiate results from discussion, I think of results as objective measures or observations, while discussion will relate to the interpretation of these results (including comparison with previous literature, in most cases). 

      We have restructured all sections to be more clearly following Introduction, Methods, Results, and Discussion; by using subsection, we believe the structure is now more accessible to readers.

      - l.39: define or discuss "cortical shape" 

      We have gone through the entire paper and corrected for any ambiguities. We specifically distinguish between the cortex as a structure overall, shape measures derived from this structure, and coarse-grained versions of the structure.

      - l.48-74: this would match either an introduction or a discussion rather than a methods section. 

      Done

      - l.98-106: this would match a discussion rather than a methods section. 

      Done

      - l.111: here could be a good spot to discuss the 4 vs 8 corners for inclusion of pial vs white matter voxelization 

      We have discussed this in the more detailed Supplementary section now, as after restructuring, this appears to be the more suitable place.

      - l.140-180: it feels that this section mixes methods, results and discussion of the results 

      We agree and we have resolved this by removing sentences and re-arranging sections.

      - l.183-217: mix of results and discussion 

      We agree and we have resolved this by removing sentences and re-arranging sections.

      Small cosmetic suggestions: 

      - l.44: conservation of 'some' quantities: vague 

      Changed to conservation of morphological relationships across evolution

      - l.66: order of citations ([24, 22,23]) 

      Will be fixed at proof stage depending on format of references.

      - l.77: delete space between citation and period 

      Done

      - l.77: I would delete 'say' 

      Done

      - l.86: 'but to also analyse' -> 'to analyse' 

      Done

      - l.105: remove 'we are encouraged that' 

      Done

      - l.111: 'also see' -> 'see also' 

      Done

      - l.164: 'remarkable': subjective 

      Done

      - l.189: define approx. abbreviation 

      Done

      - l.190: 'approx' -> 'approx.' 

      Revised

      - l.195: 'dramatic': subjective 

      removed

      -l. 246: 'much' -> vague 

      explained

    1. for - adjacency - ecology of communications - Nora Bateson -:indyweb - Deep Humanity

      Summary - A good summary of the common thread of an ecology of communication between 4 systems thinkers

      adjacency - between - ecology of communications - Nora Bateson -:Indyweb - Deep Humanity - adjacency relationship - The author summarised the salient points of a Nate Hagen Great Simplification interview with Nora Bateson on the subject of an ecology of communications - It addresses the need to use language to speak on to multiple contexts of the conversants. - The epistemologically-foundational ideas of - people centered and - interpersonal information - of the indyweb / Indranet architecture are based on the Deep Humanity ideas of - individual / collective gestalt - each individual's unique lebenswelts - the multi-meaningverse inherent in any group - symmathesetic fingerprint - perspectival knowing - salience mismatch inherent in communication due to - encoding meaning from one unique meaningverse/ lebenswelts to common language code - deciding meaning from another unique meaningverse / lebenswelt

    1. Information related to the characteristics of the reported peptide sequences, including their biological activities, descriptions, experimental information, and related publications or patents, was downloaded from all data sources.Then, Python scripts were implemented to process the raw data downloaded from the data sources and transform the information for loading into the Peptipedia v2.0.A length filter was made, containing only peptides with a length equal to or less than 150 residues and higher than three residues. Also, the collected peptides were classified as canonical (with only the 20 natural amino acids) or non-canonical peptides. Then, a semantic analysis was generated to recognise the biological activity of the peptide sequence through the available description in the data sources.Finally, a loader Python script was implemented to load the register in the Peptipedia v2.0, developing a scalable ETL strategy (Extract, Transformation, and Load) for each utilised data source.

      would you be willing to annotate inline in the text where each of the scripts that does one of these actions is available? there's some information I'd be interested in hunting down for certain peptides, and having a pointer to the code that does each specific thing would be helpful to accomplish this

    1. Author response:

      The following is the authors’ response to the original reviews.

      We would like to thank you very much for reviewing our manuscript and express our sincere appreciation for the valuable and thoughtful comments that led us to significantly improve the manuscript on Fshr-ZsGreen reporter mice. We have seriously taken your comments to make a major revision of the manuscript, and here is a summary of the revision:

      (1) New data on Fshr expression are input to the revised Manuscript:

      a. Fshr expression in the testis and adipose tissues (WAT and BAT) of B6 mice;

      b. Fshr expression in the testis of B6 by RNA-smFISH;

      c. Comparison of Fshr expression in the testis and ovary between Fshr-ZsGreen and B8 mice by ddRT-PCR to prove Fshr expression without interruptions by insertion of P2A-ZsGreen vector;

      d. Reduction of Fshr expression in osteocytes within the femoral sections from DMP1-CreERT2:Fshrfl/fl mice;

      e. Fshr expression in an established Leydig cell line-TM3 by immunofluorescence and ddRT-PCR, also show Fshr located in the nuclei of TM3 cells;

      f. Fshr expression at scRNA-seq level from 5 public single cell portals as Supplementary Data 3 to support our findings of the widespread expression pattern of Fshr, particularly in Leydig cells.

      (2) Re-organization of Figure 2 with a new legend.

      (3) A new paragraph is added to the Discussion Section of the revised MS to explain the function of P2A peptide in generation of GFP reporter mice and why Fshr express is not interrupted by the P2A-ZsGreen insertion in Fshr-ZsGreen reporter.

      (4) Deletion of Figure 1-D-c, as it is not necessary.

      (5) Replace of Figure 8-A (the left panel) with a reduced exposure time image.

      (6) Amended parts of the revised MS are labeled in red.

      A point by point response to the Reviewers’ comments:

      Reviewer 1:

      One of the shocking observations in this manuscript is the expression of FSHR in Leydig cells. Other observations are in the osteoblasts and endothelial cells as well as epithelial cells in different organs. The expression of ZsGreen in these tissues seems high and one shall start questioning if there are other mechanisms at play here.

      First, the turnover of fluorescent proteins is long, longer than 48h, which means that they accumulate at a different speed than the endogenous FSHR This means that ZsGreen will accumulate in time while the FSHR receptor might be degraded almost immediately. This correlated with mRNA expression (by the authors) but does not with the results of other studies in single-cell sequencing (see below).

      The expression of ZsGreen in Leydig cells seems much higher than in Sertoli cells, this is "disturbing" to put it mildly. This is visible in both the ZsGreen expression and the FISH assay (Figure 2 B-D).

      Thank you for this valuable comments. We added new data on Fshr expression to prove the presence of Fshr in Leydig cells in B6 detected by immunofluorescence staining, RNA-smFISH and ddRT-PCR, as well as in TM3 cells-isolated Leydig cells from a male mice in the revise MS (Fig 2E, F and G), that demonstrate no interruptions of normal Fshr expression by insertion of P2A-ZsGreen vector into a locus located between exon10 and stop code. We use ZsGreen as an indicator for active Fshr promoter status, rather than a method to measure Fshr expression, which is done by ddRT-PCR. These data are shown in Figure 2G of the revised MS

      In addition, we provide scRNA-seq based evidence on Fshr expression in human Leydig cells from two single cell portals (DISCO and BioGPS) as shown in Supplementary Data 3 in the revised MS. We also cited a recent report on scRNA-seq analysis of Fshr expression in Hu sheep in the revised MS as Reference 65 (PMID: 37541020) 1, which also clearly showed Fshr expression in Leydig cells at single cell level in Hu Sheep.

      We believe that the lack of Fshr expression in some single cell databases may be due to the degradation of Fshr transcript in cells during the process of single cell populations. In our laboratory, we spent more than 6 months to optimize methods and reagents to perverse mRNA integrity more than 8 for RAN-seq.

      The expression in WAT and BAT is also questionable as the expression of ZsGreen is high everywhere. That makes it difficult to believe that the images are truly informative. For example, the stainings of aorta show the ZsGreen expression where elastin and collagen fibres are - these are not "cells" and therefore are not expressing ZsGreen.

      FISH expression (for FSHR) in WT mice is missing.

      Also, the tissue sections were stained with the IgG only (neg control) but in practice both the KI and the WT tissues should be stained with the primary and secondary antibodies. The only control that I could think of to truly get a sense of this would be a tagged receptor (N-terminal) that could then be analysed by immunohistochemistry.

      Reply 2 and 3: Thank you for these comments. New data on Fshr expression in WAT and BAT of B6 mice by immunofluorescence staining and in the testis of B6 mice by immunofluorescence staining and RNA-smFISH are added to the revised MS (Fig.2D and E, and Fig. 4G), showing similar patterns to that of Fshr-ZsGreen mice. Furthermore, we provide more evidences as Supplementary Data 3 on Fshr expression obtained from 4 public single cell portables, showing FSHR expression in a widespread organs and tissues (including different fractions of adipose cells) of human, mice and rat at single cell levels. Please also check Fshr expression pattern in adipose tissues by immunostaining for Fshr in previous reports (Fig. 3a of PMID: 28538730 and Fig. 2 of PMID: 25754247) 2 3, which showed a similar expression pattern to our finding. These data should address your concerns on Fshr expression in WAT and BAT and other organs/tissues.

      Regard of “For example, the stainings of aorta show the ZsGreen expression where elastin and collagen fibres are - these are not "cells" and therefore are not expressing ZsGreen.” We believe that you referred to the image of the aorta in Supplementary Data2. However, Please take a look at the images of the aorta in Figure 5-C, which shows positively stained the layer of ‘elastin and collagen fibres’ for EMCN and a-SMA colocalized with Fshr expression with stained DAPI at a 1000X magnification, indicating endothelial cells and the cellular membrane presented in this layer, not just ‘elastin and collagen’.

      The authors also claim:

      To functionally prove the presence of FSHR in osteoblasts/osteocytes, we also deleted FSHR in osteocytes using an inducible model. The conditional knockout of FSHR triggered a much more profound increase in bone mass and decrease in fat mass than blockade by FSHR antibodies (unpublished data).

      This would be a good control for all their images. I think it is necessary to make the large claim of extragonadal expression, as well as intragonadal such as Leydig cells.

      Thank you for this very encouraging comment. As you suggested, we did add a result of reduced Fshr expression in osteocytes from DMP1-CreERT2+:Fshrfl/fl mice treated with tamoxifen to the revise MS, as shown in Figure 3D, demonstrating Fshr present in osteocytes and the specificity of Fshr antibody. Furthermore, we incorporated your advice on making ‘ large claim of extrogonadal and intragonadal expression of Fshr’ into the revised MS in red.

      Claiming that the under-developed Leydig cells in FSHR KO animals are due to a direct effect of the FSHR, and not via a cross-talk between Sertoli and Leydig cells, is too much of a claim. It might be speculated to some degree but as written at the moment it suggests this is "proven".

      Thank you for pointing out this incorrect claim and we apologized for it. In the revised MS, we deleted this claim.

      We also do not know if this FSHR expressed is a spliced form that would also result in the expression of ZsGreen but in a non-functional FSHR, or whether the FSHR is immediately degraded after expression. The insertion of the ZsGreen might have disturbed the epigenetics, transcription, or biosynthesis of the mRNA regulation.

      Thanks for this comment. In the revised MS, we added a new section to explain the function of P2A peptide in generation of a GFP reporter by sgRNA-guilded site specific knockin of P2A ZsGreen vector through CRISPRA/cas9 and provided a new result on comparison of Fshr expression in the testes and ovaries from Fshr-ZsGreen and B6 mice, showing equivalent Fshr expression between Fshr-ZsGreen and B6 mice (Figure 2G), which indicates no interruptions of Fshr expression by the insertion of P2A vector.

      The authors should go through single-cell data of WT mice to show the existence of the FSHR transcript(s).<br /> For example here:<br /> https://www.nature.com/articles/sdata2018192

      Thank you so much for the valuable comment. Yes, we took you critical advice to check Fshr expression through 4 single cell portals, including DISCO, GTEx, BioGPS and Human single cell portal, and present the collected data as Supplementary Data 3 in the revised MS, that strongly support our findings of the wider Fshr expression. Particularly, Fshr expression in Leydig cells is proved by scRNA-seq studies of human cells from DISCO and BioGPS, as well as a recent study in Hu sheep (PMID: 37541020) 1 and we cited it in the revised MS.

      Reviewer 2:

      Is the FSHR expression pattern affected by the knockin mice (no side-by-side comparison between wt and GSGreen mice, using in situ hybridization and ddRTPCR, at least in the gonads, is provided)?

      Thanks for the comment. In the revised MS, we provided a set of new data on Fshr expression in the testis, ovary, WAT and BAT of B6 mice by immunofluorescence staining and by RNA-smFISH for Fshr expression, showing similar expression patterns. Additionally, we also performed ddRT-PCT to compare Fshr expression in the testes and ovaries between Fshr-ZsGreen and B6 mice, demonstrating equivalent expression of Fshr expression between Fshr-ZsGreen and B6 mice. Interestingly, we also observed an significantly higher Fshr expression in the testis than that in the ovary (more than 30 folds).

      Is the splicing pattern of the FSHR affected in the knockin compared to wt mice, at least in the gonads?

      Thanks for the question. Please see our reply to the Reviewer 1 for the function of P2A peptide used for generation of GFP reporters.  Although we didn’t directly assess the splicing pattern, we provide a result of comparison of Fshr expression in Figure 2F in the revised MS, indirectly showing no changes of the splicing pattern. We will assess the splicing pattern of Fshr in the future that has been neglected in the field.

      Are there any additional off-target insertions of GSGreen in these mice?” and “Are similar results observed in separate founder mice?

      Thanks for the questions. As we describe it in the method section  in detail in the MS, Fshr-ZsGreen reporter was produced by the a site-specific long ssDNA recombination of the P2A-ZsGreen targeting vector to the locus between Exon10 and stop code by CRIPRA/cas9, which was guided by site-specific single guide RNA (sgRNA). We showed the results of Southern blot, DNA sequencing and site-specific PCR, proving the site-specific insertion of P2A-ZsGreen as shown in Figure 1. Because of the site-specific recombination, professionally, only one funder line is required for the study and there are no additional off-target insertions.

      How long is GSGreen half-life? Could a very long half-life be a major reason for the extremely large expression pattern observed?

      Thanks for the question. The half life of ZsGreen, also called ZsGreen1, is at least 26 h in mammalian cells or slightly longer due to its tetrameric structure, in contrast with the monomeric configuration of other well-known fluorescent proteins (PMID: 17510373) 4. The rationale for using this GFP protein is that ZsGreen is an exceptionally bright green fluorescent protein, which is up to 4X brighter than EGFP—and is ideally suited for whole-cell labelling, promoter-reporter studies, considering of the higher turnover and rapid degradation of Fshr transcript. In this study, we used ZsGreen as a monitor or an indicator of the active Fshr endogenous promoter, rather than a means for measuring the promoter activity. Therefore, regardless of its accumulation or not, ZsGreen driven by Fshr promoter, indicates the presence of active Fshr promoter in the defined cells. In stead, we used ddRT-PCR to measure Fshr expression degrees in this study. In addition, we also provide single cell sequence-based evidence from 4 public single cell portables to support our findings of the wide Fshr expression. Please see Supplementary Data 3 in the revised MS.

      References:

      (1) Su J, Song Y, Yang Y, et al. Study on the changes of LHR, FSHR and AR with the development of testis cells in Hu sheep. Anim Reprod Sci. Sep 2023;256:107306. doi:10.1016/j.anireprosci.2023.107306

      (2) Liu P, Ji Y, Yuen T, et al. Blocking FSH induces thermogenic adipose tissue and reduces body fat. Nature. Jun 1 2017;546(7656):107-112. doi:10.1038/nature22342

      (3) Liu XM, Chan HC, Ding GL, et al. FSH regulates fat accumulation and redistribution in aging through the Galphai/Ca(2+)/CREB pathway. Aging Cell. Jun 2015;14(3):409-20. doi:10.1111/acel.12331

      (4) Bell P, Vandenberghe LH, Wu D, Johnston J, Limberis M, Wilson JM. A comparative analysis of novel fluorescent proteins as reporters for gene transfer studies. J Histochem Cytochem. Sep 2007;55(9):931-9. doi:10.1369/jhc.7A7180.2007

    1. On responding to the first round of reviews, the authors have nicely adjusted their wording and fairly describe the results of their study. Certain markers were identified for further investigation. Yet, an overall non-obvious relationship between immune markers and HIV reservoirs has been shown previously, and despite the attempt to leverage powerful ML algorithms, they are not magical and cannot reveal strong relationships that fundamentally do not exist. In addition, categorical classification is for now hard to interpret and the more powerful ML algorithms do not seem to outperform more classic regression methods. Therefore, it remains relatively hard to evaluate the utility of this kind of study.

      Initial summary:

      Semenova et al. have studied a large cross-sectional cohort of people living with HIV on suppressive ART, N=115, and performed high dimensional flow-cytometry to then search for associations between immunological and clinical parameters and intact/total HIV DNA levels.

      A number of interesting data science/ML approaches were explored on the data and the project seems a serious undertaking. However, like many other studies that have looked for these kinds of associations, there was not a very strong signal. Of course the goal of unsupervised learning is to find new hypotheses that aren't obvious to human eyes, but I felt in that context, there were (1) results slightly oversold, (2) some questions about methodology in terms mostly of reservoir levels, and (3) results were not sufficiently translated back into meaning in terms of clinical outcomes.

      Strengths:

      The study is evidently a large and impressive undertaking and combines many cutting edge statistical techniques with a comprehensive experimental cohort of people living with HIV, notably inclusive of populations underrepresented in HIV science. A number of intriguing hypotheses are put forward that could be explored further. Data will be shared and could be a useful repository for more specific analyses.

      Weaknesses:

      Despite the detailed experiments and methods, there was not a very strong signal for variable(s) predicting HIV reservoir size. The spearman coefficients are ~0.3, (somewhat weak, and acknowledged as such) and predictive models reach 70-80% prediction levels, though of sometimes categorical variables that are challenging to interpret.

      There are some questions about methodology, as well as some conclusions that are not completely supported by results, or at minimum not sufficiently contextualized in terms of clinical significance. Edit, authors have substantially revised the text.

      On associations: the false discovery rate correction was set at 5%, but data appear underdetermined with fewer observations than variables (144vars > 115ppts), and it isn't always clear if/when variables are related (e.g inverses of one another, for instance %CD4 and %CD8).

      The modeling of reservoir size was unusual, typically intact and defective HIV DNA are analyzed on a log10 scale (both for decays and predicting rebound). Also sometimes in this analysis levels are normalized (presumably to max/min?, e.g. S5), and given the large within-host variation of level we see in other works, it is not trivial to predict any downstream impact of normalization across population vs within person. Edit, fixed.

      Also, the qualitative characterization of low/high reservoir is not standard, and naturally will split by early/later ART if done as above/below median. Given the continuous nature of these data it seems throughout that predicting above/below median is a little hard to translate into clinical meaning.

      Lastly, work is comprehensive and appears solid, but the code was not shared to see how calculations were performed. Edit, fixed.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Semenova et al. have studied a large cross-sectional cohort of people living with HIV on suppressive ART, N=115, and performed high dimensional flow cytometry to then search for associations between immunological and clinical parameters and intact/total HIV DNA levels.

      A number of interesting data science/ML approaches were explored on the data and the project seems a serious undertaking. However, like many other studies that have looked for these kinds of associations, there was not a very strong signal. Of course, the goal of unsupervised learning is to find new hypotheses that aren't obvious to human eyes, but I felt in that context, there were (1) results slightly oversold, (2) some questions about methodology in terms mostly of reservoir levels, and (3) results were not sufficiently translated back into meaning in terms of clinical outcomes.

      We appreciate the reviewer’s perspective.  In our revised version of the manuscript, we have attempted to address these concerns by more adequately explaining the limitations of the study and by more thoroughly discussing the context of the findings.  We are not able to associate the findings with specific clinical outcomes for individual study participants but we speculate about the overall biological meaning of these associations across the cohort.  We cannot disagree with the reviewer, but we find the associations statistically significant, potentially reflecting real biological associations, and forming the basis for future hypothesis testing research. 

      Strengths:

      The study is evidently a large and impressive undertaking and combines many cutting-edge statistical techniques with a comprehensive experimental cohort of people living with HIV, notably inclusive of populations underrepresented in HIV science. A number of intriguing hypotheses are put forward that could be explored further. Sharing the data could create a useful repository for more specific analyses.

      We thank the reviewer for this assessment.

      Weaknesses:

      Despite the detailed experiments and methods, there was not a very strong signal for the variable(s) predicting HIV reservoir size. The Spearman coefficients are ~0.3, (somewhat weak, and acknowledged as such) and predictive models reach 70-80% prediction levels, though sometimes categorical variables are challenging to interpret.

      We agree with the reviewer that individual parameters are only weakly correlated with the HIV reservoir, likely reflecting the complex and multi-factorial nature of reservoir/immune cell interactions.  Nevertheless, these associations are statistically significant and form the basis for functional testing in viral persistence.

      There are some questions about methodology, as well as some conclusions that are not completely supported by results, or at minimum not sufficiently contextualized in terms of clinical significance.  On associations: the false discovery rate correction was set at 5%, but data appear underdetermined with fewer observations than variables (144vars > 115ppts), and it isn't always clear if/when variables are related (e.g inverses of one another, for instance, %CD4 and %CD8).

      When deriving a list of cell populations whose frequency would be correlated with the reservoir, we focused on well-defined cell types for which functional validation exists in the literature to consider them as distinct cell types.  For many of the populations, gating based on combinations of multiple markers leads to recovery of very few cells, and so we excluded some potential combinations from the analysis.  We are also making our raw data available for others to examine and find associations not considered by our manuscript.

      The modeling of reservoir size was unusual, typically intact and defective HIV DNA are analyzed on a log10 scale (both for decays and predicting rebound). Also, sometimes in this analysis levels are normalized (presumably to max/min?, e.g. S5), and given the large within-host variation of level we see in other works, it is not trivial to predict any downstream impact of normalization across population vs within-person.

      We have repeated the analysis using log10 transformed data and the new figures are shown in Figure 1 and S2-S5.

      Also, the qualitative characterization of low/high reservoir is not standard and naturally will split by early/later ART if done as above/below median. Given the continuous nature of these data, it seems throughout that predicting above/below median is a little hard to translate into clinical meaning.

      Our ML models included time before ART as a variable in the analysis, and this was not found to be a significant driver of the reservoir size associations, except for the percentage of intact proviruses (see Figure 2C). Furthermore, we analyzed whether any of the reservoir correlated immune variables were associated with time on ART and found that, although some immune variables are associated with time on therapy, this was not the case for most of them (Table S4). We agree that it is challenging to translate above or below median into clinical meaning for this cohort, but we emphasize that this study is primarily a hypothesis generating approach requiring additional validation for the associations observed.  We attempted to predict reservoir size as a continuous variable using the data and this approach was not successful (Figure S13). We believe that a significantly larger cohort will likely be required to generate a ML model that can accurately predict the reservoir as a continuous variable.  We have added additional discussion of this to the manuscript.

      Lastly, the work is comprehensive and appears solid, but the code was not shared to see how calculations were performed.

      We now provide a link to the code used to perform the analyses in the manuscript, https://github.com/lesiasemenova/ML_HIV_reservoir.

      Reviewer #2 (Public Review):

      Summary:

      Semenova et. al., performed a cross-sectional analysis of host immunophenotypes (using flow cytometry) and the peripheral CD4+ T cell HIV reservoir size (using the Intact Proviral DNA Assay, IPDA) from 115 people with HIV (PWH) on ART. The study mostly highlights the machine learning methods applied to these host and viral reservoir datasets but fails to interpret these complex analyses into (clinically, biologically) interpretable findings. For these reasons, the direct translational take-home message from this work is lost amidst a large list of findings (shown as clusters of associated markers) and sentences such as "this study highlights the utility of machine learning approaches to identify otherwise imperceptible global patterns" - lead to overinterpretation of their data.

      We have addressed the reviewer’s concern by modifications to the manuscript that enhance the interpretation of the findings in a clinical and biological context.

      Strengths:

      Measurement of host immunophenotyping measures (multiparameter flow cytometry) and peripheral HIV reservoir size (IPDA) from 115 PWH on ART.

      Major Weaknesses:

      (1) Overall, there is little to no interpretability of their machine learning analyses; findings appear as a "laundry list" of parameters with no interpretation of the estimated effect size and directionality of the observed associations. For example, Figure 2 might actually give an interpretation of each X increase in immunophenotyping parameter, we saw a Y increase/decrease in HIV reservoir measure.

      We have added additional text to the manuscript in which we attempt to provide more immunological and clinical interpretation of the associations.  We also have emphasized that these associations are still speculative and will require additional validation.  Nevertheless, our data should provide a rich source of new hypotheses regarding immune system/reservoir interaction that could be tested in future work.

      (2) The correlations all appear to be relatively weak, with most Spearman R in the 0.30 range or so.

      We agree with the review that the associations are mostly weak, consistent with previous studies in this area.  This likely is an inherent feature of the underlying biology – the reservoir is likely associated with the immune system in complex ways and involves stochastic processes that will limit the predictability of reservoir size using any single immune parameter. We have added additional text to the manuscript to make this point clearer.

      (3) The Discussion needs further work to help guide the reader. The sentence: "The correlative results from this present study corroborate many of these studies, and provide additional insights" is broad. The authors should spend some time here to clearly describe the prior literature (e.g., describe the strength and direction of the association observed in prior work linking PD-1 and HIV reservoir size, as well as specify which type of HIV reservoir measures were analyzed in these earlier studies, etc.) and how the current findings add to or are in contrast to those prior findings.

      We have added additional text to the manuscript to help guide the readers through the possible biological significance of the findings and the context with respect to prior literature.

      (4) The most interesting finding is buried on page 12 in the Discussion: "Uniquely, however, CD127 expression on CD4 T cells was significantly inversely associated with intact reservoir frequency." The authors should highlight this in the abstract, and title, and move this up in the Discussion. The paper describes a very high dimensional analysis and the key takeaways are not clear; the more the author can point the reader to the take-home points, the better their findings can have translatability to future follow-up mechanistic and/or validation studies.

      We appreciate the reviewer’s comment.  We have increased the emphasis on this finding in the revised version of the manuscript.

      (5) The authors should avoid overinterpretation of these results. For example in the Discussion on page 13 "The existence of two distinct clusters of PWH with different immune features and reservoir characteristics could have important implications for HIV cure strategies - these two groups may respond differently to a given approach, and cluster membership may need to be considered to optimize a given strategy." It is highly unlikely that future studies will be performing the breadth of parameters resulting here and then use these directly for optimizing therapy.

      Our analyses indicate that membership of study participants in cluster1 or cluster 2 can be fairly accurately determined by a small number of individual parameters (KLRG1 etc, Figure 4F), and measuring the cells of PWH with the degree of breadth used in this paper would not be necessary to classify PWH into these clusters.  As such, we feel that it is not unrealistic to speculate that this finding could turn out to be clinically useful, if it becomes clear that the clusters are biologically meaningful.

      (6) There are only TWO limitations listed here: cross-sectional study design and the use of peripheral blood samples. (The subsequent paragraph notes an additional weakness which is misclassification of intact sequences by IPDA). This is a very limited discussion and highlights the need to more critically evaluate their study for potential weaknesses.

      We have expanded on the list of limitations discussed in the manuscript. In particular, we now address the size of the cohort, the composition with respect to different genders and demographics, lack of information for the timing of ART and the lack of information regarding intracellular transcriptional pathways.

      (7) A major clinical predictor of HIV reservoir size and decay is the timing of ART initiation. The authors should include these (as well as other clinical covariate data - see #12 below) in their analyses and/or describe as limitations of their study.

      All of the participants that make up our cohort were treated during chronic infection, and the precise timing of ART initiation is unclear in most of these cases.  We have added additional information to explain this in the manuscript and include this in the list of limitations.

      Reviewer #3 (Public Review):

      Summary:

      This valuable study by Semenova and colleagues describes a large cross-sectional cohort of 115 individuals on ART. Participants contributed a single blood sample which underwent IPDA, and 25-color flow with various markers (pre and post-stimulation). The authors then used clustering, decision tree analyses, and machine learning to look for correlations between these immunophenotypic markers and several measures of HIV reservoir volume. They identified two distinct clusters that can be somewhat differentiated based on total HIV DNA level, intact HIV DNA level, and multiple T cell cellular markers of activation and exhaustion.

      The conclusions of the paper are supported by the data but the relationships between independent and dependent variables in the models are correlative with no mechanistic work to determine causality. It is unclear in most cases whether confounding variables could explain these correlations. If there is causality, then the data is not sufficient to infer directionality (ie does the immune environment impact the HIV reservoir or vice versa or both?). In addition, even with sophisticated and appropriate machine learning approaches, the models are not terribly predictive or highly correlated. For these reasons, the study is very much hypothesis-generating and will not impact cure strategies or HIV reservoir measurement strategies in the short term.

      We appreciate the reviewer’s comments regarding the value of our study.  We fully acknowledge that the causal nature and directionality of these associations are not yet clear and agree that the study is primarily hypothesis generating in nature.  Nevertheless, we feel that the hypotheses generated will be valuable to the field.  We have added additional text to the manuscript to emphasize the hypothesis generating nature of this paper.

      Strengths:

      The study cohort is large and diverse in terms of key input variables such as age, gender, and duration of ART. Selection of immune assays is appropriate. The authors used a wide array of bioinformatic approaches to examine correlations in the data. The paper was generally well-written and appropriately referenced.

      Weaknesses:

      (1) The major limitation of this work is that it is highly exploratory and not hypothesis-driven. While some interesting correlations are identified, these are clearly hypothesis-generating based on the observational study design.

      We agree that the major goal of this study was hypothesis generating and that our work is exploratory in nature. Performing experiments with mechanism testing goals in human participants with HIV is challenging.  Additionally, before such mechanistic studies can be undertaken, one must have hypotheses to test. As such we feel our study will be useful for the field in helping to identify hypotheses that could potentially be tested.

      (2) The study's cross-sectional nature limits the ability to make mechanistic inferences about reservoir persistence. For instance, it would be very interesting to know whether the reservoir cluster is a feature of an individual throughout ART, or whether this outcome is dynamic over time.

      We agree with the reviewer’s comment. Longitudinal studies are challenging to carry out with a study cohort of this size, and addressing questions such as the one raised by the reviewer would be of great interest. We believe our study nevertheless has value in identifying hypotheses that could be tested in a longitudinal study.

      (3) A fundamental issue is that I am concerned that binarizing the 3 reservoir metrics in a 50/50 fashion is for statistical convenience. First, by converting a continuous outcome into a simple binary outcome, the authors lose significant amounts of quantitative information. Second, the low and high reservoir outcomes are not actually demonstrated to be clinically meaningful: I presume that both contain many (?all) data points above levels where rebound would be expected soon after interruption of ART. Reservoir levels would also have no apparent outcome on the selection of cure approaches. Overall, dividing at the median seems biologically arbitrary to me.

      The reviewer raises a valid point that the clinical significance of above or below median reservoir metrics is unclear, and that the size of the reservoir has potentially little relation to rebound and cure approaches.  In the manuscript, we attempted to generate models that can predict reservoir size as a continuous variable in Figure S13 and find that this approach performs poorly, while a binarized approach was more successful. As such we have included both approaches in the manuscript.  It is possible that future studies with larger sample sizes and more detailed measurements will perform better for continuous variable prediction.  While this is a fairly large study (n=115) by the standards of HIV reservoir analyses, it is a small study by the standards of the machine learning field, and accurate predictive ML models for reservoir size as a continuous variable will likely require a much larger set of samples/participants.  Nevertheless, we feel our work has value as a template for ML approaches that may be informative for understanding HIV/immune interactions and generates novel hypotheses that could be validated by subsequent studies.

      (4) The two reservoir clusters are of potential interest as high total and intact with low % intact are discriminated somewhat by immune activation and exhaustion. This was the most interesting finding to me, but it is difficult to know whether this clustering is due to age, time on ART, other co-morbidity, ART adherence, or other possible unmeasured confounding variables.

      We agree that this finding is one of the more interesting outcomes of the study. We examined a number of these variables for association with cluster membership, and these data are reported in Figure S8A-D.  Age, years of ART and CD4 Nadir were all clearly different between the clusters.   The striking feature of this clustering, however, is the clear separation between the two groups of participants, as opposed to a continuous gradient of phenotypes.  This could reflect a bifurcation of outcomes for people with HIV, dynamic changes in the reservoir immune interactions over time, or different levels of untreated infection.  It is certainly possible that some other unmeasured confounding variables contribute to this outcome and we have attempted to make this limitation clearer.

      (5) At the individual level, there is substantial overlap between clusters according to total, intact, and % intact between the clusters. Therefore, the claim in the discussion that these 2 cluster phenotypes may require different therapeutic approaches seems rather speculative. That said, the discussion is very thoughtful about how these 2 clusters may develop with consideration of the initial insult of untreated infection and / or differences in immune recovery.

      We agree with the reviewer that this claim is speculative, and we have attempted to moderate the language of the text in the revised version.

      (6) The authors state that the machine learning algorithms allow for reasonable prediction of reservoir volume. It is subjective, but to me, 70% accuracy is very low. This is not a disappointing finding per se. The authors did their best with the available data. It is informative that the machine learning algorithms cannot reliably discriminate reservoir volume despite substantial amounts of input data. This implies that either key explanatory variables were not included in the models (such as viral genotype, host immune phenotype, and comorbidities) or that the outcome for testing the models is not meaningful (which may be possible with an arbitrary 50/50 split in the data relative to median HIV DNA volumes: see above).

      We acknowledge that the predictive power of the models generated from these data is modest and we have clarified this point in the revised manuscript. As the reviewer indicates, this may result from the influence of unmeasured variables and possible stochastic processes.  The data may thus demonstrate a limit to the predictability of reservoir size which may be inherent to the underlying biology.  As we mention above, this study size (n-115) is fairly small for the application of ML methods, and an increased sample size will likely improve the accuracy of the models. At this stage, the models we describe are not yet useful as predictive clinical tools, but are still nonetheless useful as tools to describe the structure of the data and identify reservoir associated immune cell types.

      (7) The decision tree is innovative and a useful addition, but does not provide enough discriminatory information to imply causality, mechanism, or directionality in terms of whether the immune phenotype is impacting the reservoir or vice versa or both. Tree accuracy of 80% is marginal for a decision tool.

      The reviewer is correct about these points.  In the revised manuscript, we have attempted to make it clear that we are not yet advocating using this approach as a decision tool, but simply a way to visualize the data and understand the structure of the dataset.  As we discuss above, the models will likely need to be trained on a larger dataset and achieve higher accuracy before use as a decision tool.

      (8) Figure 2: this is not a weakness of the analysis but I have a question about interpretation. If total HIV DNA is more predictive of immune phenotype than intact HIV DNA, does this potentially implicate a prior high burden of viral replication (high viral load &/or more prolonged time off ART) rather than ongoing reservoir stimulation as a contributor to immune phenotype? A similar thought could be applied to the fact that clustering could only be detected when applied to total HIV DNA-associated features. Many investigators do not consider defective HIV DNA to be "part of the reservoir" so it is interesting to speculate why these defective viruses appear to have more correlation with immunophenotype than intact viruses.

      We agree with the reviewer that this observation could reflect prior viral burden and we have added additional text to make this clearer.  Even so, we cannot rule out a model in which defective viral DNA is engaged in ongoing stimulation of the immune system during ART, leading to the stronger association between total DNA and the immune cell phenotypes. We hypothesize that the defective proviruses could potentially be triggering innate immune pattern recognition receptors via viral RNA or DNA, and a higher burden of the total reservoir leads to a stronger apparent association with the immune phenotype.  We have included text in the discussion about this hypothesis.

      (9) Overall, the authors need to do an even more careful job of emphasizing that these are all just correlations. For instance, HIV DNA cannot be proven to have a causal effect on the immunophenotype of the host with this study design. Similarly, immunophenotype may be affecting HIV DNA or the correlations between the two variables could be entirely due to a separate confounding variable

      We have revised the text of the manuscript to emphasize this point, and we acknowledge that any causal relationships are, at this point, simply speculation. 

      (10) In general, in the intro, when the authors refer to the immune system, they do not consistently differentiate whether they are referring to the anti-HIV immune response, the reservoir itself, or both. More specifically, the sentence in the introduction listing various causes of immune activation should have citations. (To my knowledge, there is no study to date that definitively links proviral expression from reservoir cells in vivo to immune activation as it is next to impossible to remove the confounding possible imprint of previous HIV replication.) Similarly, it is worth mentioning that the depletion of intact proviruses is quite slow such that provial expression can only be stimulating the immune system at a low level. Similarly, the statement "Viral protein expression during therapy likely maintains antigen-specific cells of the adaptive immune system" seems hard to dissociate from the persistence of immune cells that were reactive to viremia.

      We updated the text of the manuscript to address these points and have added additional citations as per the reviewer’s suggestion.

      (11) Given the many limitations of the study design and the inability of the models to discriminate reservoir volume and phenotype, the limitations section of the discussion seems rather brief.

      We have now expanded the limitations section of the discussion and added additional considerations. We now include a discussion of the study cohort size, composition and the detail provided by the assays.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      A few specific comments:

      "This pattern is likely indicative of a more profound association of total HIV DNA with host immunophenotype relative to intact HIV DNA."

      Most studies I have seen (e.g. single cell from Lictherfeld/Yu group) show intact proviruses are generally more activated/detectable/susceptible to immune selection, so I have a hard time thinking defective proviruses are actually more affected by immunotype.

      We hypothesize that this association is actually occurring in the opposite direction – that the defective provirus are having a greater impact on the immune phenotype, due to their greater number and potential ability to engage innate or adaptive immune receptors. We have clarified this point in the manuscript

      "The existence of two distinct clusters of PWH with different immune features and reservoir characteristics could have important implications for HIV cure strategies - these two groups may respond differently to a given approach, and cluster membership may need to be considered to optimize a given strategy."

      I find this a bit of a reach, given that the definition of 2 categories depended on the total size.

      We have modified the language of this section to reduce the level of speculation.

      "This study is cross-sectional in nature and is primarily observational, so caution should be used interpreting findings associated with time on therapy".

      I found this an interesting statement because ultimately time on ART shows up throughout the analysis as a significant predictor, do you mean something about how time on ART could indicate other confounding variables like ART regimen or something?

      We have rephrased this comment to avoid confusion.  We were simply trying to make the point that we should avoid speculating about longitudinal dynamics from cross sectional data.

      "As expected, the plots showed no significant correlation for intact HIV DNA versus years of ART (Figure 1B), while total reservoir size was positively correlated with the time of ART (Figure 1A, Spearman r = 0.31)."<br />  Is this expected? Studies with longitudinal data almost uniformly show intact decay, at least for the first 10 or so years of ART, and defective/total stability (or slight decay). Also probably "time on ART" to not confuse with the duration of infection before ART.

      We have updated the language of this section to address this comment.  We have avoided comparing our data with respect to time on ART to longitudinal studies for reasons given above.

      On dimensionality reduction, as this PaCMAP seems a relatively new technique (vs tSNE and UMAP which are more standard, but absolutely have their weaknesses), it does seem important to contextualize. I think it would still be useful to show PCA and asses the % variance of each additional dimension to assess the effective dimensionality, it would be helpful to show a plot of % variance by # components to see if there is a cutoff somewhere, and if PaCMAP is really picking this up to determine the 2 dimensions/2 clusters is ideal. Figure 4B ultimately shows a lot of low/high across those clusters, and since low/high is defined categorically it's hard to know which of those dots are very close to the other categories.

      We have added this analysis to the manuscript – found in Figure S9. The PCA plot indicates that members of the two clusters also separate on PCA although this separation is not as clear as for the PaCMAP plot.

      Minor comments on writing etc:

      Intro

      -Needs some references on immune activation sequelae paragraph.

      We have added some additional references to this section.

      -"promote the entry of recently infected cells into the reservoir" -- that is only one possible mechanistic explanation, it's not unreasonable but it seems important to keep options open until we have more precise data that can illuminate the mechanism of the overabundance.

      We have modified the text to discuss additional hypotheses.

      -You might also reference Pankau et al Ppath for viral seeding near the time of ART.

      We have added this reference.

      -"Viral protein expression during therapy likely maintains antigen-specific cells of the adaptive immune system" - this was unclear to me, do you mean HIV-specific cells that act against HIV during ART? I think most studies show immunity against HIV (CD8 and CD4) wanes over time during ART.

      The Goonetilleke lab has recently generated data indicating that antiviral T cell responses are remarkably stable over time on ART, but we agree with the reviewer that the idea that ongoing antigen expression in the reservoir maintains these cells is speculative.  We have modified the text to make this point clearer.

      -Overall I think the introduction lacked a little bit of definitional precision: i.e. is the reservoir intact vs replication competent vs all HIV DNA and whether we are talking about PWH on long-term ART and how long we should be imagining? The first years of ART are certainly different than later, in terms of dynamics. The ultimate implications are likely specific for some of these categorizations.

      -"persistent sequelae of the massive disruptions to T cell homeostasis and lymphoid structures that occur during untreated HIV infection" needs a lot more context/referencing. For instance, Peter Hunt showed a decrease in activation after ART a long time ago.

      -Heather Best et al show T cell clonality stays perturbed after ART.

      We have updated the text of the introduction and added references to address the reviewer’s comments.

      Results

      -It would be important to mention the race of participants and any information about expected clades of acquired viruses, this gets mentioned eventually with reference to the Table but the breakdown would be helpful right away.

      We have added this information to the results section.

      -"performed Spearman correlations", may be calculated or tested?

      We have corrected the language for this sentence.

      Comments on figures:

      -Figure 1 data on linear scale (re discussion above) -- hard to even tell if there is a decay (to match with all we know from various long-term ART studies).

      -Figure 4 data is shown on ln (log_e) scale, which is hard to interpret for most people.

      -Figures 4 C,D, and E should have box plots to visually assess the significance.

      -Figure 4B legend says purple/pink but I think the colors are different in the plot, could be about transparency

      -Figure 5 it is now not clear if log_e(?).

      -Figure 6 "HIV reservoir characteristics" might be better to make this more explicit. Do you mean for instance in the 6B title Total HIV DNA per million CD4+ T cells I think?

      We have made these modifications.

      Reviewer #2 (Recommendations For The Authors):

      Minor Weaknesses:

      (1) The Introduction is too long and much of the text is not directly related to the study's research question and design.

      We have streamlined the introduction in the revised manuscript.

      (2) While no differences were seen by age or race, according to the authors, this is unlikely to be useful since the numbers are so small in some of these subcategories. Results from sensitivity analyses (e.g., excluding these individuals) may be more informative/useful.

      We agree that the lower numbers of participants for some subgroupings makes it challenging to know for sure if there are any differences based on these variables.  Have added text to clarify this. We have added age, race and gender to the LOCO analysis and to the variable inflation importance analysis (Table S5).

      (3) For Figure 4, based on what was described in the Results section of the manuscript, the authors should clarify that the figures show results for TOTAL HIV DNA only (not intact DNA): "Dimension reduction machine learning approaches identified two robust clusters of PWH when using total HIV DNA reservoir-associated immune cell frequencies (Figure 4A), but not for intact or percentage intact HIV DNA (Figure 4B and 4C)".

      We have added this information.

      (4) The statement on page 5, first paragraph, "Interestingly, when we examined a plot of percent intact proviruses versus time on therapy (Figure 1C), we observed a biphasic decay pattern," is not new (Peluso JCI Insight 2020, Gandhi JID 2023, McMyn JCI 2023). Prior studies have clearly demonstrated this biphasic pattern and should be cited here, and the sentence should be reworded with something like "consistent with prior work", etc.

      We have added citations to these studies and rephrased this comment.

      (5) The Cohort and sample collection sections are somewhat thin. Further details on the cohort details should include at the very minimum some description of the timing of ART initiation (is this mostly a chronic-treated cohort?) and important covariate data such as nadir CD4+ T cell count, pre-ART viral load, duration of ART suppression, etc.

      The cohort was treated during chronic infection, and we have clarified this in the manuscript.  Information regarding CD4 nadir and years on ART are included in Table 1.  Unfortunately, pre-ART viral load was not available for most members of this cohort, so we did not use it for analyses. The partial pre-ART viral load data is included with the dataset we are making publicly available.

      Reviewer #3 (Recommendations For The Authors):

      Minor points:

      (1) What is meant by CD4 nadir? Is this during primary infection or the time before ART initiation?

      We have clarified this description in the manuscript.  This term refers to the lowest CD4 count recorded during untreated infection.

      (2) The authors claim that determinants of reservoir size are starting to emerge but other than the timing of ART, I am not sure what studies they are referring to.

      We have updated the language of this section.  We intended to refer to studies looking at correlates of reservoir size, and feel that this is a more appropriate term that ‘determinants’

      (3) The discussion does not tie in the model-generated hypotheses with the known mechanisms that sustain the reservoir: clonal proliferation balanced by death and subset differentiation. It would be interesting to tie in the proposed reservoir clusters with these known mechanisms.

      We have added additional text to the manuscript to address these mechanisms.

      (4) Figure 1: Total should be listed as total HIV DNA.

      We have updated this in the manuscript.

      (5) Figure 1C: Worth mentioning the paper by Reeves et al which raises the possibility that the flattening of intact HIV DNA at 9 years may be spurious due to small levels of misclassification of defective as intact.

      We have added this reference.

      (6) "Total reservoir frequency" should be "total HIV DNA concentration"

      We respectfully feel that “frequency” is a more accurate term than “concentration”, since we are expressing the reservoir as a fraction of the CD4 T cells, while “concentration” suggests a denominator of volume.

      (7) Figure S2-5: label y-axis total HIV DNA.

      We have updated this figure.

    1. Reviewer #1 (Public Review):

      Summary:

      Building upon their famous tool for the deconvolution of human transcriptomics data (EPIC), Gabriel et al. implemented a new methodology for the quantification of the cellular composition of samples profiled with Assay for Transposase-Accessible Chromatin sequencing (ATAC-seq). To build a signature for ATAC-seq deconvolution, they first created a compendium of ATAC-seq data and derived chromatin accessibility marker peaks and reference profiles for 12 cell types, encompassing immune cells, endothelial cells, and fibroblasts. Then, they coupled this novel signature with the EPIC deconvolution framework based on constrained least-square regression to derive a dedicated tool called EPIC-ATAC. The method was then assessed using real and pseudo-bulk ATAC-seq data from human peripheral blood mononuclear cells (PBMC) and, finally, applied to ATAC-seq data from breast cancer tumors to show it accurately quantifies their immune contexture.

      Strengths:

      Overall, the work is of very high quality. The proposed tool is timely; its implementation, characterization, and validation are based on rigorous methodologies and results in robust estimates. The newly-generated, validation data and the code are publicly available and well-documented. Therefore, I believe this work and the associated resources will greatly benefit the scientific community.

      Weaknesses:

      In the benchmarking analysis, EPIC-ATAC was compared also to deconvolution methods that were originally developed for transcriptomics and not for ATAC-seq data. However, the authors described in detail the specific settings used to analyze this different data modality as robustly as possible, and they discussed possible limitations and ideas for future improvement.

    1. by doubling the size of the tables in the in the eating in the eating areas they increase cross-divisional across talk um in a very informal way they found out that cross-department um Corporation increased after that and the and the code and the code output increased two months later

      for - neuroscience - example - informal diversity - increases work efficacy - via sharing diverse and novel perspectives

    1. Welcome back, and in this lesson, I want to talk about AWS CloudFormation. I'm going to be brief because learning CloudFormation is something which will happen throughout the course, as we'll be using it to automate certain things. Before we dive in, I'll introduce the concepts you'll need and give you a chance to experience a simple practical example.

      CloudFormation is a tool which lets you create, update, and delete infrastructure in AWS in a consistent and repeatable way using templates. Rather than creating and updating resources manually, you create a template, and CloudFormation will do the rest on your behalf.

      At its base, CloudFormation uses templates. You can use a template to create AWS infrastructure using CloudFormation. You can also update a template and reapply it, which causes CloudFormation to update the infrastructure, and eventually, you can use CloudFormation to delete that same infrastructure. A CloudFormation template is written either in YAML or JSON. Depending on your experience, you might be familiar with one or both of these. If you haven't touched YAML or JSON before, don't worry. They achieve the same thing, and it's easy to convert between them. You might get to pick which one to use when writing templates, or your business might have a preference. It's mostly a matter of personal preference. Most people in the AWS space like one and dislike the other, though very few people like both. I am one of those who likes both. I started my AWS career using JSON but have come to appreciate the extra functionality that YAML offers. However, YAML can be easier to make mistakes with because it uses white spaces to indicate which parts belong to which other parts. Since spaces are not always visible, it can be a problem for less experienced engineers or architects. If I have to pick one, I'll use YAML. So for the rest of this lesson, I'll focus on YAML.

      I want to quickly step through what makes a template, the components of a template, and then discuss the architecture of CloudFormation before moving on to a demo. All templates have a list of resources, at least one. The resources section of a CloudFormation template tells CloudFormation what to do. If resources are added, CloudFormation creates them. If resources are updated, CloudFormation updates them. If resources are removed from a template and that template is reapplied, then physical resources are removed. The resources section of a template is the only mandatory part of a CloudFormation template, which makes sense because without resources, the template wouldn't do anything. The simple template that we'll use in the demo lesson immediately following this one has resources defined in it, and we'll step through those and evaluate exactly what they do.

      Next is the description section. This is a free text field that lets the author of the template add a description, as the name suggests. Generally, you would use this to provide details about what the template does, what resources get changed, and the cost of the template. Anything that you want users to know can be included in the description. The only restriction to be aware of is if you have both a description and an AWSTemplateFormatVersion, then the description needs to immediately follow the template format version. The template format version isn't mandatory, but if you use both, the description must directly follow the template format version. This has been used as a trick question in many AWS exams, so it pays to be aware of this restriction. The template format version allows AWS to extend standards over time. If it's omitted, the value is assumed.

      The metadata in the template is the next part I want to discuss. It has many functions, including some advanced ones. For example, metadata can control how different elements in the CloudFormation template are presented through the console UI. You can specify groupings, control the order, and add descriptions and labels, which helps in managing how the UI presents the template. Generally, the bigger your template and the wider the audience, the more likely it is to have a metadata section. Metadata serves other purposes, which I'll cover later in the course.

      The parameters section of a template allows you to add fields that prompt the user for more information. When applying the template from the console UI, you'll see boxes to type in or select from dropdowns. This can be used to specify things like the size of the instance to create, the name of something, or the number of availability zones to use. Parameters can have settings for valid entries and default values. You'll gain more experience with this as we progress through the course and use CloudFormation templates.

      The next section is mappings, which is another optional section of the CloudFormation template and something we won't use as much, especially when starting with CloudFormation. It allows you to create lookup tables. For example, you can create a mappings table called RegionAndInstanceTypeToAMI, which selects a specific Amazon Machine Image based on the region and environment type (e.g., test or prod). This is something you'll get experience with as the course continues, but I wanted to introduce it at this point.

      Next, let's talk about conditions. Conditions allow decision-making in the template, enabling certain things to occur only if a condition is met. Using conditions involves a two-step process. Step one is to create the condition. For instance, if a parameter is equal to "prod" (i.e., if the template is being used to create prod resources), then you create a condition called CreateProdResources. If the parameter "environment type" is set to "prod," the condition CreateProdResources will be true. Step two is using this condition within resources in the CloudFormation template. For example, a resource called Prodcatgifserver will only be created if the condition CreateProdResources is true. This will only be true if the "environment type" parameter is set to "prod" rather than "test." If it's set to "test," that resource won't be created.

      Finally, outputs are a way for the template to present outputs based on what's being created, updated, or deleted once the template is finished. For example, outputs might return the instance ID of an EC2 instance that's been created, or if the template creates a WordPress blog, it could return the admin or setup address for that blog.

      So, how exactly does CloudFormation use templates? CloudFormation starts with a template. A template contains resources and other elements you'll become familiar with as we use CloudFormation more. Let's take a simple example—a template that creates an EC2 instance. Resources inside a CloudFormation template are called logical resources. In this case, the logical resource is called "instance," with a type of AWS::EC2::Instance. The type tells CloudFormation what to create. Logical resources generally also have properties that CloudFormation uses to configure the resources in a specific way.

      When you provide a template to CloudFormation, it creates a stack, which contains all the logical resources defined in the template. A stack is a living and active representation of a template. One template could create one stack, or several stacks, or anywhere in between. A stack is created when you tell CloudFormation to do something with that template.

      For any logical resources in the stack, CloudFormation makes a corresponding physical resource in your AWS account. For example, if the stack contains a logical resource called "instance," which defines an EC2 instance, the physical resource is the actual EC2 instance created by CloudFormation. It's CloudFormation's job to keep the logical and physical resources in sync. When you use a template to create a stack, CloudFormation scans the template, creates a stack with logical resources, and then creates matching physical resources.

      You can also update a template and use it to update the stack. When you do this, the stack's logical resources will change—new ones may be added, existing ones updated or deleted. CloudFormation performs the same actions on the physical resources, adding, updating, or removing them as necessary. If you delete a stack, its logical resources are deleted, leading CloudFormation to delete the matching physical resources.

      CloudFormation is a powerful tool that allows you to automate infrastructure. For instance, if you host WordPress blogs, you can use one template to create multiple deployments rather than setting up each site individually. CloudFormation can also be part of change management, allowing you to store templates in source code repositories, make changes, get approval, and apply them as needed. It can also be used for one-off deployments.

      Throughout this course, I'll be using CloudFormation to help you implement various things in demo lessons. If a demo lesson requires certain products to function, I might provide a CloudFormation template to set up the base infrastructure. Alternatively, you can use the template to implement the entire demo end-to-end. CloudFormation is super powerful, and you'll get plenty of exposure to it throughout the course.

      Now, that's all the theory I wanted to cover. The next lesson will be a demo where you'll use CloudFormation to create an EC2 instance. Remember in the EC2 demo lesson, where you created an EC2 instance? In the next demo lesson, you'll create a similar EC2 instance using CloudFormation, demonstrating how much quicker and easier it is to automate infrastructure tasks with CloudFormation. So go ahead, complete this video, and when you're ready, join me in the next lesson where we'll demo CloudFormation.

  3. Local file Local file

    Annotators

    1. Reviewer #1 (Public Review):

      Summary:

      The authors applied a domain adaptation method using the principal of optimal transport (OT) to superimpose read count data onto each other. While the title suggests that the presented method is independent from and performs better than other methods of bias correction, the presented work uses a self-implemented version of GC bias correction apart of the OT domain adaptation. Performance comparisons were done both on normalized read counts as well as on copy number profiles which is already the complete set of presented use cases. Results involving copy number profiles from iChorCNA were also subjected to the bias correction measures implemented there. It is not clear at many points which correction method actually causes the observed performance.

      Strengths:

      The quality of superimposing distributions of normalized read counts (and copy number profiles) was sufficiently shown using uniformly distributed p-values in the interval of 0 to 1 for healthy controls D7 and D8 which differed in the choice of library preparation kit.

      The ability to select a sample from the source domain for samples in the target domain was demonstrated.

      Weaknesses:

      Experiment Design:

      The chosen bias correction methods are not explicitly designed for nor aimed at domain adaptation. The benchmark against GC bias correction while doing GC bias correction during the OT procedure is probably the most striking flaw of the entire work. GC bias correction has the purpose of correction GC biases, wherever present, NOT correcting categorical pre-analytical variables of undefined character. A more thorough examination of the presented results should address why plain iChor CNA is the best performing "domain adaptation" in some cases. Also, the extent to which the implemented GC bias correction is contributing to the performance increase independent of the OT procedure should be assessed separately in each case.<br /> Moreover, the center-and-scale standardization is probably not the most relevant contestant in domain adaptation that is out there.

      Comparison of cohorts (domains) - especially healthy from D7 and D8 - it is not described which type of ChIP analysis was done for the healthy controls of the D7 domain. The utilized library preparation kit implies that D7 represents a subset of available cfDNA in a plasma sample by precipitating only certain cfDNA fragments to which undisclosed type of protein was bound. Even if the type of protein turns out to be histones, the extracted subset of cfDNA should not be regarded as coming from the same distribution of cfNDAs. For example, fragments with sub-mononucleosomal length would be depleted in the ChIP-seq data set while these could be extracted in an untargeted cfDNA sequencing data set. It needs to be clarified why the authors deem D7 and D8 healthy controls to be identical with regards to SCNA analysis. Best start with the protein targets of D7 ChIP-seq samples.

      From the Illumina TruSeq ChIP product description page:<br /> "TruSeq ChIP Libary Preparation Kits provide a simple, cost-effective solution for generating chromatin immunoprecipitation sequencing (ChIP-Seq) libraries from ChIP-derived DNA. ChIP-seq leverages next-generation sequencing (NGS) to quickly and efficiently determine the distribution and abundance of DNA-bound protein targets of interest across the genome."

      Redundancy:

      Some parts throughout the results and discussion part reappear in the methods. The description of the methodology should be concentrated in the method section and only reiterated in a summarizing fashion where absolutely necessary.<br /> Unnecessary repetition inflate the presented work which is not appealing to the reader. Rather include more details of the utilized materials and methods in the corresponding section.

      Transparency:

      At the time point of review, the code was not available under the provided link.<br /> A part of the healthy controls from D8 is not contained under the provided accession (367 healthy samples are available in the data base vs. sum of D7 and D8 healthy controls is 499)

      Neither in the paper nor in reference 4 is an explanation of what was targeted with the ChIP-seq approach.

      Consistency:

      It is not evident why a ChIP-seq library prep kit was used (sample cohorts designated as D7). The DNA isolation procedure was not presented as having an immunoprecipitation step. Furthermore, it is not clear which DNA bound proteins were targeted during ChIP seq, if such an immunoprecipitation was actually carried out.The authors self-implemented a GC bias correction procedure although they already mentioned other procedures earlier like LIQUORICE. Also, there already exist tools that can be used to correct GC bias, like deepTools (github.com/deeptools/deepTools). Other GC bias correction algorithms designed specifically for cfDNA would be Griffin (github.com/adoebley/Griffin) and GCparagon (github.com/BGSpiegl/GCparagon). When benchmarking against state-of-the-art cfDNA GC bias correction, these algorithms should appear in a relevant scientific work, somewhere other than the introduction, preferably in the results section. It should be shown that the chosen GC bias correction method is performing best under the given circumstances.

      Accuracy:

      Use clear labels for each group of samples. The domain number is not sufficient to effectively distinguish sample groups. Already the source name plus a simple enumeration would improve the clarity at some points.

      The healthy controls of D7 and D8 are described but the numbers do not add up (257 healthy controls in line 227 vs. 260 healthy controls in line 389). Please double check this and use representative sample cohort labels in the materials description for improved clarity!

      Avoid statements like "the rest" when talking about a mixed set of samples. It is not clear how many samples from which domain are addressed.

      For optimal transport, knowledge about the destination is required ("where do I want to transport to?") and, thus, the proposed method can never be unsupervised. It is always necessary to know the label of both the source and target domains. In practice, this is not often the case and users might fall prey to the error of superimposing data that is actually separated by valid differences in some experimental variables.

      Seemingly arbitrary cutoff values are mentioned. For example, it is not clear if choosing "the cutoff that produced the highest MCCs" is meant across methods or for each method separately (are the results for each method reported that also resulted in the highest MCC for that method?).

      The Euclidean metric for assessing the similarity of (normalized) read counts is questionable for a high dimensional space: read counts are assessed for 1 Mb genomic intervals which yields around 3000 intervals (dimensions), depending on the number of excluded intervals (which was not described in more detail). There might be more appropriate measures in this high dimensional space.

      It is sometimes not clear what data actually is presented. An example would be the caption of Figure 2, (C): it is suggested that all (320) ovarian cancer cases are shown in one copy number profile.

      Furthermore, the authors do not make a distinction between male and female samples. A clarification is needed why the authors think SCNAs of ovarian cancer samples should be called against a reference set that contains male controls.<br /> The procedure would likely benefit from a strict separation of male and female cases which would also allow for chrX (and chrY) being included in downstream analysis.

      The GC bias and mappability correction implicitly done by iChorCNA for the SCNA profile comparison is presented as "no correction" which is highly misleading. (for clarification, this is also deemed inappropriate, not just inaccurate))

      The majority of interpretations presented procedure does not give any significant improvement regarding the similarity of copy number profiles are off and in many instances favor the OT procedure in an unscientific and highly inappropriate manner.

      Apart of duplicate marking (which is not specified any further - provide the command(s)!), there is no information on which read (pairs) were used (primary, secondary, supplementary, mapped in a proper pair, fragment length restrictions, clipping restrictions, etc.). The authors should explain why base quality score re-calibration was done as this might be an unnecessary step if the base quality values are not used later on.

      The adaptation method presented as "center-and-scale standardization" is inappropriate for unbalanced cancer profiles since it assumes the presence of identical SCNAs in all samples belonging to the same cancer entity.<br /> Please explain why normalizing 1 Mb genomic intervals to the average copy number across different cancer samples should be valid or use another domain adaptation method for performance comparison.

      Statements like in line 83 (unsupervised DA) are plain wrong because transport from one domain to another requires the selection of a target domain based on a label, e.g., based on health status, cancer entity, or similar.

      Relevance and Appropriateness:

      Many of the presented results are not relevant or details of the procedure were incomprehensible or incomplete: the results presented in table 2 - sample assignment. The Euclidean metric seems to be inappropriate for high dimensional data. Also the selection of the cutoff based on Euclidean distance seems to enable the optimization in favor of the OT procedure. It is hypothesized that there might exist other cutoff values for which the selection of samples form the source domain would also work for other correction methods but this is not further described. It could simply be the case that OT can assign a relationship between domains

      The statement that there are no continuous pre-analytical variables is wrong (304). The effect of target depth-of-coverage (DoC) was not analyzed although this represents one of the most common (continuous) and difficult to control variables in NGS data analysis. The inclusion of multiple samples from a single patient in a cohort likely represents introduction of a confounding factor ["contamination"] to the model training procedure: the temporal difference that lies between the taken samples of that patient represents leakage of information. As far as can be told from the presented data, this potential bias has not been ruled out (e.g., exclusion of all samples beyond the first from each patient or alternatively: picking all samples of a patient either for the training set or the test set).

      Conscientiousness:

      Statements like "good"/"best" on their own should be avoided. A clear description of why a certain procedure/methodology/algorithm performs better should be preferred in scientific writing (e.g., "highest MCC values" instead of "best MCC values").<br /> Otherwise, such statements represent mere opinions of the author rather than an unbiased evaluation of the results.<br /> The domain D8 of healthy controls seems to contain samples from multiple sources (some published other in-house). Contrary to the data availability statement (533), not all healthy control samples of the HEMA data set are available from ArrayExpress

      Other Major Concerns:

      Potential Irrelevance:

      The manuscript represents a mere performance assessment of the proposed sWGS per-bin-read-count fitting procedure and, thus, a verification in its character, not a validation (although the model training itself was "validated" - but this is to be viewed separately from the validity of the achieved correction in a biological context). A proper (biological) validation is missing.

      It is of utmost importance that parameters of the adapted (transported) samples -that lie outside of what has been optimized to be highly similar- are checked to actually validate the procedure. Especially biological signals and genome-wide parameters (GC content distribution before/after transport) need to be addressed also in hindsight of the rampant criticism towards GC bias correction by the authors. At no point in the manuscript was GC bias addressed properly, i.e., how much of an improvement is expected from GC bias correction if there is no significant GC bias?

      The (potential - not clear so far) ability of making ChIP-seq data look like cfDNA data (even if only the copy number profiles SCNAs appear highly similar) raises the concern of potential future users of the tool to superimpose domains that should not be superimposed form a biological point of view because the true domain the superimposed cohorts belong to are different. The ability to superimpose anything onto anything s troubling. There is no control mechanism that allows for failure in cases where the superposition is invalid.

      Chromosome X was excluded which could be avoided if data sets were split according to biological sex.

      The difference between the distributions was never attributed to GC bias, hence, the benchmark against GC bias correction tools might not be relevant in the first place.

      Stability of OT data transformation:

      The authors state that the straight forward choice of lambda resulted in many occasions where disruptions (of unspecified nature and amplitude) are introduced in the copy number profiles of transformed data. It is not evident from the proposed work to which extent this behavior was removed from the procedure and if it can occur and how the user could resolve such a problem on their own.

      In summary, the presented work needs considerable adaptation and additions before it can actually be considered a valuable contribution to the liquid biopsy field.

    1. Mr. B has risen beyond his real abilities by virtue of health, good looks, and being a good mixer. He married for money and he has done lots of other things for money. His code is not his own; it is that of his class—no worse, no better, He fits easily into whatever pattern is successful. That is his sole measure of value—success. Nazism as a minority movement would not attract him. As a movement likely to attain power, it would

      "That is his sole measure of value - success."

    1. F.relu

      nn.ReLU()是构造了一个ReLU对象,并不是函数调用,而F.ReLU()是函数调用

      这里也可以写成 return self.out(nn.ReLU()(self.hidden(X)) ,但是没有必要

      In the provided code, the ReLU layer is applied as a function within the forward method, using F.relu(self.hidden(X)). This means that the ReLU activation is not explicitly recorded as a separate layer in the model's structure. Instead, it is applied directly to the output of the hidden layer during the forward pass.

      If you want to explicitly include the ReLU layer in the model's structure, you can define it as a separate layer in the __init__ method and then use it in the forward method. Here's an example:

      ```python class MLP(nn.Module): def init(self): super().init() self.hidden = nn.Linear(20, 256) # 隐藏层 self.relu = nn.ReLU() # ReLU 层 self.out = nn.Linear(256, 10) # 输出层

      def forward(self, X):
          return self.out(self.relu(self.hidden(X)))
      

      ```

      In this version, the ReLU layer is explicitly defined and included in the model's structure.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors developed an extension to the pairwise sequentially Markov coalescent model that allows to simultaneously analyze multiple types of polymorphism data. In this paper, they focus on SNPs and DNA methylation data. Since methylation markers mutate at a much faster rate than SNPs, this potentially gives the method better power to infer size history in the recent past. Additionally, they explored a model where there are both local and regional epimutational processes. Integrating additional types of heritable markers into SMC is a nice idea which I like in principle. However, a major caveat to this approach seems to be a strong dependence on knowing the epimutation rate. In Fig. 6 it is seen that, when the epimutation rate is known, inferences do indeed look better; but this is not necessarily true when the rate is not known. (See also major comment #1 below about the interpretation of these plots.) A roughly similar pattern emerges in Supp. Figs. 4-7; in general, results when the rates have to be estimated don't seem that much better than when focusing on SNPs alone. This carries over to the real data analysis too: the interpretation in Fig. 7 appears to hinge on whether the rates are known or estimated, and the estimated rates differ by a large amount from earlier published ones.

      Overall, this is an interesting research direction, and I think the method may hold more promise as we get more and better epigenetic data, and in particular better knowledge of the epigenetic mutational process. At the same time, I would be careful about placing too much emphasis on new findings that emerge solely by switching to SNP+SMP analysis.

      Major comments:

      - For all of the simulated demographic inference results, only plots are presented. This allows for qualitative but not quantitative comparisons to be made across different methods. It is not easy to tell which result is actually better. For example, in Supp. Fig. 5, eSMC2 seems slightly better in the ancient past, and times the trough more effectively, while SMCm seems a bit better in the very recent past. For a more rigorous approach, it would be useful to have accompanying tables that measure e.g. mean-squared error (along with confidence intervals) for each of the different scenarios, similar to what is already done in Tables 1 and 2 for estimating $r$.

      We believe this comment was addressed in the previous revision (Sup Table 6-10) by adding Root Mean Square Errors for the demographic estimates (and RMSE for recent versus past portions of the demography). 

      - 434: The discussion downplays the really odd result that inputting the true value of the mutation rate, in some cases, produces much worse estimates than when they are learned from data (SFig. 6)! I can't think of any reason why this should happen other than some sort of mathematical error or software bug. I strongly encourage the authors to pin down the cause of this puzzling behaviour. (Comment addressed in revision. Still, I find the explanation added at 449ff to be somewhat puzzling -- shouldn't the results of the regional HMM scan only improve if the true mutation rate is given?)

      We do understand that our results and explanation can appear counter-intuitive. As acknowledged by the reviewer, in the previous round of revision we have at length clarified this puzzling behaviour by the discrepancy in assessing methylation regions using the HMM method which then differs from the HMM for the SMC inference. We are happy to clarify further in response to the new question of reviewer 1:

      If the Reviewer #1 means the SNP mutations (e.g. A → T), knowing the true mutation rate does not help the HMM to recover the region level methylation status. 

      If the Reviewer #1 means the epimutations (whether it is the region, site or both), knowing the true epimutations rates could theoretically help the HMM to recover the region level methylation status. However, at present, our method does not leverage information from epimutation rates to infer the region level methylation status. As inferring the epimutations rates is one of the goals of this study in the SMC inference, and that region level methylation status is required to infer those rates, we suspect that using epimutations rates to infer the region level methylation status could be statistically inappropriate (generating some kind of circular estimations). Instead, our HMM uses only the proportion of methylated and unmethylated sites (estimated from the genome) to determine whether or not a region status is most-likely to be methylated or unmethylated. We now explicit this fact in the HMM for methylation region in the method section.

      We acknowledge that our HMM to infer region level methylation status could be improved, but this would be a complete project and study on its own (due to the underlying complexity of the finite site and the lack of a consensus model for epimutations at evolutionary time scale). We believe our HMM to have been the best compromise with what was known from methylation and our goals when the study was conducted, and future work is definitely worth conducting on the estimation of the methylation regions.

      - As noted at 580, all of the added power from integrating SMPs/DMRs should come from improved estimation of recent TMRCAs. So, another way to study how much improvement there is would be to look at the true vs. estimated/posterior TMRCAs. Although I agree that demographic inference is ultimately the most relevant task, comparing TMRCA inference would eliminate other sources of differences between the methods (different optimization schemes, algorithmic/numerical quirks, and so forth). This could be a useful addition, and may also give you more insight into why the augmented SMC methods do worse in some cases. (Comment addressed in revision via Supp. Table 7.).

      - A general remark on the derivations in Section 2 of the supplement: I checked these formulas as best I could. But a cleaner, less tedious way of calculating these probabilities would be to express the mutation processes as continuous time Markov chains. Then all that is needed is to specify the rate matrices; computing the emission probabilities needed for the SMC methods reduces to manipulating the results of some matrix exponentials. In fact, because the processes are noninteracting, the rate matrix decomposes into a Kronecker sum of the individual rate matrices for each process, which is very easy to code up. And this structure can be exploited when computing the matrix exponential, if speed is an issue.

      We believe this comment was acknowledged in the previous revision (line 649), and we thank the reviewer for this interesting insight.

      - Most (all?) of the SNP-only SMC methods allow for binning together consecutive observations to cut down on computation time. I did not see binning mentioned anywhere, did you consider it? If the method really processes every site, how long does it take to run?

      We believe this comment was addressed in the previous revision and was added to the manuscript in the methods Section (subsection :  SMC optimization function).

      - 486: The assumed site and region (de)methylation rates listed here are several OOM different from what your method estimated (Supp. Tables 5-6). Yet, on simulated data your method is usually correct to within an order of magnitude (Supp. Table 4). How are we to interpret this much larger difference between the published estimates and yours? If the published estimates are not reliable, doesn't that call into question your interpretation of the blue line in Fig. 7 at 533? (Comment addressed in revision.)

      Reviewer #2 (Public Review):

      A limitation in using SNPs to understand recent histories of genomes is their low mutation frequency. Tellier et al. explore the possibility of adding hypermutable markers to SNP based methods for better resolution over short time frames. In particular, they hypothesize that epimutations (CG methylation and demethylation) could provide a useful marker for this purpose. Individual CGs in Arabidopsis tends to be either close to 100% methylated or close to 0%, and are inherited stably enough across generations that they can be treated as genetic markers. Small regions containing multiple CGs can also be treated as genetic markers based on their cumulative methylation level. In this manuscript, Tellier et al develop computational methods to use CG methylation as a hypermutable genetic marker and test them on theoretical and real data sets. They do this both for individual CGs and small regions. My review is limited to the simple question of whether using CG methylation for this purpose makes sense at a conceptual level, not at the level of evaluating specific details of the methods. I have a small concern in that it is not clear that CG methylation measurements are nearly as binary in other plants and other eukaryotes as they are in Arabidopsis. However, I see no reason why the concept of this work is not conceptually sound. Especially in the future as new sequencing technologies provide both base calling and methylating calling capabilities, using CG methylation in addition to SNPs could become a useful and feasible tool for population genetics in situations where SNPs are insufficient.

      We thank again the reviewer #2 for his positive comments.  

      Reviewer #3 (Public Review):

      I very much like this approach and the idea of incorporating hypervariable markers. The method is intriguing, and the ability to e.g. estimate recombination rates, the size of DMRs, etc. is a really nice plus. I am not able to comment on the details of the statistical inference, but from what I can evaluate it seems reasonable and in principle the inclusion of highly mutable sties is a nice advance. This is an exciting new avenue for thinking about inference from genomic data. I remain a bit concerned about how well this will work in systems where much less is understood about methylation,

      The authors include some good caveats about applying this approach to other systems, but I think it would be helpful to empiricists outside of thaliana or perhaps mammalian systems to be given some indication of what to watch out for. In maize, for example, there is a nonbimodal distribution of CG methlyation (35% of sites are greater than 10% and less than 90%) but this may well be due to mapping issues. The authors solve many of the issues I had concerns with by using gene body methylation, but this is only briefly mentioned on line 659. I'm assuming the authors' hope is that this method will be widely used, and I think it worth providing some guidance to workers who might do so but who are not as familiar with these kind of data.

      We thank the reviewer #3 for his positive comments. And we agree with Reviewer #3 concerning the application to data and that our approach needs to be carefully thought before applied. Our results clearly show that methylation processes are not well enough understood to apply our approach as we initially (maybe naively) designed it. Further investigations need to be conducted and appropriate theoretical models need to be developed before reliable results can be obtained. And we hope that our discussion points this out. However, our approach, the theoretical models and the additional tools contained in this study can be used to help researchers in their investigations to whether or not use different genomic markers to build a common (potentially more reliable) ancestral history. We enhanced the discussion in this second revision by clarifying also the use of the methylation from genic regions to avoid  confusion (lines 700-731).

      Recommendations for the authors:  

      Reviewer #1 (Recommendations For The Authors):

      In added Supp. Table 7, I don't think these are in log10 units as stated in the caption.

      Well Spotted! Indeed, the RMSE is not in log10 scale, we corrected the caption. We also added that the TMRCA used for MRSE calculations is in generations units to avoid potential confusion.  

      Reviewer #3 (Recommendations for The Authors):

      I very much appreciate the authors' attention to previous questions. I would ask that a bit more is spent in the discussion on concerns/approaches empiricists should keep in mind -- I am wary of this being uncritically applied to data from non-model species. It was not clear to me, for example (only mentioned on line 659 in the discussion) that the thaliana data is only using gene-body methylation. This poses potential issues with background selection that the authors acknowledge appropriately, but also assuages many of my concerns about using genome-wide data. I think text with recommendations for data/filtering/etc or at least cautions of assumptions empiricists should be aware of would help.

      We apologize for the confusion at line 659. As written in the other section of the manuscript we meant CG sites in genic regions (and not only gene body methylated regions).

      Due to the manuscript’s structure, the data from Arabidopsis thaliana is only described at the very end of the manuscript (line 900+). However, a brief description could also be found line 291-296. We however added a sentence in the introduction (line 128) for clarity. 

      We however agree with the comment made by reviewer #3 concerning the application to data. We pointed in the discussion the risk of applying our approach on ill-understood (or illprepared) data and stressed the current need of studies on the epimutations processes at evolutionary time scale ( i.e. at Ne time scale) (line 700-703).

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1

      R1 Cell profiling is an emerging field with many applications in academia and industry. Finding better representations for heterogeneous cell populations is important and timely. However, unless convinced otherwise after a rebuttal/revision, the contribution of this paper, in our opinion, is mostly conceptual, but in its current form - not yet practical. This manuscript combined two concepts that were previously reported in the context of cell profiling, weakly supervised representations. Our expertise is in computational biology, and specifically applications of machine learning in microscopy.

      In our revised manuscript, we have aimed to better clarify the practical contributions of our work by demonstrating the effectiveness of the proposed concepts on real-world datasets. We hope that these revisions and our detailed responses address your concerns and highlight the potential impact of our approach.

      R1.1a. CytoSummaryNet is evaluated in comparison to aggregate-average profiling, although previous work has already reported representations that capture heterogeneity and self-supervision independently. To argue that both components of contrastive learning and sets representations are contributing to MoA prediction we believe that a separate evaluation for each component is required. Specifically, the authors can benchmark their previous work to directly evaluate a simpler population representation (PMID: 31064985, ref #13) - we are aware that the authors report a 20% improvement, but this was reported on a separate dataset. The authors can also compare to contrastive learning-based representations that rely on the aggregate (average) profile to assess and quantify the contribution of the sets representation.

      We agree that evaluating the individual contributions of the contrastive learning framework and single-cell data usage is important for understanding CytoSummaryNet's performance gains.

      To assess the impact of the contrastive formulation independently, we applied CytoSummaryNet to averaged profiles from the cpg0004 dataset. This isolated the effect of contrastive learning by eliminating single-cell heterogeneity. The experiment yielded a 32% relative improvement in mechanism of action retrieval, compared to the 68% gain achieved with single-cell data. These findings suggest that while the contrastive formulation contributes significantly to CytoSummaryNet's performance, leveraging single-cell information is crucial for maximizing its effectiveness. We have added a discussion of this experiment to the Results section:

      “We conducted an experiment to determine whether the improvements in mechanism of action retrieval were due solely to CytoSummaryNet's contrastive formulation or also influenced by the incorporation of single-cell data. We applied the CytoSummaryNet framework to pre-processed average profiles from the 10 μM dose point data of Batch 1 (cpg0004 dataset). This approach isolated the effect of the contrastive architecture by eliminating single-cell data variability. We adjusted the experimental setup by reducing the learning rate by a factor of 100, acknowledging the reduced task complexity. All other parameters remained as described in earlier experiments.

      This method yielded a less pronounced but still substantial improvement in mechanism of action retrieval, with an increase of 0.010 (32% enhancement - Table 1). However, this improvement was not as high as when the model processed single-cell level data (68% as noted above). These findings suggest that while CytoSummaryNet's contrastive formulation contributes to performance improvements, the integration of single-cell data plays a critical role in maximizing the efficacy of mechanism of action retrieval.”

      We don't believe comparing with PMID: 31064985 is useful: while the study showcased the usefulness of modeling heterogeneity using second-order statistics, its methodology is limited in scalability due to the computational burden of computing pairwise similarities for all perturbations, particularly in large datasets. Additionally, the study's reliance on similarity network fusion, while expedient, introduces complexity and inefficiency. We contend that this comparison does not align with our objective of testing the effectiveness of heterogeneity in isolation, as it primarily focuses on capturing second and first-order information. Thus, we do not consider this study a suitable baseline for comparison.

      R1.1b. The evaluation metric of mAP improvement in percentage is misleading, because a tiny improvement for a MoA prediction can lead to huge improvement in percentage, while a much larger improvement in MoA prediction can lead to a small improvement in percentage. For example, in Fig. 4, MEK inhibitor mAP improvement of ~0.35 is measured as ~50% improvement, while a much smaller mAP improvement can have the same effect near the origins (i.e., very poor MoA prediction).

      We agree that relying solely on percentage improvements can be misleading, especially when small absolute changes result in large percentage differences.

      However, we would like to clarify two key points regarding our reporting of percentage improvements:

      • We calculate the percentage improvement by first computing the average mAP across all compounds for both CytoSummaryNet and average profiling, and then comparing these averages. This approach is less susceptible to the influence of outlier improvements compared to calculating the average of individual compound percentage improvements.
      • We report percentage improvements alongside their corresponding absolute improvements. For example, the mAP improvement for Stain4 (test set) is reported as 0.052 (60%). To further clarify this point, we have updated the caption of Table 1 to explicitly state how the percentage improvements are calculated:

      The improvements are calculated as mAP(CytoSummaryNet)-mAP(average profiling). The percentage improvements are calculated as (mAP(CytoSummaryNet)-mAP(average profiling))/mAP(average profiling).

      R1.1b. (Subjective) visual assessment of this figure does not show a convincing contribution of CytoSummaryNet representations of the average profiling on the test set (3.33 uM). This issue might also be relevant for the task of replicate retrieval. All in all, the mAP improvement reported in Table 1 and throughout the manuscript (including the Abstract), is not a proper evaluation metric for CytoSummaryNet contribution. We suggest reporting the following evaluations:

      1. Visualizing the results of cpg0001 (Figs. 1-3) similarly to cpg0004 (Fig. 4), i.e., plotting the matched mAP for CytoSummaryNet vs. average profile.

      2. In Table 1, we suggest referring to the change in the number of predictable MoAs (MoAs that pass a mAP threshold) rather than the improvement in percentages. Another option is showing a graph of the predictability, with the X axis representing a threshold and Y-axis showing the number of MoAs passing it. For example see (PMID: 36344834, Fig. 2B) and (PMID: 37031208, Fig. 2A), both papers included contributions from the corresponding author of this manuscript.

      Regarding the suggestion to visualize the results for compound group cpg0001 similarly to cpg0004, unfortunately, this is not feasible due to the differences in data splitting between the two datasets. In cpg0001, an MoA might have one compound in the training set and another in the test or validation set. Reporting a single value per MoA would require combining these splits, which could be misleading as it would conflate performance across different data subsets.

      However, we appreciate the suggestion to represent the number of predictable MoAs that surpass a certain mAP threshold, as it provides another intuitive measure of performance. To address this, we have created a graph that visualizes the predictability of MoAs across various thresholds, similar to the examples provided in the referenced papers (PMID: 36344834, Figure 2B and PMID: 37031208, Figure 2A). This graph, with the x-axis depicting the threshold and the y-axis showing the number of MoAs meeting the criterion, has been added to Supplementary Material K.

      R1.1c.i. "a subset of 18 compounds were designated as validation compounds" - 5 cross-validations of 18 compounds can make the evaluation complete. This can also enhance statistical power in figures 1-3.

      We appreciate your suggestion and acknowledge the potential benefits of employing cross-validation, particularly in enhancing statistical power. While we understand the merit of cross-validation for evaluating model performance and generalization to unseen data, we believe the results as presented already highlight the generalization characterics of our methods.

      Specifically, (the new) Figure 3 demonstrates the model's improvement over average profiling in both training and validation plates, supporting its ability to generalize to unseen compounds (but not to unseen plates).

      While cross-validation could potentially enhance our analysis, retraining five new models solely for different validation set results may not substantially alter our conclusions, given the observed trends in Suppl Figure A1 and (the new) Figure 4, both of which show results across multiple stain sets (but a single train-test-validation split).


      R1.1c.ii. Clarify if the MoA results for cpg0001 are drawn from compounds from both the training and the validation datasets. If so, describe how the results differ between the sets in text and graphs.

      We confirm that the Mechanism of Action (MoA) retrieval results for cpg0001 are derived from all available compounds. It's important to note that the training and validation dataset split for the replicate retrieval task is different from the MoA prediction task. For replicate retrieval, we train using all available compounds and validate on a held-out set (see Figure 2). For MoA prediction, we train using the replicate retrieval task as the objective on all available compounds but validate using MoA retrieval, which is a distinct task. We have added a brief clarification in the main text to highlight the distinction between these tasks and how validation is performed for each:

      “We next addressed a more challenging task: predicting the mechanism of action class for each compound at the individual well level, rather than simply matching replicates of the exact same compound (Figure 5). It's also important to note that mechanism of action matching is a downstream task on which CytoSummaryNet is not explicitly trained. Consequently, improvements observed on the training and validation plates are more meaningful in this context, unlike in the previous task where only improvements on the test plate were meaningful. For similar reasons, we calculate the mechanism of action retrieval performance on all available compounds, combining both the training and validation sets. This approach is acceptable because we calculate the score on so-called "sister compounds" only—that is, different compounds that have the same mechanism of action annotation. This ensures there is no overlap between the mechanism of action retrieval task and the training task, maintaining the integrity of our evaluation. ”

      R1.1c.iii. "Mechanism of action retrieval is evaluated by quantifying a profile's ability to retrieve the profile of other compounds with the same annotated mechanism of action.". It was unclear to us if the evaluation of mAP for MoA identification can include finding replicates of the same compound. That is, whether finding a close replicate of the same compound would be included in the AP calculation. This would provide CytoSummaryNet with an inherent advantage as this is the task it is trained to do. We assume that this was not the case (and thus should be more clearly articulated), but if it was - results need to be re-evaluated excluding same-compound replicates.

      The evaluation excludes replicate wells of the same compound and only considers wells of other compounds with the same MoA. This methodology ensures that the model's performance on the MoA prediction task is not inflated by its ability to find replicates of the same compound, which is the objective of the replicate retrieval task. Please see the explanation we have added to the main text in our response to R1.1c.ii. Additionally, we have updated the Methods section to clearly describe this evaluation procedure:

      “Mechanism of action retrieval is evaluated by quantifying a profile’s ability to retrieve the profile of different compounds with the same annotated mechanism of action.”



      __R1.2a. __The description of Stain2-5 was not clear for us at first (and second) read. The information is there, but more details will greatly enhance the reader's ability to follow. One suggestion is explicitly stating that these "stains" partitioning was already defined in ref 26. Another suggestion is laying out explicitly a concrete example on the differences between two of these stains. We believe highlighting the differences between stains will strengthen the claim of the paper, emphasizing the difficulty of generalizing to the out-of-distribution stain.

      We appreciate your feedback on the clarity of the Stain2-5 dataset descriptions; we certainly struggled to balance detail and concepts in describing these. We have made the following changes:

      • Explicitly mentioned that the partitioning of the Stain experiments was defined in https://pubmed.ncbi.nlm.nih.gov/37344608/: “The partitioning of the Stain experiments have been defined and explained previously [21].”
      • Moved an improved version of (now) Figure 2 from the Methods section to the main text to help visually explain how the stratification is done early on.
      • Added a new section in the Experimental Setup: Diversity of stain sets, which includes a concrete example highlighting the differences between Stain2, and Stain5 to emphasize the diversity in experimental setups within the same dataset: “Stain2-5 comprise a series of experiments which were conducted sequentially to optimize the experimental conditions for image-based cell profiling. These experiments gradually converged on the most optimal set of conditions; however, within each experiment, there were significant variations in the assay across plates. To illustrate the diversity in experimental setups within the dataset, we will highlight the differences between Stain2 and Stain5.

      Stain2 encompasses a wide range of nine different experimental protocols, employing various imaging techniques such as Widefield and Confocal microscopy, as well as specialized conditions like multiplane imaging and specific stains like MitoTracker Orange. This subset also includes plates acquired with strong pixel binning instead of default imaging and plates with varying concentrations of dyes like Hoechst. As a result, Stain2 exhibits greater variance in the experimental conditions across different plates compared to Stain5.

      In contrast, Stain5, the last experiment in the series, follows a more systematic approach, consistently using either confocal or default imaging across three well-defined conditions. Each condition in Stain5 utilizes a lower cell density of 1,000 cells per well compared to Stain2's 2,500 cells per well. Being the final experiment in the series, Stain5 had the least variance in experimental conditions.

      For training the models, we typically select the data containing the most variance to capture the broadest range of experimental variation. Therefore, we chose Stain2-4 for training, as they represented the majority of the data and captured the most experimental variation. We reserved Stain5 for testing to evaluate the model's ability to generalize to new experimental conditions with less variance.

      All StainX experiments were acquired in different passes, which may introduce additional batch effects.”

      These changes aim to provide a clearer understanding of the dataset's complexity and the challenges associated with generalizing to out-of-distribution data.

      R1.2b. What does each data point in Figures 1-3 represent? Is it the average mAP for the 18 validation compounds, using different seeds for model training? Why not visualize the data similarly to Fig. 4 so the improvement per compound can be clearly seen?

      The data points in (the new) Figures 3,4,5 represent the average mAP for each plate, calculated by first computing the mAP for each compound and then averaging across compounds to obtain the average mAP per plate. We have updated the figure captions to clarify this:

      "... (each data point is the average mAP of a plate) ..."

      While visualizing the mAP per compound, similar to (the new) Figure 6 for cpg0004, could provide insights into compound-level improvements, it would require creating numerous additional figures or one complex figure to adequately represent all the stratifications we are analyzing (plate, compound, Stain subset). By averaging the data per plate across different stratifications, we aim to provide a clearer and more comprehensible overview of the trends and improvements while allowing us to draw conclusions about generalization.

      Please note: this comment is related to the comment R1.1b (Subjective)

      R1.2.c [On the topic of enhancing clarity and readability:] Justification and interpretation of the evaluation metrics.

      Please refer to our response to comment R1.1b, where we have addressed your concerns regarding the justification and interpretation of the evaluation metrics.

      R1.2d. Explicitly mentioning the number of MoAs for each datasets and statistics of number of compounds per MoA (e.g., average\median, min, max).

      We have added the following to the Experimental Setup: Data section:

      “A subset of the data was used for evaluating the mechanism of action retrieval task, focusing exclusively on compounds that belong to the same mechanism class. The Stain plates contained 47 unique mechanisms of action, with each compound replicated four times. Four mechanisms had only a single compound; the four mechanisms (and corresponding compounds) were excluded, resulting in 43 unique mechanisms used for evaluation. In the LINCS dataset, there were 1436 different mechanisms, but only 661 were used for evaluation because the remaining had only one compound.”

      R1.2e. The data split in general is not easily understood. Figure 8 is somewhat helpful, however in our view, it can be improved to enhance understanding of the different splits. Specifically, the training and validation compounds need to be embedded and highlighted within the figure.

      Thank you for highlighting this. We have completely revised the figure, now Figure 2 which we hope more clearly conveys the data split strategy.

      Please note: this comment is related to the comment R1.2a.





      R1.3a. Why was stain 5 used for the test, rather than the other stains?

      Stain2-5 were part of a series of experiments aimed at optimizing the experimental conditions for image-based cell profiling using Cell Painting. These experiments were conducted sequentially, gradually converging on the most optimal set of conditions. However, within each experiment, there were significant variations in the assay across plates, with earlier iterations (Stain2-4) having more variance in the experimental conditions compared to Stain5. As Stain5 was the last experiment in the series and consisted of only three different conditions, it had the least variance. For training the models, we typically select the data containing the most variance to capture the broadest range of experimental variation. Therefore, Stain2-4 were chosen for training, while Stain5 was reserved for testing to evaluate the model's ability to generalize to new experimental conditions with less variance.

      We have now clarified this in the Experimental Setup: Diversity of stain sets section. Please see our response to comment R1.2a. for the full citation.

      R1.3b How were the 18 validation compounds selected?

      20% of the compounds (n=18) were randomly selected and designated as validation compounds, with the remaining compounds assigned to the training set. We have now clarified this in the Results section:

      “Additionally, 20% of the compounds (n=18) were randomly selected and designated as validation compounds, with the remaining compounds assigned to the training set (Supplementary Material H).”

      R1.3c. For cpg0004, no justification for the specific doses selected (10uM - train, 3.33 uM - test) for the analysis in Figure 4. Why was the data split for the two dosages? For example, why not perform 5-fold cross validation on the compounds (e.g., of the highest dose)?

      We chose to use the 10 μM dose point as the training set because we expected this higher dosage to consist of stronger profiles with more variance than lower dose points, making it more suitable for training a model. We decided to use a separate test set at a different dose (3.33 μM) to assess the model's ability to generalize to new dosages. While cross-validation on the highest dose could also be informative, our approach aimed to balance the evaluation of the model's generalization capability with its ability to capture biologically relevant patterns across different dosages.

      This explanation has been added to the text:

      “We chose the 10 μM dose point for training because we expected this high dosage to produce stronger profiles with more variance than lower dose points, making it more suitable for model training.”

      “The multiple dose points in this dataset allowed us to create a separate hold-out test set using the 3.33 μM dose point data. This approach aimed to evaluate the model's performance on data with potentially weaker profiles and less variance, providing insights into its robustness and ability to capture biologically relevant patterns across dosages. While cross-validation on the 10 μM dose could also be informative, focusing on lower dose points offers a more challenging test of the model's capacity to generalize beyond its training conditions, although we do note that all compounds’ phenotypes would likely have been present in the 10 μM training dataset, given the compounds tested are the same in both.”

      R1.3d. A more detailed explanation on the logic behind using a training stain to test MoA retrieval will help readers appreciate these results. In our first read of this manuscript we did not grasp that, we did in a second read, but spoon-feeding your readers will help.

      This comment is related to the rationale behind training on one task and testing on another, which is addressed in our responses to comments R1.1.cii and R1.1.ciii.

      R1.4 Assessment of interpretability is always tricky. But in this case, the authors can directly confirm their interpretation that the CytoSummaryNet representation prioritizes large uncrowded cells, by explicitly selecting these cells, and using their average profile re

      We progressively filtered out cells based on a quantile threshold for Cells_AreaShape features (MeanRadius, MaximumRadius, MedianRadius, and Area), which were identified as important in our interpretability analysis, and then computed average profiles using the remaining cells before determining the replicate retrieval mAP. In the exclusion experiment, we gradually left out cells as the threshold increased, while in the inclusion experiment, we progressively included larger cells from left to right.

      The results show that using only the largest cells does not significantly increase the performance. Instead, it is more important to include the large cells rather than only including small cells. The mAP saturates after a threshold of around 0.4, indicating that larger cells define the profile the most, and once enough cells are included to outweigh the smaller cell features, the profile does not change significantly by including even larger cells.

      These findings support our interpretation that CytoSummaryNet prioritizes large, uncrowded cells. While this approach could potentially be used as a general outlier removal strategy for cell profiling, further investigation is needed to assess its robustness and generalizability across different datasets and experimental conditions.

      We have created Supplementary Material L to report these findings and we additionally highlight them in the Results:

      “To further validate CytoSummaryNet's prioritization of large, uncrowded cells, we progressively filtered cells based on Cells_AreaShape features and observed the impact on replicate retrieval mAP (Supplementary Material L). The results support our interpretation and highlight the key role of larger cells in profile strength.”

      __R1.5. __Placing this work in context of other weakly supervised representations. Previous papers used weakly supervised labels of proteins / experimental perturbations (e.g., compounds) to improve image-derived representations, but were not discussed in this context. These include PMID: 35879608, https://www.biorxiv.org/content/10.1101/2022.08.12.503783v2 (from the same research groups and can also be benchmarked in this context), https://pubs.rsc.org/en/content/articlelanding/2023/dd/d3dd00060e , and https://www.biorxiv.org/content/10.1101/2023.02.24.529975v1. We believe that a discussion explicitly referencing these papers in this specific context is important.

      While these studies provide valuable insights into improving cell population profiles using representation learning, our work focuses specifically on the question of single-cell aggregation methods. We chose to use classical features for our comparisons because they are the current standard in the field. This approach allows us to directly assess the performance of our method in the context of the most widely used feature extraction pipeline in practice. However, we see the value in incorporating them in future work and have mentioned them in the Discussion:

      “Recent studies exploring image-derived representations using self-supervised and self-supervised learning [35][36] could inspire future research on using learned embeddings instead of classical features to enhance model-aggregated profiles.”

      R1.minor1. "Because the improved results could stem from prioritizing certain features over others during aggregation, we investigated each cell's importance during CytoSummaryNet aggregation by calculating a relevance score for each" - what is the relevance score? Would be helpful to provide some intuition in the Results.

      We have included more explanation of the relevance score in the Results section, following the explanation of sensitivity analysis (SA) and critical point analysis (CPA):

      “SA evaluates the model's predictions by analyzing the partial derivatives in a localized context, while CPA identifies the input cells with the most significant contribution to the model's output. The relevance scores of SA and CPA are min-max normalized per well and then combined by addition. The combination of the two is again min-max normalized, resulting in the SA and CPA combined relevance score (see Methods for details).”

      R1.minor2. Figure 1:

      1. Colors of the two methods too similar
      2. The dots are too close. It will be more easily interpreted if they were further apart.
      3. What do the dots stand for?
      4. We recommend considering moving this figure to the supp. material (the most important part of it is the results on the test set and it appears in Fig.2).
      1. We chose a lighter and darker version of the same color as a theme to simplify visualization, as this theme is used throughout (the new) Figures 3,4,5.
      2. We agree; we have now redrawn the figure to fix this.
      3. Each data point is the average mAP of a plate. Please see our answer for R1.2b as well.
      4. We believe that (the new) Figures 3,4,5 serve distinct purposes in testing various generalization hypotheses. We have added the following text to emphasize that the first figures are specifically about generalization hypothesis testing: “We first investigated CytoSummaryNet’s capacity to generalize to out-of-distribution data: unseen compounds, unseen experimental protocols, and unseen batches. The results of these investigations are visualized in Figures 3, 4, and 5, respectively.”

      R1.minor3 Figure 4: It is somewhat misleading to look at the training MoAs and validation MoAs embedded together in the same graph. We recommend showing only the test MoAs (train MoAs can move to SI).

      We addressed this comment in R1.1c.ii. To reiterate briefly, there are no training, validation, or test MoAs because these are not used as labels during the training process. There is an option to split them based on training and validation compounds, which is addressed in R1.1c.ii.


      R1.minor4 Figure 5

      1. Why only Stain3? What happens if we look at Stains 2,3 and 4 together? Stain 5?

      2. Should validation compounds and training compounds be analyzed separately?

      3. Subfigure (d): it is expected that the data will be classified by compound labels as it is the training task, but for this to be persuasive I would like to see this separately on the training compounds first and then and more importantly on the validation compounds.

      4. For subfigures (b) and (d): it appears there are not enough colors for d, which makes it partially not understandable. For example, the pink label in (d) shows a single compound which appears to represent two different MoAs. This is probably not the case, and it has two different compounds, but it cannot be inferred when they are represented by the same color.

      5. For the Subfigure (e) - only 1 circle looks justified (in the top left). And for that one, is it not a case of an outlier plate that would perhaps need to be removed from analysis? Is it not good that such a plate will be identified?

      We have addressed this point in the text, stating that the results are similar for Stain2 and Stain4. Stain5 represents an out-of-distribution subset because of a very different set of experimental conditions (see Experimental Setup: Diversity of stain sets). To improve clarity, we have revised the figure caption to reiterate this information:

      “... Stain2 and Stain4 yielded similar results (data not shown). …”

      1. For replicate retrieval, analyzing validation and training compounds separately is appropriate. However, this is not the case for MoA retrieval, as discussed in our responses to R1.1c.ii and R1.1c.i.
      2. We have created the requested plot (below) but ultimately decided not to include it in the manuscript because we believe that (the new) Figures 3 and 4 are more effective for making quantitative comparative claims.

      [Please see the full revision document for the figures]

      Top: training compounds (validation compounds grayed out); not all compounds are listed in the legend.

      *Bottom: validation compounds (training compounds grayed out). *

      Left: average profiling; Right: CytoSummaryNet

      1. We agree with your observation and have addressed this issue by labeling the center mass as a single class (gray) and highlighting only the outstanding pairs in color. Please refer to the updated figure and our response to R3.6 for more details.

      2. In the updated figure, we have revised the figure caption to focus solely on the annotation of same mechanism of action profile clusters, as indicated by the green ellipses. The annotation of isolated plate clusters has been removed (Figures 7e and 7f) to maintain consistency and avoid potential confusion. Despite being an outlier for Stain3, the plate (BR00115134bin1) clusters with Stain4 plates (Supplementary Figure F1, green annotated square inside the yellow annotated square), indicating it is not merely a noisy outlier and can provide insights into the out-of-sample performance of our model.

      R1.minor5a. Discussion: "perhaps in part due to its correction of batch effects" - is this statement based on Fig. 5F - we are not convinced.

      We appreciate the reviewer's scrutiny regarding our statement about batch effect correction. Upon reevaluation, we agree that this claim was not adequately substantiated by empirical data. We quantified the batch effects using comparison mean average precision for both average profiles and CytoSummaryNet profiles, and the statistical analysis revealed no significant difference between these profiles in terms of batch effect correction. Therefore, we have removed this theoretical argument from the manuscript entirely to ensure that all claims are strongly supported by the data presented.

      R1.minor5b. "Overall, these results improve upon the ~20% gains we previously observed using covariance features" - this is not the same dataset so it is hard to reach conclusions - perhaps compare performance directly on the same data?

      We have now explicitly clarified this is a different dataset. Please see our response to R1.1a for why a direct comparison was not performed. The following clarification can be found in the Discussion:

      “These results improve upon the ~20% gains previously observed using covariance features [13] albeit on a different dataset, and importantly, CytoSummaryNet effectively overcomes the challenge of recomputation after training, making it easier to use.”

      Reviewer 2

      R2.1 The authors present a well-developed and useful algorithm. The technical motivation and validation are very carefully and clearly explained, and their work is potentially useful to a varied audience.

      That said, I think the authors could do a better job, especially in the figures, of putting the algorithm in context for an audience that is unfamiliar with the cell painting assay. (a) For example, a figure towards the beginning of the paper with example images might help to set the stage. (b) Similarly a schematic of the algorithm earlier in the paper would provide a graphical overview. (c) For the sake of a biologically inclined audience, I would consider labeling the images in the caption by cell type and label.

      Thank you for your valuable suggestions on improving the accessibility of our figures for readers unfamiliar with the Cell Painting assay. We have made the following changes to address your comments:

      1. and b. To provide visual context and a graphical overview of the algorithm, we have moved the original Figure 7 to Figure 1. This figure now includes example images that help readers new to the Cell Painting assay.
      2. We have added relevant details to the example images in (the new) Figure 1

        R2.2 The interpretability results were intriguing. The authors might consider further validating these interpretations by removing weakly informative cells from the dataset and retraining. Are the cells so uninformative that the algorithm does better without them, or are they just less informative than other cells?

      Please see our responses to R1.4 and R3.0

      R2.3 As far as I can tell, the authors only oblique state whether the code associated with the manuscript is openly available. Posting the code is needed for reproducibility. I would provide not only a github, but a doi linked to the code, or some other permanent link.

      We have now added a Code Availability and Data Availability section, clearing stating that the code and data associated with the manuscript are openly available.

      R2.4 Incorporating biological heterogeneity into machine-learning driven problems is a critical research question. Replacing means/modes and such with a machine learning framework, the authors have identified a problem with potentially wide significance. The application to cell painting and related assays is of broad enough significance for many journals, However, the authors could further broaden the significance by commenting on other possible cell biology applications. What other applications might the algorithm be particularly suited for? Are there any possible roadblocks to wider use. What sorts of data has the code been tested on so far?

      We have added the following paragraph to discuss the broader applicability of CytoSummaryNet:

      “The architecture of CytoSummaryNet holds significant potential for broader applications beyond image-based cell profiling, accommodating tabular, permutation-invariant data and enhancing downstream task performance when applied to processed population-level profiles. Its versatility makes it valuable for any omics measurements where downstream tasks depend on measuring similarity between profiles. Future research could also explore CytoSummaryNet's applicability to genetic perturbations, expanding its utility in functional genomics.”

      Reviewer 3

      R3.0 The authors have done a commendable job discussing the method, demonstrating its potential to outperform current models in profiling cell-based features. The work is of considerable significance and interest to a wide field of researchers working on the understanding of cell heterogeneity's impact on various biological phenomena and practical studies in pharmacology.

      One aspect that would further enhance the value of this work is an exploration of the method's separation power across different modes of action. For instance, it would be interesting to ascertain if the method's performance varies when dealing with actions that primarily affect size, those that affect marker expression, or compounds that significantly diminish cell numbers.

      Thank you for encouraging comments!

      We have added the following to Results: Relevance scores reveal CytoSummaryNet's preference for large, isolated cells:

      “Statistical t-tests were conducted to identify the features that most effectively differentiate mechanisms of action from negative controls in average profiles, focusing on the three mechanisms of action where CytoSummaryNet demonstrates the most significant improvement and the three mechanisms where it shows the least. Consistent with our hypothesis that CytoSummaryNet emphasizes larger, more sparse cells, the important features for the CytoSummaryNet-improved mechanisms of action (Supplementary Material I1) often involve the radial distribution for the mitochondria and RNA channels. These metrics capture the fraction of those stains near the edge of the cell versus concentric rings towards the nucleus, which are more readily detectable in larger cells compared to small, rounded cells.

      In contrast, the important features for mechanisms of action not improved by CytoSummaryNet (Supplementary Material I) predominantly include correlation metrics between brightfield and various fluorescent channels, capturing spatial relationships between cellular components. Some of these mechanisms of action included compounds that were not individually distinguishable from negative controls, and CytoSummaryNet did not overcome the lack of phenotype in these cases. This suggests that while CytoSummaryNet excels in identifying certain cellular features, its effectiveness is limited when dealing with mechanisms of action that do not exhibit pronounced phenotypic changes.”

      We have also added supplementary material to support (I. Relevant features for CytoSummaryNet improvement).

      R3.0 Another test on datasets that are not concerned with chemical compounds, but rather genetic perturbations would greatly increase the reach of the method into the functional genomics community and beyond. This additional analysis could provide valuable insights into the versatility and applicability of the proposed method.

      We agree that testing the method's behavior on genetic perturbations would be interesting and could provide insights into its versatility. However, the efficacy of the methodology may vary depending on the specific properties of different genetic perturbation types.

      For example, the penetrance of phenotypes may differ between genetic and chemical perturbations. In some experimental setups, a selection agent ensures that nearly all cells receive a genetic perturbation (though not all may express a phenotype due to heterogeneity or varying levels of the target protein). Other experiments may omit such an agent. Additionally, different patterns might be observed in various classes of reagents, such as overexpression, CRISPR-Cas9 knockdown (CRISPRn), CRISPR-interference (CRISPRi), and CRISPR-activation (CRISPRa).

      We believe that selecting a single experiment with one of these technologies would not adequately address the question of versatility. Instead, we propose future studies that may conclusively assess the method's performance across a variety of genetic perturbation types. This would provide a more comprehensive understanding of CytoSummaryNet's applicability in functional genomics and beyond. We have update the Discussion section to reflect this:

      “Future research could also explore CytoSummaryNet's applicability to genetic perturbations, expanding its utility in functional genomics.”

      R3.1. The datasets were stratified based on plates and compounds. It would be beneficial to clarify the basis for data stratification applied for compounds. Was the data sampled based on structural or functional similarity of compounds? If not, what can be expected from the model if trained and validated using structurally or functionally diverse and non-diverse compounds?

      Thank you for raising the important question of data stratification based on compound similarity. In our study, the data stratification was performed by randomly sampling the compounds, without considering their structural or functional similarity.

      This approach may limit the generalizability of the learned representations to new structural or functional classes not captured in the training set. Consequently, the current methodology may not fully characterize the model’s performance across diverse compound structures.

      In future work, it would be valuable to explore the impact of compound diversity on model performance by stratifying data based on structural or functional similarity and comparing the results to our current random stratification approach to more thoroughly characterize the learned representations.

      R3.2. Is the method prioritizing a particular biological reaction of cells toward common chemical compounds, such as mitotic failure? Could this be oncology-specific, or is there more utility to it in other datasets?

      Our analysis of CytoSummaryNet's performance in (the new) Figure 6 reveals a strong improvement in MoAs targeting cancer-related pathways, such as MEK, HSP, MDM, dehydrogenase, and purine antagonist inhibitors. These MoAs share a common focus on cellular proliferation, survival, and metabolic processes, which are key characteristics of cancer cells.

      Given the composition of the cpg0004 dataset, which contains 1,258 unique MoAs with only 28 annotated as oncology-related, the likelihood of randomly selecting five oncology-related MoAs that show strong improvement is extremely low. This suggests that the observed prioritization is not due to chance.

      Furthermore, the prioritization cannot be solely attributed to the frequency of oncology-related MoAs in the dataset. Other prevalent disease areas, such as neurology/psychiatry, infectious disease, and cardiology, do not exhibit similar improvements despite having higher MoA counts.

      While these findings indicate a potential prioritization of oncology-related MoAs by CytoSummaryNet, further research is necessary to fully understand the extent and implications of this bias. Future work should involve conducting similar analyses across other disease areas and cell types to assess the method's broader utility and identify areas for refinement and application. However, given the speculative nature of these observations, we have chosen not to update the manuscript to discuss this potential bias at this time.

      R3.3 Figures 1 and 2 demonstrate that the CytoSummaryNet profiles outperform average-aggregated profiles. However, the average profiling results seem more consistent when compared to CytoSummaryNet profiling. What further conditions or approaches can help improve CytoSummaryNet profiling results to be more consistent?

      The observed variability in CytoSummaryNet's performance is primarily due to the intentional technical variance in our datasets, where each plate tested different staining protocol variations. It's important to note that this level of technical variance is not typical in standard cell profiling experiments. In practice, the variance across plates would be much lower. We want to emphasize that while a model capable of generalizing across diverse experimental conditions might seem ideal, it may not be as practically useful in real-world scenarios. This is because such non-uniform conditions are uncommon in typical cell profiling experiments. In normal experimental settings, where technical variance is more controlled, we expect CytoSummaryNet's performance to be more consistent.

      R3.4 Can the poor performance on unseen data (in the case of stain 5) be overcome? If yes, how? If no, why not?

      We believe that the poor performance on unseen data, such as Stain 5, can be overcome depending on the nature of the unseen data. As shown in Figure 4 (panel 3), the model improves upon average profiling for unseen data when the experimental conditions are similar to the training set.

      The issue lies in the different experimental conditions. As explained in our response to R3.3, this could be addressed by including these experimental conditions in the training dataset. As long as CytoSummaryNet is trained (seen) and tested (unseen) on data generated under similar experimental conditions, we are confident that it will improve or perform as well as average profiling.

      It's important to note that the issue of generalization to vastly different experimental conditions was considered out of scope for this paper. The main focus is to introduce a new method that improves upon average profiling and can be readily used within a consistent experimental setup.

      R3.5 It needs to be mentioned how the feature data used for CytoSummaryNet profiling was normalized before training the model. What would be the impact of feature data normalization before model training? Would the model still outperform if the skewed feature data is normalized using square or log transformation before model training?

      We have clarified in the manuscript that we standardized the feature data on a plate-by-plate basis to achieve zero mean and unit variance across all cells per feature within each plate. We have added the following statement to improve clarity:

      “The data used to compute the average profiles and train the model were standardized at the plate-level, ensuring that all cell features across the plate had a zero mean and unit variance. The negative control wells were then removed from all plates."

      We chose standardization over transformations like squaring or logging to maintain a balanced scale across features while preserving the biological and morphological information inherent in the data. While transformations can reduce skewness and are useful for data spanning several orders of magnitude, they might distort biological relevance by compressing or expanding data ranges in ways that could obscure important cellular variations.

      Regarding the potential impact of square or log transformations on skewed feature data, these methods could improve the model's learning efficiency by making the feature distribution more symmetrical. However, the suitability and effectiveness of these techniques would depend on the specific data characteristics and the model architecture.

      Although not explored in this study, investigating various normalization techniques could be a valuable direction for future research to assess their impact on the performance and adaptability of CytoSummaryNet across diverse datasets and experimental setups.

      R3.6. In Figure 5 b and c, MoAs often seem to be represented by singular compounds and thus, the test (MoA prediction) is very similar to the training (compound ID). Given this context, a discussion about the extent this presents a circular argument supported by stats on the compound library used for training and testing would be beneficial.

      Clusters in (the new) Figure 7 that contain only replicates of a single compound would not yield an improved performance on the MoA task unless they also include replicates of other compounds sharing the same MoA in close proximity. Please see our response to R1.1c.iii. for details. To improve visual clarity and avoid misinterpretation, we have recomputed the colors for (the new) Figure 7 and grayed out overlapping points.

      R3.7 Can you estimate the minimum amount of supervision (fuzzy/sparse labels, often present in mislabeled compound libraries with dirty compounds and polypharmacology being present) that is needed for it to be efficiently trained?

      It's important to note that the metadata used by the model is only based on identifying replicates of the same compound. Mechanism of action (MoA) annotations, which can be erroneous due to dirty compounds, polypharmacology, and incomplete information, are not used in training at all. MoA annotations are only used in our evaluation, specifically for calculating the mAP for MoA retrieval.

      We have successfully trained CytoSummaryNet on 72 unique compounds with 4 replicates each. This is the current empirical minimum, but it is possible that the model could be trained effectively with even fewer compounds or replicates.

      Determining the absolute minimum amount of supervision required for efficient training would require further experimentation and analysis. Factors such as data quality, feature dimensionality, and model complexity could influence the required level of supervision.

      R3.minor1 Figure 5: The x-axis and y-axis tick values are too small, and image resolution/size needs to be increased.

      We have made the following changes to address the concerns:

      • Increased the image resolution and size to improve clarity and readability.
      • Removed the x-axis and y-axis tick values, as they do not provide meaningful information in the context of UMAP visualizations. We believe these modifications enhance the visual presentation of the data and make it easier for readers to interpret the results.

      R3.minor2 The methods applied to optimize hyperparameters in supplementary data need to be included.

      We added the following to Supplementary Material D:

      “We used the Weights & Biases (WandB) sweep suite in combination with the BOHB (Bayesian Optimization and HyperBand) algorithm for hyperparameter sweeps. The BOHB algorithm [47] combines Bayesian optimization with bandit-based strategies to efficiently find optimal hyperparameters.

      Additionally Table D1 provides an overview of all tunable hyperparameters and their chosen values based on a BOHB hyperparameter optimization.”

      R3.minor3 Figure 5(c, d): The names of compound 2 and Compound 5 need to be included in the labels.

      These compounds were obtained from external companies and are proprietary, necessitating their anonymization in our study. This has now been added in the caption of (the new) Figure 7:

      “Note that Compound2 and Compound5 are intentionally anonymized.”

      R3.minor4 Table C1: Plate descriptions need to be included.

      *Table C1: The training, validation, and test set stratification for Stain2, Stain3, Stain4, and Stain5. Five training, four validation, and three test plates are used for Stain2, Stain3, and Stain4. Stain5 contains six test set plates only. *

      __Stain2 __

      Stain3

      Stain4

      Stain5

      Training plates

      Test plates

      BR00113818

      BR00115128

      BR00116627

      BR00120532

      BR00113820

      BR00115125highexp

      BR00116631

      BR00120270

      BR00112202

      BR00115133highexp

      BR00116625

      BR00120536

      BR00112197binned

      BR00115131

      BR00116630highexp

      BR00120530

      BR00112198

      BR00115134

      200922_015124-Vhighexp

      BR00120526

      Validation plates

      BR00120274

      BR00112197standard

      BR00115129

      BR00116628highexp

      BR00112197repeat

      BR00115133

      BR00116629highexp

      BR00112204

      BR00115128highexp

      BR00116627highexp

      BR00112201

      BR00115127

      BR00116629

      Test plates

      BR00112199

      BR00115134bin1

      200922_044247-Vbin1

      BR00113819

      BR00115134multiplane

      200922_015124-V

      BR00113821

      BR00115126highexp

      BR00116633bin1

      We have added a reference to the metadata file in the description of Table C1: https://github.com/carpenter-singh-lab/2023_Cimini_NatureProtocols/blob/main/JUMPExperimentMasterTable.csv

      R3.minor5 Figure F1: Does the green box (stain 3) also involve training on plates from stain 4 (BR00116630highexp) and 5 (BR00120530) mentioned in Table C1? Please check the figure once again for possible errors.

      We have carefully re-examined Figure F1 and Table C1 to ensure their accuracy and consistency. Upon double-checking, we can confirm that the figure is indeed correct. We intentionally omitted the training and validation plates from Figure F1 to maintain clarity and readability, as including them resulted in a cluttered and difficult-to-interpret figure.

      Regarding the specific plates mentioned:

      • BR00116630highexp (Stain4) is used for training, as correctly stated in Table C1. This plate is considered an outlier within the Stain4 dataset and happens to cluster with the Stain3 plates in Figure F1.
      • BR00120530 (Stain5) is part of the test set only and correctly falls within the Stain5 cluster in Figure F1. To improve the clarity of the training, validation, and test split in Table C1, we have added a color scheme that visually distinguishes the different data subsets. This should make it easier for readers to understand the distribution of plates across the various splits.
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The authors present a well-developed and useful algorithm. The technical motivation and validation are very carefully and clearly explained, and their work is potentially useful to a varied audience.

      That said, I think the authors could do a better job, especially in the figures, of putting the algorithm in context for an audience that is unfamiliar with the cell painting assay. For example, a figure towards the beginning of the paper with example images might help to set the stage. Similarly a schematic of the algorithm earlier in the paper would provide a graphical overview. For the sake of a biologically inclined audience, I would consider labeling the images in the caption by cell type and label.

      The interpretability results were intriguing. The authors might consider further validating these interpretations by removing weakly informative cells from the dataset and retraining. Are the cells so uninformative that the algorithm does better without them, or are they just less informative than other cells?

      As far as I can tell, the authors only oblique state whether the code associated with the manuscript is openly available. Posting the code is needed for reproducibility. I would provide not only a github, but a doi linked to the code, or some other permanent link.

      Significance

      Incorporating biological heterogeneity into machine-learning driven problems is a critical research question. Replacing means/modes and such with a machine learning framework, the authors have identified a problem with potentially wide significance. The application to cell painting and related assays is of broad enough significance for many journals, However, the authors could further broaden the significance by commenting on other possible cell biology applications. What other applications might the algorithm be particularly suited for? Are there any possible roadblocks to wider use. What sorts of data has the code been tested on so far?

  4. Jul 2024
    1. Background Over the past few years, the rise of omics technologies has offered an exceptional chance to gain a deeper insight into the structural and functional characteristics of microbial communities. As a result, there is a growing demand for user friendly, reproducible, and versatile bioinformatic tools that can effectively harness multi-omics data to offer a holistic understanding of microbiomes. Previously, we introduced gNOMO, a bioinformatic pipeline specifically tailored to analyze microbiome multi-omics data in an integrative manner. In response to the evolving demands within the microbiome field and the growing necessity for integrated multi-omics data analysis, we have implemented substantial enhancements to the gNOMO pipeline.Results Here, we present gNOMO2, a comprehensive and modular pipeline that can seamlessly manage various omics combinations, ranging from two to four distinct omics data types including 16S rRNA gene amplicon sequencing, metagenomics, metatranscriptomics, and metaproteomics. Furthermore, gNOMO2 features a specialized module for processing 16S rRNA gene amplicon sequencing data to create a protein database suitable for metaproteomics investigations. Moreover, it incorporates new differential abundance, integration and visualization approaches, all aimed at providing a more comprehensive toolkit and insightful analysis of microbiomes. The functionality of these new features is showcased through the use of four microbiome multi-omics datasets encompassing various ecosystems and omics combinations. gNOMO2 not only replicated most of the primary findings from these studies but also offered further valuable perspectives.Conclusions gNOMO2 enables the thorough integration of taxonomic and functional analyses in microbiome multi-omics data, opening up avenues for novel insights in the field of both host associated and free-living microbiome research. gNOMO2 is available freely at https://github.com/muzafferarikan/gNOMO2.

      This work has been peer reviewed in GigaScience (see paper), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer name: Alexander Bartholomaus (original submission)

      Summary: "gNOMO2: a comprehensive and modular pipeline for integrated multi-omics analyses of microbiomes" by Arıkan and Muth presents a multi-omics tools for analysis of prokaryotes. It is an evolution of the first version and offers various separate modules, taking different type of input data. They present different example analysis based on already published data and reproduced the results. The manuscript is very well written (I could not detect a single typo) and it was fun to read! Well done! I have only very few comments and suggestions, see below. However, I had a problem executing the code.

      Key questions to answer: 1) Are the methods appropriate to the aims of the study, are they well described, and are necessary controls included? Yes 2) Are the conclusions adequately supported by the data shown? Yes 3) Please indicate the quality of language in the manuscript. Does it require a heavy editing for language and clarity? Very well written! 4) Are you able to assess all statistics in the manuscript, including the appropriateness of statistical tests used? No direct statistics given in the manuscript. Maybe the authors could include some example output as .zip file for interested potential users.

      Detailed comments to the manuscript: Line 168: What does "cleaned and redundancies are removed" mean? Are only identical genomes removed? Or are genome part that are identical (I guess this barely exists, except for conserved gene parts as 16S, or similar) removed? Or are only redundant genes removed? How is redundancy defined, 99% identical stretch? Line 399-405: When looking at figure 5A I am wondering how Fluviicoccus and Methanosarcina in the MP faction appear relatively abundant in some samples. Where they de novo assembled in the MG or MT modules? General comment figures: I know that it is a hack to deal with automatic figure generation and especially the axis labels (as names have very different length). However, I think some figures might be hardly visable in the printed version, especially axes label for panel B are very small. Maybe you can put the critical figures separately in the supplement, e.g. each B panel a one page.

      Suggestions: As suggest above, maybe the authors could include some example output (a simple example) as .zip file for interested potential users. This would give an idea of how the output looks like and what to expect besides the plots. But differential abundance tables might be more important than the plots, as the user would generate their own plot for later publications.

      Github and software: I also tested the software and followed the instructions in the Github. I successfully executed the "Requirements" and "Config" steps (including create of metadata file and copying of amplicon reads) and tried to execute Modul1.

      However, the following error occurred (using up-to-date conda and snakemake on Ubuntu linux): (snakemake) abartho@gmbs17:~/review_papers/GigaScience/gNOMO2$ snakemake -v 6.15.5 (snakemake) abartho@gmbs17:~/review_papers/GigaScience/gNOMO2$ snakemake -s workflow/Snakefile --cores 20 SyntaxError in line 9 of /home/abartho/miniconda3/envs/snakemake/lib/python3.6/sitepackages/smart_open/s3.py: future feature annotations is not defined (s3.py, line 9) File "/home/abartho/miniconda3/envs/snakemake/lib/python3.6/sitepackages/smart_open/init.py", line 34, in <module> File "/home/abartho/miniconda3/envs/snakemake/lib/python3.6/sitepackages/smart_open/smart_open_lib.py", line 35, in <module> File "/home/abartho/miniconda3/envs/snakemake/lib/python3.6/sitepackages/smart_open/doctools.py", line 21, in <module> File "/home/abartho/miniconda3/envs/snakemake/lib/python3.6/sitepackages/smart_open/transport.py", line 104, in <module> File "/home/abartho/miniconda3/envs/snakemake/lib/python3.6/sitepackages/smart_open/transport.py", line 49, in register_transport File "/home/abartho/miniconda3/envs/snakemake/lib/python3.6/importlib/init.py", line 126, in import_module In addition to solving the problem, an example metadata file and some explanation about the output (which I did not see yet) would be good for less experienced users.

    1. Cohort studies increasingly collect biosamples for molecular profiling and are observing molecular heterogeneity. High throughput RNA sequencing is providing large datasets capable of reflecting disease mechanisms. Clustering approaches have produced a number of tools to help dissect complex heterogeneous datasets, however, selecting the appropriate method and parameters to perform exploratory clustering analysis of transcriptomic data requires deep understanding of machine learning and extensive computational experimentation. Tools that assist with such decisions without prior field knowledge are nonexistent. To address this we have developed Omada, a suite of tools aiming to automate these processes and make robust unsupervised clustering of transcriptomic data more accessible through automated machine learning based functions. The efficiency of each tool was tested with five datasets characterised by different expression signal strengths to capture a wide spectrum of RNA expression datasets. Our toolkit’s decisions reflected the real number of stable partitions in datasets where the subgroups are discernible. Within datasets with less clear biological distinctions, our tools either formed stable subgroups with different expression profiles and robust clinical associations or revealed signs of problematic data such as biased measurements.

      This work has been peer reviewed in GigaScience (see paper), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer name: Casey S. Greene (original submission)

      The authors describe a system for clustering gene expression data. The manuscript describes clustering workflows (data cleaning, assessing data structure, etc).

      I found the manuscript difficult to read. It reads somewhat like a how-to guide and somewhat like a software package. I recommend approaching this as a software package, which would require adding evidence to support the choices made. Describe the purpose for the package, evidence for the choices made, benchmarking (compute and performance), describe application to one or more case studies, and discuss how the work fits into the context.

      The evaluation includes two simulation studies and then application to a few real datasets; however, for all real datasets the problem is either very easy or the answer is unknown. The largest challenges I have with the manuscript are the large number of arbitrarily selected parameters the limited evidence available to support those as strong choices. Conceptually, an alternative strategy is to consider real clusters to be those that are robust over many clustering methods. In this case, the best clusters are those that are maximally detectable with a single method. While there exists software for the former strategy, this package implements the latter strategy. It is not intuitively clear to me that this framework is superior to the other for biological discovery. It seems like general clusters (i.e., those that persist across multiple parameterizations) may be the most fruitful to pursue. It would be helpful to provide evidence that the selected strategy has superior utility in at least some settings and a description of how those settings might be identified. I examined the vignette, and I found that it provided a set of examples. I can imagine that running this on larger datasets would be highly time-consuming. It would be helpful to add benchmarking or an estimate of compute time. Given that this seems feasible to parallelize, it might make sense to provide a mechanism for parallelization.

      I examined the software briefly. There are some comments. Dead code exists in some files. There is at least one typo in a filename (gene_singatures.R). Some of the choices that seemed arbitrary appear to be written into the software (e.g., get_top30percent_coefficients.R).

    1. There has been growing caution around biological foundation models due to532potential biosecurity threats such as generating novel pathogenic viruses or guiding533gain-of-function viral mutations

      It might be good to mention this rationale/justification and thought process earlier in the preprint so people understand why the code isn't available right now.

    1. Nous retrouvons donc dans la partie CSS le nom de la police utilisée, son poids, son style, la taille des caractères et la taille de la ligne en pixels.

      Impossible d'avoir le code css sur Figma. On peut deviner le font-family, font-style, font-weight, font-size, mais aucune idée pour trouver le line height...

    1. Math camp was the happiest educational experience of my childhood. I loved theoretical math in grade school even and majored in philosophy of mathematics in college with the intention of going on in artificial intelligence or what at the time was called “quantificational logic” — roughly, machine language, translating human language into code and instructions that can be executed by computers.

      Shared experience by author

    1. Reviewer #2 (Public Review):

      The study presented by Paoli et al. explores temporal aspects of neuronal encoding of odors and their perception, using bees as a general model for insects. The neuronal encoding of the presence of an odor is not a static representation; rather, its neuronal representation is partly encoded by the temporal order in which parallel olfactory pathways participate and are combined. This aspect is not novel, and its relevance in odor encoding and recognition has been discussed for more than the past 20 years.

      The temporal richness of the olfactory code and its significance have traditionally been driven by results obtained based on electrophysiological methods with temporal resolution, allowing the identification and timing of the action potentials in the different populations of neurons whose combination encodes the identity of an odor. On the other hand, optophysiological methods that enable spatial resolution and cell identification in odor coding lack the temporal resolution to appreciate the intricacies of olfactory code dynamics.

      (1) In this context, the main merit of Paoli et al.'s work is achieving an optical recording that allows for spatial registration of olfactory codes with greater temporal detail than the classical method and, at the same time, with greater sensitivity to measure inhibitions as part of the olfactory code.

      The work clearly demonstrates how the onset and offset of odor stimulation triggers a dynamic code at the level of the first interneurons of the olfactory system that changes at every moment as a natural consequence of the local inhibitory interactions within the first olfactory neuropil, the antennal lobe. This gives rise to the interesting theory that each combination of activated neurons along this temporal sequence corresponds to the perception of a different odor. The extent to which the corresponding postsynaptic layers integrate this temporal information to drive the perception of an odor, or whether this sequence is, in a sense, a journey through different perceptions, is challenging to address experimentally.

      In their work, the authors propose a computational approach and olfactory learning experiments in bees to address these questions and evaluate whether the sequence of combinations drives a sequence of different perceptions. In my view, it is a highly inspiring piece of work that still leaves several questions unanswered.

      (2) In my opinion, the detailed temporal profile of the response of projection neurons and their respective probabilities of occurrence provide valuable information for understanding odor coding at the level of neurons transferring information from the antennal lobes to the mushroom bodies. An analysis of these probabilities in each animal, rather than in the population of animals that were measured, would aid in better comprehending the encoding function of such temporal profiles. Being able to identify the involved glomeruli and understanding the extent to which the sequence of patterns and inhibitions is conserved for each odor across different animals, as it is well known for the initial excitatory burst of activity observed in previous studies without the fine temporal detail, would also be highly significant.

      In my view, the computational approach serves as a useful tool to inspire future experiments; however, it appears somewhat simplistic in tackling the complexity of the subject. One question that I believe the researchers do not address is to what extent the inhibitions recorded in the projection neurons are integrated by the Kenyon cells and are functional for generating odor-specific patterns at that level.

      Lastly, the behavioral result indicating a difference in conditioned response latency after early or delayed learning protocol is interesting. However, it does not align with the expected time for the neuronal representation that was theoretically rewarded in the delayed protocol. This final result does not support the authors' interpretation regarding the existence of a smell and an after-smell as separate percepts that can serve as conditioned stimuli.

    1. nstalación de Visual Studio Code

      Seria bueno darles alguna información sobre que pasa si ya tienen instalado VSC, para que lo tengan claro

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      We thank the reviewers for their insights and helpful suggestions on the manuscript. Based on these, we have prepared a revision plan for this manuscript, which is outlined below. We believe these revisions will improve the overall quality of the manuscript.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      • *

      Reviewer #1

      (Evidence, reproducibility and clarity (Required)):

      Summary:

      This study builds on previous work from the same group, where they use Drosophila photoreceptors as a model system to investigate the role or ER-plasma membrane contact sites in an in vivo setting. The authors recently described a role of the ER-PM contact site protein dEsyt in regulating photoreceptor function in Drosophila. In this follow-up study, they explore whether this function of dEsyt is connected Ca2+ signaling downstream of photoreceptor activation. Using a dEsyt mutant that should be unable to bind Ca2+, they find that Ca2+ to some extent is required for dEsyt localization, membrane contact site formation and photoreceptor function.

      Major comments:

      The use of photoreceptor cells in Drosophila is an elegant model system that enable studies of membrane contact sites and associated proteins in a native condition. The data presented by the authors clearly shows that these structures are important for photoreceptor function, and that dEsyt plays a role at these sites. However, this was already known from previous studies by the same group. When it comes to whether these contacts are sensing Ca2+ changes and if these changes are acting through dEsyt, which is the focus of the current manuscript, the results are unclear to me and would need to be clarified by the authors both in text and with new experiments.

      1) What is the role of cellular Ca2+ signaling in the regulation of dEsyt function? There are several aspects here that needs to be clarified. 1) How is WT dEsyt localization regulated by Ca2+? This could for example be evaluated in the mutant flies used in Fig. 1 (trpl302; trp343), where lack of light-induced Ca2+ influx would be predicted to result in a localization of dEsyt that resembles that observed for dEsytCaBM. 2) Is Ca2+ important for dEsyt localization, lipid exchange or both? The authors express a version of dEsyt with mutation made in all three C2 domains. In mammalian E-Syts, Ca2+ binding to the C2A domain is important for lipid exchange while binding to C2C (in E-Syt1) is important for interactions with lipids in the plasma membrane. Using more carefully designed mutants will allow the authors to determine how Ca2+ regulates dEsyt function in vivo. In addition, the authors must show experimentally that the mutant dEsytCaBM is unable to bind Ca2+ (could e.g. be done by acute Ca2+ changes in the cell-based model used in Fig. 3). Writing that "This transgene carrying a total of nine mutations should render the protein unable to bind calcium" (p. 6, line 173) is not sufficient.

      1) How is WT dEsyt localization regulated by Ca2+?

      We agree that further experimental evidence would be helpful in establishing the significance of cellular Ca2+ signaling in the control of dEsyt function. As suggested by the reviewer, the localization of wild type dEsyt will be examined in the mutants: norpAP24 (PLC null mutant) and trpl302; trp343 (protein null mutants of TRPL and TRP channels respectively) in which light induced calcium influx is eliminated. These data will be included in the revision.

      2) Is Ca2+ important for dEsyt localization, lipid exchange or both?

      We have already performed experiments to address the question of how important calcium binding to dEsyt is for lipid transport at the ER-PM interface in Drosophila photoreceptors. This results indicate a previously unexpected role for lipid exchange and will be included in the revision.

      3). Writing that "This transgene carrying a total of nine mutations should render the protein unable to bind calcium" (p. 6, line 173) is not sufficient.

      We concur with the reviewers that at present we do not have experimental data to demonstrate that dEsytCaBM can't bind Ca2+. However, as Reviewer 4 pointed out, it will be challenging to demonstrate this experimentally. A direct proof would only come from measurements of the calcium binding affinity of dEsyt (which involves protein purification that is beyond the scope of the current work). An indirect demonstration would be any cellular or in vivo experiment. In addition to the in silico analysis already included in Fig 2 C-F, we propose the following to provide additional evidence to strengthen our in silico analysis: Use AlphaFold model to demonstrate that the arrangement of the calcium binding residues in the C2 domain of dEsyt is compatible with Ca2+ binding.

      2) The localization of dEsyt shown in Fig. 3B is a bit confusing. First of all, I would recommend including markers of the ER and the plasma membrane, because without these it is difficult to make statements about the localization of dEsyt to these structures.

      As suggested, to better appreciate the localization of dEsyt in photoreceptors, we will perform colocalization of dEsyt with markers of the PM (Rhabdomere) and ER (Sub Microvillar Cisternae).

      Second, it appears that WT dEsyt localize to the reticular ER, and that the CaBM version localize to the plasma membrane. This is somewhat opposite to mammalian ESyts, where mutations that prevent Ca2+ binding either had no effect (for ESyt2) or prevented (for ESyt1) the interaction with the plasma membrane. It also appears different from the localization in vivo (Fig. 3C). Clarifying this will be important. It will also be important to connect this localization to changes in Ca2+ and not just to the localization of a mutant that may or may not be deficient in Ca2+ binding (see comment above).

      In considering this comment, we need to bear in mind the following:

      • Mammalian cells have three genes that encode for Esyt: Esyt 1, 2 and 3 whereas the Drosophila genome encodes only a single gene for Esyt.
      • In terms of sequence similarity and structure, dEsyt and hEsyt2 are very similar. However, in contrast to hEsyt2 and hEsyt3, which localize to the plasma membrane (PMID: 17360437), dEsyt acts like hEsyt1 and localizes to the ER-PM junctions.
      • A single study (PMID: 27065097) has shown that the SMP domain of Esyt1 can transfer lipids in an in vitro assay. In our studies, we have noted an unexpected function for the SMP domain of dEsyt for in vivo function as measured through phenotypes in the eye (data will be presented in the revised ms).
      • While knocking out the single dEsyt in Drosophila photoreceptor neurons results in phenotypes (Nath et.al PMID: 32716137) to date, knocking out all three Esyts in mammalian cell culture models or mice has not revealed an in vivo Bearing these points in mind it may not be reasonable to expect every observation on mammalian Esyt to be recapitulated in the fly system or vice versa. 3) I don't fully understand the time course of events. The authors show that dEsytCaBM is mislocalized already at day 1 in dark-reared flies (Fig. 3C) but this mislocalization is not accompanied by a change in MCS density or gap distance, and consistently does not influence the localization of RDGB. The authors next expose the flies to constant light illumination to trigger Ca2+ dependent signaling, and this leads to mislocalization of RDGB, perhaps indicating changes in MCS (this is not shown). From these results it is difficult to know what the role of dEsyt is. It would be necessary to also show a control where Ca2+ signaling is not induced, e.g. a parallel dark-control (same number of days but no illumination).

      It is important to remember that even complete loss of Esyt does not result in altered MCS or mislocalization of RDGB on day 1 post eclosion. This has been published by us previously (Nath et.al PMID: 32716137). Since we show in this manuscript that dEsytCaBM exerts a dominant negative effect when expressed in wild type and phenocopies dEsytKO, one might expect expression of dEsytCaBM to also lead to altered MCS density and mislocalization of RDGB by 6D constant light.

      Bearing this in mind, we will incorporate the following data in the manuscript: Addition of MCS density in dEsytKO photoreceptors at Day1 in Figure 3C.

      • Electron Microscopy to check MCS density in Rh1>dEsytCaBM at Day 6CL with appropriate control genotypes.
      • Confocal Imaging: RDGB staining in Rh1>dEsytCaBM- Day 6CD reared flies with appropriate control genotypes- dark control where only reduced Ca2+ signaling is induced due to dark noise or spontaneous PLC activation. This is particularly important given that the authors show in Fig. 1 that preventing Ca2+ influx had a dramatic impact on MCS density even at day 1 (which is in sharp contrast to dEsytCaBM-expressing flies, that show normal morphology at day 1, which rather implies that dEsyt is not a major Ca2+ effector).

      In thinking about this comment, it is important to bear in mind the details of the experimental paradigm in use in each of the experiments while drawing comparisons between the observed results. It is to be noted that throughout the manuscript dEsytCaBM is expressed selectively in photoreceptors using the Rhodopsin enhancer which drives expression of the transgene during late eye development. By contrast, in germ line mutant strains such as trpl302;trp343 the channels are blocked throughout development. Thus the phenotypes of trpl302;trp343 might be broader than that of expressing dEsytCaBM. Therefore, mutating the calcium binding residues of dEsyt and expressing it using Rh1 enhancer at a specific developmental time window might not have the same impact on the contact site density as completely blocking the major calcium permeable channels, TRP and TRPL that is important to sustain the ongoing phototransduction cascade all through the development.

      4) The experiments done in dEsyt KO flies are important, and here the authors show that dEsyt1 could to some extent rescue all phenotypes. Some results are a bit puzzling. For example, dEsyt1CaBM localization in dEsyt1 KO flies is identical to that of WT dEsyt (Fig. 5C), which is in sharp contrast to the data shown in Fig. 3C. What is the reason for this? I would have anticipated the opposite (i.e. that in WT flies, dEsytCaBM can form dimers with endogenous dEsyt through SMP-domain interactions which may have an impact on its localization and the function of endogenous dEsyt, but that in the dEsyt KO cells, dEsytCaBM would show a different localization due to the lack of endogenous dEyt to interact with). It is important to clarify as one of the major observations here is that dEsytCaBM no longer localize to MCS. Since the CaBM version of dEsyt could rescue, to some extent, MCS density and delay photoreceptor degeneration, this implies that Ca2+ may not be required for regulation of dEsyt function or that the mutant is still able to partially bind to Ca2+.

      The localization shown in Fig 5C is not of dEsytCaBM in dEsytKO photoreceptors but the localization of RDGB in Rh1>dEsytCaBM; dEsytKO at Day 1 (Figure 5C i) and as a function of age and illumination- Day 6CL (Figure 5C ii).

      One experiment that would help the authors determining the function of dEsyt in vivo would be to use a mutant that lacks functional SMP domain (ideally also with and without mutations in the C2-domains).

      There is information available to address the question of how the lipid binding module, SMP is important to render dEsyt functional at the ER-PM interface in Drosophila photoreceptors. The same will be included in the revision.

      5) PLC activation typically couples to rapid signaling and involved hydrolysis of PIP2 and release of Ca2+ from the ER. Mammalian Esyts also require PIP2 for plasma membrane binding (through interactions with C2-domains), so constitutive PLC activity would be expected to impair ESyt localization to MCS. Here, the authors expose flies for days of constant illumination. How does this influence plasma membrane PIP2 levels and could this be of relevance for how data is interpreted?

      This is an interesting question from the reviewer. However, we would like to clarify the fact that constitutive activation of PLC is different from constant activation of PLC during illumination. Flies have robust mechanisms for controlling PLC turnover and PIP2 levels during continuous illumination and Ca2+ is a key regulator of this process; the underlying mechanisms have been described by Raghu and Hardie in multiple past papers (PMID: 11343651, PMID: 15355960). This is why, apart from adaptation, flies grown in constant light for many days do not show electrophysiological defects and neither do they undergo retinal degeneration. We will however measure the kinetics of PIP2 resynthesis in (i) wild type (Day 1 vs Day 6CD vs Day 6CL) and (ii) Control, Rh1>dEsyt and Rh1>dEsytCaBM (Day 1 vs Day6CL). This might reveal some interesting insight into the mutants.

      Do the authors know whether the CaBM mutant has reduced affinity for PIP2?

      The ability of wild type dEsyt to bind PIP2 has not been determined. We will test this and if it does so, the impact of CaBM on PIP2 binding can be tested.

      Minor comments:

      • The overexpression of WT dEsyt had a dramatic impact on MCS density and gap distance, while expression of dEsytCaBM did not. If these contacts are important for photoreceptor function, is it not surprising that such a dramatic change in photoreceptor structure was without effect on function? This should be further discussed. The establishment of more contact sites and reduction in contact site distance in Rh1>dEsyt::GFP photoreceptors is likely indicative of the proposed tethering role of the protein at the ER-PM MCS. Increase in contact site density or reduction in distance need not directly parallel to the increase in the levels of MCS proteins that are expressed at these contact sites to enhance the ongoing signal transduction. We will test this idea proposed by the reviewer and include the following data in a revision to strengthen our statement:

      • RDGB levels in control vs Rh1>dEsyt::GFP - Western blot

      • Electroretinograms from the genotypes indicated above as a functional readout of the ongoing signaling cascade.
      • PIP2 kinetics in control vs Rh1>dEsyt::GFP to understand if establishing more contact sites can enhance the replenishment of the lipid at the PM. 2) How is quantification of MCS density and gap distance influenced by retinal degeneration (e.g. induced by dEsyt KO)?

      Wherever we have analyzed MCS density or gap distance, these experiments have been done in flies at ages prior to the onset of retinal degeneration defined as collapse of the microvilli of the rhabdomere. Therefore, our measurements of MCS density and gap in this paper are not affected by retinal degeneration.

      3) The graphical abstract is a bit confusing. It seems to suggest that changes in dEsyt is a consequence of ageing and does not show any role of this protein in photoreceptor function. I think that the abstract could be improved to more clearly highlight the findings in the manuscript. For example, it doesn't at all show the difference in localization between WT and CaBM.

      We will modify the graphical abstract.

      4) P. 5, line 135 the authors state that "The tethering and lipid transfer activity of mammalian Esyts are reported to be influenced by Ca2+". This is a massive understatement. Ca2+ is a critical regulator of Esyt function in mammalian cells.

      The statement will be modified.

      5) In figure legend 1B and C: correct µM to µm.

      Changes will be incorporated as per the suggestion.

      6) In figure legend 2A: should be red rectangles and not black rectangles.

      Changes will be incorporated as per the suggestion.

      7) In Fig. 2B: specify which isoform of human ESyt that is shown.

      Changes will be incorporated as per the suggestion.

      8) In Fig. 2C: do the authors mean D374 or D384 (as indicated in Fig. 2A)?

      Changes will be incorporated as per the suggestion; the residue is D374.

      Significance

      Light-induced signal transduction in photoreceptor cells involves Ca2+ influx and signaling and also depends on correct formation of ER-plasma membrane contact sites. In mammalian cells, the Esyts (esp. Esyt1 and Esyt2) localize to ER-PM contacts in a Ca2+-dependent manner, and the ion has dual effects in both enriching the protein at the membrane contact sites and in promoting lipid transport. Mammalian Esyts form homo- and heterodimers, and the properties of the dimers depends on their composition (PMID: 26202220). Drosophila only have one Esyt (dEsyt) which is structurally most similar to mammalian Esyt2, and the authors have previously shown how this protein is required for photoreceptor function (PMID: 32716137), although the role of Ca2+ was not investigated in that study. However, an earlier study has shown that mutations of all Ca2+-coordinating residues in dEsyt impairs protein function in Drosophila neurons (PMID: 28882990), so a similar Ca2+-dependence in the retina would be expected. The results from the present study confirm the requirement of Ca2+ signaling for dEsyt function, and extends this Ca2+-dependent regulation to also involve photoreceptor-induced Ca2+ signaling, which corroborates many other studies showing the requirement of Ca2+ signaling for the regulation of Esyt function in mammalian cells (e.g. PMID: 23791178; PMID: 27065097; PMID: 29222176; PMID: 26202220; PMID: 24183667; PMID: 30589572). As such, the results from this study represent an incremental step towards understanding Esyt function in vivo. These results would be of greatest interest to researchers working of photoreceptor function, and of some interest to a broader audience working on membrane contact sites and signal transduction. My own background is in mammalian cell biology, with a focus on lipid and Ca2+ signaling and inter-organelle communication. I have limited understanding of the model system used here (Drosophila photoreceptor cells).


      We would like to provide an alternative perspective on the reviewer’s view that “As such, the results from this study represent an incremental step towards understanding Esyt function in vivo.”

      We are well aware of the content in several studies of Esyt in mammalian cells including the ones cited by the reviewer (e.g. PMID: 23791178; PMID: 27065097; PMID: 29222176; PMID: 26202220; PMID: 24183667; PMID: 30589572). These have been cited in our manuscript. However, it is important to recognize that each of these studies is an analysis of the properties of mammalian Esyt as a molecule in the context of Ca2+. However, none of these studies addresses the key question of whether the regulation of Esyt by Ca2+ is important for cellular function or to support cell physiology. The reason for this is quite straightforward and well known in the field. To date, there is no cellular or physiological phenotype that is reported to depend on endogenous Esyt function in mammalian cellular or animal models. As an illustrative example, deletion of all three mammalian Esyt does not affect cell signalling (PMID 23791178) including Ca2+ signalling and a triple knockout of all three Esyt in mice (PMID: 27348751) has no discernable phenotype.

      By contrast, deletion of the single Esyt gene in Drosophila results in robust phenotypes in adult photoreceptors (PMID: 32716137). Using these phenotypes, in this manuscript we study the importance of Ca2+ dependent regulation of cellular functions mediated by dEsyt. Therefore, this study fills an important unfilled gap in establishing the mechanism by which dEsyt proteins regulate cellular functions in vivo, in a Ca2+ dependent manner. We respectfully ask that this not be caricatured as an incremental step.


      Reviewer #2

      Evidence, reproducibility and clarity

      Esyt is a C domain (a Ca2+ binding domain) containing protein that localizes to the ER-MCS, playing a role in ER-mitochondria tethering and lipid transfer. At the same time, proteins at the ER-MCS are well-positioned to sense changing levels of Ca2+. Previous studies reported that loss of Esyt in Drosophila causes a loss of ER-PM integrity and retinal degeneration. Here, the authors report the consequence of disrupting the Esyt C domain in Drosophila photoreceptor cells. They used in-silico strategies to identify the Ca2+ contacting residues within the C domain and generated transgenic flies containing either the wild type or the Esyt-CaBM mutants. They show that the wild type transgene rescues several Esyt KO phenotypes in the Drosophila photoreceptors. In some cases, they report dominant negative effects of Esyt-CaBM overexpression.

      This is a straightforward structure-function analysis of the Esyt C domain. Overall, the experiments are well executed. At the same time, a few aspects of the manuscript could be further improved. For example, the authors analyze multiple aspects of photoreceptor integrity. In some cases, they show that the mutant Esyt transgene shows dominant negative effects. In others, there is no evidence or even a partial function. Clarifying these points could be helpful. Below are a few specific points for the authors' consideration:

      Major Comments

      1. RDGB is a protein that localizes to the ER-MCS. Esyt-CABM-GFP expression causes RDGB mis-localization even in the presence of wild type Esyt expression, suggestive of a dominant negative effect (Fig. 4C). But Esyt CaBM-GFP expression doesn't seem to have a dominant negative effect on contact site distance (Fig. 4D). Are the authors not seeing a dominant negative effect because they didn't examine older flies? Or, is there a distinct effect of Esyt CaBM on RDGB localization and contact site distance? If there is a distinct effect, what is the reason? As the reviewer correctly mentions, we are not seeing a dominant negative effect of dEsytCaBM::GFP expression on contact site distance because we didn't examine older flies.

      Dominant negative effect of dEsytCaBM on the wild type protein is observed in all phenotypes analyzed. The contact site distance analysis shown in the paper is done on day 1 old constant dark reared flies. Contact site distance exhibited by dEsytCaBM is like that of dEsytKO photoreceptors at day 1 post eclosion. dEsyt deprived photoreceptors are comparable to its wild type counterpart at Day 1 in all aspects of phototransduction (PMID: 32716137). But as a function of age and illumination, the dEsytKO photoreceptors exhibit progressive loss in contact site integrity, followed by induction of retinal degeneration and RDGB mis-localisation (PMID: 32716137). These observations are consistent in dEsytCaBM.

      During the revision, the following experiments will be included to strengthen this statement:

      • Add the MCS density and gap distance in dEsytKO photoreceptors at Day1 in Figure 3C.
      • Electron Microscopy to check MCS density and distance in Rh1>dEsytCaBM at Day 6CL with appropriate control genotypes.

      Esyt-CABM-GFP partially rescues the Esyt KO phenotype in retinal degeneration (Fig 6). This is surprising since cellular assays in Fig 4 show a failure of Esyt-CaBM to localize to ER-MCS. The results here contrast with earlier data showing that Esyt-CABM has dominant negative effects. How will the authors interpret the results? Is it possible that Esyt-CAMB still has some residual Ca2+ binding activity? Alternatively, does this result imply that Esyt can still function (albeit at lower capacity) without binding Ca2+? Is there Esyt function unrelated to ER-MCS site maintenance when it comes to its role in retinal degeneration? A reasonable explanation is warranted.

      Partial rescue of dEsytKO phenotypes by Rh1>dEsytCaBM; dEsytKO photoreceptors indicate that apart from calcium sensing, there might be another function for dEsyt at the ER-PM interface which is yet to be discovered.


      Minor Comments:

      Figure legends refer to "SMC" (I am guessing they are referring to Sub microvillar cisternae) without defining it in the text.

      Changes will be incorporated as per the suggestion.


      Significance

      This study will be of interest to those generally interested in the ER mitochondria contact sites. The main significance here is in dissecting the role of the C-domain within the Esyt protein. The authors demonstrate a physiological role using Drosophila photoreceptors as a model.

      We thank the reviewer for appreciating the significance of our study which seeks to show the in vivo significance of the Ca2+ regulation of dEsyt for in vivo function.

      __Reviewer #3 __

      (Evidence, reproducibility and clarity (Required)):

      Summary

      In the present work, the authors explore the role of Ca2+ binding to Esyt in the regulation of ER-PM contact sites using drosophila photoreceptors as a model system. By expressing in wild type or in EsytKO flies a mutated version of dEsyt which is predicted to lose Ca2+ binding, they highlight a potential role of Ca2+ binding to Esyt in the regulation of ER-PM contact sites density and the development of rhabdomeres. The data clearly show the effect of Esyt mutant during development of photoreceptors in Drosophila. However, as discussed below, one essential missing point is the experimental proof that the mutant has indeed lost its ability to bind Ca2+, and that PIP2 binding is not perturbed.

      Major comments

      1. One major comment is the lack of experimental proof that the EsytCABM mutant is indeed unable to bind Ca2+. The MIB tool only gives a prediction and it is not sufficient to prove their statements throughout the manuscript on the requirement of Ca2+ binding for the regulation of MCS. We understand the reviewer’s comment that this manuscript does not contain experimental data demonstrating that dEsytCaBM does not bind Ca2+. However, as Reviewer 4 pointed out, it will be challenging to demonstrate this experimentally. A direct proof would likely come from measurements of the calcium binding affinity of dEsyt (which involves protein purification that is beyond the scope of this work). An indirect demonstration would be any cellular or in vivo experiment oar any additional in silico analysis. To provide additional indirect evidence to address this question, we will:

      2. Use the AlphaFold model to demonstrate that the arrangement of the calcium binding residues in dEsyt is compatible with Ca2+

      3. Evaluate if the wild type dEsyt is mislocalized in the photoreceptors upon eliminating the calcium entry to these specialized sensory neurons. The localization of wild type dEsyt will be examined in the mutants: norpAP24 (PLC null mutant) and trpl302; trp343 (protein null mutant of TRPL and TRP channels respectively) in which light induced calcium influx is eliminated. Moreover, they should check experimentally the potential differences in the capacity of EsytCABM mutant to bind PI(4,5)P2, which can potentially perturb its subcellular localization.

      As recommended by the reviewer, it is important to determine the PIP2 binding capacity of dEsytCaBM. The ability of wild type dEsyt to bind PIP2 has not been determined. We will test this and if it does so, the impact of CaBM on PIP2 binding can be tested.

      Figure 1A: the legend on the right side of the scheme is missing. On the left, RDGB and dEsyt don't associate with the PM.

      Changes will be incorporated as per the suggestion.

      line 125: the authors should describe more precisely the Trp mutant that they used.

      The text will be modified.

      Concerning the quantification of MCS density done throughout the paper, can the authors mention what they considered as an MCS, in other words, what distance they defined as the maximal distance between the ER and the PM.

      We used fixation methods that allow enhanced membrane preservation and better visualization of membranes and MCS (PMID: 2496206). Such images allowed us to quantify the fraction of SMC that are present at the base of the microvilli in each ultrathin section of a photoreceptor. The MCS is the dark stretch that can be seen at the base of the rhabdomere in each TEM image (PMID: 32716137). Contact site distance measured is the absolute distance between the visible demarcation of the PM and SMC as indicated by the yellow arrows in Figure 4D iii, vi, and ix.

      Figure 3: the localization of Esyt and EsytCABM in S2R cells and in vivo is not precisely analyzed: a co-staining with PM and ER markers should be added in order to state the localization at ER-PM MCS or at apical PM.

      As suggested, to better understand the compartmental localization of dEsyt in photoreceptors, we will use markers of PM (Rhabdomere) and ER (Sub Microvillar Cisternae) and conduct co-localization assays.

      line 181: the authors should precise in which membrane compartments Esyt is localized.

      The text will be modified.

      line 185-187: the conclusion here doesn't seem to fit the data, as the EsytCABM mutant looks enriched at ER-PM contact sites.

      As previously answered, we will remark on whether there is an enrichment of dEsytCaBM at the ER-PM contact sites following the co-localization experiment that is recommended in Q5.

      a paragraph on the production of Drosophila transgene mutants should be added to the Mat et Med section.

      The text will be added as suggested.

      considering the phenotypes observed for the EsytCABM mutant in vivo, the authors should provide an analysis of the level of expression of the exogenous proteins Esyt and EsytCABM by western blot in the different backgrounds. EsytCABM seems to be expressed at lower levels in Figure 3C.

      As per the suggestion, western blot analysis will be conducted and better representative confocal images depicting the protein levels will be added in the manuscript.

      Fig 4D: considering the perturbation of RDGB localization observed at Day 6, the authors should analyze the organization of MCS by TEM at Day 6, in addition to Day 1.

      We agree that to support the observation of RDGB mis-localization, the decrease in contact site integrity as a function of age and illumination (Day6CL) should be evaluated in Rh1>dEsytCaBM photoreceptors. The manuscript revision will include data from this experiment.

      the EsytCABM mutant exhibits strong dominant negative effects, but rescues completely or partially some of the phenotypes of Esyt KO: could the authors discuss and provide some hypothesis on this apparent discrepancy?

      We are unsure what the reviewer means by “apparent discrepancy”. When dEsytCaBM is expressed in wild type photoreceptors, it exhibits a strong dominant negative effect presumably by inhibiting the function of wild type dEsyt protein.

      dEsytKO is a protein null allele. Therefore, when dEsytCaBM is expressed in the dEsytKO background it does not exert a dominant negative effect as there is no wild type protein to interact with. The partial rescue of dEsytKO phenotypes by Rh1>dEsytCaBM; dEsytKO photoreceptors likely indicates that calcium binding is not the sole factor affecting dEsyt function at the ER-PM interface.

      lines 230-233: the sentence is not clear. I don't see any consistency between data in Figure 5B, showing only very partial rescue by EsytCABM, and the data in Figure 5C (ii) showing complete rescue of RDGB localization by EsytCABM.

      The time point (six days of continuous light exposure following eclosion) at which RDGB localization was analyzed becomes extremely important in thinking about this reviewer comment. If we look at the degeneration kinetics depicted in figure 5B, we can see that neurodegeneration begins in both dEsytKO and Rh1>dEsytCaBM on Day 8 post-eclosion; prior to which, on Day 6, RDGB is mislocalized from the base. However, in Rh1>dEsytCaBM; dEsytKO, the onset of degeneration is delayed, and the photoreceptors show intact structure until Day 8 or Day 10, and measurable retinal degeneration begins on Day 12. This may be the reason why, RDGB continues to be correctly localized in Rh1>dEsytCaBM; dEsytKO at Day 6CL.

      Figure 6D: could the authors comment the increase of MCS density observed in Esyt-GFP expressing flies.

      Esyt is proposed to function as a tether that connects the ER and PM (PMID: 23791178; PMID: 27065097; PMID: 29222176), bringing them closer together. Based on this idea, perhaps by expressing dEsyt::GFP we are drawing the membranes together thus establishing more MCS.

      on several TEM images, some pictures illustrating different conditions look very similar, as if they were serial cuts: Fig 1B (Day 1 and Day 14), Fig 4D (Rh1 and Rh1>dEsytCABM::GFP), Fig 6B Day 1 and Day 14 and Fig 6C Day 1. Could the authors check if there was a mistake with these pictures?

      The images are not taken from serial sections of the same TEM block as is evident from the arrangement of nucleus of each photoreceptor cell. As mentioned in the figure legends, all experiments are carried out using 3 independent blocks (N=3 fly heads) prepared from each genotype and 10 photoreceptors from each block/ fly retinae are used for quantification of contact site density/ contact site distance. Aside from the arrangement of the accessory cells and cellular nuclei, the TEM images will appear very similar since Drosophila photoreceptor neurons are symmetrically arranged, with around 700–800 ommatidia per eye each comprising 8 photoreceptors.

      Minor comments:

      • lines 84-88 : the sentence is not clear. Besides, the authors should precise what they mean by "extra-cellular Ca2+ influx enhance ER-PM contact sites". Which parameter exactly has been shown to be regulated by Ca2+?

      The paper by Idevall-Hagren et al. proposes that following store operated Ca2+ influx, Esyt1 translocates to ER-PM junctions and the number of ER-PM contact sites increases. Please refer to this section of the publication from Idevall-Hagren et al. (2015) (PMID: 26202220):

      “As detected by TIRF microscopy, the depletion of Ca2+ from the lumen of the ER occurring under these conditions led to a progressive accumulation of ER‐anchored STIM1 at the PM, where it activates Orai Ca2+ channels (Fig 4C). Subsequent addition of 1–10 mM Ca2+ to the extracellular medium, either in the absence or in the presence of SERCA inhibitors, caused a massive increase in cytosolic Ca2+ (SOCE) through the activated Ca2+ channels (Figs 4A and EV4D–G). Such increase induced a very robust translocation of E‐Syt1 to the PM (Figs 4B and EV4D–G), which, in the absence of SERCA inhibition (i.e., when a reversible inhibitor of the SERCA pump had been washed out), preceded the dissociation of STIM1 and the inactivation of SOCE (Fig 4D). Inspection of TIRF microscopy images during the manipulation showed that E‐Syt1 does not form new contacts but populates and expands contacts previously occupied by STIM1.”

      • lines 108-110: can you give the reference?

      Reference for the localization of dEsyt to ER-PM MCS is Nath et.al PMID PMID: 32716137

      Reference for the localization of TRP and TRPL at the microvillar plasma membrane: Numerous primary research papers have shown this- for example see review PMID: 11557987, PMID: 22487656

      • line 189: the authors should summarize the findings in one sentence. "Functional activity" would refer to lipid transfer.

      The text will be modified as per the suggestion.

      Reviewer #3 (Significance (Required)):

      General assessment

      The work relies on a model system that enables the exploration of the role of Esyt in vivo, in a fundamental process highly regulated during development. The data clearly show the effect of Esyt mutant during development of photoreceptors in Drosophila but as discussed before, some experimental evidences are missing to completely prove the statements.

      Advance

      This work brings new insights in the functional role of lipid transfer during development and explores how the dialog between lipid transfer and Ca2+ flux can influence MCS organization. The interesting points that could be explored in the paper are the effects of a Ca2+ influx on Esyt and EsytCABM localization, and on their lipid transfer activity.

      Audience

      This work would be of interest for the membrane contact sites community and for the Developmental biology community.

      We thank the reviewer for highlighting the significance of our work and the clarity of the data. Additional data to address the points they have raised will be provided.

      __Reviewer #4 __

      (Evidence, reproducibility and clarity (Required)):

      In this study, Nath et al., aim at understanding the role of dESyt Ca2+ binding activity on ER-PM MCS in D. melanogaster photoreceptors. Using a combination of transmission electron microscopy and fluorescence microscopy, the authors explore the ability of a dESyt mutant, supposedly unable to bind Ca2+ (based on homology with the human ortholog hESyt2), to recapitulate the function of the wild type version of the protein in establishing ER-PM MCS and modulating their density.

      Findings:

      1) MCS density depends on the activity of TRP and TRPL channels in aging photoreceptors.

      2) Mutation of dESyt Ca2+ binding residues (dEsytCaBM::GFP) leads to a gross mis-localization of the protein, even in the presence of the endogenous protein.

      3) Overexpression of the mutant affects the structure of photoreceptors upon constant illumination.

      4) After 6 days of continuous illumination, RDGB is mis-localized in cells overexpressing dEsytCaBM::GFP.

      5) Overexpressed dEsytCaBM::GFP fails to reduce the distance between ER and PM, meaning it fails to establish ER-PM contract sites, while overexpressed dEsyt::GFP show reduced MCS distance. Overexpressed dEsyt::GFP also leads to a 10% increase in MCS density compared to WT or cells expressing dEsytCaBM::GFP.

      6) dEsytCaBM::GFP is not able to rescue the light dependent retinal degeneration of dESytKO, although it slightly delays the onset, but is able to rescue RDGB localization at day 6 of constant illumination.

      7) Examining MCS density in dESytKO cells, rescues with dEsyt::GFP and dEsytCaBM::GFP show a slightly higher MCS density than dESytKO at day 1. At day 14, ER-PM MCS were non-existent in dESytKO, unchanged in dEsyt::GFP and reduced by 20% in dEsytCaBM::GFP compared to day1.

      Specific comments:

      My field of expertise is biochemistry and structural biology (including cellular cryo-electron tomography), but I have no experience with drosophila biology, so I am not able to judge the drosophila work per se.

      While I find the confocal microscopy experiments compelling, I have some reservations regarding the quantification of the TEM images (MCS distances and density) as it was done manually, and therefore, to some extent subjective, especially, when differences between conditions are in the order of 10%. I would have found the quantification more convincing if done systematically, i.e. segmenting the MCS and computationally measuring distances and densities. Otherwise, the authors could expand a little bit on how their methodology is accurate.

      As the reviewer correctly mentions, the quantification will be more convincing if done systematically, i.e. segmenting the MCS and computationally measuring distances and densities. For MCS measurements, we have experimented with the segmentation method using ImageJ and Imaris. As mentioned in the answer to Q4 of reviewer 3, we used fixation methods that allow enhanced membrane preservation and better visualization of membranes and MCS (Matsumoto‐Suzuki et al, 1989). However, this staining method does not selectively stain the ER which is part of the MCS but all the ER. Due to this, automated segmentation poses significant challenges.

      The primary drawback of the segmentation method is that, in the process of training the software to predict/detect distinct cellular compartments, it recognizes all ER membranes, including SMC as well as the ER that is not part of the MCS. As a result, the software's minimum distance calculation may be between PM and SMC or PM and generic ER, which does not help the analysis we wish to perform. Similarly, to determine the contact site distance in images with obscure ER and PM boundaries, the software uses the border it can identify—which is typically inside the rhabdomere rather than at its edge. For the contact site density measurements, software is not able to distinguish between ER and pigment granules close to the rhabdomere as the gray scale value for both these compartments are comparable.

      Advantages of manual approach:

      To account for potential effects of photoreceptor depth on contact site density and distance, we have analyzed TEM sections obtained directly from the nuclear plane of the photoreceptors to calculate both contact site density and distance. Additionally, by utilizing the freehand line tool, manual analysis enables us to define the length of each little section of the MCS and the base of the rhabdomere. The entire length of the MCS at the base is then calculated by adding each segment together. An illustration of how the manual analysis is done will be included as part of methods in the revision.

      Another point is whether the levels of expression of dESyt proteins (dESyt-GFP and dESytCABM-GFP) are comparable. In the overexpression experiments, what are the expression levels of the constructs compared to the endogenous protein? The authors should provide e.g. a Western blot.

      As per the suggestion, western blot analysis will be conducted to compare the expression levels of the constructs utilized to the endogenous protein.

      Concerning the modelling, while I do think that the identification of dESyt Ca2+ binding residues is correct (the sequence alignment is convincing and the sequence identity is very high), and that most likely the structural arrangement will be conserved, homology modelling (using MODELLER with a single reference) leads to models highly similar to the input reference (in particular when the sequence identity is very high). Therefore, rmsd will necessarily be low and the side chain arrangement of conserved residues will be identical. This is unlikely to happen, as protein structures will not be identical despite high sequence conservation. In addition, a crystal structure is a snapshot of a protein conformation that is favorable for crystal formation. It would have been more interesting to use an AlphaFold model and show that the arrangement on the residues is compatible with Ca2+ binding (i.e., the C positions are similar).

      We agree with the reviewer that the data presented to demonstrate the inability of dEsytCaBM to bind Ca2+ is inadequate as is also pointed out by other reviewers. It would be crucial to prove this using multiple approaches. As suggested AlphaFold model will be used to answer the same.

      Minor comments:

      Line 102: indicate what PI and PA stand for (I don't think that there is a need for acronyms when they are not reused in the text later on).

      Changes will be incorporated as per the suggestion.

      Line 217-219: "When the same experimental set was examined for MCS density, we discovered that the density enhanced by 10% in Rh1>dEsyt::GFP while being comparable between wild type and dEsytCaBM::GFP flies." The authors don't comment on this finding. Does that imply that increase in the protein levels leads to increase in MCS density?

      Yes. Increase in wild type dEsyt protein levels can establish more contact sites as well as reduce the contact site distance which further elucidates the protein's role in functional tethering as mentioned in line 215 as proposed by previous studies in other models (PMID: 23791178; PMID: 27065097; PMID: 29222176).

      Lines 298-302: "...implying that dEsytCaBM exerts a dominant negative effect on wild type dEsyt. One possible mechanism for the phenotypes exhibited by dEsytCaBM expression in wild type cells is suggested by the findings of a structural and mass spectrometry investigation of hEsyt2 that reveals that the SMP domain dimerizes to create a 90Å long cylinder to facilitate the transfer of lipids (Schauder et al., 2014)." It is not clear to me what the authors suggest here: because of the dimerisation between wild type and mutant, the mutant has a negative effect or that the SMP dimerization is somehow impaired in dEsytCaBM?

      SMP domain of Esyt proteins have previously been shown to dimerize (PMID: 23791178, PMID: 24847877). They are known to form either homodimers or heterodimers in mammalian system where there are three genes that code for the protein (Esyt1, 2 and 3). In Drosophila, since it is just one gene that codes for the protein, our hypothesis is that one copy of the functional wild type gene dimerizes with the CaBM mutant and thereby render the wild type gene product nonfunctional.

      Line 304-305: "...protein expression was restricted to the cell body rather than the presynaptic terminals...". I am not sure that this is correct. The fact that a protein is localizing to a compartment does not mean that its expression is restricted to that compartment (one should measure mRNA levels to conclude this).

      The statement is based on the findings made by Kikuma et al, 2017 (PMID: 28882990) when they tried to understand the role of dEsyt at the NMJs.

      In figure 1B legend, indicate what SMC stands for (the acronym should be indicated in figure 1A legend).

      The text will be added as suggested.

      In figure 2A legend Ca binding in black box but in red boxes in figure.

      Changes will be incorporated as per the suggestion.

      **Referees cross-commenting**

      I agree with the other reviewers that one of the premise of this study relies on the loss of calcium binding by the dESyt mutant and this is not experimentally proven by the authors. However, I find that this will be difficult to prove in vivo. Only measurements of dESyt calcium binding affinity would constitute a direct proof (which requires protein purification. Any in vivo or cellular experiment would be an indirect proof. I believe that based on the high sequence conservation with ESyt proteins, the calcium binding residues have been correctly identified.

      Reviewer #4 (Significance (Required)):

      ESyt proteins are known ER-PM tethers involved in lipid transfer at MCS in a Ca2+ dependent manner. Contrary to yeast and mammals, that have several ESyt orthologs, D. melanogaster has only one ESyt, making it an ideal model to study ESyt function in vivo. It has been previously shown that proper localization of ESyt at MCS depends on Ca2+ concentration: ESyts are anchors to the ER but translocate to the PM in response to elevation of Ca2+ levels in the cytosol (Fernández-Busnadiego et al., 2015). The finding that an ESyt mutant unable to bind calcium is not localized properly is therefore not surprising. The link between RDGB, a protein known to localize at MCS, and ESyt has been shown before but to my knowledge Nath et al., show for the first time that RDBG localization at MCS is directly dependent on the Ca2+ binding activity of ESyt. In addition, the authors convincingly demonstrate that the Ca2+ binding activity of dESyt is necessary to maintain the structure of aging photoreceptors.

      The main finding of this study is that the Ca2+ binding activity of dESyt regulates the density of ER-PM MCS in photoreceptors. If true (see my comment below), that would be a novel finding, although the authors don't propose any mechanistic explanation for this.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      We haven't made any changes to the manuscript yet. However, we will be able to implement the changes mentioned in the pointwise response to reviewers above.

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      • *

      We feel that experiments to directly determine the calcium binding of dEsyt and the loss of this in dEsytCaBM are beyond the scope of this study. This is because of the huge work to heterologously express and purify the protein. We have proposed alternate ways to strengthen this conclusion.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      This study by Paoli et al. used a resonant scanning multiphoton microscope to examine olfactory representation in the projection neurons (PNs) of the honeybee with improved temporal resolution. PNs were classified into 9 groups based on their response patterns. Authors found that excitatory repose in the PNs precedes the inhibitory responses for ~40ms, and ~50% of PN responses contain inhibitory components. They built the neural circuit model of the mushroom body (MB) with evolutionally conserved features such as sparse representation, global inhibition, and a plasticity rule. This MB model fed with the experimental data could reproduce a number of phenomena observed in experiments using bees and other insects, including dynamical representations of odor onset and offset by different populations of Kenyon cells, prolonged representations of after-smell, different levels of odorspecificity for early/delay conditioning, and shift of behavioral timing in delay conditioning. The trace conditioning was not modeled and tested experimentally. Also, the experimental result itself is largely confirmatory to preceding studies using other organisms. Nonetheless, the experimental data and the model provide a solid basis for future studies.  

      We thank the reviewer for summarizing the value of our study and recognizing its generality and significance. As suggested, in a revised version of the manuscript, we will discuss the implication of our approach for the context of trace conditioning. The model we presented hinges on the learning-induced plasticity of KC-to-MBON synapses recruited during the learning window (i.e., the simulated US arrival). In the case of trace conditioning, the model predicts that the time of the behavioral response time should match the expected US arrival. Contrary to this prediction, preliminary analyses on empirical measurements of PER latency upon trace conditioning indicate this is not the case. In a revised version of the manuscript, we will discuss the differences between the predictions of the model and the experimental observations in a trace conditioning paradigm.

      Reviewer #2 (Public Review):

      The study presented by Paoli et al. explores temporal aspects of neuronal encoding of odors and their perception, using bees as a general model for insects. The neuronal encoding of the presence of an odor is not a static representation; rather, its neuronal representation is partly encoded by the temporal order in which parallel olfactory pathways participate and are combined. This aspect is not novel, and its relevance in odor encoding and recognition has been discussed for more than the past 20 years. 

      The temporal richness of the olfactory code and its significance have traditionally been driven by results obtained based on electrophysiological methods with temporal resolution, allowing the identification and timing of the action potentials in the different populations of neurons whose combination encodes the identity of an odor. On the other hand, optophysiological methods that enable spatial resolution and cell identification in odor coding lack the temporal resolution to appreciate the intricacies of olfactory code dynamics. 

      (1) In this context, the main merit of Paoli et al.'s work is achieving an optical recording that allows for spatial registration of olfactory codes with greater temporal detail than the classical method and, at the same time, with greater sensitivity to measure inhibitions as part of the olfactory code. 

      The work clearly demonstrates how the onset and offset of odor stimulation triggers a dynamic code at the level of the first interneurons of the olfactory system that changes at every moment as a natural consequence of the local inhibitory interactions within the first olfactory neuropil, the antennal lobe. This gives rise to the interesting theory that each combination of activated neurons along this temporal sequence corresponds to the perception of a different odor. The extent to which the corresponding postsynaptic layers integrate this temporal information to drive the perception of an odor, or whether this sequence is, in a sense, a journey through different perceptions, is challenging to address experimentally. 

      In their work, the authors propose a computational approach and olfactory learning experiments in bees to address these questions and evaluate whether the sequence of combinations drives a sequence of different perceptions. In my view, it is a highly inspiring piece of work that still leaves several questions unanswered. 

      We thank the reviewer for considering that our work has an inspiring nature. Below we have tried to answer the questions raised by the following comments, and we will include part of these answers in the revised version of our manuscript.

      (2) In my opinion, the detailed temporal profile of the response of projection neurons and their respective probabilities of occurrence provide valuable information for understanding odor coding at the level of neurons transferring information from the antennal lobes to the mushroom bodies. An analysis of these probabilities in each animal, rather than in the population of animals that were measured, would aid in better comprehending the encoding function of such temporal profiles. Being able to identify the involved glomeruli and understanding the extent to which the sequence of patterns and inhibitions is conserved for each odor across different animals, as it is well known for the initial excitatory burst of activity observed in previous studies without the fine temporal detail, would also be highly significant. 

      We thank the reviewer for recognizing the relevance of the findings in understanding the logic of olfactory coding. We agree about the importance of establishing if the different glomerular response profiles are evenly distributed across individuals or have individual biases. In the revised version of the manuscript, we will provide data on the distribution of response profiles for each animal and for different olfactory stimuli. Also, we fully agree on the importance of assessing to what extent such response profiles - largely determined by the local network of AL interneurons - are glomerulus-specific and conserved across individuals.

      In my view, the computational approach serves as a useful tool to inspire future experiments; however, it appears somewhat simplistic in tackling the complexity of the subject. One question that I believe the researchers do not address is to what extent the inhibitions recorded in the projection neurons are integrated by the Kenyon cells and are functional for generating odor-specific patterns at that level. 

      The model we proposed represents, indeed, a simplification of olfactory signal processing throughout the honey bee olfactory circuit. Still, it shows that simple but realistic rules can be sufficient to grasp some fundamental aspects of olfactory coding. However, we agree with the reviewer and believe that such a minimalistic model can provide a basis for designing future experiments in which complexity can be increased by adding relevant features, such as the learning-induced plasticity of PN-to-KC synapses or the divergence of multiple PNs from the same glomerulus to different KCs.

      Concerning the reviewer's question on the involvement of inhibitory inputs in generating odor-specific patterns at the level of the KCs, the short answer is yes, they contribute to the summed input of a target KC, thus to the odor representation. In designing the model, we considered that a given glomerulus provides maximal input at maximal excitation and minimal input (=0 input) at maximal inhibition. For this reason, an inhibited glomerulus contributes less (to KC action potential probability) than a glomerulus showing baseline activity. This, in turn, contributes less than an excited glomerulus. From the modeling point of view, normalizing the signal between 0 and 1 (i.e., setting minimal inhibition to 0 and maximal excitation to 1) would yield a similar result as with the current approach, where values range from -25% to +30% F/F. We implement the model's description to clarify this point.

      Lastly, the behavioral result indicating a difference in conditioned response latency after early or delayed learning protocol is interesting. However, it does not align with the expected time for the neuronal representation that was theoretically rewarded in the delayed protocol. This final result does not support the authors' interpretation regarding the existence of a smell and an after-smell as separate percepts that can serve as conditioned stimuli.

      Considering that our odor stimulus lasted 5 seconds, glomerular activity is highly variable at odor onset (i.e., within the first 1s) because of short excitatory response profiles and the delayed and slower onset of inhibitory responses. After the initial phase, the neural representation of the stimulus becomes more stable. Consequently, a neural signature learned in the case of delay conditioning, i.e., with the US appearing towards the end of the olfactory stimulation (t = 4 - 5s), may present itself much earlier (t = 1.5s), triggering a behavioral response that largely anticipates the expected US arrival time. 

      In the model, we observe an early decrease in action potential probability even in the case of delay conditioning. This occurs because the synapses recruited during the last second of olfactory stimulation (within the learning window during which CS and US overlap) become inactive. Because odorant-induced activity recruits highly overlapping synaptic populations between 1.5 and 5 s from the onset, a learning-induced inactivation of part of these synapses will result in a reduced action-potential probability in the modeled MBON. Importantly, this event will not be governed by time but by the appearance of the learned synaptic configuration. 

      We will add a new section to the revised version of the manuscript to clarify this concept and perform further analyses to characterize the contribution of different response types to the modeled response latency.

      • Overview of Graphs in Computation:

        • Graphs have been successful in domains like shader programming and signal processing.
        • Computation in these systems is usually expressed on nodes with edges representing information flow.
        • Traditional models often have a closed-world environment where node and edge types are pre-defined.
      • Introduction to Scoped Propagators (SPs):

        • SPs are a programming model embedded within existing environments and interfaces.
        • They represent computation as mappings between nodes along edges.
        • SPs reduce the need for a closed environment and add behavior and interactivity to otherwise static systems.
      • Definition and Mechanics:

        • A scoped propagator consists of a function taking a source and target node, returning a partial update to the target.
        • Propagation is triggered by specific events within a defined scope.
        • Four event scopes implemented: change (default), click, tick, and geo.
        • Syntax: scope { property1: value1, property2: value2 }.
      • Event Scopes and Syntax:

        • Example: click {x: from.x + 10, rotation: to.rotation + 1} updates target properties when the source is clicked.
      • Demonstration and Practical Uses:

        • SPs enable the creation of toggles and counters by mapping nodes to themselves.
        • Layout management is simplified as arrows move with nodes.
        • Useful for constraint-based layouts and debugging by transforming node properties.
        • Dynamic behaviors can be created using scopes like tick, which utilize time-based transformations.
      • Behavior Encoding and Side Effects:

        • All behavior is encoded in arrow text, allowing for easy reconstruction from static diagrams.
        • Supports arbitrary JavaScript for side effects, enabling creation of utilities or tools within the environment.
      • Cross-System Integration:

        • SPs can cross boundaries of siloed systems without editing source code.
        • Example: mapping a Petri Net to a chart, demonstrating flexibility in creating mappings between unrelated systems.
      • Complex Example:

        • A small game created with SPs includes joystick control, fish movement, shark behavior, toggle switch, death state, and score counter.
        • The game uses nine arrows to propagate behavior between different node types.
      • Comparison to Prior Work:

        • Differences from Propagator Networks: propagation along edges, scope conditions, arbitrary stateful nodes.
        • Previous work like Holograph influenced the use of the term "propagator."
      • Open Questions and Future Work:

        • Unanswered questions include function reuse, modeling side effects, multi-input-multi-output propagation, and applications to other domains.
        • Formalization of the model and examination of real-world usage are pending tasks.

      By following the structured format above, the summary captures the essence and main points of the text, providing clear insights into the Scoped Propagators model and its potential applications.

    1. Welcome back. In this demo lesson, I want to quickly demonstrate how to use CloudFormation to create some simple resources. So before we start, just make sure you're logged in to the general AWS account and that you've got the Northern Virginia region selected. Once you've got that, just move across to the CloudFormation console.

      So this is the CloudFormation console, and as I discussed in the previous lesson, it works around the concepts of stacks and templates. To get started with CloudFormation, we need to create a stack. When you create a stack, you can use a sample template, and there are lots of different sample templates that AWS makes available. You can create a template in the Designer or upload a ready-made template, and that's what I'm going to do. Now, I've provided a template for you to use, linked to this lesson. So go ahead and click on that link to download the sample template file.

      Once you've downloaded it, you'll need to select 'Upload a template file' and then choose 'File'. Locate the template file that you just downloaded; it should be called 'ec2instance.yaml'. Select that and click on 'Open'. Whenever you upload a template to CloudFormation, it's actually uploading the template directly to an S3 bucket that it creates automatically. This is why, when you're using AWS, you may notice lots of buckets with the prefix CF that get created in a region automatically. You can always go ahead and delete these if you want to keep things tidy, but that's where they come from.

      Now, before we upload this, I want to move across to my code editor and step through exactly what this template does. The template uses three of the main components that I've talked about previously. The first one is parameters. There are two parameters for the template: latest AMI and SSH and web location. Let's quickly talk about the latest AMI ID because this is an important one. The type of this parameter is a special type that's actually a really useful feature. What this allows us to do is rather than having to explicitly provide an AMI ID, we can say that we want the latest AMI for a given distribution. In this case, I'm asking for the latest AMI ID for Amazon Linux 2023 in whichever region you apply this template in. By using this style of parameter, the latest AMI ID gets set to the AMI of the latest version of this operating system.

      The final parameter that this template uses is SSH and web location, which is where we can just specify an IP address range that we want to be able to access this EC2 instance. So that's parameters—nothing special, and you'll get more exposure to these as we go through the course. Now we've also got outputs, and outputs are things that are set when the template has been applied successfully. When a stack creates, when it finishes that process, it will have some outputs. I've created outputs so that we get the instance ID, the availability zone that the instance uses—remember EC2 is an AZ service. It’ll also provide the public DNS name for the instance, as well as the public IP address. The way that it sets those is by using what's known as a CloudFormation function.

      So this is ref, and this is going to reference another part of the CloudFormation template. In this case, it's going to reference a logical resource, the EC2 instance resource. Now, get attribute or get att is another function that's a more capable version of ref. With get attribute, you still refer to another thing inside the template, but you can pick from different data that that thing generates. An EC2 instance, by default, the default thing that you can reference is the instance ID, but it also provides additional information: which availability zone it's in, its DNS name, and its public IP. I’ll make sure to include a link in the lesson that details all of the resources that CloudFormation can create, as well as all of the outputs that they generate.

      The main component of course of this template is the resources component. It creates a number of resources. The bottom two, you don’t have to worry about for now. I’ve included them so I can demonstrate the Session Manager capability of AWS. I'll be talking about that much more later in the course, but what I'm doing is creating an instance role and an instance role profile. You won't know what these are yet, but I’ll be talking about them later in the course. For now, just ignore them. The main two components that we're creating are an EC2 instance and a security group for that instance.

      We’re creating a security group that allows two things into this instance: port 22, which is SSH, and port 80, which is HTTP. So it’s allowing two different types of traffic into whatever the security group is attached to. Then we’re creating the EC2 instance itself. We’ve got the EC2 instance, which is a logical resource, the type being AWS::EC2::Instance, and then the properties for that logical resource, such as the configuration for the instance. We’re setting the type and size of the instance, t2.micro, which will keep it inside the free tier. We’re setting the AMI image ID to use, and it's referencing the parameter, and if you recall, that automatically sets the latest AMI ID. We’re setting the security group, which is referencing the logical resource that we create below, so it creates this security group and then uses it on the instance. Finally, we’re setting the instance profile. Now, that’s related to these two things that I’m not talking about at the bottom. It just sets the instance profile, so it gives us the permission to use Session Manager, which I’ll demonstrate shortly after we implement this.

      There’s nothing too complex about that, and I promise you by the end of the course, and as you get more exposure to CloudFormation, this will make a lot more sense. For now, I just want to use it to illustrate the power of CloudFormation. So I’m going to move back to the console. Before I do this, I’m going to go to services and just open EC2 in a new tab. Once you’ve done that, return to CloudFormation and click on next. We’ll need to name the stack. I’m just going to call it CFN demo one for CloudFormation demo one. Here’s how the parameters are presented to us in the UI. The latest AMI ID is set by default to this value because, if we look at the parameters, it’s got this default value for this parameter. Then SSH and web location also has a default value which is set in the template, and that’s why it’s set in the UI. Leave these two values as default. Once you’ve done that, click on next.

      I’ll be talking more about all of these advanced options later on in the course when I talk about CloudFormation. For now, we’re not going to use any of these, so click on next. On this screen, we need to scroll down to the bottom and check this capabilities box. For certain resources that you can create within CloudFormation, CloudFormation views them as high-risk. In this case, we're creating an identity, an IAM role. Don't worry, I'll be talking a lot more about what an IAM role is in the next section of the course. Because it's an identity, because it's changing something that provides access to AWS, CloudFormation wants us to explicitly acknowledge that we’re to create this resource. So it’s prompting us for this capability to create this resource. Check this box, it’s fine, and then click on submit. The stack creation process will begin and the status will show create in progress.

      This process might take a few minutes. You’re able to click on refresh here, so this icon on the top right, and this will refresh the list of events. As CloudFormation is creating each physical resource that matches the logical resources in the template, it’s going to create a new event. For each resource, you’ll see a create in progress event when the creation process starts, and then you’ll see another one create complete when it creates successfully. If there are any errors in the template, you might see red text, which will tell you the nature of that error. But because this is a CloudFormation template that I’ve created, there’ll be no errors. After a number of minutes, the stack itself will move from Create in Progress to Create Complete.

      I refreshed a couple more times and we can see that the Session Manager instance profiles moved into the Create Complete status and straight after that it started to create the EC2 instance. We’ve got this additional event line saying Create in Progress, and the resource creation has been initiated. We’re almost at the end of the process now; the EC2 instance is going to be the last thing that the stack will create. At this point, just go ahead and pause the video and wait until both the EC2 instance and the stack itself move into Create Complete. Once both of those move into Create Complete, then you can resume the video and we’re good to continue.

      Another refresh, and we can see that the EC2 instance has now moved into a Create Complete status. Another refresh and the entire stack, CFN demo 1, is now in the create complete state, which means that the creation process has been completed and for every logical resource in the template, it’s created a physical resource. I can click on the outputs tab and see a list of all the outputs that are generated from the stack. You’ll note how they perfectly match the outputs that are listed inside the template. We’ve got instance ID, AZ, public DNS, and public IP. These are exactly the same as the outputs listed inside the CloudFormation template. You’ll see that these have corresponding values: the instance ID, the public DNS of the instance, and the public IP version 4 address of the instance.

      If I click on the resources tab, we’ll be able to see a list of the logical resources defined in the template, along with their corresponding physical resource IDs. For the EC2 instance logical resource, it’s created an instance with this ID. If you click on this physical ID, it will take you to the actual resource inside AWS, in this case, the EC2 instance. Now, before we look at this instance, I’m going to click back on CloudFormation and just click on the stacks clickable link at the top there. Note how I’ve got one stack, which is CFN demo one. I could actually go ahead and click on create stack and create stack with new resources and apply the same template again, and it would create another EC2 instance. That’s one of the powerful features of CloudFormation. You can use the same template and apply it multiple times to create the same set of consistent infrastructure.

      I could also take this template because it's portable, and because it automatically selects the AMI to use, I could apply it in a different region and it would have the same effect. But I’m not going to do that. I’m going to keep things simple for now and move back to the EC2 tab. Now, the one thing I want to demonstrate before I finish up with this lesson is Session Manager. This is an alternative to having to use the key pair and SSH to connect to the instance. What I’m able to do is right-click and hit Connect, and instead of using a standalone SSH client, I can select to use Session Manager. I’ll select that and hit Connect, and that will open a new tab and connect me to this instance without having to use that key pair.

      Now, it connects me using a different shell than I'm used to, so if I type bash, which is the shell that you normally have when you log into an EC2 instance, that should look familiar. I’m able to run normal Linux commands like df -k to list all of the different volumes on the server, or dmesg to get a list of informational outputs for the server. This particular one does need admin permission, so I’ll need to rerun this with sudo and then dmesg. These are all commands that I could run in just the same way if I was connected to the instance using an SSH client and the key pair. Session Manager is just a better way to do it, but it requires certain permissions to be given to the instance. That’s done with an instance role that I’ll be talking all about later on in the course. That is the reason why my CloudFormation template has these two logical resources, because these give the instance the permission to be able to be connected to using Session Manager. It makes it a lot easier to manage EC2 instances.

      So that’s been a demo of how easy it is to create an EC2 instance using CloudFormation. Throughout the course, we'll be using more and more complex examples of CloudFormation. I’ll be using that to show you how powerful the tool is. For now, it’s a really simple example, but it should show how much quicker it is to create this instance using CloudFormation than it was to do it manually. To finish up this lesson, I’m going to move back to the CloudFormation console. I’m going to select this CloudFormation stack and click on Delete. I need to confirm that I want to do this because it’s telling me that deleting this stack will delete all of the stack resources.

      What happens when I do this is that the stack deletes all of the logical resources that it has, and then it deletes all of the corresponding physical resources. This is another benefit of CloudFormation in that it cleans up after itself. If you create a stack and that creates resources, when you delete that stack, it cleans up by deleting those resources. So if I click on Delete Stack Now, which I will do, it starts a delete process, and that’s going to go ahead and remove the EC2 instance that it created. If I select this stack now, I can watch it do that. I can click on Events, and it will tell me exactly what it’s doing. It’s starting off by deleting the EC2 instance. If I move back to the EC2 console and just hit Refresh, we can see how the instance state has moved from running to shutting down.

      Eventually, once the shutdown is completed, it will terminate that instance. It’ll delete the storage, it will stop using the CPU and memory resources. At that point, the account won’t have any more charges. It wouldn’t have done anyway because this demo has been completely within the free tier allocation because I was using a t2.micro instance. But there we go. We can see the instance state has now moved to terminated. Go back to CloudFormation and just refresh this. We’ll see that it’s completed the deletion of all the other resources and then finished off by deleting the stack itself. So that’s the demonstration of CloudFormation. To reaffirm the benefits, it allows us to do automated, consistent provisioning. We can apply the same template and always get the same results. It’s completely automated, repeatable, and portable. Well-designed templates can be used in any AWS region. It’s just a tool that really does allow us to manage infrastructure effectively inside AWS.

    1. A critique on the Mass Media... The problem is that they want the Mass Media system to operate on the code of "True/False" rather than "Known/Unknown"... But if it were to be so, it would not be Mass Media anymore, but rather the Science System.

      For Mass Media to be Mass Media it needs to be concerned with selection and filtering, to condense and make known, not to present "all the facts". Sure, they need to be concerned with truth to a certain degree, but it's not the primary priority.


      This is a reflection based on my knowledge of Luhmann's theory of society as functionally differentiated systems; as explained by Hans-Georg Moeller (Carefree Wandering) on YouTube.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Responses to recommendations

      Reviewer #1 (Recommendations For The Authors):

      Describe more precisely how gene expression graphs are built (tissues, reads counts). For example, how were read counts normalized? Were they from DESeq2 data, which only works by comparing two samples? If so, all samples should be independently compared to a reference and the normalized expression value of the reference will change from sample to sample... thus introducing a pure technical artifact.

      We have added additional information about the normalisation method to the

      Material and Methods section (Lines 597-598: “Lastly, expression levels shown in figures 2-5 are normalised gene counts produced by DESeq2.”) and figure legends

      (lines 247, 286, 372, 404: “Gene expression data was generated from whole fish.

      Expression levels were derived from DESeq2 normalised gene counts.”) to address this recommendation. 

      DESeq2 provides a reference independent normalisation through a median of ratios method (a good explanation can be found here:

      https://hbctraining.github.io/DGE_workshop/lessons/02_DGE_count_normalization.h tml). The normalised expression values are independent of any reference, and therefore will not change from sample and sample as suggested in this comment. In contrast, the pairwise comparisons are done when analysing significantly differentially expressed genes between two treatments using a Wald test, which is done against a reference and generates log2 fold change information and p-values.; however, this is different to the normalisation we described above.

      Provide bioinformatics workflows and, if possible, the set of parameters used, the computing resources, etc. Were some assembly finishing steps carried out (by long-range PCR?) and experimental validations (especially for allelespecific transcripts, by conventional RT-PCR based on diagnostic mutations)?

      We have added additional information on the bioinformatics workflows where required, including parameters used (Lines 530, 536, 549-551, and 574-583.). No finishing steps other than HiC scaffolding were performed. No allele-specific analysis was done as part of this manuscript.

      To further improve transparency, we have also uploaded all the scripts used for this study to https://github.com/R-Huerlimann/Malabar_grouper_genome and the gene models and functional annotation to https://figshare.com/projects/Malabar_grouper_Epinephelus_malabaricus_genome_ annotation/199909. This information has been added to the manuscript in lines 600601 and 609-611.

      Reviewer #3 (Recommendations For The Authors):

      General author response:

      All the recommendations of this reviewer are very relevant and would certainly provide a lot of information, but they are constituting a full project in themselves as they would imply establishing this grouper species as an experimental model in our lab. Currently we only have access to the larval and juvenile stages via a collaboration with the Okinawa Prefectural Sea Farming Center, which is an hour drive from our lab, and is limited to the grouper spawning season. If we want to do all what is suggested, we need to have a regular and easy access to the fishes. This would require establishing this model in our marine station, which is not possible due to space and time issues. These groupers grow to a very large size (1-2 m in length, and up to 150 kg in weight) and only mature into males after > 6 years.

      First and foremost, I would advise the authors to extend their TH and cortisol levels measurements to the entire developmental time considered in their analysis.

      For the reasons stated above we could not perform these experiments. We must emphasize that the data regarding TH are available for a closely related species (e.g., Epinephelus coioides, de Jesus et al. 1998) and there is no reason to think that the situation will be drastically different in E. malabaricus. In addition, given that we have now studied several coral reef fish species in the same context (clownfish, surgeonfish, damselfish, gobies) we observed that the transcriptomic data are more robust, more sensitive, and more precise than hormone measurements. 

      Consider carrying out in situ hybridisation of TSH with putative CRH receptors to determine if thyrotrophin could be competent to respond to HPA axis signals.

      We agree studying the interplay between corticoids and thyroid hormones at the neuroendocrine level would be desirable and we fully agree with the experiment suggested by the reviewer, but this is impossible in our current situation. We are not working with an establish animal model like zebrafish or Xenopus, but with a large, long-lived marine fish that reproduces in spawning aggregations and whose husbandry is notoriously difficult.

      Consider conducting cortisol treatment experiments to functionally determine if indeed cortisol is involved in grouper metamorphosis.

      We tried to do TH and cortisol treatments specifically on the early larval stages corresponding to the early TH peak to see how this would impact the development of the fin spines, but our trials were unsuccessful. The larvae at that stage are extremely fragile and even putting them into small volumes of treatment drugs induced massive mortalities. Again, this would mean establishing this grouper species as a model organism and would require a massive effort to improve larval rearing as discussed above. We feel that our data stands on its own in the meantime and adds valuable information to the existing literature by studying a rarely investigated species.

      Responses to comments

      Reviewer #1 (Public Review):

      Weaknesses:

      The manuscript needs proper editing and is not complete. Some wordings lack precision and make it difficult to follow (e.g. line 98 "we assembled a chromosome-scale genome of ..." should read instead "we assembled a chromsome-scla genome sequence of ...". Also, panel Figure 2E is missing.

      We made the suggested change of adding “sequence” in lines 32 and 121. Concerning additional changes, we have carefully edited our manuscript and looked for any incomplete sections. Unfortunately, it is difficult to see what other issues are being raised here without any further information. 

      As for panel E of figure 2, it is not missing. The panel is located to the right, just below “Target Cells”.

      The shortcomings of the manuscripts are not limited to the writing style, and important technical and technological information is missing or not clear enough, thereby preventing a proper evaluation of the resolution of the genomic resources provided:

      Several RNASeq libraries from different tissues have been built to help annotate the genome and identify transcribed regions. This is fine. But all along the manuscript, gene expression changes are summarized into a single panel where it is not clear at all which tissue this comes from (whole embryo or a specific tissue ?), or whether it is a cumulative expression level computed across several tissues (and how it was computed) etc. This is essential information needed for data interpretation.

      No fertilised eggs or embryos have been sequenced. The individual tissues derived from juvenile fish were used for the genome annotation only, using ISOseq. The whole larval fish were used for the developmental analysis using RNAseq, as well as the genome annotation. We have added additional information in the figures and text that the results shown are from whole larvae, and added more detail to the material and methods section about which type of sample was analysed in which way.

      Specifically, we have added “Lastly, expression levels shown in figures 2-5 are normalised gene counts produced by DESeq2.” to lines 597-598 in the Material and Methods section, “Gene expression data was generated from whole larvae.” to line 191, and “Gene expression data was generated from whole fish. Expression levels were derived from DESeq2 normalised gene counts.” to the figure legends in lines 247, 286, 372, 404). Additionally, we have added clarifications in lines 489, 497, 530, and 536. 

      The bioinformatic processing, especially of the assemble and annotation, is very poorly described. This is also a sensitive topic, as illustrated by the numerous "assemblathon" and "annotathon" initiatives to evaluate tools and workflows. Importantly, providing configuration files and in-depth description of workflows and parameter settings is highly recommended. This can be made available through data store services and documents even benefit from DOIs. This provides others with more information to evaluate the resolution of this work. No doubt that it is well done,but especially in the field of genome assembly and annotation, high resolution is VERY cost and time-intensive. Not surprisingly, most projects are conditioned by trade-offs between cost, time, and labor. The authors should provide others with the information needed to evaluate this.

      We have added additional information on parameters used in the genome assembly, annotation and transcriptome analysis in lines 549-551, 577, 579, 580, and 582. Additionally, we have uploaded all scripts to github as outlined in the Code and Data Availability section (lines 599-614).

      The genome assembly did not use a specific workflow (e.g., nextflow), but was done with a simple command and standard parameters in IPA. Scaffolding was carried out by Phase Genomics using their standardised proprietary workflow, of which a detailed description provided by Phase Genomics can be found in the supplementary material.

      Quantifications of T3 and T4 levels look fairly low and not so convincing. The work would clearly benefit from a discussion about why the signal is so low and what are the current technological limitations of these quantifications.

      This would really help (general) readers.

      The T3/T4 levels are consistent with other published work in fish. In the present manuscript for grouper we have a peak level of 1.2 ng/g (1,200 pg/g) of T4 and 0.06 ng/g (60 pg/g) of T3. This is a higher level of T4 and comparable level of T3 to what was found in convict tang (Holzer et al. 2017; Figure 2) with 30 pg/g of T4 and 100 pg/g of T3. Of course, there are also examples with higher levels, such as clownfish (Roux et al. 2023; Figure 1), with 10 ng/g (10,000 pg/g) of T4 and 2 ng/g (2,000 pg/g) of T3.

      The differences could be due to different structure of fish tissues and therefore different hormone extraction efficiency, different hormone measurement protocols, different fish physiology, different fish size (e.g., the weighting of tiny grouper larvae is difficult and less precise than in convict tang). What is important is not the absolute level but the relative level, which shows the change within different larval stages of a species with identical extraction and measurement protocols. Which means our data is internally consistent and coherent with what the grouper literature says.

      Holzer, Guillaume, et al. "Fish larval recruitment to reefs is a thyroid hormonemediated metamorphosis sensitive to the pesticide chlorpyrifos." Elife 6 (2017): e27595.

      Roux, Natacha, et al. "The multi-level regulation of clownfish metamorphosis by thyroid hormones." Cell Reports 42.7 (2023).

      Differential analysis highlights up to ~ 15,000 differentially expressed genes (DEG), out of a predicted 26k genes. This corresponds to more than half of all genes. ANOVA-based differential analysis relies on the simple fact that only a minority of genes are DEG. Having >50% DEG is well beyond the validity of the method. This should be addressed, or at least discussed.

      The large number of differentially expressed genes is due to the fact that this is coming from a larval developmental transcriptome going from one day old larva to fully metamorphosed juveniles at around day 60. 

      While DESeq2 indeed works on an assumption that most genes are not differentially expressed, this affects normalization but not hypothesis testing (Wald-test, LRT tests or ANOVA). However, normalisation in DESeq2 is fairly robust to this assumption. According to the author of DESeq2, Micheal Love, DESeq2 is using the median ratio for normalisation, and as long as the number of up and down regulated genes is relatively even, DESeq2 will be able to handle the data. As part of our general quality control for this project we consulted the MA plots, which do not show any overrepresented up or down expression patterns. Additionally see Michael Love comment on comparing different tissues, which is also applicable here when comparing vastly different larval stages (https://support.bioconductor.org/p/63630/):

      “For experiments where all genes increase in expression across conditions, the median ratio method will not be able to capture this difference, but this is typically not the case for a tissue comparison, as there are many "housekeeping" genes with relatively similar expression pattern across tissues.”

      Reviewer #3 (Public Review):

      Weaknesses:

      However, the authors make substantial considerations that are not proven by experimental or functional data. In fact, this is a descriptive study that does not provide any functional evidence to support the claims made.

      We agree with the reviewer that our paper lacks functional experiments but despite that, the transcriptomic data clearly show the activation of TH and corticoid pathways during two distinct periods: an early activation between D1 and D10, and a second one between D32 and juvenile stage. These data are interesting as they call for further examination of 1) the existence of an early larval developmental step also involving TH and corticosteroids and 2) the possible interaction of corticoids and TH during metamorphosis. This is a question that is certainly not settled yet in teleost fishes and which is of great interest.

      Especially 1) is of interest and importance, since this early activation (unique to our knowledge in any teleost fish studied so far) raises a lot of new questions and once again will certainly be scrutinised by other groups in the years to come, therefore ensuring a good citation impact of this study. We hope that the reviewer, while disagreeing with some our statements, will recognize that our study will be stimulating at that level and that this is what scientific studies should do.

      We acknowledge the descriptive nature of the data and the lack of functional experiments in the Discussion in lines 443 to 445: “This may suggest that in some aspect, cortisol synthesis could work in concert with TH, as has been shown in several different contexts in amphibians, but functional experiments need to be conducted to confirm this hypothesis.” As stated above doing such functional experiment would require establishing the grouper as an experimental model in our husbandry, which currently is not possible due to the large size of the adult fish.

      The consideration that cortisol is involved in metamorphosis in teleosts has never been shown, and the only example cited by the authors (REF 20) clearly states that cortisol alone does not induce flatfish metamorphosis. In that work, the authors clearly state that in vivo cortisol treatment had no synergistic effect with TH in inducing metamorphosis. Moreover, in Senegalensis, the sole pre-otic CRH neuron number decreases during metamorphosis, further arguing that, at least in flatfish, cortisol is not involved in flatfish metamorphosis (PMID: 25575457).  

      We will do our best to improve the clarity of the revised manuscript to avoid any misunderstanding about our claims. However, we would like to point out the semantic shift in the reviewer first sentence: Indeed “being involved” is not the same as “cortisol alone does not induce”. In ref 20 the authors explicitly wrote that “Cortisol further enhanced the effects of both T4 and T3, but was ineffective in the absence of thyroid hormones” and in our view this indeed corresponds to ”being involved in metamorphosis”.

      We are not claiming that cortisol alone is involved in metamorphosis as the reviewer suggests, but simply that there is a possible involvement of cortisol together with TH in metamorphosis. We stand on this claim as we indeed observed an activation of corticoid pathway genes around D32, which is sufficient to say it is involved. We do agree that functional experiments will be needed to properly demonstrate the involvement of corticoids in grouper metamorphosis, but this was not possible in the current study as it would imply to set up a full grouper life cycle in lab conditions which is impossible for the scope of this manuscript.

      We also mentioned in the discussion that the role of corticoids in fish larval development is still debated, and we agree that this remains a contentious issue. We have clarified the Discussion on this point (lines 375-376, lines 439-464).

      We wrote that “There is contrasting evidence of communication between these two pathways during teleost fish larval development with some data suggesting a synergic and other an antagonistic relationship. In terms of synergy, an increase in cortisol level concomitantly with an increase in TH levels has been observed in flatfish [26], golden sea bream [64] and silver sea bream [65]. Cortisol was also shown to enhance in vitro the action of TH on fin ray resorption (phenomenon occurring during flatfish metamorphosis) in flounder[27]. It has also been shown that cortisol regulates local T3 bioavailability in the juvenile sole via regulation of deiodinase 2 in an organ-specific manner [66]. On the antagonistic side, it has been shown that experimentally induced hyperthyroidism in common carp decreases cortisol levels[67], whereas cortisol exposure decreases TH levels in European eel [68]. Given this scattered evidence, the existence of a crosstalk active during teleost larval development and metamorphosis has never been formally demonstrated. The results we obtained in grouper are clearly indicating that HPI axis is activated during both early development and metamorphosis and that cortisol synthesis is activated during early development. This may suggest that in some aspect, cortisol synthesis could work in concert with TH, as has been shown in several different contexts in amphibians [25], but functional experiments need to be conducted to confirm this hypothesis.” In the revised manuscript, we have also added the interesting case of the Senegal sole mentioned by the reviewer.

      In the last revision, we had also added that our results “brought a first insight into the potential role of corticoids in the metamorphosis of E. malabaricus and call for functional experiments directly testing a possible synergy” meaning that we clearly acknowledge that we are only revealing a hypothesis that remains to be tested. We later follow up with a discussion about the most novel observation and focus of our study, the increase in THs and cortisol during early development, which was unexpected and very intriguing. Again, these results suggest that there might be a link between the two, as has been shown in amphibians. This is typically the kind of results that should encourage more investigations into other fish species. Indeed, this has been pointed out by other authors and in particular by Bob Denver (probably the foremost expert on this topic) in Crespi and Denver 2012: “Elevation in HPA/I axis activity has been described prior to Metamorphosis in amphibians and fish, birth in mammals (reviewed in Crespi & Denver 2005a; Wada 2008)”. B. Denver also adds that: “Experiments in which GCs were elevated prior to metamorphosis or prior to hatching or birth (e.g. Weiss, Johnston & Moore 2007) or inhibited by treatments with GC synthesis blockers (e.g. metyrapone) or receptor antagonists (e.g. RU486, Glennemeir & Denver 2002) demonstrate that GCs play a causal role in precipitating these life-history transitions (also reviewed in Crespi & Denver 2005a; Wada 2008).” We believe the reviewer will be convinced by these elements coming from a colleague unanimously respected in the field. 

      Furthermore, the authors need to recognise that the transcriptomic analysis is whole-body and that HPA axis genes are upregulated, which does not mean they are involved in regulating the HPT axis. The authors do not show that in thyrotrophs, any CRH receptor is expressed or in any other HPT axis-relevant cells and that changes in these genes correlate with changes in TSH expression. An in-situ hybridisation experiment showing co-expression on thyrotrophs of HPA genes and TSH could be a good start. However, the best scenario would be conducting cortisol treatment experiments to see if this hormone affects grouper metamorphosis.

      We agree that functional experiments are needed to validate our hypothesis. As the early peaks of expression levels observed for many genes were very intriguing for us, we did carry out thyroid hormones and goitrogenic treatment on young grouper larvae to test their effect on the morphological changes. Unfortunately, such experiments, already tricky on metamorphosing larvae, are even more risky on such tiny individuals just after hatching and we encountered high mortality rates. We must add that because we cannot establish a full grouper life cycle under lab conditions, we have done these experiments in the context of a commercial husbandry system in Japan, which while excellent limits the scope of possible experiments. We were thus not able to provide functional validation of our hypothesis. Such experiments will be a full project in itself, requiring setting up a rearing system suitable for both larval survival and economical constraints related to drug treatments. We were further limited by the spawning times of the grouper in the operational aquaculture farm, which are limited to a short time during each year. So even if we strongly agree with the necessity of conducting such experiments, we think that this is not in the scope of the present paper, but something future research can explore.

      High TSH and Tg levels usually parallel whole-body TH levels during teleost metamorphosis. However, in this study, high Tg expression levels are only achieved at the juvenile stage, whereas high TSH is achieved at D32, and at the juvenile stage, they are already at their lowest levels.

      This is exactly our point. We observe two peaks in TSH expression, one at D3 and one at D32. The peak at D3 coincides with high thyroid hormone levels on the same day, and while we have not measured TH at D32, existing literature shows that there is a peak in TH during that time (e.g., de Jesus et al., 1998). Similarly, there is a small peak of Tg at D3. Our manuscript focused more on the upregulation of these genes at D3, which has not been reported before in the literature and raised the question of the role of TH so early in the larval development, outside of the metamorphosis period. 

      Regarding the respective levels of TSH and Tg, we first would like to add that their respective order of appearance before metamorphosis (TSH at D32, Tg after) is consistent with what we would expect. We agree however that the strong increase of Tg and TPO expression is later than expected. Therefore, we have added the following sentence in lines 212 to 216: “The respective order of appearance of TSH and Tg (TSH at D32, Tg after) is consistent with what we would expect but a bit later than expected given the morphologicl transformation. It would be interesting to revisit this in a future series of experiments, with tighter temporal sampling to study how gene expression and morphological transformation aligned.“.

      It is very difficult to conclude anything with the TH and cortisol levels measurements. The authors only measured up until D10, whereas they argue that metamorphosis occurs at D32. In this way, these measurements could be more helpful if they focus on the correct developmental time. The data is irrelevant to their hypothesis.

      We respectfully disagree with the reviewer, considering that 1) TH levels have already been investigated in groupers coinciding with pigmentation changes and fin rays resorption (Figure 4 in de Jesus et al, 1998), 2) there is also evidence in numerous fish species that TH level increase is concomitant with increase of TH related genes, and 3) we observed in our data an increase in the expression of TH related genes as well as pigmentation changes and fin rays resorption. Based on our experience in fish metamorphosis and the literature we can say confidently that those observations indicate that metamorphosis is occurring between D32 and the juvenile stage. This clearly shows that our inference is correct. Additionally, we would like to reemphasize that from our experience in several fish species transcriptomic data are more robust and precise than hormone measurements.

      However, as we were surprised by the activation of TH and corticoid pathway genes very early in the larval development (at D3), which is clearly outside of the metamorphosis period, we decided to measure TH and cortisol levels during this period of time to determine if whether or not there this surprising early activation was indeed corresponding to an increase in both TH and cortisol. As such observation has never been made in other teleost species (to our knowledge), and as we were wondering if gene activation was accompanied by hormonal increase, the measurements we did for TH and cortisol between D1 and D10 are relevant. In order to clarify our message further, we have changed some of the mentions of

      “metamorphosis” to “larval development” throughout the manuscript and added other improvements to avoid any confusion between the two periods we are studying: early larval development (between D1 and D10) and metamorphosis (between D32 and juvenile stage).  

      Moreover, as stated in the previous review, a classical sign of teleost metamorphosis is the upregulation of TSHb and Tg, which does not occur at D32 therefore, it is very hard for me to accept that this is the metamorphic stage. With the lack of TH measurements, I cannot agree with the authors. I think this has to be toned down and made clear in the manuscript that D32 might be a putative metamorphic climax but that several aspects of biology work against it. Moreover, in D10, the authors show the highest cortisol level and lowest T4 and T3 levels. These observations are irreconcilable, with cortisol enhancing or participating in TH-driven metamorphosis.

      We thank the reviewer for this comment, but we think that there might be a misunderstanding here. 

      (1) We clearly observed an increase of TSHb (that occurs between D18 and juvenile stage) and an increase of tg from D32 which coincide with the activation of other genes involved in TH pathway (dio2, dio3, and also a strong increase of TRb). All this and put in the context of what we know from previous grouper studies, clearly supports our conclusion that TH-regulated metamorphosis is starting at around D32 in grouper. We also observed morphological changes such as fin rays resorption and pigmentation changes between D32 and juvenile stage. Such morphological changes have already been associated as corresponding to metamorphosis in groupers (De Jesus et al 1998) as they occur during TH level increase, and they also happen to be under the control of TH in grouper (De Jesus et al 1998). Based on this study but also on studies (conducted on many other teleost species) showing that the increase of TH levels is always associated with an activation of TH pathway genes and morphological and pigmentation changes we concluded that metamorphosis of E. malabaricus occurs between D32 and juvenile stage. We have improved the clarity of the manuscript in several places to make sure that our conclusion is based on our transcriptomic and morphological data plus the available literature.

      (2) We clearly observed another activation of TH related gene earlier in the development (between D1 and D10, with a surge of trhrs, tg and tpo at D3. As this activation was very unexpected for us, we decided to focus the analysis of TH levels between D1 and D10 and very interestingly we observed high level of T4 at D3 indicating that THs are instrumental very precociously in the larval development of the malabar grouper which has never been shown before. We declared lines 224-225 that our “data reinforce the existence of two distinct periods of TH signalling activity, one early on at D3 and one late corresponding to classic metamorphosis at D32”. However, we agree that we could have been clearer and clearly explained that this early activation was very intriguing for us and that we wanted to investigate hormonal levels around that period. However, we never claimed anywhere in the manuscript

      that this early developmental period corresponds to metamorphosis. Something else is occurring and both TH and cortisol seem to be involved but further experiments need to be conducted to understand their role and their possible interaction. We have added corresponding statements in the abstract (lines 39-43) and discussion (lines 447 to 449).

      (3) Finally, regarding the comment about cortisol enhancing or participating in TH driven metamorphosis, our data clearly showed an activation of the corticoid pathway genes around metamorphosis (between D32 and juvenile stage) suggesting a potential implication of corticoids in metamorphosis, but we agree with the reviewer that further experiment are needed to test that. We never claimed that cortisol was enhancing or participating in metamorphosis, on the contrary we are “suggesting a possible interaction between TH and corticoid pathway during metamorphosis”. And we also say that our “results brought a first insight into the potential role of corticoids in the metamorphosis of E. malabaricus and call for functional experiments directly testing a possible synergy.” Nonetheless, we agree that some parts of our manuscript can be confusing in regards of cortisol synthesis during metamorphosis as we did not measure cortisol levels between D32 and juvenile stage. We have therefore made changes throughout the Introduction and Discussion to make this clearer.

      Given this, the authors should quantify whole-body TH levels throughout the entire developmental window considered to determine where the peak is observed and how it correlates with the other hormonal genes/systems in the analysis.

      We did not measure TH levels at later stages as it has already been measured during Epinephelus coioides metamorphosis and the morphological changes observed in this species around the TH peak corresponds to what we observed in Epinephelus malabaricus around the peak of expression of TH pathway genes (see De Jesus et al., 1998 General and Comparative Endocrinology, 112:10-16). The main focus of this manuscript is the novel observation of the existence of an early activation period observed at D3, and for which we needed TH levels to determine if they were involved in another early developmental process (not related to metamorphosis). Our hypothesis is that this early activation might be related to the growth of fin rays necessary to enhance floatability during the oceanic larval dispersal. As we may have arrived at the explanation of this hypothesis too rapidly without setting up the context well enough, we have made changes to the introduction and discussion.

      Even though this is a solid technical paper and the data obtained is excellent, the conclusions drawn by the authors are not supported by their data, and at least hormonal levels should be present in parallel to the transcriptomic data. Furthermore, toning down some affirmations or even considering the different hypotheses available that are different from the ones suggested would be very positive.

      We thank the reviewer for acknowledging the solidity of the method of our paper and the quality of the results. We agree that there were several parts where our message was unclear. We have addressed these points in the revised version of the manuscript to make sure there is no more confusion between the two distinct periods we studied in this paper (early larval development and metamorphosis). We also made sure that our claims about TH/corticoids interaction during both periods remain hypothetical as we cannot yet, despite trials, sustain them with functional experiment.

    1. Reviewer #2 (Public Review):

      Summary:

      In this work, the authors present a biologically plausible, efficient E-I spiking network model and study various aspects of the model and its relation to experimental observations. This includes a derivation of the network into two (E-I) populations, the study of single-neuron perturbations and lateral-inhibition, the study of the effects of adaptation and metabolic cost, and considerations of optimal parameters. From this, they conclude that their work puts forth a plausible implementation of efficient coding that matches several experimental findings, including feature-specific inhibition, tight instantaneous balance, a 4 to 1 ratio of excitatory to inhibitory neurons, and a 3 to 1 ratio of I-I to E-I connectivity strength. It thus argues that some of these observations may come as a direct consequence of efficient coding.

      Strengths:

      While many network implementations of efficient coding have been developed, such normative models are often abstract and lacking sufficient detail to compare directly to experiments. The intention of this work to produce a more plausible and efficient spiking model and compare it with experimental data is important and necessary in order to test these models.

      In rigorously deriving the model with real physical units, this work maps efficient spiking networks onto other more classical biophysical spiking neuron models. It also attempts to compare the model to recent single-neuron perturbation experiments, as well as some long-standing puzzles about neural circuits, such as the presence of separate excitatory and inhibitory neurons, the ratio of excitatory to inhibitory neurons, and E/I balance. One of the primary goals of this paper, to determine if these are merely biological constraints or come from some normative efficient coding objective, is also important.

      Though several of the observations have been reported and studied before (see below), this work arguably studies them in more depth, which could be useful for comparing more directly to experiments.

      Weaknesses:

      Though the text of the paper may suggest otherwise, many of the modeling choices and observations found in the paper have been introduced in previous work on efficient spiking models, thereby making this work somewhat repetitive and incremental at times. This includes the derivation of the network into separate excitatory and inhibitory populations, discussion of physical units, comparison of voltage versus spike-timing correlations, and instantaneous E/I balance, all of which can be found in one of the first efficient spiking network papers (Boerlin et al. 2013), as well as in subsequent papers. Metabolic cost and slow adaptation currents were also presented in a previous study (Gutierrez & Deneve 2019). Though it is perfectly fine and reasonable to build upon these previous studies, the language of the text gives them insufficient credit.

      Furthermore, the paper makes several claims of optimality that are not convincing enough, as they are only verified by a limited parameter sweep of single parameters at a time, are unintuitive and may be in conflict with previous findings of efficient spiking networks. This includes the following. Coding error (RMSE) has a minimum at intermediate metabolic cost (Figure 5B), despite the fact that intuitively, zero metabolic cost would indicate that the network is solely minimizing coding error and that previous work has suggested that additional costs bias the output. Coding error also appears to have a minimum at intermediate values of the ratio of E to I neurons (effectively the number of I neurons) and the number of encoded variables (Figures 6D, 7B). These both have to do with the redundancy in the network (number of neurons for each encoded variable), and previous work suggests that networks can code for arbitrary numbers of variables provided the redundancy is high enough (e.g., Calaim et al. 2022). Lastly, the performance of the E-I variant of the network is shown to be better than that of a single cell type (1CT: Figure 7C, D). Given that the E-I network is performing a similar computation as to the 1CT model but with more neurons (i.e., instead of an E neuron directly providing lateral inhibition to its neighbor, it goes through an interneuron), this is unintuitive and again not supported by previous work. These may be valid emergent properties of the E-I spiking network derived here, but their presentation and description are not sufficient to determine this.

      Alternatively, the methodology of the model suggests that ad hoc modeling choices may be playing a role. For example, an arbitrary weighting of coding error and metabolic cost of 0.7 to 0.3, respectively, is chosen without mention of how this affects the results. Furthermore, the scaling of synaptic weights appears to be controlled separately for each connection type in the network (Table 1), despite the fact that some of these quantities are likely linked in the optimal network derivation. Finally, the optimal threshold and metabolic constants are an order of magnitude larger than the synaptic weights (Table 1). All of these considerations suggest one of the following two possibilities. One, the model has a substantial number of unconstrained parameters to tune, in which case more parameter sweeps would be necessary to definitively make claims of optimality. Or two, parameters are being decoupled from those constrained by the optimal derivation, and the optima simply corresponds to the values that should come out of the derivation.

    2. Author response:

      eLife assessment

      This study offers a useful treatment of how the population of excitatory and inhibitory neurons integrates principles of energy efficiency in their coding strategies. The analysis provides a comprehensive characterisation of the model, highlighting the structured connectivity between excitatory and inhibitory neurons. However, the manuscript provides an incomplete motivation for parameter choices. Furthermore, the work is insufficiently contextualized within the literature, and some of the findings appear overlapping and incremental given previous work.

      We thank the Reviewers and the Reviewing Editor for taking time to provide extremely valuable suggestions and comments, which will help us to substantially improve our paper. In what follows we summarize our current plan to improve the paper taking up on their suggestions.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: Koren et al. derive and analyse a spiking network model optimised to represent external signals using the minimum number of spikes. Unlike most prior work using a similar setup, the network includes separate populations of excitatory and inhibitory neurons. The authors show that the optimised connectivity has a like-to-like structure, leading to the experimentally observed phenomenon of feature competition. They also characterise the impact of various (hyper)parameters, such as adaptation timescale, ratio of excitatory to inhibitory cells, regularisation strength, and background current. These results add useful biological realism to a particular model of efficient coding. However, not all claims seem fully supported by the evidence. Specifically, several biological features, such as the ratio of excitatory to inhibitory neurons, which the authors claim to explain through efficient coding, might be contingent on arbitrary modelling choices. In addition, earlier work has already established the importance of structured connectivity for feature competition. A clearer presentation of modelling choices, limitations, and prior work could improve the manuscript.

      Thanks for these insights and for this summary of our work.

      Major comments:

      (1) Much is made of the 4:1 ratio between excitatory and inhibitory neurons, which the authors claim to explain through efficient coding. I see two issues with this conclusion: (i) The 4:1 ratio is specific to rodents; humans have an approximate 2:1 ratio (see Fang & Xia et al., Science 2022 and references therein); (ii) the optimal ratio in the model depends on a seemingly arbitrary choice of hyperparameters, particularly the weighting of encoding error versus metabolic cost. This second concern applies to several other results, including the strength of inhibitory versus excitatory synapses. While the model can, therefore, be made consistent with biological data, this requires auxiliary assumptions.

      We will describe better the ratio of numbers of E and I neurons found in real data, as suggested. The first submission already contained an analysis of how this ratio of neuron numbers depends on the weighting of the loss of E and I neurons and on the relative weighting of the encoding error vs the metabolic cost in the loss function (see Fig 6E). We will make sure that these results are suitably expanded and better emphasized in revision. We will also include new analysis of dependence of optimal parameters on the relative weighting of encoding error vs metabolic cost in the loss function when studying other parameters (namely: noise intensity, metabolic constant, ratio of mean I-I to E-I connectivity, time constants of single E and I neurons).

      (2) A growing body of evidence supports the importance of structured E-I and I-E connectivity for feature selectivity and response to perturbations. For example, this is a major conclusion from the Oldenburg paper (reference 62 in the manuscript), which includes extensive modelling work. Similar conclusions can be found in work from Znamenskiy and colleagues (experiments and spiking network model; bioRxiv 2018, Neuron 2023 (ref. 82)), Sadeh & Clopath (rate network; eLife, 2020), and Mackwood et al. (rate network with plasticity; eLife, 2021). The current manuscript adds to this evidence by showing that (a particular implementation of) efficient coding in spiking networks leads to structured connectivity. The fact that this structured connectivity then explains perturbation responses is, in the light of earlier findings, not new.

      We agree that the main contribution of our manuscript in this respect is to show how efficient coding in spiking networks can lead to structured connectivity similar to those proposed in the above papers. We apologize if this was not clear enough in the previous version. We will make it clearer in revision.  We nevertheless think it useful to report the effects of perturbations within this network because the structure derived in our network is not identical to those studied in the above paper, and because these results give information about how lateral inhibition works in this network. Thus, we will keep presenting it in the revised version, although we will de-emphasize and simplify its presentation to give more emphasis to the novelty of the derivation of this connectivity rule from the principles of efficient coding.

      (3) The model's limitations are hard to discern, being relegated to the manuscript's last and rather equivocal paragraph. For instance, the lack of recurrent excitation, crucial in neural dynamics and computation, likely influences the results: neuronal time constants must be as large as the target readout (Figure 4), presumably because the network cannot integrate the signal without recurrent excitation. However, this and other results are not presented in tandem with relevant caveats.

      We will improve the Limitations paragraph in Discussion, and also anticipate caveats in tandem with results when needed, as suggested.

      (4) On repeated occasions, results from the model are referred to as predictions claimed to match the data. A prediction is a statement about what will happen in the future - but most of the "predictions" from the model are actually findings that broadly match earlier experimental results, making them "postdictions".

      This distinction is important: compared to postdictions, predictions are a much stronger test because they are falsifiable. This is especially relevant given (my impression) that key parameters of the model were tweaked to match the data.

      We will better distinguish between pre- and post-dictions  in revision.

      Reviewer #2 (Public Review):

      Summary: In this work, the authors present a biologically plausible, efficient E-I spiking network model and study various aspects of the model and its relation to experimental observations. This includes a derivation of the network into two (E-I) populations, the study of single-neuron perturbations and lateral-inhibition, the study of the effects of adaptation and metabolic cost, and considerations of optimal parameters. From this, they conclude that their work puts forth a plausible implementation of efficient coding that matches several experimental findings, including feature-specific inhibition, tight instantaneous balance, a 4 to 1 ratio of excitatory to inhibitory neurons, and a 3 to 1 ratio of I-I to E-I connectivity strength. It thus argues that some of these observations may come as a direct consequence of efficient coding.

      Strengths:

      While many network implementations of efficient coding have been developed, such normative models are often abstract and lacking sufficient detail to compare directly to experiments. The intention of this work to produce a more plausible and efficient spiking model and compare it with experimental data is important and necessary in order to test these models.

      In rigorously deriving the model with real physical units, this work maps efficient spiking networks onto other more classical biophysical spiking neuron models. It also attempts to compare the model to recent single-neuron perturbation experiments, as well as some long-standing puzzles about neural circuits, such as the presence of separate excitatory and inhibitory neurons, the ratio of excitatory to inhibitory neurons, and E/I balance. One of the primary goals of this paper, to determine if these are merely biological constraints or come from some normative efficient coding objective, is also important.

      Though several of the observations have been reported and studied before (see below), this work arguably studies them in more depth, which could be useful for comparing more directly to experiments.

      Thanks for these insights and for the kind words of appreciation of the strengths of our work.

      Weaknesses:

      Though the text of the paper may suggest otherwise, many of the modeling choices and observations found in the paper have been introduced in previous work on efficient spiking models, thereby making this work somewhat repetitive and incremental at times. This includes the derivation of the network into separate excitatory and inhibitory populations, discussion of physical units, comparison of voltage versus spike-timing correlations, and instantaneous E/I balance, all of which can be found in one of the first efficient spiking network papers (Boerlin et al. 2013), as well as in subsequent papers. Metabolic cost and slow adaptation currents were also presented in a previous study (Gutierrez & Deneve 2019). Though it is perfectly fine and reasonable to build upon these previous studies, the language of the text gives them insufficient credit.

      We will improve the text to make sure that credit to previous studies is more precisely and more clearly given.

      Furthermore, the paper makes several claims of optimality that are not convincing enough, as they are only verified by a limited parameter sweep of single parameters at a time, are unintuitive and may be in conflict with previous findings of efficient spiking networks. This includes the following. Coding error (RMSE) has a minimum at intermediate metabolic cost (Figure 5B), despite the fact that intuitively, zero metabolic cost would indicate that the network is solely minimizing coding error and that previous work has suggested that additional costs bias the output. Coding error also appears to have a minimum at intermediate values of the ratio of E to I neurons (effectively the number of I neurons) and the number of encoded variables (Figures 6D, 7B). These both have to do with the redundancy in the network (number of neurons for each encoded variable), and previous work suggests that networks can code for arbitrary numbers of variables provided the redundancy is high enough (e.g., Calaim et al. 2022). Lastly, the performance of the E-I variant of the network is shown to be better than that of a single cell type (1CT: Figure 7C, D). Given that the E-I network is performing a similar computation as to the 1CT model but with more neurons (i.e., instead of an E neuron directly providing lateral inhibition to its neighbor, it goes through an interneuron), this is unintuitive and again not supported by previous work. These may be valid emergent properties of the E-I spiking network derived here, but their presentation and description are not sufficient to determine this.

      We are addressing this issue in two ways. First, we will present results of joint sweeps of variations of pairs of parameters whose joint variations are expected to influence optimality in a way that cannot be understood varying one parameter at a time. Namely we plan to vary jointly the noise intensity and the metabolic constant, as well as the ratio of E to I neuron numbers and the ratio of mean I-I to E-I connectivity. Second, we will individuate a reasonable/realistic range of possible variations of each individual parameter and then perform a Monte Carlo search for the optimal point within this range, and compare the so-obtained results with those obtained from the understanding gained from varying one or two parameters at a time.  We will also add the suggested citation to Calaim et al. 2022 in regard to the points discussed above.

      We will improve the comparison between the Excitatory-Inhibitory and the 1-Cell-Type model (see reply to the suggestions of Referee 3 for more details).

      Alternatively, the methodology of the model suggests that ad hoc modeling choices may be playing a role. For example, an arbitrary weighting of coding error and metabolic cost of 0.7 to 0.3, respectively, is chosen without mention of how this affects the results. Furthermore, the scaling of synaptic weights appears to be controlled separately for each connection type in the network (Table 1), despite the fact that some of these quantities are likely linked in the optimal network derivation. Finally, the optimal threshold and metabolic constants are an order of magnitude larger than the synaptic weights (Table 1). All of these considerations suggest one of the following two possibilities. One, the model has a substantial number of unconstrained parameters to tune, in which case more parameter sweeps would be necessary to definitively make claims of optimality. Or two, parameters are being decoupled from those constrained by the optimal derivation, and the optima simply corresponds to the values that should come out of the derivation.

      In the previously submitted manuscript we presented both the encoding error and the metabolic cost separately as a function of the parameters, so that readers could get an understanding of how stable optimal parameters would be to the change of the relative weighting of encoding error and metabolic cost. We will improve this work by adding the suggested calculations to provide quantitative measures of the dependence of the optimal network parameters and configurations on this relative weighting.

      Reviewer #3 (Public Review):

      Summary: In their paper the authors tackle three things at once in a theoretical model: how can spiking neural networks perform efficient coding, how can such networks limit the energy use at the same time, and how can this be done in a more biologically realistic way than previous work?

      They start by working from a long-running theory on how networks operating in a precisely balanced state can perform efficient coding. First, they assume split networks of excitatory (E) and inhibitory (I) neurons. The E neurons have the task to represent some lower dimensional input signal, and the I neurons have the task to represent the signal represented by the E neurons. Additionally, the E and I populations should minimize an energy cost represented by the sum of all spikes. All this results in two loss functions for the E and I populations, and the networks are then derived by assuming E and I neurons should only spike if this improves their respective loss. This results in networks of spiking neurons that live in a balanced state, and can accurately represent the network inputs.

      They then investigate in-depth different aspects of the resulting networks, such as responses to perturbations, the effect of following Dale's law, spiking statistics, the excitation (E)/inhibition (I) balance, optimal E/I cell ratios, and others. Overall, they expand on previous work by taking a more biological angle on the theory and showing the networks can operate in a biologically realistic regime.

      Strengths:

      (1) The authors take a much more biological angle on the efficient spiking networks theory than previous work, which is an essential contribution to the field.

      (2) They make a very extensive investigation of many aspects of the network in this context, and do so thoroughly.

      (3) They put sensible constraints on their networks, while still maintaining the good properties these networks should have.

      Thanks for this summary and for these kind words of appreciation of the strengths of our work.

      Weaknesses:

      (1) The paper has somewhat overstated the significance of their theoretical contributions, and should make much clearer what aspects of the derivations are novel. Large parts were done in very similar ways in previous papers. Specifically: the split into E and I neurons was also done in Boerlin et al (2008) and in Barrett et al (2016). Defining the networks in terms of realistic units was already done by Boerlin et al (2008). It would also be worth it to discuss Barrett et al (2016) specifically more, as there they also use split E/I networks and perform biologically relevant experiments.

      We will improve the text to make sure that credit to previous studies is more precisely and more clearly given.

      (2) It is not clear from an optimization perspective why the split into E and I neurons and following Dale's law would be beneficial. While the constraints of Dale's law are sensible (splitting the population in E and I neurons, and removing any non-Dalian connection), they are imposed from biology and not from any coding principles. A discussion of how this could be done would be much appreciated, and in the main text, this should be made clear.

      We indeed removed non-Dalian connections because having only connections respecting Dale’s law is a major constraint for biological plausibility. Our logic was to consider efficient coding within the space of networks that satisfy this (and other) biological plausibility constraints. We did not intend to claim that removing the non-Dalian connections was the result of an analytical optimization. However, to get better insights into how Dale’s Law constrains or influences the design of efficient networks, we added a comparison of the coding properties of networks that either do or do not satisfy Dale’s law. We apologize if this was not sufficiently clear in the previous version and we will clarify this in revision. 

      (3) Related to the previous point, the claim that the network with split E and I neurons has a lower average loss than a 1 cell-type (1-CT) network seems incorrect to me. Only the E population coding error should be compared to the 1-CT network loss, or the sum of the E and I populations (not their average). In my author recommendations, I go more in-depth on this point.

      We will perform the suggested detailed comparisons between the network loss in the 1CT-model and E-I model and then revise or refine conclusions if and as needed, according to the results we will obtain.

      (4) While the paper is supposed to bring the balanced spiking networks they consider in a more experimentally relevant context, for experimental audiences I don't think it is easy to follow how the model works, and I recommend reworking both the main text and methods to improve on that aspect.

      We will try to make the presentation of the model more accessible to a non-computational audience.

      Assessment and context: Overall, although much of the underlying theory is not necessarily new, the work provides an important addition to the field. The authors succeeded well in their goal of making the networks more biologically realistic, and incorporating aspects of energy efficiency. For computational neuroscientists, this paper is a good example of how to build models that link well to experimental knowledge and constraints, while still being computationally and mathematically tractable. For experimental readers, the model provides a clearer link between efficient coding spiking networks to known experimental constraints and provides a few predictions.

      Thanks for these kind words. We will make sure that these points emerge more clearly and in a more accessible way from the revised paper.

    1. Reviewer #2 (Public Review):

      Summary:

      This paper attempts to examine how rare, extreme events impact decision-making in rats. The paper used an extensive behavioural study with rats to evaluate how the probability and magnitude of outcomes impact preference. The paper, however, provides limited evidence for the conclusions because the design did not allow for the isolation of the rare, extreme events in choice. There are many confounding factors, including the outcome variance and presence of less-rare, and less-extreme outcomes in the same conditions.

      Strengths:

      (1) The major strength of the paper is the significant volume of behavioural data with a reasonable sample size of 20 rats.

      (2) The paper attempts to examine losses with rats (a notoriously tricky problem with non-human animals) by substituting time-outs as a proxy for losses. This allows for mixed gambles that have both gain and loss possible outcomes.

      (3) The paper integrates both a behavioural and a modelling approach to get at the factors that drive decision-making.

      (4) The paper takes seriously the question of what it means for an event to be rare, pushing to less frequent outcomes than usually used with non-human animals.

      Weaknesses:

      (1) The primary issue with this work is that the primary experimental manipulation fails to isolate the rare, extreme events in choice. As I understand the task, in all the conditions with a rare extreme event (e.g., 80 pellets with probability epsilon), there is also a less-rare, less-extreme event (e.g., 12 pellets with probability 5). In addition, the variance differs between the two conditions. So, any impact attributable to the rare, extreme event could be due to the less rare event or due difference in the variance. The design does not support the conclusions. Finally, by deliberately confounding rarity and extremity, the design does not allow for assessing the impact of either aspect.

      (2) The RL-modelling work also fails to show a specific impact of the rare extreme event. As best as I can understand Eq 2, the model provides a free parameter that adds a bonus to the value of either the two options with high-variance gains (A and V in the paper) or to the two options with high-variance losses (F and V in the paper). This parameter only depends on whether this option could have possibly yielded the rare, extreme outcome (i.e., based on the generative probability) and was not connected to its actual appearance. That makes it a free parameter that just bumps up (or down) the probability of selecting a pair of options. In the case of the "black swan" or high-variance loss conditions, this seems very much like a loss aversion parameter, but an additive one instead of a multiplicative one.

      (3) The paper presented the methods and results with lots of neologisms and fairly obscure jargon (e.g., fragility, total REE sensitivity). That made it very hard to decipher exactly what was done and what was found. For example, on p. 4, the use of concave and convex was very hard to decipher; the text even has to repeat itself 3 times (i.e., "to repeat" and "in other words") and is still not clear. It would be much clearer (and probably accurate) to say that the options varied along the variance dimension, separately for gains and losses. Option A was low-variance gains and losses. Option B was low-variance losses and high-variance gains. Option C was high-variance losses and low-variance gains, and Option D was high-variance losses and gains. That tells much more clearly what the animals experienced without the reader having to master a set of new terminologies around fragility and robustness, which brings a set of theoretical assumptions unnecessarily into the description of the experimental design. In terms of results, "Black Swan" avoidance is more simply known as risk aversion for losses.

      (4) Were the probabilities shuffled or truly random (seem to be fixed sequences, so neither)? What were the experienced probabilities? Given the fixed sequences, these experienced ("ex-post") probabilities, could differ tremendously from the scheduled ("ex ante") probabilities. It's quite possible that an animal never experienced the rare, extreme event for a specific option. It's even possible (if they only picked it on the 10th/60th choices by chance), that they only ever experienced that rare extreme event. This cannot be known given the information provided. The Supplemental info on p.55 only gives gross overall numbers but does not indicate what the rats experienced for each choice/option-which is what matters here. A simple table that indicates for each of the 4 options, how often they were selected, and how often the animals experienced each of the 6-8 possible outcome would make it much clearer how closely the experience matched the planned outcomes. In addition, by restricting the rare outcome to either the 10th or 60th activations in a session, these are not random. Did the animals learn this association?

      (5) The choice data are only presented in an overprocessed fashion with a sum and a difference (in both figures and tables). The basic datum (probability/frequency of selecting each of the 4 options) is not provided directly, even if it can theoretically be inferred from the sum and the difference. To understand what the rats actually do, we first need to see how often they select each option, without these transformations.

      (6) There is insufficient detail provided on the inferential statistical tests (e.g., no degrees of freedom or effect sizes), and only limited information on exactly what tests were run and how (bootstrapping, but little detail). Without code or data (only summary information is provided in the supplement), this is difficult to evaluate. In addition, the studies seem not to be pre-registered in any way, leaving many researchers with degrees of freedom. Were any alternative analysis pipelines attempted? Similarly, there were many sub-groupings of the animals, and then comparisons between them - were these post-hoc?

      (7) On p. 17, there is an attempt to look at the impact of a rare, extreme event by plotting a measure of preference for the 10 trials before/after the rare, extreme event. In the human literature, the main impact of experiencing a rare, extreme event is what is known as the wavy recency effect (See Plonsky et al. 2015 in Psych Review for example). What this means is that there tends to be some immediate negative recency (e.g., avoiding a rare gain) followed by positive recency (e.g., chasing the rare gain). Using a 10-trial window would thus obscure any impact of this rare, extreme event. An analysis that looks at a time course trial-by-trial could reveal any impact.

      (8) As I understood the method (p. 31), the assignment of options to physical locations was not random or counterbalanced, but deliberately biased to have one of the options in the preferred location. This would seem to create a bias towards a particular option and a bias away from the other options, which confounds the preference data in subsequent analyses.

      (9) Are delays really losses? This is a big assumption. Magnitude and delay are different aspects of experience, which are not necessarily commensurable and can be manipulated independently. And, for the model, how were these delays transformed into outcomes for the model? Eq 1 skips over that. Is there an assumption of linearity? In addition, I was not wholly clear if the delays meant fewer trials in a session or if the delays merely extended the session and meant longer delays until the next choice period.

      (10) The paper does not sufficiently accurately represent the existing literature on human risky decision-making (with and without rare events). Here are a few examples of misrepresented and/or missing literature:<br /> -Most studies on decision-making do not only rely on p > 10% (as per p. 2). Maybe that is true with animals, but not a fair statement generally. Some do, and some don't. There is substantial literature looking at rarer events in both descriptions (most famously with Kahneman & Tversky's work), but also in experience (which is alluded to in reference 19). That reference is not only about the situation when choices are not repeated (e.g. the sampling paradigm), but also partial feedback and full-feedback situations.

      The literature on learning from rewarding experiences in humans is obliquely referenced but not really incorporated. In short, there are two main findings - firstly people underweight rare events in experience; second, people overweight extreme outcomes in experience (both contrary to description). Some related papers are cited, but their content is not used or incorporated into the logic of the manuscript.

      One recent study systematically examined rarity and extremity in human risky decision-making, which seems very relevant here: Mason et al. (2024). Rare and extreme outcomes in risky choice. Psychonomic Bulletin & Review, 31, 1301-1308.

      There is a fair bit of research on the human perception of the risk of rare events (including from experience) and important events like climate. One notable paper is Newell et al (2015) in Nature Climate Change.

    1. Reviewer #1 (Public Review):

      Summary:

      In this study, the authors identified and described the transcriptional trajectories leading to CMs during early mouse development, and characterized the epigenetic landscapes that underlie early mesodermal lineage specification.

      The authors identified two transcriptomic trajectories from a mesodermal population to cardiomyocytes, the MJH and PSH trajectories. These trajectories are relevant to the current model for the First Heart Field (FHF) and the Second Heart Field (SHF) differentiation. Then, the authors characterized both gene expression and enhancer activity of the MJH and PSH trajectories, using a multiomics analysis. They highlighted the role of Gata4, Hand1, Foxf1, and Tead4 in the specification of the MJH trajectory. Finally, they performed a focused analysis of the role of Hand1 and Foxf1 in the MJH trajectory, showing their mutual regulation and their requirement for cardiac lineage specification.

      Strengths:

      The authors performed an extensive transcriptional and epigenetic analysis of early cardiac lineage specification and differentiation which will be of interest to investigators in the field of cardiac development and congenital heart disease. The authors considered the impact of the loss of Hand1 and Foxf1 in-vitro and Hand1 in-vivo.

      Weaknesses:

      The authors used previously published scRNA-seq data to generate two described transcriptomic trajectories.

      (1) Details of the re-analysis step should be added, including a careful characterization of the different clusters and maker genes, more details on the WOT analysis, and details on the time stamp distribution along the different pseudotimes. These details would be important to allow readers to gain confidence that the two major trajectories identified are realistic interpretations of the input data.

      The authors have also renamed the cardiac trajectories/lineages, departing from the convention applied in hundreds of papers, making the interpretation of their results challenging.

      (2) The concept of "reverse reasoning" applied to the Waddington-OT package for directional mass transfer is not adequately explained. While the authors correctly acknowledged Waddington-OT's ability to model cell transitions from ancestors to descendants (using optimal transport theory), the justification for using a "reverse reasoning" approach is missing. Clarifying the rationale behind this strategy would be beneficial.

      (3) As the authors used the EEM cell cluster as a starting point to build the MJH trajectory, it's unclear whether this trajectory truly represents the cardiac differentiation trajectory of the FHF progenitors:<br /> - This strategy infers that the FHF progenitors are mixed in the same cluster as the extra-embryonic mesoderm, but no specific characterization of potential different cell populations included in this cluster was performed to confirm this.

      - The authors identified the EEM cluster as a Juxta-cardiac field, without showing the expression of the principal marker Mab21l2 per cluster and/or on UMAPs.

      - As the FHF progenitors arise earlier than the Juxta-cardiac field cells, it must be possible to identify an early FHF progenitor population (Nkx2-5+; Mab21l2-) using the time stamp. It would be more accurate to use this FHF cluster as a starting point than the EEM cluster to infer the FHF cardiac differentiation trajectory.

      These concerns call into question the overall veracity of the trajectory analysis, and in fact, the discrepancies with prior published heart field trajectories are noted but the authors fail to validate their new interpretation. Because their trajectories are followed for the remainder of the paper, many of the interpretations and claims in the paper may be misleading. For example, these trajectories are used subsequently for annotation of the multiomic data, but any errors in the initial trajectories could result in errors in multiomic annotation, etc, etc.

      (4) As mentioned in the discussion, the authors identified the MJH and PSH trajectories as non-overlapping. But, the authors did not discuss major previously published data showing that both FHF and SHF arise from a common transcriptomic progenitor state in the primitive streak (DOI: 10.1126/science.aao4174; DOI: 10.1007/s11886-022-01681-w). The authors should consider and discuss the specifics of why they obtained two completely separate trajectories from the beginning, how these observations conflict with prior published work, and what efforts they have made at validation.

      (5) Figures 1D and E are confusing, as it's unclear why the authors selected only cells at E7.0. Also, panels 1D 'Trajectory' and 'Pseudotime' suggest that the CM trajectory moves from the PSH cells to the MJH. This result is confusing, and the authors should explain this observation.

      (6) Regarding the PSH trajectory, it's unclear how the authors can obtain a full cardiac differentiation trajectory from the SHF progenitors as the SHF-derived cardiomyocytes are just starting to invade the heart tube at E8.5 (DOI: 10.7554/eLife.30668).

      The above notes some of the discrepancies between the author's trajectory analysis and the historical cardiac development literature. Overall, the discrepancies between the author's trajectory analysis and the historical cardiac development literature are glossed over and not adequately validated.

      (7) The authors mention analyzing "activated/inhibited genes" from Peng et al. 2019 but didn't specify when Peng's data was collected. Is it temporally relevant to the current study? How can "later stage" pathway enrichment be interpreted in the context of early-stage gene expression?

      (8) Motif enrichment: cluster-specific DAEs were analyzed for motifs, but the authors list specific TFs rather than TF families, which is all that motif enrichment can provide. The authors should either list TF families or state clearly that the specific TFs they list were not validated beyond motifs.

      (9) The core regulatory network is purely predictive. The authors again should refrain from language implying that the TFs in the CRN have any validated role.

      Regarding the in vivo analysis of Hand1 CKO embryos, Figures 6 and 7:

      (10) How can the authors explain the presence of a heart tube in the E9.5 Hand1 CKO embryos (Figure 6B) if, following the authors' model, the FHF/Juxta-cardiac field trajectory is disrupted by Hand1 CKO? A more detailed analysis of the cardiac phenotype of Hand1 CKO embryos would help to assess this question.

      (11) The cell proportion differences observed between Ctrl and Hand1 CKO in Figure 6D need to be replicated and an appropriate statistical analysis must be performed to definitely conclude the impact of Hand1 CKO on cell proportions.

      (12) The in-vitro cell differentiations are unlikely to recapitulate the complexity of the heart fields in-vivo, but they are analyzed and interpreted as if they do.

      (13) The schematic summary of Figure 7F is confusing and should be adjusted based on the following considerations:<br /> (a) the 'Wild-type' side presents 3 main trajectories (SHF, Early HT and JCF), but uses a 2-color code and the authors described only two trajectories everywhere else in the article (aka MJH and PSH). It's unclear how the SHF trajectory (blue line) can contribute to the Early HT, when the Early HT is supposed to be FHF-associated only (DOI: 10.7554/eLife.30668). As mentioned previously in Major comment 3., this model suggests a distinction between FHF and JCF trajectories, which is not investigated in the article.<br /> (b) the color code suggests that the MJH (FHF-related) trajectory will give rise to the right ventricle and outflow tract (green line), which is contrary to current knowledge.

      Minor comments:

      (1) How genes were selected to generate Figure 1F? Is this a list of top differentially expressed genes over each pseudotime and/or between pseudotimes?

      (2) Regarding Figure 1G, it's unclear how inhibited signaling can have an increased expression of underlying genes over pseudotimes. Can the authors give more details about this analysis and results?

      (3) How do the authors explain the visible Hand1 expression in Hand1 CKO in Figure S7C 'EEM markers'? Is this an expected expression in terms of RNA which is not converted into proteins?

      (4) The authors do not address the potential presence of doublets (merged cells) within their newly generated dataset. While they mention using "SCTransform" for normalization and artifact removal, it's unclear if doublet removal was explicitly performed.

    1. Vous allez maintenant pouvoir mettre en pratique ce que vous venez d'apprendre sur les liens. Exceptionnellement pour cet exercice, la page  a-propos.html  a été déplacée dans un dossier  dossier-demo  afin que vous puissiez tester votre compréhension des liens relatifs.

      Comment je fais pour ouvrir le dossier-demo sur le Visual studio code! Please Help

    1. Every time you want to update your app, save the source file. When you do that, Streamlit detects if there is a change and asks you whether you want to rerun your app. Choose "Always rerun" at the top-right of your screen to automatically update your app every time you change its source code.

      Save and auto update

    1. heating blocked by energy optimising

      Error code: E2 - Time out of external power optimising system - Heating blocked by the external energy optimising system for longer than 2 minutes.

    2. change polarity of mains supply

      Error code: PoL (in time display / CHnG (in Cabinet display)

      • Change Polarity Phase / Neutral (only gas units) (Flame sensing electrode requires mains supply to be the correct polarity)
    1. Reviewer #1 (Public Review):

      Summary:

      In this paper, the authors had 2 aims:

      (1) Measure macaques' aversion to sand and see if its' removal is intentional, as it is likely in an unpleasurable sensation that causes tooth damage.

      (2) Show that or see if monkeys engage in suboptimal behavior by cleaning foods beyond the point of diminishing returns, and see if this was related to individual traits such as sex and rank, and behavioral technique.

      They attempted to achieve these aims through a combination of geochemical analysis of sand, field experiments, and comparing predictions to an analytical model.

      The authors' conclusions were that they verified a long-standing assumption that monkeys have an aversion to sand as it contains many potentially damaging fine-grained silicates and that removing it via brushing or washing is intentional.

      They also concluded that monkeys will clean food for longer than is necessary, i.e. beyond the point of diminishing returns, and that this is rank-dependent.

      High and low-ranking monkeys tended not to wash their food, but instead over-brushed it, potentially to minimize handling time and maximize caloric intake, despite the long-term cumulative costs of sand.

      This was interpreted through the *disposable soma hypothesis*, where dominants maximize immediate needs to maintain rank and increase reproductive success at the potential expense of long-term health and survival.

      Strengths:

      The field experiment seemed well-designed, and their quantification of physical and mineral properties of quartz particles (relative to human detection thresholds) seemed good relative to their feret diameter and particle circularity (to a reviewer who is not an expert in sand). The *Rank Determination* and *Measuring Sand* sections were clear.

      In achieving Aim 1, the authors validated a commonly interpreted, but unmeasured function, of macaque and primate behavior-- a key study/finding in primate food processing and cultural transmission research.

      I commend their approach in developing a quantitative model to generate predictions to compare to empirical data for their second aim.

      This is something others should strive for.

      I really appreciated the historical context of this paper in the introduction, and found it very enjoyable and easy to read.

      I do think that interpreting these results in the context of the *disposable soma hypothesis* and the potential implications in the *paleolithic matters* section about interpreting dental wear in the fossil record are worthwhile.

      Weaknesses:

      Most of the weaknesses in this paper lie in statistical methods, visualization, and a missing connection to the marginal value theorem and optimal foraging theory.

      I think all of these weaknesses are solvable.

      The data and code were not submitted. Therefore I was unable to better understand the simulation or to provide useful feedback on the stats, the connection between the two, and its relevance to the broader community.

      (1) Statistics:

      (a) AIC and outcome distributions

      The use of AIC for hierarchical models, and models with different outcome distributions brought up several concerns.

      The authors appear to use AIC to help inform which model to use for their primary analyses in Tables S1 and S2. It is unclear which of these models are analyzed in Tables S3 and S4.

      AIC should not be used on hierarchical models, and something like WAIC (or DIC which has other caveats) would be more appropriate.

      Also, using information criteria on Mixture Models like Negative Binomials (aka Gamma-Poisson) should be done with extreme caution, or not at all, as the values are highly sensitive to the data structure.

      Some researchers also say that information criteria should not be used to compare models with different outcome distributions - although this might be slightly less of a concern as all of your models are essentially variations on a Poisson GLM.

      Discussion on this can be found in McElreath Statistical Rethinking (Section 12.1.3) and Gelman et al. BDA3 (Chapter 7).

      Choosing an outcome distribution, based on your understanding of the data generating process is a better approach than relying on AIC, especially in this context where it can be misleading.

      (b) Zeros

      I also had some concerns about how zeros were treated in the models.

      In lines 217-218, they mentioned that "if a monkey consumed a cucumber slice without brushing or washing it, the zero-second duration was included in both GLMMs."

      This zero implies no processing and should not be treated as a length 0 duration of processing.

      This suggests to me that a zero-inflated poisson or zero-inflated negative binomial, would be the best choice for modelling the data as it is essentially a 2-step process:<br /> (i) Do they process the cucumber at all?<br /> (ii) If so do they wash or brush, and how is this predicted by rank and treatment?

      (2) Absence of Links to Foraging Theory

      Optimal cleaning time model: the optimality model was not well described including how it was programmed. Better description and documentation of this model, along with code (Mathematica judging from the plot?) is needed.

      There seems to be much conceptual and theoretical overlap with foraging theory models that were not well described - namely the *marginal value theorem (Charnov (1976), Krebs et al. (1974),) and its subsequent advances* (see https://doi.org/10.1016/j.jaa.2016.03.002 and https://doi.org/10.1086/283929 for examples).

      In the suggestions, I attached the R code where I replicated their model to show that it is *mathematically identical to the marginal value theorem*. This was not mentioned at all in the text or citations.

      This is a well-studied literature since the 1970's and there is a history of studies that compare behavior to an optimality model and fail (or do find) instances where animals conform or diverge with its predictions (https://doi.org/10.1146/annurev.es.15.110184.002515). This link should be highlighted, and interpreting it in that theoretical context will make it more broadly applicable to behavioral ecologists.

      The data was subsetted to include instances where there were < 3 monkeys present to avoid confounds of rank, but it is important to know that optimal behavior might vary by individual, and can change in a social context depending on rank (see https://doi.org/10.1016/j.tree.2022.06.010). Discussion of this, and further exploration of it in the data would strengthen the overall contribution of this manuscript to the field, but I understand that the researchers wish to avoid that in this paper for it is a complex topic, which this dataset is uniquely suited to address.

      (3) Interpretation and validity of model relative to data

      In lines 92-102, they present summary statistics (I think) showing that time spent brushing and washing is consistent with washing or brushing to remove sand.

      In the **mitigating tooth wear** section (line 73) and corresponding Figure S1 showing surface sand removed, more detail about how these numbers were acquired, and statistical modelling, is needed.

      This is important as uncertainty and measurement error around these metrics are key to the central finding and interpretation of Aim 2 in this paper.

      It appears that the researchers simulated the monkey's brushing and washing behaviors (similar to https://doi.org/10.1007/s10071-009-0230-3).

      How many researchers simulated monkey behavior and how many times?

      What are the repeat points in Figure S1?

      What is the number of trials or number of people?

      This effect appears stronger for washing than brushing as well - if so, why?

      More info about this data, and the uncertainty in this is important, as it is key to the second central claim of this paper.

      The estimates of removing between 76% +/- 7 and 93% +/- 4 of sand (visualized in Figure S1), are statistical estimates.

      I would find the argument more convincing if after propagating for the uncertainty in handling in sand removal rates, and the corresponding half-saturation constants, if this processing for food is too long, after accounting for diminishing returns held true.<br /> It is very possible that after accounting for uncertainty and variation in handling time and removal rates, the second result may not hold true.

      I was not able to convince myself of this via reanalysis as the description of the data in the text was not enough to simulate it myself.

      Essentially, this would imply that in Figure 3 the predicted value would have some variation around it (informed by boundary conditions of time being positive, and percents having floors and ceilings) and that a range of predicting cleaning times (optimal give-up times) would be plotted in Figure 3.

      This could be accomplished in a Bayesian approach, Or by simply plotting multiple predictions given some confidence interval around, c and h.

    1. the

      Dear Writer, I understand that your concept is the idea of code-switching and the problems that come with that. You expose the reader to this idea through experiences in your life. It is relevant that you provided why people choose to code switch, but also why it is important to be yourself. Your main idea doesn’t solely rely on your experiences, but instead the experiences of many over generations. A few things that you have done very well are using real experiences to show a bigger idea. This helps show the many aspects of your topic. It also adds a direct human connection, so even if the reader hasn’t directly experienced code-switching, they understand what it entails and why it is done. You do a great job sustaining your main idea throughout the paper, while also adding more aspects to it. A great example of this is when you state how people with differences can co-exist and they don’t need to suppress their differences. And that code-switching could be used to effectively communicate with different people, but it isn’t a permanent change. It’s also great that you then developed this idea further by tying it to Dartmouth. Something that you can improve on are possibly provide an example pertaining specifically to Dartmouth. This would possibly allow a reader to make more connections. Another thing is possibly develop the future actions of Dartmouth further. Meaning that you can explain specific steps the school can take in the future that would allow more of this “good” code-switching, while also greatly diminishing the need for students to code-switch in a “bad” way. Finally, you can possibly break up a few sentences that use a lot of commas. They do make sense and have a logical progression. But by making them shorter statements, they may hold more weight. Before reading I was unaware as to the extent that code-switching played in today’s world, strictly because it is something that hasn’t affected me very much. However, now I am more aware of what code-switching means and how it can affect people’s lives.

    1. Especially users working with Microsoft Office 365 and therefore Outlook noticed very often that login is not possible. Upon closer analysis, it was found that the MS/Bing crawlers are particularly persistent and repeatedly call the reset links, regardless of server configuration or the like. For this reason, a text field was implemented in the backend via the Drupal State API, in which selected user agents (always one per line) can be entered. These are checked by 'Shy One Time', in case of a hit a redirect to the LogIn form with a 302 status code occurs, the reset link is not invalidated.
    1. Scientific research’s dependence on falsifiability allows for great confidence in the information that it produces. Typically, by the time information is accepted by the scientific community, it has been tested repeatedly. window.document.addEventListener("DOMContentLoaded", function (event) { const toggleBodyColorMode = (bsSheetEl) => { const mode = bsSheetEl.getAttribute("data-mode"); const bodyEl = window.document.querySelector("body"); if (mode === "dark") { bodyEl.classList.add("quarto-dark"); bodyEl.classList.remove("quarto-light"); } else { bodyEl.classList.add("quarto-light"); bodyEl.classList.remove("quarto-dark"); } } const toggleBodyColorPrimary = () => { const bsSheetEl = window.document.querySelector("link#quarto-bootstrap"); if (bsSheetEl) { toggleBodyColorMode(bsSheetEl); } } toggleBodyColorPrimary(); const icon = ""; const anchorJS = new window.AnchorJS(); anchorJS.options = { placement: 'right', icon: icon }; anchorJS.add('.anchored'); const isCodeAnnotation = (el) => { for (const clz of el.classList) { if (clz.startsWith('code-annotation-')) { return true; } } return false; } const clipboard = new window.ClipboardJS('.code-copy-button', { text: function(trigger) { const codeEl = trigger.previousElementSibling.cloneNode(true); for (const childEl of codeEl.children) { if (isCodeAnnotation(childEl)) { childEl.remove(); } } return codeEl.innerText; } }); clipboard.on('success', function(e) { // button target const button = e.trigger; // don't keep focus button.blur(); // flash "checked" button.classList.add('code-copy-button-checked'); var currentTitle = button.getAttribute("title"); button.setAttribute("title", "Copied!"); let tooltip; if (window.bootstrap) { button.setAttribute("data-bs-toggle", "tooltip"); button.setAttribute("data-bs-placement", "left"); button.setAttribute("data-bs-title", "Copied!"); tooltip = new bootstrap.Tooltip(button, { trigger: "manual", customClass: "code-copy-button-tooltip", offset: [0, -8]}); tooltip.show(); } setTimeout(function() { if (tooltip) { tooltip.hide(); button.removeAttribute("data-bs-title"); button.removeAttribute("data-bs-toggle"); button.removeAttribute("data-bs-placement"); } button.setAttribute("title", currentTitle); button.classList.remove('code-copy-button-checked'); }, 1000); // clear code selection e.clearSelection(); }); var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//); var mailtoRegex = new RegExp(/^mailto:/); var filterRegex = new RegExp('/' + window.location.host + '/'); var isInternal = (href) => { return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href); } // Inspect non-navigation links and adorn them if external var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool)'); for (var i=0; i<links.length; i++) { const link = links[i]; if (!isInternal(link.href)) { // undo the damage that might have been done by quarto-nav.js in the case of // links that we want to consider external if (link.dataset.originalHref !== undefined) { link.href = link.dataset.originalHref; } // target, if specified link.setAttribute("target", "_blank"); if (link.getAttribute("rel") === null) { link.setAttribute("rel", "noopener"); } } } function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) { const config = { allowHTML: true, maxWidth: 500, delay: 100, arrow: false, appendTo: function(el) { return el.parentElement; }, interactive: true, interactiveBorder: 10, theme: 'quarto', placement: 'bottom-start', }; if (contentFn) { config.content = contentFn; } if (onTriggerFn) { config.onTrigger = onTriggerFn; } if (onUntriggerFn) { config.onUntrigger = onUntriggerFn; } window.tippy(el, config); } const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]'); for (var i=0; i<noterefs.length; i++) { const ref = noterefs[i]; tippyHover(ref, function() { // use id or data attribute instead here let href = ref.getAttribute('data-footnote-href') || ref.getAttribute('href'); try { href = new URL(href).hash; } catch {} const id = href.replace(/^#\/?/, ""); const note = window.document.getElementById(id); if (note) { return note.innerHTML; } else { return ""; } }); } const xrefs = window.document.querySelectorAll('a.quarto-xref'); const processXRef = (id, note) => { // Strip column container classes const stripColumnClz = (el) => { el.classList.remove("page-full", "page-columns"); if (el.children) { for (const child of el.children) { stripColumnClz(child); } } } stripColumnClz(note) if (id === null || id.startsWith('sec-')) { // Special case sections, only their first couple elements const container = document.createElement("div"); if (note.children && note.children.length > 2) { container.appendChild(note.children[0].cloneNode(true)); for (let i = 1; i < note.children.length; i++) { const child = note.children[i]; if (child.tagName === "P" && child.innerText === "") { continue; } else { container.appendChild(child.cloneNode(true)); break; } } if (window.Quarto?.typesetMath) { window.Quarto.typesetMath(container); } return container.innerHTML } else { if (window.Quarto?.typesetMath) { window.Quarto.typesetMath(note); } return note.innerHTML; } } else { // Remove any anchor links if they are present const anchorLink = note.querySelector('a.anchorjs-link'); if (anchorLink) { anchorLink.remove(); } if (window.Quarto?.typesetMath) { window.Quarto.typesetMath(note); } // TODO in 1.5, we should make sure this works without a callout special case if (note.classList.contains("callout")) { return note.outerHTML; } else { return note.innerHTML; } } } for (var i=0; i<xrefs.length; i++) { const xref = xrefs[i]; tippyHover(xref, undefined, function(instance) { instance.disable(); let url = xref.getAttribute('href'); let hash = undefined; if (url.startsWith('#')) { hash = url; } else { try { hash = new URL(url).hash; } catch {} } if (hash) { const id = hash.replace(/^#\/?/, ""); const note = window.document.getElementById(id); if (note !== null) { try { const html = processXRef(id, note.cloneNode(true)); instance.setContent(html); } finally { instance.enable(); instance.show(); } } else { // See if we can fetch this fetch(url.split('#')[0]) .then(res => res.text()) .then(html => { const parser = new DOMParser(); const htmlDoc = parser.parseFromString(html, "text/html"); const note = htmlDoc.getElementById(id); if (note !== null) { const html = processXRef(id, note); instance.setContent(html); } }).finally(() => { instance.enable(); instance.show(); }); } } else { // See if we can fetch a full url (with no hash to target) // This is a special case and we should probably do some content thinning / targeting fetch(url) .then(res => res.text()) .then(html => { const parser = new DOMParser(); const htmlDoc = parser.parseFromString(html, "text/html"); const note = htmlDoc.querySelector('main.content'); if (note !== null) { // This should only happen for chapter cross references // (since there is no id in the URL) // remove the first header if (note.children.length > 0 && note.children[0].tagName === "HEADER") { note.children[0].remove(); } const html = processXRef(null, note); instance.setContent(html); } }).finally(() => { instance.enable(); instance.show(); }); } }, function(instance) { }); } let selectedAnnoteEl; const selectorForAnnotation = ( cell, annotation) => { let cellAttr = 'data-code-cell="' + cell + '"'; let lineAttr = 'data-code-annotation="' + annotation + '"'; const selector = 'span[' + cellAttr + '][' + lineAttr + ']'; return selector; } const selectCodeLines = (annoteEl) => { const doc = window.document; const targetCell = annoteEl.getAttribute("data-target-cell"); const targetAnnotation = annoteEl.getAttribute("data-target-annotation"); const annoteSpan = window.document.querySelector(selectorForAnnotation(targetCell, targetAnnotation)); const lines = annoteSpan.getAttribute("data-code-lines").split(","); const lineIds = lines.map((line) => { return targetCell + "-" + line; }) let top = null; let height = null; let parent = null; if (lineIds.length > 0) { //compute the position of the single el (top and bottom and make a div) const el = window.document.getElementById(lineIds[0]); top = el.offsetTop; height = el.offsetHeight; parent = el.parentElement.parentElement; if (lineIds.length > 1) { const lastEl = window.document.getElementById(lineIds[lineIds.length - 1]); const bottom = lastEl.offsetTop + lastEl.offsetHeight; height = bottom - top; } if (top !== null && height !== null && parent !== null) { // cook up a div (if necessary) and position it let div = window.document.getElementById("code-annotation-line-highlight"); if (div === null) { div = window.document.createElement("div"); div.setAttribute("id", "code-annotation-line-highlight"); div.style.position = 'absolute'; parent.appendChild(div); } div.style.top = top - 2 + "px"; div.style.height = height + 4 + "px"; div.style.left = 0; let gutterDiv = window.document.getElementById("code-annotation-line-highlight-gutter"); if (gutterDiv === null) { gutterDiv = window.document.createElement("div"); gutterDiv.setAttribute("id", "code-annotation-line-highlight-gutter"); gutterDiv.style.position = 'absolute'; const codeCell = window.document.getElementById(targetCell); const gutter = codeCell.querySelector('.code-annotation-gutter'); gutter.appendChild(gutterDiv); } gutterDiv.style.top = top - 2 + "px"; gutterDiv.style.height = height + 4 + "px"; } selectedAnnoteEl = annoteEl; } }; const unselectCodeLines = () => { const elementsIds = ["code-annotation-line-highlight", "code-annotation-line-highlight-gutter"]; elementsIds.forEach((elId) => { const div = window.document.getElementById(elId); if (div) { div.remove(); } }); selectedAnnoteEl = undefined; }; // Handle positioning of the toggle window.addEventListener( "resize", throttle(() => { elRect = undefined; if (selectedAnnoteEl) { selectCodeLines(selectedAnnoteEl); } }, 10) ); function throttle(fn, ms) { let throttle = false; let timer; return (...args) => { if(!throttle) { // first call gets through fn.apply(this, args); throttle = true; } else { // all the others get throttled if(timer) clearTimeout(timer); // cancel #2 timer = setTimeout(() => { fn.apply(this, args); timer = throttle = false; }, ms); } }; } // Attach click handler to the DT const annoteDls = window.document.querySelectorAll('dt[data-target-cell]'); for (const annoteDlNode of annoteDls) { annoteDlNode.addEventListener('click', (event) => { const clickedEl = event.target; if (clickedEl !== selectedAnnoteEl) { unselectCodeLines(); const activeEl = window.document.querySelector('dt[data-target-cell].code-annotation-active'); if (activeEl) { activeEl.classList.remove('code-annotation-active'); } selectCodeLines(clickedEl); clickedEl.classList.add('code-annotation-active'); } else { // Unselect the line unselectCodeLines(); clickedEl.classList.remove('code-annotation-active'); } }); } const findCites = (el) => { const parentEl = el.parentElement; if (parentEl) { const cites = parentEl.dataset.cites; if (cites) { return { el, cites: cites.split(' ') }; } else { return findCites(el.parentElement) } } else { return undefined; } }; var bibliorefs = window.document.querySelectorAll('a[role="doc-biblioref"]'); for (var i=0; i<bibliorefs.length; i++) { const ref = bibliorefs[i]; const citeInfo = findCites(ref); if (citeInfo) { tippyHover(citeInfo.el, function() { var popup = window.document.createElement('div'); citeInfo.cites.forEach(function(cite) { var citeDiv = window.document.createElement('div'); citeDiv.classList.add('hanging-indent'); citeDiv.classList.add('csl-entry'); var biblioDiv = window.document.getElementById('ref-' + cite); if (biblioDiv) { citeDiv.innerHTML = biblioDiv.innerHTML; } popup.appendChild(citeDiv); }); return popup.innerHTML; }); } } });

      I'm not surprised that scientific reserch depends on falsifiability for reliable results. In the link I applied talks about how the theory of gravity was repeatedly tested and challenged before becoming widely accepted. https://www.cfa.harvard.edu/research/science-field/einsteins-theory-gravitation

    1. Transclusion facilitates modular design (using the "single source of truth" model, whether in data, code, or content): a resource is stored once and distributed for reuse in multiple documents. Updates or corrections to a resource are then reflected in any referencing documents.
    1. First, the complexity of modern federal criminal law, codified in several thousand sections of the United States Code and the virtually infinite variety of factual circumstances that might trigger an investigation into a possible violation of the law, make it difficult for anyone to know, in advance, just when a particular set of statements might later appear (to a prosecutor) to be relevant to some such investigation.

      If the federal government had access to every email you’ve ever written and every phone call you’ve ever made, it’s almost certain that they could find something you’ve done which violates a provision in the 27,000 pages of federal statues or 10,000 administrative regulations. You probably do have something to hide, you just don’t know it yet.

    1. Reviewer #2 (Public Review):

      Summary:

      The manuscript by Kelbert et al. presents results on the involvement of the yeast transcription factor Sfp1 in the stabilisation of transcripts whose synthesis it stimulates. Sfp1 is known to affect the synthesis of a number of important cellular transcripts, such as many of those that code for ribosomal proteins. The hypothesis that a transcription factor can remain bound to the nascent transcript and affect its cytoplasmic half-life is attractive. However, the association of Sfp1 with cytoplasmic transcripts remains to be validated, as explained in the following comments:

      A two-hybrid based assay for protein-protein interactions identified Sfp1, a transcription factor known for its effects on ribosomal protein gene expression, as interacting with Rpb4, a subunit of RNA polymerase II. Classical two-hybrid experiments depend on the presence of the tested proteins in the nucleus of yeast cells, suggesting that the observed interaction occurs in the nucleus. Unfortunately, the two-hybrid method cannot determine whether the interaction is direct or mediated by nucleic acids. The revised version of the manuscript now states that the observed interaction could be indirect.

      To understand to which RNA Sfp1 might bind, the authors used an N-terminally tagged fusion protein in a cross-linking and purification experiment. This method identified 264 transcripts for which the CRAC signal was considered positive and which mostly correspond to abundant mRNAs, including 74 ribosomal protein mRNAs or metabolic enzyme-abundant mRNAs such as PGK1. The authors did not provide evidence for the specificity of the observed CRAC signal, in particular what would be the background of a similar experiment performed without UV cross-linking. This is crucial, as Figure S2G shows very localized and sharp peaks for the CRAC signal, often associated with over-amplification of weak signal during sequencing library preparation.

      In a validation experiment, the presence of several mRNAs in a purified SFP1 fraction was measured at levels that reflect the relative levels of RNA in a total RNA extract. Negative controls showing that abundant mRNAs not found in the CRAC experiment were clearly depleted from the purified fraction with Sfp1 would be crucial to assess the specificity of the observed protein-RNA interactions (to complement Fig. 2D). The CRAC-selected mRNAs were enriched for genes whose expression was previously shown to be upregulated upon Sfp1 overexpression (Albert et al., 2019). The presence of unspliced RPL30 pre-mRNA in the Sfp1 purification was interpreted as a sign of co-transcriptional assembly of Sfp1 into mRNA, but in the absence of valid negative controls, this hypothesis would require further experimental validation. Also, whether the fraction of mRNA bound by Sfp1 is nuclear or cytoplasmic is unclear.

      To address the important question of whether co-transcriptional assembly of Spf1 with transcripts could alter their stability, the authors first used a reporter system in which the RPL30 transcription unit is transferred to vectors under different transcriptional contexts, as previously described by the Choder laboratory (Bregman et al. 2011). While RPL30 expressed under an ACT1 promoter was barely detectable, the highest levels of RNA were observed in the context of the native upstream RPL30 sequence when Rap1 binding sites were also present. Sfp1 showed better association with reporter mRNAs containing Rap1 binding sites in the promoter region. Removal of the Rap1 binding sites from the reporter vector also led to a drastic decrease in reporter mRNA levels. Co-purification of reporter RNA with Sfp1 was only observed when Rap1 binding sites were included in the reporter. Negative controls for all the purification experiments might be useful.

      To complement the biochemical data presented in the first part of the manuscript, the authors turned to the deletion or rapid depletion of SFP1 and used labelling experiments to assess changes in the rate of synthesis, abundance and decay of mRNAs under these conditions. An important observation was that in the absence of Sfp1, mRNAs encoding ribosomal protein genes not only had a reduced synthesis rate, but also an increased degradation rate. This important observation needs careful validation, as genomic run-on experiments were used to measure half-lives, and this particular method was found to give results that correlated poorly with other measures of half-life in yeast (e.g. Chappelboim et al., 2022 for a comparison). As an additional validation, a temperature shift to 42{degree sign}C was used to show that , for specific ribosomal protein mRNA, the degradation was faster, assuming that transcription stops at that temperature. It would be important to cite and discuss the work from the Tollervey laboratory showing that a temperature shift to 42{degree sign}C leads to a strong and specific decrease in ribosomal protein mRNA levels, probably through an accelerated RNA degradation (Bresson et al., Mol Cell 2020, e.g. Fig 5E). Finally, the conclusion that mRNA deadenylation rate is altered in the absence of Sfp1, is difficult to assess from the presented results (Fig. 3D).

      The effects of SFP1 on transcription were investigated by chromatin purification with Rpb3, a subunit of RNA polymerase, and the results were compared with synthesis rates determined by genomic run-on experiments. The decrease in polII presence on transcripts in the absence of SFP1 was not accompanied by a marked decrease in transcript output, suggesting an effect of Sfp1 in ensuring robust transcription and avoiding RNA polymerase backtracking. To further investigate the phenotypes associated with the depletion or absence of Sfp1, the authors examined the presence of Rpb4 along transcription units compared to Rpb3. An effect of spf1 deficiency was that this ratio, which decreased from the start of transcription towards the end of transcripts, increased slightly. To what extent this result is important for the main message of the manuscript is unclear.

      Suggestions: a) please clearly indicate in the figures when they correspond to reanalyses of published results. b) In table S2, it would be important to mention what the results represent and what statistics were used for the selection of "positive" hits.

      Strengths:

      - Diversity of experimental approaches used.<br /> - Validation of large-scale results with appropriate reporters.

      Weaknesses:

      - Lack of controls for the CRAC results and lack of negative controls for the co-purification experiments that were used to validate specific mRNA targets potentially bound by Sfp1.<br /> - Several conclusions are derived from complex correlative analyses that fully depend on the validity of the aforementioned Sfp1-mRNA interactions.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      This manuscript builds upon the authors' previous work on the cross-talk between transcription initiation and post-transcriptional events in yeast gene expression. These prior studies identified an mRNA 'imprinting' phenomenon linked to genes activated by the Rap1 transcription factor (TF), a surprising role for the Sfp1 TF in promoting RNA polymerase II (RNAPII) backtracking, and a role for the non-essential RNAPII subunits Rpb4/7 in the regulation of mRNA decay and translation. Here the authors aimed to extend these observations to provide a more coherent picture of the role of Sfp1 in transcription initiation and subsequent steps in gene expression. They provide evidence for (1) a physical interaction between Sfp1 and Rpb4, (2) Sfp1 binding and stabilization of mRNAs derived from genes whose promoters are bound by both Rap1 and Sfp1 and (3) an effect of Sfp1 on Rpb4 binding or conformation during transcription elongation. 

      Strengths: 

      This study provides evidence that a TF (yeast Sfp1), in addition to stimulating transcription initiation, can at some target genes interact with their mRNA transcripts and promote their stability. Sfp1 thus has a positive effect on two distinct regulatory steps. Furthermore, evidence is presented indicating that strong Sfp1 mRNA association requires both Rap1 and Sfp1 promoter binding and is increased at a sequence motif near the polyA track of many target mRNAs. Finally, they provide compelling evidence that Sfp1-bound mRNAs have higher levels of RNAPII backtracking and altered Rpb4 association or conformation compared to those not bound by Sfp1. 

      Weaknesses: 

      The Sfp1-Rpb4 association is supported only by a two-hybrid assay that is poorly described and lacks an important control. Furthermore, there is no evidence that this interaction is direct, nor are the interaction domains on either protein identified (or mutated to address function). 

      Indeed, our two hybrid, immunoprecipitation and imaging results do not allow us to conclusively discern whether the interaction between Rpb4 and Sfp1 is direct or indirect. While the interaction holds significance, we consider the direct versus indirect distinction to be of secondary importance in the context of this paper. In the current text we indicated that 'our two hybrid, immunoprecipitation and imaging results do not differentiate between a direct or indirect interactions' (see page 6, sentences highlighted in blue)

      The contention that Sfp1 nuclear export to the cytoplasm is transcription-dependent is not well supported by the experiments shown, which are not properly described in the text and are not accompanied by any primary data. 

      This section has been re-written for better clarity (see page 7). We note that this assay was originally developed and published by Lee, M. S., M. Henry, and P. A. Silver in their 1996 paper in G&D and has since been reported in numerous subsequent studies. Reassuringly, our conclusion is bolstered by the observation that Sfp1 binds to Pol II transcripts co-transcriptionally, suggesting that Sfp1 is exported in the context of the mRNA.

      The presence of Sfp1 in P-bodies is of unclear relevance and the authors do not ask whether Sfp1-bound mRNAs are also present in these condensates. 

      P-bodies consist of both RNA and proteins (reviewed in doi: 10.1021/acs.biochem.7b01162). The significance of this experiment lies in its contribution to further confirming the co-localization of Sfp1 with mRNAs and Rpb4. This observation could also yield valuable insights for future investigations into the role of Sfp1.

      Further analysis of Sfp1-bound mRNAs would be of interest, particularly to address the question of whether those from ribosomal protein genes and other growth-related genes that are known to display Sfp1 binding in their promoters are regulated (either stabilized or destabilized) by Sfp1. 

      Fig. 4A, C and D show that RP mRNAs become destabilized in sfp1Δ cells.

      The authors need to discuss, and ideally address, the apparent paradox that their previous findings showed that Rap1 acts to destabilize its downstream transcripts, i.e. that it has the opposite effect of Sfp1 shown here. 

      We would like to thank Reviewer 1 for this valuable comment. In the revised paper, we delved into our hypothesis suggesting that Rap1 is likely responsible for regulating the imprinting of other proteins, that, in turn, lead to the destabilization of mRNAs, such as Rpb4. See blue paragraph in page 20.

      Finally, recent studies indicate that the drugs used here to measure mRNA stability induce a strong stress response accompanied by rapid and complex effects on transcription. Their relevance to mRNA stability in unstressed cells is questionable. 

      Half-lives were determined mainly by the GRO analysis of optimally proliferating cells. This  method does not requires any drug or stressful treatment.  The results obtained by this method were consistent with those obtained after thiolutin addition. Using both methods, we discovered that disruption of Sfp1 results in substantial mRNA destabilization. Nevertheless, in our revised manuscript, we show results obtained by subjecting cells to a temperature shift to 42°C, a natural method to inhibit transcription. This approach to determine half-lives has been previously reported in our publications, such as Lotan et al. (2005, 2007) and Goler Baron et al. (2008). This may rule out effects of the drug on half-lives. Indeed, this assay clearly determine HL under heat stress. Thus it can clearly demonstrate that, at least during heat shock, Sfp1 stabilizes mRNAs. Since the results are similar to those obtained by the GRO method at 30oC, we concluded that Sfp1 stabilizes mRNA under optimal and hot conditions.

      Reviewer #2 (Public Review): 

      Summary: 

      The manuscript by Kelbert et al. presents results on the involvement of the yeast transcription factor Sfp1 in the stabilisation of transcripts whose synthesis it stimulates. Sfp1 is known to affect the synthesis of a number of important cellular transcripts, such as many of those that code for ribosomal proteins. The hypothesis that a transcription factor can remain bound to the nascent transcript and affect its cytoplasmic half-life is attractive, but the methods used to demonstrate the half-life effects and the association of Sfp1 with cytoplasmic transcripts remain to be fully validated, as explained in my comments on the results below: 

      Comments on methodology and results: 

      (1) A two-hybrid-based assay for protein-protein interactions identified Sfp1, a transcription factor known for its effects on ribosomal protein gene expression, as interacting with Rpb4, a subunit of RNA polymerase II. Classical two-hybrid experiments depend on the presence of the tested proteins in the nucleus of yeast cells, suggesting that the observed interaction occurs in the nucleus. Unfortunately, the two-hybrid method cannot determine whether the interaction is direct or mediated by nucleic acids. 

      Indeed, our two hybrid, immunoprecipitation and imaging results do not allow us to conclusively discern whether the interaction between Rpb4 and Sfp1 is direct or indirect. While the interaction holds significance, we consider the direct versus indirect distinction to be of secondary importance in the context of this paper. In the current text we indicated that 'our two hybrid, immunoprecipitation and imaging results do not differentiate between a direct or indirect interactions' (see page 6)

      (2) Inactivation of nup49, a component of the nuclear pore complex, resulted in the redistribution of GFP-Sfp1 into the cytoplasm at the temperature non-permissive for the nup49-313 strain, suggesting that GFP-Sfp1 is a nucleo-cytoplasmic shuttling protein. This observation confirmed the dynamic nature of the nucleo-cytoplasmic distribution of Sfp1. For example, a similar redistribution to the cytoplasm was previously reported following rapamycin treatment and under starvation (Marion et al., PNAS 2004). In conjunction with the observation of an interaction with Rpb4, the authors observed slower nuclear import kinetics for GFP-Sfp1 in the absence of Rpb4 when cells were transferred to a glucose-containing medium after a period of starvation. Since the redistribution of GFP-Sfp1 was abolished in an rpb1-1/nup49-313 double mutant, the authors concluded that Sfp1 localisation to the cytoplasm depends on transcription. The double mutant yeast cells may show a variety of non-specific effects at the restrictive temperature, and whether transcription is required for Sfp1 cytoplasmic localisation remains incompletely demonstrated. 

      We agree with Reviewer 2 that any heat inactivation of a temperature-sensitive (ts) protein can lead to non-specific effects. It is evident that nup49-313 does not prevent Sfp1 export to the cytoplasm. In the case of rpb1-1, these non-specific effects are expected due to transcriptional arrest, which can eventually result in a reduction in protein content. However, this process takes some time, while the impact on export is more rapid. It is worth noting that this assay was developed and previously published by Pam Silver (Henry and Silver G&D 1996) and has been reported in many subsequent papers. Importantly, our conclusion is supported by the observation that Sfp1 binds both nascent RNA (co-transcriptionally) and mature mRNA (cytoplasmic). These observations, along with the reduced mRNA export upon transcription blocking, are consistent with our proposal that Sfp1 is exported in association with mRNA.

      (3) Under starvation conditions, which led to the presence of Sfp1 in the cytoplasm and have previously been correlated with a decrease in the transcription of Sfp1 target genes, the authors observed that a plasmid-based expressed GFP-Sfp1 accumulated in cytoplasmic foci. These foci were also labelled by P-body markers such as Dcp2 and Lsm1. The quality of the microscopic images provided does not allow to determine whether Rpb4-RFP colocalises with GFP-Sfp1. 

      The submitted PDF figure is of low quality. We believe that high quality figure of the final submission is convincing. 

      (4) To understand to which RNA Sfp1 might bind, the authors used an N-terminally tagged fusion protein in a cross-linking and purification experiment. This method identified 264 transcripts for which the CRAC signal was considered positive and which mostly correspond to abundant mRNAs, including 74 ribosomal protein mRNAs or metabolic enzyme-abundant mRNAs such as PGK1. The authors did not provide evidence for the specificity of the observed CRAC signal, in particular, what would be the background of a similar experiment performed without UV cross-linking. In a validation experiment, the presence of several mRNAs in a purified SFP1 fraction was measured at levels that reflect the relative levels of RNA in a total RNA extract. Negative controls showing that abundant mRNAs not found in the CRAC experiment were clearly depleted from the purified fraction with Sfp1 would be crucial to assessing the specificity of the observed protein-RNA interactions. The NON-CRAC+ selected mRNAs were enriched for genes whose expression was previously shown to be upregulated upon Sfp1 overexpression (Albert et al., 2019). The presence of unspliced RPL30 pre-mRNA in the Sfp1 purification was interpreted as a sign of co-transcriptional assembly of Sfp1 into mRNA, but in the absence of valid negative controls, this hypothesis would require further experimental validation.

      We would like to thank Reviewer 2 for bringing this issue up, as it helped us to clarify it in the revised paper.

      First, we emphasized in the Discussion that many CRAC+ genes do not fall into the category of highly transcribed genes. Please see more detailed discussion below.

      Secondly, we examined various features of the 264 genes - classified as CRAC+ - to estimate their specificity and biological significance. Our various experiments revealed that the CRAC+ genes represent a distinct group with many unique features.

      The biological significance of the 264 CRAC+ mRNAs was demonstrated by various experiments; all are inconsistent with technical flaws. In fact, all the experiments and analyses that we have pursued indicate the unique nature of the CRAC+ genes. Some examples are:

      (1) Fig. 2a and B show that most reads of CRAC+ mRNA were mapped to specific location – close the pA sites.

      (2) Fig. 2C shows that most reads of CRAC+ mRNA were mapped to specific RNA motif located near the 3’ ends of the mRNAs.

      (3) Most RiBi CRAC+ promoter contain Rap1 binding sites (p= 1.9x10-22), whiles the vast majority of RiBi non-CRAC+  promoters do not. (Fig. 3C).

      (4) Fig. 4A shows that RiBi CRAC+ mRNAs become destabilized due to Sfp1 deletion, whereas RiBi non-CRAC+ mRNAs do not. Fig. 4B shows similar results due to Sfp1 depletion.

      (5) Fig. 6B shows that the impact of Sfp1 on backtracking is substantially higher for CRAC+ than for non-CRAC+ genes. This is most clearly visible in RiBi genes.

      (6) Fig. 7A shows that the Sfp1-dependent changes along the transcription units is substantially more rigorous for CRAC+ than for non-CRAC+.

      (7) In Fig. S4B, the chromatin binding profile of Sfp1 is shown to be different for CRAC+ and non-CRAC+ genes.

      Taken together, the many unique features, in fact, any feature that we examined, indicate the specificity and significance of this group, demonstrating that our CRAC results are biologically significant.

      Most importantly, these genes do not all fall into the category of highly transcribed genes.  On the contrary, as depicted in Figure 6A (green dots), it is evident that CRAC+ genes exhibit a diverse range of Rpb3 ChIP and GRO signals. Furthermore, as illustrated in Figure 7A, when comparing CRAC+ to Q1 (the most highly transcribed genes), it becomes evident that the Rpb4/Rpb3 profile of CRAC+ genes behaves differently from the Q1 group. Evidently, despite the heterogeneous transcription of CRAC+ genes (as mentioned above), the Rpb4/Rpb3 profile decreases more substantially than that of the highly transcribed genes (Q1).  Moreover, despite similar expression levels among all RiBi mRNAs, only a portion of them binds Sfp1.

      Thus, all our results indicate that CRAC+ genes represent biologically significant group, irrespective of the expression of it members. In response to this comment, we included a new paragraph discussing the validity of our conclusions. See page 18, blue paragraph.

      (5) To address the important question of whether co-transcriptional assembly of Spf1 with transcripts could alter their stability, the authors first used a reporter system in which the RPL30 transcription unit is transferred to vectors under different transcriptional contexts, as previously described by the Choder laboratory (Bregman et al. 2011). While RPL30 expressed under an ACT1 promoter was barely detectable, the highest levels of RNA were observed in the context of the native upstream RPL30 sequence when Rap1 binding sites were also present. Sfp1 showed better association with reporter mRNAs containing Rap1 binding sites in the promoter region. However, removal of the Rap1 binding sites from the reporter vector also led to a drastic decrease in reporter mRNA levels. Whether the fraction of co-purified RNA is nuclear and co-transcriptional or not cannot be inferred from these results. 

      The proposed co-transcriptional binding of Sfp1 is based on the findings presented in Figure 5C and Figure S2D, as well as the observed binding of Sfp1 to transcripts containing introns, as shown in Figures 2D and 3B.  The results of Fig. 3 led us to the assertion that the "RNA-binding capacity of Sfp1 is regulated by Rap1-binding sites located at the promoter." We maintain our stance on this conclusion. Indeed, the Rap1 binding site does impact mRNA levels, as highlighted by Reviewer 2. However, "construct E," which possesses a promoter with a Rap1 binding site, exhibits lower transcript levels compared to "construct F," which lacks such a binding site in its promoter. Despite this difference in transcript levels, Sfp1 was able to pull down the former transcript but not the latter, even though expression of the former gene is relatively low. Thus, the results appear to be more reliant on the specific capacity of Sfp1 to interact with the transcript rather than on the transcript's expression level.

      (6) To complement the biochemical data presented in the first part of the manuscript, the authors turned to the deletion or rapid depletion of SFP1 and used labelling experiments to assess changes in the rate of synthesis, abundance, and decay of mRNAs under these conditions. An important observation was that in the absence of Sfp1, mRNAs encoding ribosomal protein genes not only had a reduced synthesis rate but also an increased degradation rate. This important observation needs careful validation, as genomic run-on experiments were used to measure half-lives, and this particular method was found to give results that correlated poorly with other measures of half-life in yeast (e.g. Chappelboim et al., 2022 for a comparison). Similarly, the use of thiolutin to block transcription as a method of assessing mRNA half-life has been reported to be problematic, as thiolutin can specifically inhibit the degradation of ribosomal protein mRNA (Pelechano & Perez-Ortin, 2008). Specific repressible reporters, such as those used by Baudrimont et al. (2017), would need to be tested to validate the effect of Sfp1 on the half-life of specific mRNAs. Also, it would be very difficult to infer from the images presented whether the rate of deadenylation is altered by Sfp1.

      Various methods exist for assessing mRNA half-lives (HLs), and each of them carries its own set of challenges and biases. Consequently, it becomes problematic to directly compare HL values of a specific mRNA when different methods are employed. The superiority of one particular method over others remains unclear (in my opinion). However, they exhibit a high degree of reliability when it comes to comparing different strains under the identical conditions using a single method.

      Estimating HLs through the GRO approach is a non-invasive method, applied on optimally proliferating cells, which has been employed in numerous publications. While no method is without its limitations, our experience along the years reassured approach to be among the most dependable. Our HL determination using thiolutin to block transcription provided results that were consistent with the values obtained by the GRO approach.

      Nevertheless, in our revised manuscript, we supplemented the HL data, obtain by thiolutin, with results obtained by subjecting cells to a temperature shift to 42°C, a natural method to block transcription in wild-type (WT) cells. This approach to determine HLs has been previously reported in our publications, such as Lotan et al. (2005, 2007) and Goler Baron et al. (2008). The new results are shown in Fig. S3B. They are consistent with our conclusion that Sfp1 stabilizes mRNAs.

      Using a repressible promoter to determine mRNA HL is, unfortunately, not suitable in this paper because the promoter itself is involved in HL regulation. This observation is supported by Bregman et al. (2011) and depicted in Fig. 3, which illustrates that the promoter is critical for mRNA imprinting, consequently regulating HL.

      (7) The effects of SFP1 on transcription were investigated by chromatin purification with Rpb3, a subunit of RNA polymerase, and the results were compared with synthesis rates determined by genomic run-on experiments. The decrease in polII presence on transcripts in the absence of SFP1 was not accompanied by a marked decrease in transcript output, suggesting an effect of Sfp1 in ensuring robust transcription and avoiding RNA polymerase backtracking. To further investigate the phenotypes associated with the depletion or absence of Sfp1, the authors examined the presence of Rpb4 along transcription units compared to Rpb3. One effect of spf1 deficiency was that this ratio, which decreased from the start of transcription towards the end of transcripts, increased slightly. The results presented are largely correlative and could arise from the focus on very specific types of mRNAs, such as those of ribosomal protein genes, which are sensitive to stress and are targeted by very active RNA degradation mechanisms activated, for example, under heat stress (Bresson et al., 2020). 

      Figure 7A illustrates a significant reduction in Rpb4/Rpb3 ratios along the transcription unit in WT cells. This reduction is notably more pronounced in CRAC+ genes compared to the highly transcribed quartile (Q1), which includes all ribosomal protein (RP) genes, and it is completely absent in sfp1∆ cells. Furthermore, it's important to highlight that the CRAC+ gene group displays a wide range of transcription rates, as measured by either Rpb3 ChIP or GRO (Figure 6A). Given these observations, we do not think that heightened sensitivity of RP mRNA degradation in response to stress is responsible for the pronounced difference in the configuration of the Pol II elongation complex that is detected in CRAC+ genes, mainly because this experiment was performed under standard (non-stress) culture conditions.

      Correlative studies are particularly informative when a gene mutation eliminates a correlation, and this is precisely the type of study depicted in Figure 7B-C. The correlations shown in these panels are dependent on Sfp1. Indeed, RP genes are sensitive to stress. However, we used non-stressed conditions. Furthermore, CRAC+ genes did not display any apparent unusual destabilization but rather exhibited higher (not lower) mRNA stability compared to non-CRAC+ genes (Figure 7C).

    1. We decided that we would like to see better documented code included within web pages for convenient browsing. The motivation behind this peculiar aim is to be able to include high quality documentation alongside working code, hopefully making it easier for programmers to produce more maintainable, readable programs.
    1. nisation des outils de configuration du CAP4000 Transition vers une nouvelle application web Auteur Joshua JOURDAM Date de publication 17 juillet 2024 Résumé au dos ou au début du mémoire synthèse carte de visite anglais / français ~15 lignes 3 à 5 mot clés Le renouvellement constant des technologies dans l’industrie est un phénomène de plus en plus marquant de nos jours. Les entreprises qui souhaitent rester compétitives sur le marché doivent s’adapter à ces changements et innover en permanence. Dans ce contexte, le présent projet a pour objectif de développer une nouvelle solution pour répondre aux besoins actuels des industriels. Le projet s’inscrit dans la conception et la réalisation d’une plateforme web permettant de faciliter la gestion des opérations de maintenance et de suivi des équipements SDEL Contrôle Commande. Mots clés Calculateur, Développement WEB, API REST, Authentification Table des matières 1 Environnement et Contexte 1.1 Entreprise, service et position 1.1.1 Vinci 1.1.2 Vinci Energies 1.1.3 Hiérarchie et Fonctionnement de l’Entreprise 1.1.4 SDEL Contrôle Commande 1.1.5 Le service Recherche et Développement 1.2 Contexte du projet 2 Problématique 3 Buts et Objectifs 3.1 Objectifs Stratégiques et Opérationnels 3.1.1 Sélectionner un Cadre de Développement Optimal 3.1.2 Préparer la Transition Technologique 3.1.3 Développer la Nouvelle Application Web 3.1.4 Objectif 4 : Développer les Compétences en Gestion de Projet et Techniques 4 Démarche 4.1 Démarche générale 4.2 Méthodologie, techniques et technologies 4.3 Analyse des risques 4.4 Acteurs 4.5 Lotissements 4.6 Planning prévisionnel 4.7 Planning effectif 4.7.1 Jalons 4.7.2 Livrables 4.8 Budget 5 Résultats 5.1 Evaluation des technologies du marché 5.2 Transition 5.2.1 OpenAPI 5.2.2 Authentification 5.2.3 POC supervision stage 5.3 Développement 5.3.1 Spécification de l’application 5.3.2 Fonctionnement de la solution 5.3.3 Architecture/Workspace (expliquer comment fonctionne un monorepo) 5.3.4 Bibliothèques (zod, react-hook-form, react-query, react-router, orval, react-testing-library, vitest) 5.3.5 Gestion des erreurs et des chargements 5.3.6 Authentification et autorisation 5.3.7 Application 5.3.8 Modules 5.3.9 Tests 5.3.10 Performances 5.3.11 Application 6 Conclusion 6.1 Bilan 7 Annexes 7.1 Références 7.2 Grille des compétences Acronymes HSP PRP REST API SDELCC JWT HTTP HTTPS HMAC WBS CSR SPA Remerciements Ce projet s’intègre dans le cadre de mon apprentissage au sein de l’entreprise SDEL Contrôle Commande. SDEL Contrôle Commande appartient à la filiale Energies du groupe VINCI. Grâce à son expertise technique, SDEL Contrôle commande propose son accompagnement auprès des gestionnaires de réseaux de transport et de distribution d’énergie. J’ai intégré l’entreprise en tant qu’apprenti ingénieur en informatique. J’ai été affecté au service Recherche et Développement, sous la responsabilité de Monsieur Sébastien BARRE, responsable du développement logiciel. Le service Recherche et Développement est en charge de la conception et du développement de calculateurs utilisés principalement dans le domaine de l’énergie notamment dans les postes de transformation du réseau électrique français. Il est également en charge de la maintenance des produits existants. Cette maintenance peut s’effectuer sur de longues périodes, de l’ordre de 20 ans. Le projet s’inscrit dans une dynamique globale qui vise à moderniser les logiciels et les outils que nos équipements embarquent. Cette modernisation à pour objectif de proposer des outils plus ergonomiques et de répondre à des contraintes de cybersécurité plus strictes. Le projet à pour objectif de développer une nouvelle application d’administration et de configuration des automates SDEL. IL se déroule sur la dernière année de ma formation d’ingénieur en apprentissage sur la période de Juin 2023 à août 2024. Ce rapport est découpé en 3 parties : Environnement et contexte : Dans un premier temps, je présenterai le contexte de mon entreprise, puis mon poste et mes rôles au sein de celle-ci. Je présenterai également le contexte, la problématique spécifique ainsi que les buts et objectif du projet. Mise en œuvre et analyse des résultats : Dans cette partie, j’aborderai la méthodologie de travail choisie et les outils utilisés pour assurer son bon déroulement. Je présenterai ensuite les résultats obtenus en évoquant les écarts avec les objectif établis. Bilan et perspectives : Enfin, je ferai un bilan et j’évoquerai les perspectives d’évolution du projet. Je présenterai également les compétences acquises et les apports de ce projet dans mon parcours de formation. Pour finir, je présenterai mes perspectives futures pour ma carrière professionnelle. 1 Environnement et Contexte développer contexte entreprise, Vinci et hiérarchie (energies, omexom), nos marchés (où) ce qu’il faut retenir de chaque chapitre (bullet list) 1.1 Entreprise, service et position 1.1.1 Vinci image/svg+xml Figure 1: Logo VINCI Vinci est une entreprise multinationale française spécialisée dans la construction et les concessions. Fondée en 1899 sous le nom de Société Générale d’Entreprises (SGE), elle est devenue Vinci en 2000. Vinci est l’une des plus grandes entreprises de construction et de concessions dans le monde, avec des activités diversifiées dans le domaine de la construction, des infrastructures, des services énergétiques et de la gestion des infrastructures. Secteurs d’activité : Construction : Vinci Construction est spécialisée dans le bâtiment, les travaux publics, le génie civil, les fondations et les infrastructures de transport. Concessions : Vinci Autoroutes et Vinci Airports gèrent des réseaux autoroutiers et des aéroports dans plusieurs pays. Énergie : Vinci Energies intervient dans les domaines des infrastructures d’énergie, des industries, des technologies de l’information, et de la transition énergétique. Immobilier : Vinci Immobilier est active dans le développement immobilier résidentiel, tertiaire et commercial. #mermaid-1{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#ccc;}#mermaid-1 .error-icon{fill:#a44141;}#mermaid-1 .error-text{fill:#ddd;stroke:#ddd;}#mermaid-1 .edge-thickness-normal{stroke-width:2px;}#mermaid-1 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-1 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-1 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-1 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-1 .marker{fill:lightgrey;stroke:lightgrey;}#mermaid-1 .marker.cross{stroke:lightgrey;}#mermaid-1 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-1 .edge{stroke-width:3;}#mermaid-1 .section--1 rect,#mermaid-1 .section--1 path,#mermaid-1 .section--1 circle,#mermaid-1 .section--1 polygon,#mermaid-1 .section--1 path{fill:#1f2020;}#mermaid-1 .section--1 text{fill:lightgrey;}#mermaid-1 .node-icon--1{font-size:40px;color:lightgrey;}#mermaid-1 .section-edge--1{stroke:#1f2020;}#mermaid-1 .edge-depth--1{stroke-width:17;}#mermaid-1 .section--1 line{stroke:#e0dfdf;stroke-width:3;}#mermaid-1 .disabled,#mermaid-1 .disabled circle,#mermaid-1 .disabled text{fill:lightgray;}#mermaid-1 .disabled text{fill:#efefef;}#mermaid-1 .section-0 rect,#mermaid-1 .section-0 path,#mermaid-1 .section-0 circle,#mermaid-1 .section-0 polygon,#mermaid-1 .section-0 path{fill:#0b0000;}#mermaid-1 .section-0 text{fill:lightgrey;}#mermaid-1 .node-icon-0{font-size:40px;color:lightgrey;}#mermaid-1 .section-edge-0{stroke:#0b0000;}#mermaid-1 .edge-depth-0{stroke-width:14;}#mermaid-1 .section-0 line{stroke:#f4ffff;stroke-width:3;}#mermaid-1 .disabled,#mermaid-1 .disabled circle,#mermaid-1 .disabled text{fill:lightgray;}#mermaid-1 .disabled text{fill:#efefef;}#mermaid-1 .section-1 rect,#mermaid-1 .section-1 path,#mermaid-1 .section-1 circle,#mermaid-1 .section-1 polygon,#mermaid-1 .section-1 path{fill:#4d1037;}#mermaid-1 .section-1 text{fill:lightgrey;}#mermaid-1 .node-icon-1{font-size:40px;color:lightgrey;}#mermaid-1 .section-edge-1{stroke:#4d1037;}#mermaid-1 .edge-depth-1{stroke-width:11;}#mermaid-1 .section-1 line{stroke:#b2efc8;stroke-width:3;}#mermaid-1 .disabled,#mermaid-1 .disabled circle,#mermaid-1 .disabled text{fill:lightgray;}#mermaid-1 .disabled text{fill:#efefef;}#mermaid-1 .section-2 rect,#mermaid-1 .section-2 path,#mermaid-1 .section-2 circle,#mermaid-1 .section-2 polygon,#mermaid-1 .section-2 path{fill:#3f5258;}#mermaid-1 .section-2 text{fill:lightgrey;}#mermaid-1 .node-icon-2{font-size:40px;color:lightgrey;}#mermaid-1 .section-edge-2{stroke:#3f5258;}#mermaid-1 .edge-depth-2{stroke-width:8;}#mermaid-1 .section-2 line{stroke:#c0ada7;stroke-width:3;}#mermaid-1 .disabled,#mermaid-1 .disabled circle,#mermaid-1 .disabled text{fill:lightgray;}#mermaid-1 .disabled text{fill:#efefef;}#mermaid-1 .section-3 rect,#mermaid-1 .section-3 path,#mermaid-1 .section-3 circle,#mermaid-1 .section-3 polygon,#mermaid-1 .section-3 path{fill:#4f2f1b;}#mermaid-1 .section-3 text{fill:lightgrey;}#mermaid-1 .node-icon-3{font-size:40px;color:lightgrey;}#mermaid-1 .section-edge-3{stroke:#4f2f1b;}#mermaid-1 .edge-depth-3{stroke-width:5;}#mermaid-1 .section-3 line{stroke:#b0d0e4;stroke-width:3;}#mermaid-1 .disabled,#mermaid-1 .disabled circle,#mermaid-1 .disabled text{fill:lightgray;}#mermaid-1 .disabled text{fill:#efefef;}#mermaid-1 .section-4 rect,#mermaid-1 .section-4 path,#mermaid-1 .section-4 circle,#mermaid-1 .section-4 polygon,#mermaid-1 .section-4 path{fill:#6e0a0a;}#mermaid-1 .section-4 text{fill:lightgrey;}#mermaid-1 .node-icon-4{font-size:40px;color:lightgrey;}#mermaid-1 .section-edge-4{stroke:#6e0a0a;}#mermaid-1 .edge-depth-4{stroke-width:2;}#mermaid-1 .section-4 line{stroke:#91f5f5;stroke-width:3;}#mermaid-1 .disabled,#mermaid-1 .disabled circle,#mermaid-1 .disabled text{fill:lightgray;}#mermaid-1 .disabled text{fill:#efefef;}#mermaid-1 .section-5 rect,#mermaid-1 .section-5 path,#mermaid-1 .section-5 circle,#mermaid-1 .section-5 polygon,#mermaid-1 .section-5 path{fill:#3b0048;}#mermaid-1 .section-5 text{fill:lightgrey;}#mermaid-1 .node-icon-5{font-size:40px;color:lightgrey;}#mermaid-1 .section-edge-5{stroke:#3b0048;}#mermaid-1 .edge-depth-5{stroke-width:-1;}#mermaid-1 .section-5 line{stroke:#c4ffb7;stroke-width:3;}#mermaid-1 .disabled,#mermaid-1 .disabled circle,#mermaid-1 .disabled text{fill:lightgray;}#mermaid-1 .disabled text{fill:#efefef;}#mermaid-1 .section-6 rect,#mermaid-1 .section-6 path,#mermaid-1 .section-6 circle,#mermaid-1 .section-6 polygon,#mermaid-1 .section-6 path{fill:#995a01;}#mermaid-1 .section-6 text{fill:lightgrey;}#mermaid-1 .node-icon-6{font-size:40px;color:lightgrey;}#mermaid-1 .section-edge-6{stroke:#995a01;}#mermaid-1 .edge-depth-6{stroke-width:-4;}#mermaid-1 .section-6 line{stroke:#66a5fe;stroke-width:3;}#mermaid-1 .disabled,#mermaid-1 .disabled circle,#mermaid-1 .disabled text{fill:lightgray;}#mermaid-1 .disabled text{fill:#efefef;}#mermaid-1 .section-7 rect,#mermaid-1 .section-7 path,#mermaid-1 .section-7 circle,#mermaid-1 .section-7 polygon,#mermaid-1 .section-7 path{fill:#154706;}#mermaid-1 .section-7 text{fill:lightgrey;}#mermaid-1 .node-icon-7{font-size:40px;color:lightgrey;}#mermaid-1 .section-edge-7{stroke:#154706;}#mermaid-1 .edge-depth-7{stroke-width:-7;}#mermaid-1 .section-7 line{stroke:#eab8f9;stroke-width:3;}#mermaid-1 .disabled,#mermaid-1 .disabled circle,#mermaid-1 .disabled text{fill:lightgray;}#mermaid-1 .disabled text{fill:#efefef;}#mermaid-1 .section-8 rect,#mermaid-1 .section-8 path,#mermaid-1 .section-8 circle,#mermaid-1 .section-8 polygon,#mermaid-1 .section-8 path{fill:#161722;}#mermaid-1 .section-8 text{fill:lightgrey;}#mermaid-1 .node-icon-8{font-size:40px;color:lightgrey;}#mermaid-1 .section-edge-8{stroke:#161722;}#mermaid-1 .edge-depth-8{stroke-width:-10;}#mermaid-1 .section-8 line{stroke:#e9e8dd;stroke-width:3;}#mermaid-1 .disabled,#mermaid-1 .disabled circle,#mermaid-1 .disabled text{fill:lightgray;}#mermaid-1 .disabled text{fill:#efefef;}#mermaid-1 .section-9 rect,#mermaid-1 .section-9 path,#mermaid-1 .section-9 circle,#mermaid-1 .section-9 polygon,#mermaid-1 .section-9 path{fill:#00296f;}#mermaid-1 .section-9 text{fill:lightgrey;}#mermaid-1 .node-icon-9{font-size:40px;color:lightgrey;}#mermaid-1 .section-edge-9{stroke:#00296f;}#mermaid-1 .edge-depth-9{stroke-width:-13;}#mermaid-1 .section-9 line{stroke:#ffd690;stroke-width:3;}#mermaid-1 .disabled,#mermaid-1 .disabled circle,#mermaid-1 .disabled text{fill:lightgray;}#mermaid-1 .disabled text{fill:#efefef;}#mermaid-1 .section-10 rect,#mermaid-1 .section-10 path,#mermaid-1 .section-10 circle,#mermaid-1 .section-10 polygon,#mermaid-1 .section-10 path{fill:#01629c;}#mermaid-1 .section-10 text{fill:lightgrey;}#mermaid-1 .node-icon-10{font-size:40px;color:lightgrey;}#mermaid-1 .section-edge-10{stroke:#01629c;}#mermaid-1 .edge-depth-10{stroke-width:-16;}#mermaid-1 .section-10 line{stroke:#fe9d63;stroke-width:3;}#mermaid-1 .disabled,#mermaid-1 .disabled circle,#mermaid-1 .disabled text{fill:lightgray;}#mermaid-1 .disabled text{fill:#efefef;}#mermaid-1 .section-root rect,#mermaid-1 .section-root path,#mermaid-1 .section-root circle,#mermaid-1 .section-root polygon{fill:hsl(180, 1.5873015873%, 48.3529411765%);}#mermaid-1 .section-root text{fill:#2c2c2c;}#mermaid-1 .icon-container{height:100%;display:flex;justify-content:center;align-items:center;}#mermaid-1 .edge{fill:none;}#mermaid-1 .mindmap-node-label{dy:1em;alignment-baseline:middle;text-anchor:middle;dominant-baseline:middle;text-align:center;}#mermaid-1 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;}VINCIVinci EnergiesVinci ConstructionVinci AutoroutesVinci AirportsVinci Immobilier 1.1.2 Vinci Energies Vinci Energies, une filiale de Vinci, est spécialisée dans les services énergétiques et les technologies de l’information. Elle propose des solutions dans les domaines de l’énergie, des technologies de l’information, et des télécommunications. Ses principales activités incluent : Infrastructure Énergie : Gestion des réseaux électriques et des infrastructures de distribution d’énergie. Industrie : Optimisation des processus industriels et amélioration de l’efficacité énergétique. TIC (Technologies de l’Information et de la Communication) : Solutions pour les systèmes d’information et les télécommunications. Transition Énergétique et Environnementale : Développement de solutions pour une énergie plus durable et respectueuse de l’environnement. Vinci Energies, une division du groupe VINCI, regroupe plusieurs marques spécialisées dans divers domaines des services énergétiques et des technologies de l’information. Les cinq principales marques de Vinci Energies sont : Actemium : Spécialisée dans les solutions et les services pour les processus industriels, couvrant l’ensemble du cycle de vie des installations industrielles. Axians : Focalisée sur les technologies de l’information et de la communication (TIC), offrant des solutions pour les infrastructures IT, la cybersécurité, le cloud, les réseaux et la collaboration. Cegelec : Fournit des services et des solutions en ingénierie électrique et maintenance pour les infrastructures et les bâtiments. Omexom : Se concentre sur les infrastructures énergétiques, notamment les réseaux de transport et de distribution d’électricité, les énergies renouvelables et les systèmes de stockage d’énergie. VINCI Facilities : Offre des services de gestion et de maintenance des bâtiments, incluant des solutions de facility management intégrées pour optimiser les performances des installations. 1.1.3 Hiérarchie et Fonctionnement de l’Entreprise 1.1.3.1 Organisation Vinci est organisée en plusieurs divisions opérationnelles, chacune ayant une structure hiérarchique propre. La hiérarchie de Vinci et de ses filiales, comme Vinci Energies, est généralement structurée de la manière suivante : Conseil d’Administration : Organe suprême de la société, responsable de la stratégie globale et de la surveillance de la direction exécutive. Direction Générale : Composée du Président-Directeur Général (PDG) et d’autres membres de la direction exécutive, responsables de la mise en œuvre de la stratégie et de la gestion quotidienne de l’entreprise. Directeurs de Divisions/Branches : Chacun responsable d’une branche spécifique (par exemple, Vinci Construction, Vinci Energies). Directeurs de Filières/Entités : Supervisent des sous-divisions ou entités spécifiques au sein de chaque branche, telles que des régions géographiques ou des domaines spécialisés. Chefs de Projet et Managers Opérationnels : Responsables de la gestion quotidienne des projets et des équipes sur le terrain. Équipes Opérationnelles : Constituent les employés qui travaillent directement sur les projets, incluant ingénieurs, techniciens, ouvriers, et autres professionnels spécialisés. 1.1.3.2 Fonctionnement Décentralisation : Vinci privilégie une approche décentralisée, permettant à ses différentes divisions et filiales de bénéficier d’une grande autonomie. Cela favorise la réactivité et l’adaptabilité aux marchés locaux. Innovation : L’entreprise met l’accent sur l’innovation technologique et l’efficacité énergétique, soutenant les projets de recherche et développement pour anticiper les besoins futurs. Responsabilité Sociétale et Environnementale : Vinci est engagée dans une démarche de développement durable, visant à réduire son empreinte écologique et à améliorer ses performances environnementales. En résumé, Vinci est une entreprise diversifiée avec une structure hiérarchique bien définie, soutenant une approche décentralisée et innovante pour répondre aux besoins de ses marchés dans le domaine de la construction, des concessions, et des services énergétiques. 1.1.4 SDEL Contrôle Commande Figure 2: Entrée principale de SDEL Contrôle Commande SDEL Contrôle Commande appartient à la filiale Energies du groupe VINCI. Elle est composée de plusieurs marques, comme Omexom, Actemium et Axians. Chaque marque est spécialisée dans un domaine d’activité précis. Ainsi VINCI Energies offre une gamme complète de services et de solutions dans le domaine de l’énergie. SDEL Contrôle Commande est une entreprise mono-site basée à Saint-Aignan-Grandlieu. Au sein du réseau Omexom, SDEL Contrôle Commande intervient dans la gestion de projets clé en main, la conception, l’ingénierie, l’intégration, l’installation, la configuration, les essais, la mise en service et la maintenance de systèmes de contrôle commande de postes électriques et d’automatismes. Figure 3: Tranches SDEL Composée de plus de 300 salariés, elle bénéficie de plus de 50 années d’expérience dans le contrôle commande. Ainsi, grâce à son expertise technique, SDEL Contrôle commande propose son accompagnement auprès des gestionnaires de réseaux de transport et de distribution d’énergie. Nos clients majeurs sont RTE et Enedis. Nous proposons une offre sur mesure de produits à destination des postes du réseau électrique français. Nous fournissons également des systèmes de contrôle commande à Thales pour le pilotage de batteries marines et à la RATP pour la supervision du système de ventilation du métro parisien. RTE Enedis Thales RATP EDF Nos principaux concurrents sont Actia Telecom, SCLE, Eiffage Energie. Ils proposent une gamme de produits similaire sur le marché de la distribution et du transport de l’énergie. Quelques entreprises du groupe VINCI sont également en compétition directe avec SDEL Contrôle Commande sur certains appels d’offre. #fig-market-mermaid{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#ccc;}#fig-market-mermaid .error-icon{fill:#a44141;}#fig-market-mermaid .error-text{fill:#ddd;stroke:#ddd;}#fig-market-mermaid .edge-thickness-normal{stroke-width:2px;}#fig-market-mermaid .edge-thickness-thick{stroke-width:3.5px;}#fig-market-mermaid .edge-pattern-solid{stroke-dasharray:0;}#fig-market-mermaid .edge-pattern-dashed{stroke-dasharray:3;}#fig-market-mermaid .edge-pattern-dotted{stroke-dasharray:2;}#fig-market-mermaid .marker{fill:lightgrey;stroke:lightgrey;}#fig-market-mermaid .marker.cross{stroke:lightgrey;}#fig-market-mermaid svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#fig-market-mermaid .pieCircle{stroke:black;stroke-width:2px;opacity:0.7;}#fig-market-mermaid .pieOuterCircle{stroke:black;stroke-width:2px;fill:none;}#fig-market-mermaid .pieTitleText{text-anchor:middle;font-size:25px;fill:hsl(28.5714285714, 17.3553719008%, 86.2745098039%);font-family:"trebuchet ms",verdana,arial,sans-serif;}#fig-market-mermaid .slice{font-family:"trebuchet ms",verdana,arial,sans-serif;fill:#ccc;font-size:17px;}#fig-market-mermaid .legend text{fill:hsl(28.5714285714, 17.3553719008%, 86.2745098039%);font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:17px;}#fig-market-mermaid :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;}85%10%4%1%EnergieDéfenseTransportAutres industries Figure 4: Répartition des marchés de SDEL Contrôle Commande #fig-organisation-mermaid{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#ccc;}#fig-organisation-mermaid .error-icon{fill:#a44141;}#fig-organisation-mermaid .error-text{fill:#ddd;stroke:#ddd;}#fig-organisation-mermaid .edge-thickness-normal{stroke-width:2px;}#fig-organisation-mermaid .edge-thickness-thick{stroke-width:3.5px;}#fig-organisation-mermaid .edge-pattern-solid{stroke-dasharray:0;}#fig-organisation-mermaid .edge-pattern-dashed{stroke-dasharray:3;}#fig-organisation-mermaid .edge-pattern-dotted{stroke-dasharray:2;}#fig-organisation-mermaid .marker{fill:lightgrey;stroke:lightgrey;}#fig-organisation-mermaid .marker.cross{stroke:lightgrey;}#fig-organisation-mermaid svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#fig-organisation-mermaid .edge{stroke-width:3;}#fig-organisation-mermaid .section--1 rect,#fig-organisation-mermaid .section--1 path,#fig-organisation-mermaid .section--1 circle,#fig-organisation-mermaid .section--1 polygon,#fig-organisation-mermaid .section--1 path{fill:#1f2020;}#fig-organisation-mermaid .section--1 text{fill:lightgrey;}#fig-organisation-mermaid .node-icon--1{font-size:40px;color:lightgrey;}#fig-organisation-mermaid .section-edge--1{stroke:#1f2020;}#fig-organisation-mermaid .edge-depth--1{stroke-width:17;}#fig-organisation-mermaid .section--1 line{stroke:#e0dfdf;stroke-width:3;}#fig-organisation-mermaid .disabled,#fig-organisation-mermaid .disabled circle,#fig-organisation-mermaid .disabled text{fill:lightgray;}#fig-organisation-mermaid .disabled text{fill:#efefef;}#fig-organisation-mermaid .section-0 rect,#fig-organisation-mermaid .section-0 path,#fig-organisation-mermaid .section-0 circle,#fig-organisation-mermaid .section-0 polygon,#fig-organisation-mermaid .section-0 path{fill:#0b0000;}#fig-organisation-mermaid .section-0 text{fill:lightgrey;}#fig-organisation-mermaid .node-icon-0{font-size:40px;color:lightgrey;}#fig-organisation-mermaid .section-edge-0{stroke:#0b0000;}#fig-organisation-mermaid .edge-depth-0{stroke-width:14;}#fig-organisation-mermaid .section-0 line{stroke:#f4ffff;stroke-width:3;}#fig-organisation-mermaid .disabled,#fig-organisation-mermaid .disabled circle,#fig-organisation-mermaid .disabled text{fill:lightgray;}#fig-organisation-mermaid .disabled text{fill:#efefef;}#fig-organisation-mermaid .section-1 rect,#fig-organisation-mermaid .section-1 path,#fig-organisation-mermaid .section-1 circle,#fig-organisation-mermaid .section-1 polygon,#fig-organisation-mermaid .section-1 path{fill:#4d1037;}#fig-organisation-mermaid .section-1 text{fill:lightgrey;}#fig-organisation-mermaid .node-icon-1{font-size:40px;color:lightgrey;}#fig-organisation-mermaid .section-edge-1{stroke:#4d1037;}#fig-organisation-mermaid .edge-depth-1{stroke-width:11;}#fig-organisation-mermaid .section-1 line{stroke:#b2efc8;stroke-width:3;}#fig-organisation-mermaid .disabled,#fig-organisation-mermaid .disabled circle,#fig-organisation-mermaid .disabled text{fill:lightgray;}#fig-organisation-mermaid .disabled text{fill:#efefef;}#fig-organisation-mermaid .section-2 rect,#fig-organisation-mermaid .section-2 path,#fig-organisation-mermaid .section-2 circle,#fig-organisation-mermaid .section-2 polygon,#fig-organisation-mermaid .section-2 path{fill:#3f5258;}#fig-organisation-mermaid .section-2 text{fill:lightgrey;}#fig-organisation-mermaid .node-icon-2{font-size:40px;color:lightgrey;}#fig-organisation-mermaid .section-edge-2{stroke:#3f5258;}#fig-organisation-mermaid .edge-depth-2{stroke-width:8;}#fig-organisation-mermaid .section-2 line{stroke:#c0ada7;stroke-width:3;}#fig-organisation-mermaid .disabled,#fig-organisation-mermaid .disabled circle,#fig-organisation-mermaid .disabled text{fill:lightgray;}#fig-organisation-mermaid .disabled text{fill:#efefef;}#fig-organisation-mermaid .section-3 rect,#fig-organisation-mermaid .section-3 path,#fig-organisation-mermaid .section-3 circle,#fig-organisation-mermaid .section-3 polygon,#fig-organisation-mermaid .section-3 path{fill:#4f2f1b;}#fig-organisation-mermaid .section-3 text{fill:lightgrey;}#fig-organisation-mermaid .node-icon-3{font-size:40px;color:lightgrey;}#fig-organisation-mermaid .section-edge-3{stroke:#4f2f1b;}#fig-organisation-mermaid .edge-depth-3{stroke-width:5;}#fig-organisation-mermaid .section-3 line{stroke:#b0d0e4;stroke-width:3;}#fig-organisation-mermaid .disabled,#fig-organisation-mermaid .disabled circle,#fig-organisation-mermaid .disabled text{fill:lightgray;}#fig-organisation-mermaid .disabled text{fill:#efefef;}#fig-organisation-mermaid .section-4 rect,#fig-organisation-mermaid .section-4 path,#fig-organisation-mermaid .section-4 circle,#fig-organisation-mermaid .section-4 polygon,#fig-organisation-mermaid .section-4 path{fill:#6e0a0a;}#fig-organisation-mermaid .section-4 text{fill:lightgrey;}#fig-organisation-mermaid .node-icon-4{font-size:40px;color:lightgrey;}#fig-organisation-mermaid .section-edge-4{stroke:#6e0a0a;}#fig-organisation-mermaid .edge-depth-4{stroke-width:2;}#fig-organisation-mermaid .section-4 line{stroke:#91f5f5;stroke-width:3;}#fig-organisation-mermaid .disabled,#fig-organisation-mermaid .disabled circle,#fig-organisation-mermaid .disabled text{fill:lightgray;}#fig-organisation-mermaid .disabled text{fill:#efefef;}#fig-organisation-mermaid .section-5 rect,#fig-organisation-mermaid .section-5 path,#fig-organisation-mermaid .section-5 circle,#fig-organisation-mermaid .section-5 polygon,#fig-organisation-mermaid .section-5 path{fill:#3b0048;}#fig-organisation-mermaid .section-5 text{fill:lightgrey;}#fig-organisation-mermaid .node-icon-5{font-size:40px;color:lightgrey;}#fig-organisation-mermaid .section-edge-5{stroke:#3b0048;}#fig-organisation-mermaid .edge-depth-5{stroke-width:-1;}#fig-organisation-mermaid .section-5 line{stroke:#c4ffb7;stroke-width:3;}#fig-organisation-mermaid .disabled,#fig-organisation-mermaid .disabled circle,#fig-organisation-mermaid .disabled text{fill:lightgray;}#fig-organisation-mermaid .disabled text{fill:#efefef;}#fig-organisation-mermaid .section-6 rect,#fig-organisation-mermaid .section-6 path,#fig-organisation-mermaid .section-6 circle,#fig-organisation-mermaid .section-6 polygon,#fig-organisation-mermaid .section-6 path{fill:#995a01;}#fig-organisation-mermaid .section-6 text{fill:lightgrey;}#fig-organisation-mermaid .node-icon-6{font-size:40px;color:lightgrey;}#fig-organisation-mermaid .section-edge-6{stroke:#995a01;}#fig-organisation-mermaid .edge-depth-6{stroke-width:-4;}#fig-organisation-mermaid .section-6 line{stroke:#66a5fe;stroke-width:3;}#fig-organisation-mermaid .disabled,#fig-organisation-mermaid .disabled circle,#fig-organisation-mermaid .disabled text{fill:lightgray;}#fig-organisation-mermaid .disabled text{fill:#efefef;}#fig-organisation-mermaid .section-7 rect,#fig-organisation-mermaid .section-7 path,#fig-organisation-mermaid .section-7 circle,#fig-organisation-mermaid .section-7 polygon,#fig-organisation-mermaid .section-7 path{fill:#154706;}#fig-organisation-mermaid .section-7 text{fill:lightgrey;}#fig-organisation-mermaid .node-icon-7{font-size:40px;color:lightgrey;}#fig-organisation-mermaid .section-edge-7{stroke:#154706;}#fig-organisation-mermaid .edge-depth-7{stroke-width:-7;}#fig-organisation-mermaid .section-7 line{stroke:#eab8f9;stroke-width:3;}#fig-organisation-mermaid .disabled,#fig-organisation-mermaid .disabled circle,#fig-organisation-mermaid .disabled text{fill:lightgray;}#fig-organisation-mermaid .disabled text{fill:#efefef;}#fig-organisation-mermaid .section-8 rect,#fig-organisation-mermaid .section-8 path,#fig-organisation-mermaid .section-8 circle,#fig-organisation-mermaid .section-8 polygon,#fig-organisation-mermaid .section-8 path{fill:#161722;}#fig-organisation-mermaid .section-8 text{fill:lightgrey;}#fig-organisation-mermaid .node-icon-8{font-size:40px;color:lightgrey;}#fig-organisation-mermaid .section-edge-8{stroke:#161722;}#fig-organisation-mermaid .edge-depth-8{stroke-width:-10;}#fig-organisation-mermaid .section-8 line{stroke:#e9e8dd;stroke-width:3;}#fig-organisation-mermaid .disabled,#fig-organisation-mermaid .disabled circle,#fig-organisation-mermaid .disabled text{fill:lightgray;}#fig-organisation-mermaid .disabled text{fill:#efefef;}#fig-organisation-mermaid .section-9 rect,#fig-organisation-mermaid .section-9 path,#fig-organisation-mermaid .section-9 circle,#fig-organisation-mermaid .section-9 polygon,#fig-organisation-mermaid .section-9 path{fill:#00296f;}#fig-organisation-mermaid .section-9 text{fill:lightgrey;}#fig-organisation-mermaid .node-icon-9{font-size:40px;color:lightgrey;}#fig-organisation-mermaid .section-edge-9{stroke:#00296f;}#fig-organisation-mermaid .edge-depth-9{stroke-width:-13;}#fig-organisation-mermaid .section-9 line{stroke:#ffd690;stroke-width:3;}#fig-organisation-mermaid .disabled,#fig-organisation-mermaid .disabled circle,#fig-organisation-mermaid .disabled text{fill:lightgray;}#fig-organisation-mermaid .disabled text{fill:#efefef;}#fig-organisation-mermaid .section-10 rect,#fig-organisation-mermaid .section-10 path,#fig-organisation-mermaid .section-10 circle,#fig-organisation-mermaid .section-10 polygon,#fig-organisation-mermaid .section-10 path{fill:#01629c;}#fig-organisation-mermaid .section-10 text{fill:lightgrey;}#fig-organisation-mermaid .node-icon-10{font-size:40px;color:lightgrey;}#fig-organisation-mermaid .section-edge-10{stroke:#01629c;}#fig-organisation-mermaid .edge-depth-10{stroke-width:-16;}#fig-organisation-mermaid .section-10 line{stroke:#fe9d63;stroke-width:3;}#fig-organisation-mermaid .disabled,#fig-organisation-mermaid .disabled circle,#fig-organisation-mermaid .disabled text{fill:lightgray;}#fig-organisation-mermaid .disabled text{fill:#efefef;}#fig-organisation-mermaid .section-root rect,#fig-organisation-mermaid .section-root path,#fig-organisation-mermaid .section-root circle,#fig-organisation-mermaid .section-root polygon{fill:hsl(180, 1.5873015873%, 48.3529411765%);}#fig-organisation-mermaid .section-root text{fill:#2c2c2c;}#fig-organisation-mermaid .icon-container{height:100%;display:flex;justify-content:center;align-items:center;}#fig-organisation-mermaid .edge{fill:none;}#fig-organisation-mermaid .mindmap-node-label{dy:1em;alignment-baseline:middle;text-anchor:middle;dominant-baseline:middle;text-align:center;}#fig-organisation-mermaid :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;}SDEL Contrôle CommandeRecherche etDéveloppementIngénierieInnovationGestion affairesStratégie IndustrielleBureau d 'étudesIntégrationLogistiqueChaîne achatsApprovisionnementEssaisMise en serviceTravauxInterventions Figure 5: Les services de SDEL Contrôle Commande 1.1.5 Le service Recherche et Développement SDEL Contrôle Commande propose des solutions techniques adaptées aux besoins de ses clients grâce à ses services Recherche et Développement, Ingénierie et Innovation. En plus de notre connaissance approfondie des différents constructeurs, notre expertise dans le domaine d’application nous permet de répondre aux besoins en ingénierie, dimensionnement, configuration et déploiement d’équipements de protection et de contrôle pour les réseaux électriques de transport et de distribution d’énergie. Le service Recherche et Développement est composé de 16 personnes. Il est spécialisé dans la conception de calculateurs et d’équipements utilisés dans les réseaux de distribution électriques. #fig-software-team-mermaid{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#ccc;}#fig-software-team-mermaid .error-icon{fill:#a44141;}#fig-software-team-mermaid .error-text{fill:#ddd;stroke:#ddd;}#fig-software-team-mermaid .edge-thickness-normal{stroke-width:2px;}#fig-software-team-mermaid .edge-thickness-thick{stroke-width:3.5px;}#fig-software-team-mermaid .edge-pattern-solid{stroke-dasharray:0;}#fig-software-team-mermaid .edge-pattern-dashed{stroke-dasharray:3;}#fig-software-team-mermaid .edge-pattern-dotted{stroke-dasharray:2;}#fig-software-team-mermaid .marker{fill:lightgrey;stroke:lightgrey;}#fig-software-team-mermaid .marker.cross{stroke:lightgrey;}#fig-software-team-mermaid svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#fig-software-team-mermaid .edge{stroke-width:3;}#fig-software-team-mermaid .section--1 rect,#fig-software-team-mermaid .section--1 path,#fig-software-team-mermaid .section--1 circle,#fig-software-team-mermaid .section--1 polygon,#fig-software-team-mermaid .section--1 path{fill:#1f2020;}#fig-software-team-mermaid .section--1 text{fill:lightgrey;}#fig-software-team-mermaid .node-icon--1{font-size:40px;color:lightgrey;}#fig-software-team-mermaid .section-edge--1{stroke:#1f2020;}#fig-software-team-mermaid .edge-depth--1{stroke-width:17;}#fig-software-team-mermaid .section--1 line{stroke:#e0dfdf;stroke-width:3;}#fig-software-team-mermaid .disabled,#fig-software-team-mermaid .disabled circle,#fig-software-team-mermaid .disabled text{fill:lightgray;}#fig-software-team-mermaid .disabled text{fill:#efefef;}#fig-software-team-mermaid .section-0 rect,#fig-software-team-mermaid .section-0 path,#fig-software-team-mermaid .section-0 circle,#fig-software-team-mermaid .section-0 polygon,#fig-software-team-mermaid .section-0 path{fill:#0b0000;}#fig-software-team-mermaid .section-0 text{fill:lightgrey;}#fig-software-team-mermaid .node-icon-0{font-size:40px;color:lightgrey;}#fig-software-team-mermaid .section-edge-0{stroke:#0b0000;}#fig-software-team-mermaid .edge-depth-0{stroke-width:14;}#fig-software-team-mermaid .section-0 line{stroke:#f4ffff;stroke-width:3;}#fig-software-team-mermaid .disabled,#fig-software-team-mermaid .disabled circle,#fig-software-team-mermaid .disabled text{fill:lightgray;}#fig-software-team-mermaid .disabled text{fill:#efefef;}#fig-software-team-mermaid .section-1 rect,#fig-software-team-mermaid .section-1 path,#fig-software-team-mermaid .section-1 circle,#fig-software-team-mermaid .section-1 polygon,#fig-software-team-mermaid .section-1 path{fill:#4d1037;}#fig-software-team-mermaid .section-1 text{fill:lightgrey;}#fig-software-team-mermaid .node-icon-1{font-size:40px;color:lightgrey;}#fig-software-team-mermaid .section-edge-1{stroke:#4d1037;}#fig-software-team-mermaid .edge-depth-1{stroke-width:11;}#fig-software-team-mermaid .section-1 line{stroke:#b2efc8;stroke-width:3;}#fig-software-team-mermaid .disabled,#fig-software-team-mermaid .disabled circle,#fig-software-team-mermaid .disabled text{fill:lightgray;}#fig-software-team-mermaid .disabled text{fill:#efefef;}#fig-software-team-mermaid .section-2 rect,#fig-software-team-mermaid .section-2 path,#fig-software-team-mermaid .section-2 circle,#fig-software-team-mermaid .section-2 polygon,#fig-software-team-mermaid .section-2 path{fill:#3f5258;}#fig-software-team-mermaid .section-2 text{fill:lightgrey;}#fig-software-team-mermaid .node-icon-2{font-size:40px;color:lightgrey;}#fig-software-team-mermaid .section-edge-2{stroke:#3f5258;}#fig-software-team-mermaid .edge-depth-2{stroke-width:8;}#fig-software-team-mermaid .section-2 line{stroke:#c0ada7;stroke-width:3;}#fig-software-team-mermaid .disabled,#fig-software-team-mermaid .disabled circle,#fig-software-team-mermaid .disabled text{fill:lightgray;}#fig-software-team-mermaid .disabled text{fill:#efefef;}#fig-software-team-mermaid .section-3 rect,#fig-software-team-mermaid .section-3 path,#fig-software-team-mermaid .section-3 circle,#fig-software-team-mermaid .section-3 polygon,#fig-software-team-mermaid .section-3 path{fill:#4f2f1b;}#fig-software-team-mermaid .section-3 text{fill:lightgrey;}#fig-software-team-mermaid .node-icon-3{font-size:40px;color:lightgrey;}#fig-software-team-mermaid .section-edge-3{stroke:#4f2f1b;}#fig-software-team-mermaid .edge-depth-3{stroke-width:5;}#fig-software-team-mermaid .section-3 line{stroke:#b0d0e4;stroke-width:3;}#fig-software-team-mermaid .disabled,#fig-software-team-mermaid .disabled circle,#fig-software-team-mermaid .disabled text{fill:lightgray;}#fig-software-team-mermaid .disabled text{fill:#efefef;}#fig-software-team-mermaid .section-4 rect,#fig-software-team-mermaid .section-4 path,#fig-software-team-mermaid .section-4 circle,#fig-software-team-mermaid .section-4 polygon,#fig-software-team-mermaid .section-4 path{fill:#6e0a0a;}#fig-software-team-mermaid .section-4 text{fill:lightgrey;}#fig-software-team-mermaid .node-icon-4{font-size:40px;color:lightgrey;}#fig-software-team-mermaid .section-edge-4{stroke:#6e0a0a;}#fig-software-team-mermaid .edge-depth-4{stroke-width:2;}#fig-software-team-mermaid .section-4 line{stroke:#91f5f5;stroke-width:3;}#fig-software-team-mermaid .disabled,#fig-software-team-mermaid .disabled circle,#fig-software-team-mermaid .disabled text{fill:lightgray;}#fig-software-team-mermaid .disabled text{fill:#efefef;}#fig-software-team-mermaid .section-5 rect,#fig-software-team-mermaid .section-5 path,#fig-software-team-mermaid .section-5 circle,#fig-software-team-mermaid .section-5 polygon,#fig-software-team-mermaid .section-5 path{fill:#3b0048;}#fig-software-team-mermaid .section-5 text{fill:lightgrey;}#fig-software-team-mermaid .node-icon-5{font-size:40px;color:lightgrey;}#fig-software-team-mermaid .section-edge-5{stroke:#3b0048;}#fig-software-team-mermaid .edge-depth-5{stroke-width:-1;}#fig-software-team-mermaid .section-5 line{stroke:#c4ffb7;stroke-width:3;}#fig-software-team-mermaid .disabled,#fig-software-team-mermaid .disabled circle,#fig-software-team-mermaid .disabled text{fill:lightgray;}#fig-software-team-mermaid .disabled text{fill:#efefef;}#fig-software-team-mermaid .section-6 rect,#fig-software-team-mermaid .section-6 path,#fig-software-team-mermaid .section-6 circle,#fig-software-team-mermaid .section-6 polygon,#fig-software-team-mermaid .section-6 path{fill:#995a01;}#fig-software-team-mermaid .section-6 text{fill:lightgrey;}#fig-software-team-mermaid .node-icon-6{font-size:40px;color:lightgrey;}#fig-software-team-mermaid .section-edge-6{stroke:#995a01;}#fig-software-team-mermaid .edge-depth-6{stroke-width:-4;}#fig-software-team-mermaid .section-6 line{stroke:#66a5fe;stroke-width:3;}#fig-software-team-mermaid .disabled,#fig-software-team-mermaid .disabled circle,#fig-software-team-mermaid .disabled text{fill:lightgray;}#fig-software-team-mermaid .disabled text{fill:#efefef;}#fig-software-team-mermaid .section-7 rect,#fig-software-team-mermaid .section-7 path,#fig-software-team-mermaid .section-7 circle,#fig-software-team-mermaid .section-7 polygon,#fig-software-team-mermaid .section-7 path{fill:#154706;}#fig-software-team-mermaid .section-7 text{fill:lightgrey;}#fig-software-team-mermaid .node-icon-7{font-size:40px;color:lightgrey;}#fig-software-team-mermaid .section-edge-7{stroke:#154706;}#fig-software-team-mermaid .edge-depth-7{stroke-width:-7;}#fig-software-team-mermaid .section-7 line{stroke:#eab8f9;stroke-width:3;}#fig-software-team-mermaid .disabled,#fig-software-team-mermaid .disabled circle,#fig-software-team-mermaid .disabled text{fill:lightgray;}#fig-software-team-mermaid .disabled text{fill:#efefef;}#fig-software-team-mermaid .section-8 rect,#fig-software-team-mermaid .section-8 path,#fig-software-team-mermaid .section-8 circle,#fig-software-team-mermaid .section-8 polygon,#fig-software-team-mermaid .section-8 path{fill:#161722;}#fig-software-team-mermaid .section-8 text{fill:lightgrey;}#fig-software-team-mermaid .node-icon-8{font-size:40px;color:lightgrey;}#fig-software-team-mermaid .section-edge-8{stroke:#161722;}#fig-software-team-mermaid .edge-depth-8{stroke-width:-10;}#fig-software-team-mermaid .section-8 line{stroke:#e9e8dd;stroke-width:3;}#fig-software-team-mermaid .disabled,#fig-software-team-mermaid .disabled circle,#fig-software-team-mermaid .disabled text{fill:lightgray;}#fig-software-team-mermaid .disabled text{fill:#efefef;}#fig-software-team-mermaid .section-9 rect,#fig-software-team-mermaid .section-9 path,#fig-software-team-mermaid .section-9 circle,#fig-software-team-mermaid .section-9 polygon,#fig-software-team-mermaid .section-9 path{fill:#00296f;}#fig-software-team-mermaid .section-9 text{fill:lightgrey;}#fig-software-team-mermaid .node-icon-9{font-size:40px;color:lightgrey;}#fig-software-team-mermaid .section-edge-9{stroke:#00296f;}#fig-software-team-mermaid .edge-depth-9{stroke-width:-13;}#fig-software-team-mermaid .section-9 line{stroke:#ffd690;stroke-width:3;}#fig-software-team-mermaid .disabled,#fig-software-team-mermaid .disabled circle,#fig-software-team-mermaid .disabled text{fill:lightgray;}#fig-software-team-mermaid .disabled text{fill:#efefef;}#fig-software-team-mermaid .section-10 rect,#fig-software-team-mermaid .section-10 path,#fig-software-team-mermaid .section-10 circle,#fig-software-team-mermaid .section-10 polygon,#fig-software-team-mermaid .section-10 path{fill:#01629c;}#fig-software-team-mermaid .section-10 text{fill:lightgrey;}#fig-software-team-mermaid .node-icon-10{font-size:40px;color:lightgrey;}#fig-software-team-mermaid .section-edge-10{stroke:#01629c;}#fig-software-team-mermaid .edge-depth-10{stroke-width:-16;}#fig-software-team-mermaid .section-10 line{stroke:#fe9d63;stroke-width:3;}#fig-software-team-mermaid .disabled,#fig-software-team-mermaid .disabled circle,#fig-software-team-mermaid .disabled text{fill:lightgray;}#fig-software-team-mermaid .disabled text{fill:#efefef;}#fig-software-team-mermaid .section-root rect,#fig-software-team-mermaid .section-root path,#fig-software-team-mermaid .section-root circle,#fig-software-team-mermaid .section-root polygon{fill:hsl(180, 1.5873015873%, 48.3529411765%);}#fig-software-team-mermaid .section-root text{fill:#2c2c2c;}#fig-software-team-mermaid .icon-container{height:100%;display:flex;justify-content:center;align-items:center;}#fig-software-team-mermaid .edge{fill:none;}#fig-software-team-mermaid .mindmap-node-label{dy:1em;alignment-baseline:middle;text-anchor:middle;dominant-baseline:middle;text-align:center;}#fig-software-team-mermaid :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;}BARRE SebastienChef de groupeIngénieurBAUDOUIN Jean LouisIHMTechnicienJANNIERE SylvainCAP4000 /IHMIngénieurCECILLON LucasCAP4000 /CybersécuritéApprentiDUPONT DavidIHMIngénieurHELARD FlorentCAP4000IngénieurMINIER BertrandYocto /CAP4000IngénieurJOURDAM JoshuaYocto /IHMApprenti Figure 6: Organigramme équipe logiciel Monsieur Sébastien BARRE, est responsable du développement logiciel dans le service Recherche et Développement. L’équipe logiciel travaille les applicatifs et le système d’exploitation des équipements que nous développons. Ces matériels sont destinés à être commercialisés via l’intégration dans les produits et services fournis par SDEL contrôle commande. Par exemple lors de la vente d’armoires de contrôle commande ou lors de l’installation d’un poste électrique. Figure 7: DigiBOX, calculateur SDELCC dédié aux applications des postes électriques Applications : Système de supervision de poste Passerelle de téléconduite Synoptique de poste local Consignateur d’état Concrètement -> embarqué, linux, yocto En tant qu’apprenti j’ai pour rôle de contribuer au développement de nos logiciels. Je suis aussi mobilisé pour tester et étudier l’intégration de nouvelles technologies sur les produits. Mon travail comprends l’étude, l’analyse de données, la création de prototypes, la mise en place de tests et l’assistance aux techniciens et aux ingénieurs dans leur travail. Je suis également amené à rédiger de la documentation et la communiquer les résultats de mes travaux. J’ai l’occasion de travailler sur des projets concrets et de développer mes compétences dans le domaine de l’informatique embarquée principalement. Avec ma montée en compétences au fur et à mesure de ma formation, je suis maintenant amené à travailler sur des projets plus complexes et à diriger le présent projet. 1.2 Contexte du projet Le CAP4000 est une base logicielle modulaire qui est utilisée par tous les équipements développés par le service Recherche et Développement. C’est une application qui fonctionne en permanence. Elle gère les principales fonctionnalités de nos produits en s’interfaçant avec différents composants tels que : Système d’exploitation (réseau, alimentation, processus …) Périphériques (affichage, cartes d’extensions, clé usb, …) Moteur d’automatisme Panneau frontal (leds, boutons, …) Nos équipements sont hautement configurables. Pour simplifier l’utilisation par nos client, l’administration des équipements doit pouvoir être effectuée à distance. Ces tâches sont réalisées grâce à aux outils de configuration historiques qui sont inclus dans une application de bureau windows. Figure 8: Utilitaire de configuration Depuis 2020 l’équipe logiciel travaille sur le remplacement de cette application. L’objectif est de moderniser nos outils pour simplifier et intégrer des contraintes de cybersécurité plus strictes dans les processus de gestion et de maintenance des équipements. Pour moderniser les principes d’accès aux ressources d’un calculateur, une API de type REST à été implémentée dans le logiciel CAP4000. L’objectif est de mettre à disposition un système de dialogue fiable et sécurisé avec un équipement dans le but de créer des outils basés sur les technologies Web pour les produits SDELCC. API (Interface de Programmation d’Application) Les API pour Application Programming Interface permettent à 2 ordinateur de communiquer entre eux. Imaginez cela comme l’utilisation d’un site web, mais au lieu de cliquer sur des boutons, vous écrivez du code pour demander explicitement des données à un serveur. Une API dite RESTful, suit un ensemble de règles et contraintes imposées par l’architecture REST (REpresentational State Transfer). Une API REST fonctionne sur le protocole HTTP (Hypertext Transfer Protocol). Elle mets à disposition des ressources (données) accessible par des URL uniques. Une requête permet d’accéder aux ressources. Chaque requête (GET, POST, PATCH, DELETE) respecte le principe CRUD (Create, Read, Update, Delete) et suit un format spécifique : méthode, URL, en-têtes (métadonnées) et corps (données). Un client effectue une requête sur le serveur via une URL, le serveur exécute du code (généralement accède à une base de données), formate les données dans une réponse avec un code d’état (indiquant le succès, une erreur client ou serveur). Les API REST sont stateless, ce qui signifie que chaque interaction est indépendante des précédentes, rendant les applications prévisibles et fiables. Les API de type REST sont devenues le standard de facto pour le développement d’API web depuis le début des années 2000. #fig-rest-api-flow-mermaid{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#ccc;}#fig-rest-api-flow-mermaid .error-icon{fill:#a44141;}#fig-rest-api-flow-mermaid .error-text{fill:#ddd;stroke:#ddd;}#fig-rest-api-flow-mermaid .edge-thickness-normal{stroke-width:2px;}#fig-rest-api-flow-mermaid .edge-thickness-thick{stroke-width:3.5px;}#fig-rest-api-flow-mermaid .edge-pattern-solid{stroke-dasharray:0;}#fig-rest-api-flow-mermaid .edge-pattern-dashed{stroke-dasharray:3;}#fig-rest-api-flow-mermaid .edge-pattern-dotted{stroke-dasharray:2;}#fig-rest-api-flow-mermaid .marker{fill:lightgrey;stroke:lightgrey;}#fig-rest-api-flow-mermaid .marker.cross{stroke:lightgrey;}#fig-rest-api-flow-mermaid svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#fig-rest-api-flow-mermaid .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#ccc;}#fig-rest-api-flow-mermaid .cluster-label text{fill:#F9FFFE;}#fig-rest-api-flow-mermaid .cluster-label span,#fig-rest-api-flow-mermaid p{color:#F9FFFE;}#fig-rest-api-flow-mermaid .label text,#fig-rest-api-flow-mermaid span,#fig-rest-api-flow-mermaid p{fill:#ccc;color:#ccc;}#fig-rest-api-flow-mermaid .node rect,#fig-rest-api-flow-mermaid .node circle,#fig-rest-api-flow-mermaid .node ellipse,#fig-rest-api-flow-mermaid .node polygon,#fig-rest-api-flow-mermaid .node path{fill:#1f2020;stroke:#81B1DB;stroke-width:1px;}#fig-rest-api-flow-mermaid .flowchart-label text{text-anchor:middle;}#fig-rest-api-flow-mermaid .node .label{text-align:center;}#fig-rest-api-flow-mermaid .node.clickable{cursor:pointer;}#fig-rest-api-flow-mermaid .arrowheadPath{fill:lightgrey;}#fig-rest-api-flow-mermaid .edgePath .path{stroke:lightgrey;stroke-width:2.0px;}#fig-rest-api-flow-mermaid .flowchart-link{stroke:lightgrey;fill:none;}#fig-rest-api-flow-mermaid .edgeLabel{background-color:hsl(0, 0%, 34.4117647059%);text-align:center;}#fig-rest-api-flow-mermaid .edgeLabel rect{opacity:0.5;background-color:hsl(0, 0%, 34.4117647059%);fill:hsl(0, 0%, 34.4117647059%);}#fig-rest-api-flow-mermaid .cluster rect{fill:hsl(180, 1.5873015873%, 28.3529411765%);stroke:rgba(255, 255, 255, 0.25);stroke-width:1px;}#fig-rest-api-flow-mermaid .cluster text{fill:#F9FFFE;}#fig-rest-api-flow-mermaid .cluster span,#fig-rest-api-flow-mermaid p{color:#F9FFFE;}#fig-rest-api-flow-mermaid div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(20, 1.5873015873%, 12.3529411765%);border:1px solid rgba(255, 255, 255, 0.25);border-radius:2px;pointer-events:none;z-index:100;}#fig-rest-api-flow-mermaid .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#ccc;}#fig-rest-api-flow-mermaid :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;}OdrinateurÉquipementNavigateurCAP4000VisiteRequêteGET/POST/PUT/DELETERéponseJSONLit et écritExecuteApplication webAPIrestModule XBase de donnéesProcéduresUtilisateur Figure 9: Principe d’utilisation de l’API REST du CAP4000 Cette API à pour principal objectif de permettre la configuration et la gestion des équipements à distance. Elle permet de récupérer des informations sur l’état de l’équipement et de ses composants. Elle permet également la modification la configuration de l’équipement et le déclenchement de procédures comme le redémarrage. Par exemple, il est possible de récupérer la version du logiciel installé sur l’équipement en exécutant la requête HTTP GET suivante : http://<ip-equipement>/api/v1/identification. identification { "CAP4000": { "IndiceValidation": 0, "Nom": "DigiBOX", "VersionCorrectif": 1, "VersionMajeure": 6, "VersionMineure": 0, "VersionProtocole": 2 }, "OS": { "Nom": "digibox-debug-os", "VersionCorrectif": 5, "VersionMajeure": 0, "VersionMineure": 4 }, "Produit": { "Nom": "digibox-debug", "VersionCorrectif": 1, "VersionMajeure": 2, "VersionMineure": 0 } } Une expérimentation d’application à été développée. Cette application est disponible pour deux plateformes : Windows et Web. Elle dialogue avec l’API REST du CAP4000. L’application a été développée avec le framework Qt. C’est un framework que nous utilisons déjà sur d’autres projets. Il a donc été choisi afin de capitaliser sur l’expertise du service. L’application est découpée en modules qui représentent une fonctionnalité dans l’interface (debug, configuration réseau, import de configuration, …). Qt (prononcé “cute”) Qt est un framework de développement d’applications multi-plateformes basé sur le language C++. Il permet de créer des logiciels avec une interface utilisateur graphique. Les applications peuvent être exécutées sur différents systèmes d’exploitation tels que Windows, macOS, Linux, etc., mais aussi dans un navigateur sans nécessiter de modifications majeures du code source. Pour fonctionner dans un navigateur, une application Qt doit être compilée au format WebAssembly. WebAssembly est un format binaire qui permet d’exécuter du code bas niveau de manière portable et sécurisée dans les navigateurs web modernes. Il est conçu pour compléter les langages de programmation traditionnels utilisés sur le web, tels que JavaScript, en offrant des performances plus élevées pour les applications web. Cette application expérimentale est destinée à être utilisée en interne par les collaborateurs de SDEL Contrôle Commande ainsi que certains clients qui souhaitent l’évaluer. Figure 10: Interface web du produit DigiBox (menu principal) On retrouve des fonctionnalités comme : Configuration des interfaces réseau Gestion des certificats État des cartes entrées/sorties Affichage des informations de débogage Gestion de l’alimentation Import de configuration … L’application gère les utilisateurs du système d’exploitation et les rôles du CAP4000. Ainsi, il est possible de restreindre l’accès à certaines fonctionnalités. Par exemple, un utilisateur avec le rôle “observateur” ne pourra pas accéder à la configuration réseau de l’équipement. L’architecture du cette solution peut être représentée par la pile technologique suivante : Table 1: Pile technologique Frontend API Backend Qt API REST CAP4000 Pile technologique D’abord c’est quoi ? Une pile technologique peut être découpée en trois catégories : Frontend La couche frontend inclus les outils requis pour construire une interface homme machine pour les utilisateurs finaux. Elle peut être développée pour fonctionner dans un navigateur internet ou sur un système d’exploitation (windows, linux, macos, android, ios, …). Backend La couche backend inclus un runtime (environnement d’exécution) serveur généralement accompagné d’une base de données. Elle permet de gérer les données et la logique métier de l’application (gestion des utilisateurs, des droits, fonctionnalités …). API La couche APIs permet de connecter le frontend et le backend (REST, GraphQL). Elle gère aussi l’intégration avec des services tiers (paiement, gestion des identités, messagerie, …) L’application web est accessible localement en accédant à un équipement via son adresse ip depuis un navigateur internet. Elle est compatible avec les navigateurs modernes (Chrome, Firefox, Edge, Safari, …). L’application est donc distribuée directement dans le système d’exploitation d’un équipement et ne nécessite aucune installation sur les postes de travail de nos clients. Chaque type d’équipement dispose de sa propre version de l’application intégrant plus ou moins de fonctionnalités. Dans la suite du rapport, nous allons présenter les différentes étapes de la réalisation de ce projet. Nous commencerons par présenter la problématique et les objectifs du projet. Nous détaillerons ensuite la méthodologie utilisée pour réaliser le projet. Enfin, nous présenterons les résultats obtenus et les perspectives pour la suite du projet. Parler des licenses libres / open sources https://www.diatem.net/les-licences-open-source/ 2 Problématique Aujourd’hui, nous souhaitons intégrer l’application web sur l’ensemble des produits que nous commercialisons. En informatique, les logiciels sont distribués sous plusieurs types de licenses. Les licenses open-sources permettent aux développeurs de les utiliser, de les modifier et de les redistribuer en suivant certaines règles. Il existe beaucoup de licences open-sources et chacune dispose de ses propres règles d’utilisation. Il est important de prendre en compte ces conditions avant de commercialiser un produit qui intègre un composant open-source. Le framework Qt est la propriété de The Qt Company. Il est distribué sous plusieurs licenses. Jusqu’ici nous avons utilisé la version open-source de ce logiciel pour commercialiser nos produits. Cependant cette version ne permet pas de distribuer une application au format WebAssembly sans également distribuer le code source de celle-ci. La direction de VINCI ne souhaite pas publier les sources de ses applications. Nous avons également écarté la possibilité de payer des licenses commerciales. La commercialisation des produits intégrant de l’application web sous sa forme actuelle n’est donc pas envisageable. Le présent projet vise à développer une nouvelle application en se basant sur une autre technologie. Nous souhaitons une interface graphique web reposant sur le dialogue avec un calculateur via l’API REST. Nous souhaitons utiliser un framework open source et entièrement gratuit, adapté aux besoins de l’embarqué. Ce projet permettra également d’explorer et monter en compétences sur les technologies web ainsi que consolider notre solution en utilisant des technologies pérennes qui autorisent la diffusion de nos applications sous licence propriétaire. Problématique Comment choisir judicieusement un framework open source pour le développement d’une nouvelle application web intégrée à l’ensemble de la gamme de produits, tout en respectant les contraintes de confidentialité imposées par la direction de l’entreprise et en garantissant la possibilité de diffuser les applications sous licence propriétaire ? Ce projet aura un impact sur les utilisateurs de l’application Qt. Comme cette application était encore en phase expérimentale, seuls les développeurs du service recherche et développement et les intervenants internes divers (test, bureau d’études, etc …) seront impactés. Nos client qui profitent déjà de l’application seront limités à l’utilisation sur la plateforme Windows uniquement. Lorsque la prochaine application sera développée, ils pourront bénéficier d’une mise à niveau vers la nouvelle version système de leur produit. L’application Qt continuera à être développée par Jean-Louis Baudoin. Un autre groupe de travail a réussi à reproduire le fonctionnement de l’application web en explorant des solutions alternatives au WebAssembly. Cela est rendu possible en installant l’application de bureau directement sur les calculateurs et partageant l’environnement de bureau grace aux technologies de type VNC. Cette solution, rapide à mettre en place, restera cependant temporaire en raison de ses faible performances et sera à terme définitivement remplacée définitivement lorsque la future application sera publiée. 3 Buts et Objectifs Choisir un framework Établir une liste de minimum 10 critères permettant la comparaison des frameworks Évaluer et tester au moins trois frameworks alternatifs en fonction la liste de critères Réaliser un prototype (POC) fonctionnel comprenant un menu ainsi que 3 modules Préparer la transition Mettre à niveau la spécification de l’API pour répondre au standard OpenAPI 3 Modifier la méthode d’authentification des requêtes pour simplifier l’implémentation de l’API dans la nouvelle application Changer le système d’authentification afin de garantir la sécurité de l’API Effectuer une semaine d’auto-formation sur le framework choisi Établir une liste ordonnée de modules à porter, à créer ou à supprimer comprenant au minimum les modules existants Développer la nouvelle application web Développer au moins 20 modules Atteindre des performances au moins équivalentes (dégradation maximale de 5%) à l’application actuelle Acquérir les compétences d’un ingénieur débutant Mettre en place une démarche de gestion de projet permettant d’atteindre les buts et objectifs fixés Fournir l’ensemble des livrables demandés par l’ESEO Avoir une note supérieure à 14 aux 3 évaluations PING Bien sûr, voici une proposition de structure alternative pour les objectifs, avec des formulations plus étoffées et professionnelles : 3.1 Objectifs Stratégiques et Opérationnels 3.1.1 Sélectionner un Cadre de Développement Optimal Identifier et choisir un framework adapté aux besoins spécifiques du projet. Élaborer une Liste de Critères de Sélection : Objectif : Développer une liste détaillée d’au moins 10 critères pertinents pour la comparaison des frameworks. Délai : 2 semaines. Description : Inclure des critères tels que la performance, la scalabilité, la facilité d’intégration, le support communautaire, la documentation, et la sécurité. Évaluer et Tester des Frameworks Alternatifs : Objectif : Évaluer au moins trois frameworks alternatifs en fonction de la liste de critères établie. Délai : 4 semaines. Description : Effectuer des tests pratiques pour chaque framework afin de vérifier leur conformité aux critères et documenter les résultats. Développer un Prototype Fonctionnel : Objectif : Réaliser un Proof of Concept (POC) comprenant un menu principal et trois modules fonctionnels. Délai : 6 semaines. Description : Utiliser le framework sélectionné pour développer un prototype démontrant les capacités du framework à répondre aux exigences du projet. 3.1.2 Préparer la Transition Technologique Assurer une transition fluide et sécurisée vers le nouveau framework. Mettre à Niveau la Spécification de l’API : Objectif : Conformer la spécification de l’API au standard OpenAPI 3. Délai : 2 semaines. Description : Revoir et modifier la spécification actuelle pour garantir la compatibilité et les bonnes pratiques. Modifier le Système d’Authentification : Objectif : Mettre en place un système d’authentification robuste pour garantir la sécurité de l’API. Délai : 3 semaines. Description : Intégrer des méthodes modernes d’authentification (par exemple, OAuth2) pour renforcer la sécurité et simplifier l’implémentation. Former l’Équipe sur le Nouveau Framework : Objectif : Effectuer une semaine d’auto-formation pour l’ensemble de l’équipe sur le framework sélectionné. Délai : 1 semaine. Description : Utiliser des ressources en ligne et des formations internes pour acquérir les compétences nécessaires. Établir une Liste de Modules à Migrer : Objectif : Créer une liste ordonnée des modules existants à porter, créer ou supprimer. Délai : 2 semaines. Description : Prioriser les modules en fonction de leur importance et de leur complexité, en incluant une évaluation des efforts requis pour chaque module. Minimum viable product : Liste modules 3.1.3 Développer la Nouvelle Application Web Concevoir et déployer une nouvelle application web performante et fonctionnelle. Développer les Modules Nécessaires : Objectif : Concevoir et développer au moins 20 modules fonctionnels pour la nouvelle application. Délai : 3 mois. Description : Chaque module doit être testé et validé selon les critères de qualité et de performance. Optimiser les Performances de l’Application : Objectif : Assurer que les performances de la nouvelle application ne se dégradent pas de plus de 5% par rapport à l’application actuelle. Délai : 1 mois. Description : Effectuer des tests de performance réguliers et optimiser le code et l’infrastructure en conséquence. 3.1.4 Objectif 4 : Développer les Compétences en Gestion de Projet et Techniques But : Acquérir et démontrer les compétences nécessaires pour réussir dans le cadre du projet. Mettre en Place une Démarche de Gestion de Projet : Objectif : Développer une approche structurée de gestion de projet pour atteindre les objectifs fixés. Délai : 1 semaine. Description : Utiliser des outils de gestion de projet (comme Jira ou Trello) pour suivre les tâches, les progrès et les délais. Fournir les Livrables Requis : Objectif : Produire et livrer tous les livrables exigés par l’ESEO. Délai : Selon les échéances établies. Description : Assurer la qualité et la complétude de tous les documents et livrables. Obtenir des Notes Élevées aux Évaluations PING : Objectif : Obtenir une note supérieure à 14/20 aux trois évaluations PING. Délai : Prochaine session d’évaluations. Description : Préparer et réviser les matières évaluées pour garantir une performance optimale. 4 Démarche Comment fonctionne le service R&D de base outils de gestion de projet Redmine Fiche de développement Notes Outils - Gestion de projet - Gestion de version - Environnement de développement - Produits de développement - Langages de programmation 4.1 Démarche générale La démarché générale du projet est définie par le présent document. Ce document défini les buts et objectif attendus. Il établis également un plan d’action pour les atteindre. Il mets en place un processus de communication pour garantir la circulation de l’information. Enfin, il identifie les risques et problèmes potentiels qui pourraient survenir et établi des solutions pour y faire face. 4.2 Méthodologie, techniques et technologies Le développement logiciel suit plusieurs étapes essentielles pour assurer la création et le bon fonctionnement d’un logiciel. Analyse des besoins : Identifier et définir les besoins des utilisateurs et les objectifs du logiciel. Les exigences fonctionnelles et non fonctionnelles sont spécifiées, ainsi que les contraintes du projet. Conception : Créer une architecture logicielle en se basant sur l’analyse des besoins. Cela implique la création de diagrammes, de schémas et de modèles qui servent de guide pour la réalisation du logiciel. Tests : Vérifier la qualité et la conformité du logiciel. Des tests unitaires, d’intégration et de validation sont réalisés pour détecter et corriger les éventuelles erreurs et bugs. Intégration : Assembler les différentes parties du logiciel en un ensemble fonctionnel et cohérent. Cela inclut l’ajout de fonctionnalités supplémentaires et l’optimisation des performances. Déploiement : Rendre le logiciel disponible aux utilisateurs finaux. Cela peut impliquer l’installation sur des serveurs, la distribution de fichiers d’installation ou la mise à disposition sur des plateformes en ligne. Maintenance : Maintenir le logiciel en état de fonctionnement. Les erreurs sont corrigées, des mises à jour sont effectuées, des nouvelles fonctionnalités peuvent être ajoutées et des améliorations sont apportées pour répondre aux besoins changeants des utilisateurs. Dans le cadre de ce projet, une approche de gestion de projet agile sera utilisée. Cette approche permettra une meilleure gestion des changements et des imprévus, tout en limitant l’effet tunnel. Méthodologie agile La méthodologie agile est une approche de gestion de projet qui se caractérise par sa flexibilité, sa collaboration continue avec les parties prenantes et sa capacité à s’adapter aux changements tout au long du cycle de développement. Les méthodologies agiles mettent l’accent sur la livraison itérative et incrémentale du produit, favorisant des cycles de développement courts et des retours fréquents des utilisateurs. L’agilité est souvent utilisée dans le développement logiciel, mais elle peut également être appliquée à d’autres domaines. Pour plus d’informations sur les méthodologies agiles, voir https://www.atlassian.com/fr/agile. La phase de développement sera découpée en sprints. Chaque période entreprise (2 à 4 semaines) intégrera un unique sprint. Cette segmentation permettra d’effectuer des contrôles réguliers de l’avancement et favorisera un dialogue continu. Sprint Un sprint en développement agile est un cycle de travail itératif et incrémental qui permet à une équipe de développer et de livrer des fonctionnalités de manière régulière, tout en restant flexible et adaptative aux changements et aux retours client. Un sprint intègre généralement les étapes suivantes : Planification des tâches en début du sprint Développement et test continu Revues et rétrospectives en fin de sprint Livraison logicielle potentielle Le projet utilisera un backlog pour recueillir, organiser et hiérarchiser l’ensemble des travaux à réaliser dans le cadre du projet. Chaque fonctionnalité à développer sera spécifiée en une user story et ajoutée comme ticket dans le backlog. Une user story est une technique de description des besoins du client du point de vue de ce dernier. L’objectif de chaque user story est de décrire une fonctionnalité du logiciel de manière simple et compréhensible, tout en se concentrant sur la valeur apportée à l’utilisateur final. Chaque user story pourra dépendre de plusieurs tâches de développements, qui seront rattachés à ce ticket. Au début de chaque sprint, le backlog devra être complété en ajoutant les tâches à traiter pour les trois semaines à venir. Si nécessaire, une modélisation de la fonctionnalité à implémenter pourra être réalisée pour répondre aux spécifications établies précédemment. Chaque sprint devra intégrer la spécification, le développement, les tests et la documentation d’une ou plusieurs user story. Un réunion sera effectué à la fin de chaque sprint afin de suivre l’état d’avancement et d’exercer un esprit critique. Le but de ces réunions sera de replacer les tickets qui n’ont pas pu être traités et d’effectuer un bilan de ce qui à marché et ce qui n’a pas fonctionné pour mieux organiser les sprints futurs. Ces réunions seront basés sur un unique document de suivi, qui sera mis à jour durant chaque sprint. L’utilisation d’UML sera privilégiée pour la modélisation des différents aspects du projet, tels que les cas d’utilisation, les diagrammes de classes, les diagrammes de séquence, etc. Cela permettra de décrire plus précisément les besoins du projet et de mieux visualiser les différentes parties impliquées. 4.3 Analyse des risques Table 2: SWOT (Strengths Weaknesses Opportunities Threats) Atouts Handicapes Interne Forces Maintenir la confidentialité du code source Economies de coûts à long terme Étendre et renforcer les compétences techniques de l’équipe de développement Faiblesses Coûteux en termes de temps et de ressources Peut ne pas répondre aux besoins Difficultés à apprendre un nouveau framework Externe Opportunités Nouvelles opportunités de développement pour l’entreprise Élargir son champ d’expertise Bénéficier de l’expertise de la communauté open source Menaces Coûts de licence prohibitifs ou des restrictions d’utilisation Difficulté à trouver une alternative qui soit compatible avec la cible embarquée Retards dans le développement en raison de l’apprentissage d’un nouveau framework Manque de compétences sur le nouveau framework et les technologies associées Établir un calendrier pour l’apprentissage et la mise en œuvre du nouveau framework Offrir une formation à l’équipe de développement pour aider à l’apprentissage du nouveau framework Mobiliser des ressources supplémentaires pour maintenir les délais Communauté du framework limitée Effectuer une recherche sur les communautés de développeurs pour identifier les frameworks les plus populaires Évaluer la taille de la communauté et la fréquence des mises à jour Évaluer la qualité de la documentation et des exemples Manque de ressources Augmenter l’équipe de développement pour combler les manques Étendre les délais du projet Réduire les fonctionnalités à développer Incompatibilité avec la cible embarquée Examiner les exigences de la cible embarquée et les spécifications de l’alternative Effectuer des tests de compatibilité pour vérifier si l’alternative répond aux exigences de la cible embarquée Résistance au changement Communiquer sur les avantages du nouveau framework choisi Mettre en place une formation pour les développeurs qui ne connaissent pas le nouveau framework Difficulté à trouver une alternative viable Effectuer une analyse des besoins pour identifier les caractéristiques essentielles que l’alternative doit posséder Effectuer une recherche sur les coûts des licences pour les différentes alternatives Quadrant chart risk analysis 4.4 Acteurs Table 3: Acteurs Acteur Rôle Description Joshua Jourdam Chef de projet, Développeur Fourni le livrable du projet. Conduit et pilote le projet. Anime les différentes réunions. Principal développeur. Sébastien Barré Sponsor du projet, Client Responsable global du projet et est un soutien pour le chef de projet. Prend les décisions importantes et arbitre entre deux choix, notamment en situation de crise. Détermine les attendus du projet et les exigences liées. Validera la conformité du livrable à ses attentes et exigences. Opérateurs, testeurs, développeurs Utilisateurs finaux Personnes qui vont utiliser le produit ou le service au quotidien. Exprime leur soutien ou leur mécontentement vis à vis du projet. Jean-Louis Baudoin Utilisateur clé Référent métier, développeur de l’application Qt. Sera le principal interlocuteur pour collecter des informations sur l’application actuelle. David Dupond Utilisateur clé Référent licenses, pourra valider la license d’utilisation du nouvel outil. Lucas Cecillon Utilisateur clé Référent cybersécurité, pourra être consulté sur les problématiques de sécurité de l’application et de l’API. Stagiaires Support Peuvent intervenir en renfort sur le projet, notamment lors des périodes estivales. 4.5 Lotissements %%| label: fig-work-breakdown-structure %%| fig-cap: WBS (Work Breakdown Structure) %%| file: ./graphics/wbs.mmd 4.6 Planning prévisionnel #fig-forecast-schedule-mermaid{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#ccc;}#fig-forecast-schedule-mermaid .error-icon{fill:#a44141;}#fig-forecast-schedule-mermaid .error-text{fill:#ddd;stroke:#ddd;}#fig-forecast-schedule-mermaid .edge-thickness-normal{stroke-width:2px;}#fig-forecast-schedule-mermaid .edge-thickness-thick{stroke-width:3.5px;}#fig-forecast-schedule-mermaid .edge-pattern-solid{stroke-dasharray:0;}#fig-forecast-schedule-mermaid .edge-pattern-dashed{stroke-dasharray:3;}#fig-forecast-schedule-mermaid .edge-pattern-dotted{stroke-dasharray:2;}#fig-forecast-schedule-mermaid .marker{fill:lightgrey;stroke:lightgrey;}#fig-forecast-schedule-mermaid .marker.cross{stroke:lightgrey;}#fig-forecast-schedule-mermaid svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#fig-forecast-schedule-mermaid .mermaid-main-font{font-family:"trebuchet ms",verdana,arial,sans-serif;font-family:var(--mermaid-font-family);}#fig-forecast-schedule-mermaid .exclude-range{fill:hsl(52.9411764706, 28.813559322%, 48.431372549%);}#fig-forecast-schedule-mermaid .section{stroke:none;opacity:0.2;}#fig-forecast-schedule-mermaid .section0{fill:hsl(52.9411764706, 28.813559322%, 58.431372549%);}#fig-forecast-schedule-mermaid .section2{fill:#EAE8D9;}#fig-forecast-schedule-mermaid .section1,#fig-forecast-schedule-mermaid .section3{fill:#333;opacity:0.2;}#fig-forecast-schedule-mermaid .sectionTitle0{fill:#F9FFFE;}#fig-forecast-schedule-mermaid .sectionTitle1{fill:#F9FFFE;}#fig-forecast-schedule-mermaid .sectionTitle2{fill:#F9FFFE;}#fig-forecast-schedule-mermaid .sectionTitle3{fill:#F9FFFE;}#fig-forecast-schedule-mermaid .sectionTitle{text-anchor:start;font-family:'trebuchet ms',verdana,arial,sans-serif;font-family:var(--mermaid-font-family);}#fig-forecast-schedule-mermaid .grid .tick{stroke:lightgrey;opacity:0.8;shape-rendering:crispEdges;}#fig-forecast-schedule-mermaid .grid .tick text{font-family:"trebuchet ms",verdana,arial,sans-serif;fill:#ccc;}#fig-forecast-schedule-mermaid .grid path{stroke-width:0;}#fig-forecast-schedule-mermaid .today{fill:none;stroke:#DB5757;stroke-width:2px;}#fig-forecast-schedule-mermaid .task{stroke-width:2;}#fig-forecast-schedule-mermaid .taskText{text-anchor:middle;font-family:'trebuchet ms',verdana,arial,sans-serif;font-family:var(--mermaid-font-family);}#fig-forecast-schedule-mermaid .taskTextOutsideRight{fill:hsl(28.5714285714, 17.3553719008%, 86.2745098039%);text-anchor:start;font-family:'trebuchet ms',verdana,arial,sans-serif;font-family:var(--mermaid-font-family);}#fig-forecast-schedule-mermaid .taskTextOutsideLeft{fill:hsl(28.5714285714, 17.3553719008%, 86.2745098039%);text-anchor:end;}#fig-forecast-schedule-mermaid .task.clickable{cursor:pointer;}#fig-forecast-schedule-mermaid .taskText.clickable{cursor:pointer;fill:#003163!important;font-weight:bold;}#fig-forecast-schedule-mermaid .taskTextOutsideLeft.clickable{cursor:pointer;fill:#003163!important;font-weight:bold;}#fig-forecast-schedule-mermaid .taskTextOutsideRight.clickable{cursor:pointer;fill:#003163!important;font-weight:bold;}#fig-forecast-schedule-mermaid .taskText0,#fig-forecast-schedule-mermaid .taskText1,#fig-forecast-schedule-mermaid .taskText2,#fig-forecast-schedule-mermaid .taskText3{fill:hsl(28.5714285714, 17.3553719008%, 86.2745098039%);}#fig-forecast-schedule-mermaid .task0,#fig-forecast-schedule-mermaid .task1,#fig-forecast-schedule-mermaid .task2,#fig-forecast-schedule-mermaid .task3{fill:hsl(180, 1.5873015873%, 35.3529411765%);stroke:#ffffff;}#fig-forecast-schedule-mermaid .taskTextOutside0,#fig-forecast-schedule-mermaid .taskTextOutside2{fill:lightgrey;}#fig-forecast-schedule-mermaid .taskTextOutside1,#fig-forecast-schedule-mermaid .taskTextOutside3{fill:lightgrey;}#fig-forecast-schedule-mermaid .active0,#fig-forecast-schedule-mermaid .active1,#fig-forecast-schedule-mermaid .active2,#fig-forecast-schedule-mermaid .active3{fill:#81B1DB;stroke:#ffffff;}#fig-forecast-schedule-mermaid .activeText0,#fig-forecast-schedule-mermaid .activeText1,#fig-forecast-schedule-mermaid .activeText2,#fig-forecast-schedule-mermaid .activeText3{fill:hsl(28.5714285714, 17.3553719008%, 86.2745098039%)!important;}#fig-forecast-schedule-mermaid .done0,#fig-forecast-schedule-mermaid .done1,#fig-forecast-schedule-mermaid .done2,#fig-forecast-schedule-mermaid .done3{stroke:grey;fill:lightgrey;stroke-width:2;}#fig-forecast-schedule-mermaid .doneText0,#fig-forecast-schedule-mermaid .doneText1,#fig-forecast-schedule-mermaid .doneText2,#fig-forecast-schedule-mermaid .doneText3{fill:hsl(28.5714285714, 17.3553719008%, 86.2745098039%)!important;}#fig-forecast-schedule-mermaid .crit0,#fig-forecast-schedule-mermaid .crit1,#fig-forecast-schedule-mermaid .crit2,#fig-forecast-schedule-mermaid .crit3{stroke:#E83737;fill:#E83737;stroke-width:2;}#fig-forecast-schedule-mermaid .activeCrit0,#fig-forecast-schedule-mermaid .activeCrit1,#fig-forecast-schedule-mermaid .activeCrit2,#fig-forecast-schedule-mermaid .activeCrit3{stroke:#E83737;fill:#81B1DB;stroke-width:2;}#fig-forecast-schedule-mermaid .doneCrit0,#fig-forecast-schedule-mermaid .doneCrit1,#fig-forecast-schedule-mermaid .doneCrit2,#fig-forecast-schedule-mermaid .doneCrit3{stroke:#E83737;fill:lightgrey;stroke-width:2;cursor:pointer;shape-rendering:crispEdges;}#fig-forecast-schedule-mermaid .milestone{transform:rotate(45deg) scale(0.8,0.8);}#fig-forecast-schedule-mermaid .milestoneText{font-style:italic;}#fig-forecast-schedule-mermaid .doneCritText0,#fig-forecast-schedule-mermaid .doneCritText1,#fig-forecast-schedule-mermaid .doneCritText2,#fig-forecast-schedule-mermaid .doneCritText3{fill:hsl(28.5714285714, 17.3553719008%, 86.2745098039%)!important;}#fig-forecast-schedule-mermaid .activeCritText0,#fig-forecast-schedule-mermaid .activeCritText1,#fig-forecast-schedule-mermaid .activeCritText2,#fig-forecast-schedule-mermaid .activeCritText3{fill:hsl(28.5714285714, 17.3553719008%, 86.2745098039%)!important;}#fig-forecast-schedule-mermaid .titleText{text-anchor:middle;font-size:18px;fill:#ccc;font-family:'trebuchet ms',verdana,arial,sans-serif;font-family:var(--mermaid-font-family);}#fig-forecast-schedule-mermaid :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 01/06 01/07 01/08 01/09 01/10 01/11 01/12 01/01 01/02 01/03 01/04 01/05 01/06 01/07 01/08 01/09Réunion de lancement Collecte informations Rapport avant projet Réunion premier choix OpenAPI Soutenance avant projet Authentification FormationPOC Réunion de validation Feuille de route Sprints Rapport intermédiaire Soutenance intermédiaire Réunion de clôture Rapport final Soutenance final EvaluationPréparationDéveloppementEseo Figure 11: Diagramme de GANTT prévisionnel 4.7 Planning effectif #fig-actual-schedule-mermaid{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#ccc;}#fig-actual-schedule-mermaid .error-icon{fill:#a44141;}#fig-actual-schedule-mermaid .error-text{fill:#ddd;stroke:#ddd;}#fig-actual-schedule-mermaid .edge-thickness-normal{stroke-width:2px;}#fig-actual-schedule-mermaid .edge-thickness-thick{stroke-width:3.5px;}#fig-actual-schedule-mermaid .edge-pattern-solid{stroke-dasharray:0;}#fig-actual-schedule-mermaid .edge-pattern-dashed{stroke-dasharray:3;}#fig-actual-schedule-mermaid .edge-pattern-dotted{stroke-dasharray:2;}#fig-actual-schedule-mermaid .marker{fill:lightgrey;stroke:lightgrey;}#fig-actual-schedule-mermaid .marker.cross{stroke:lightgrey;}#fig-actual-schedule-mermaid svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#fig-actual-schedule-mermaid .mermaid-main-font{font-family:"trebuchet ms",verdana,arial,sans-serif;font-family:var(--mermaid-font-family);}#fig-actual-schedule-mermaid .exclude-range{fill:hsl(52.9411764706, 28.813559322%, 48.431372549%);}#fig-actual-schedule-mermaid .section{stroke:none;opacity:0.2;}#fig-actual-schedule-mermaid .section0{fill:hsl(52.9411764706, 28.813559322%, 58.431372549%);}#fig-actual-schedule-mermaid .section2{fill:#EAE8D9;}#fig-actual-schedule-mermaid .section1,#fig-actual-schedule-mermaid .section3{fill:#333;opacity:0.2;}#fig-actual-schedule-mermaid .sectionTitle0{fill:#F9FFFE;}#fig-actual-schedule-mermaid .sectionTitle1{fill:#F9FFFE;}#fig-actual-schedule-mermaid .sectionTitle2{fill:#F9FFFE;}#fig-actual-schedule-mermaid .sectionTitle3{fill:#F9FFFE;}#fig-actual-schedule-mermaid .sectionTitle{text-anchor:start;font-family:'trebuchet ms',verdana,arial,sans-serif;font-family:var(--mermaid-font-family);}#fig-actual-schedule-mermaid .grid .tick{stroke:lightgrey;opacity:0.8;shape-rendering:crispEdges;}#fig-actual-schedule-mermaid .grid .tick text{font-family:"trebuchet ms",verdana,arial,sans-serif;fill:#ccc;}#fig-actual-schedule-mermaid .grid path{stroke-width:0;}#fig-actual-schedule-mermaid .today{fill:none;stroke:#DB5757;stroke-width:2px;}#fig-actual-schedule-mermaid .task{stroke-width:2;}#fig-actual-schedule-mermaid .taskText{text-anchor:middle;font-family:'trebuchet ms',verdana,arial,sans-serif;font-family:var(--mermaid-font-family);}#fig-actual-schedule-mermaid .taskTextOutsideRight{fill:hsl(28.5714285714, 17.3553719008%, 86.2745098039%);text-anchor:start;font-family:'trebuchet ms',verdana,arial,sans-serif;font-family:var(--mermaid-font-family);}#fig-actual-schedule-mermaid .taskTextOutsideLeft{fill:hsl(28.5714285714, 17.3553719008%, 86.2745098039%);text-anchor:end;}#fig-actual-schedule-mermaid .task.clickable{cursor:pointer;}#fig-actual-schedule-mermaid .taskText.clickable{cursor:pointer;fill:#003163!important;font-weight:bold;}#fig-actual-schedule-mermaid .taskTextOutsideLeft.clickable{cursor:pointer;fill:#003163!important;font-weight:bold;}#fig-actual-schedule-mermaid .taskTextOutsideRight.clickable{cursor:pointer;fill:#003163!important;font-weight:bold;}#fig-actual-schedule-mermaid .taskText0,#fig-actual-schedule-mermaid .taskText1,#fig-actual-schedule-mermaid .taskText2,#fig-actual-schedule-mermaid .taskText3{fill:hsl(28.5714285714, 17.3553719008%, 86.2745098039%);}#fig-actual-schedule-mermaid .task0,#fig-actual-schedule-mermaid .task1,#fig-actual-schedule-mermaid .task2,#fig-actual-schedule-mermaid .task3{fill:hsl(180, 1.5873015873%, 35.3529411765%);stroke:#ffffff;}#fig-actual-schedule-mermaid .taskTextOutside0,#fig-actual-schedule-mermaid .taskTextOutside2{fill:lightgrey;}#fig-actual-schedule-mermaid .taskTextOutside1,#fig-actual-schedule-mermaid .taskTextOutside3{fill:lightgrey;}#fig-actual-schedule-mermaid .active0,#fig-actual-schedule-mermaid .active1,#fig-actual-schedule-mermaid .active2,#fig-actual-schedule-mermaid .active3{fill:#81B1DB;stroke:#ffffff;}#fig-actual-schedule-mermaid .activeText0,#fig-actual-schedule-mermaid .activeText1,#fig-actual-schedule-mermaid .activeText2,#fig-actual-schedule-mermaid .activeText3{fill:hsl(28.5714285714, 17.3553719008%, 86.2745098039%)!important;}#fig-actual-schedule-mermaid .done0,#fig-actual-schedule-mermaid .done1,#fig-actual-schedule-mermaid .done2,#fig-actual-schedule-mermaid .done3{stroke:grey;fill:lightgrey;stroke-width:2;}#fig-actual-schedule-mermaid .doneText0,#fig-actual-schedule-mermaid .doneText1,#fig-actual-schedule-mermaid .doneText2,#fig-actual-schedule-mermaid .doneText3{fill:hsl(28.5714285714, 17.3553719008%, 86.2745098039%)!important;}#fig-actual-schedule-mermaid .crit0,#fig-actual-schedule-mermaid .crit1,#fig-actual-schedule-mermaid .crit2,#fig-actual-schedule-mermaid .crit3{stroke:#E83737;fill:#E83737;stroke-width:2;}#fig-actual-schedule-mermaid .activeCrit0,#fig-actual-schedule-mermaid .activeCrit1,#fig-actual-schedule-mermaid .activeCrit2,#fig-actual-schedule-mermaid .activeCrit3{stroke:#E83737;fill:#81B1DB;stroke-width:2;}#fig-actual-schedule-mermaid .doneCrit0,#fig-actual-schedule-mermaid .doneCrit1,#fig-actual-schedule-mermaid .doneCrit2,#fig-actual-schedule-mermaid .doneCrit3{stroke:#E83737;fill:lightgrey;stroke-width:2;cursor:pointer;shape-rendering:crispEdges;}#fig-actual-schedule-mermaid .milestone{transform:rotate(45deg) scale(0.8,0.8);}#fig-actual-schedule-mermaid .milestoneText{font-style:italic;}#fig-actual-schedule-mermaid .doneCritText0,#fig-actual-schedule-mermaid .doneCritText1,#fig-actual-schedule-mermaid .doneCritText2,#fig-actual-schedule-mermaid .doneCritText3{fill:hsl(28.5714285714, 17.3553719008%, 86.2745098039%)!important;}#fig-actual-schedule-mermaid .activeCritText0,#fig-actual-schedule-mermaid .activeCritText1,#fig-actual-schedule-mermaid .activeCritText2,#fig-actual-schedule-mermaid .activeCritText3{fill:hsl(28.5714285714, 17.3553719008%, 86.2745098039%)!important;}#fig-actual-schedule-mermaid .titleText{text-anchor:middle;font-size:18px;fill:#ccc;font-family:'trebuchet ms',verdana,arial,sans-serif;font-family:var(--mermaid-font-family);}#fig-actual-schedule-mermaid :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 01/06 01/07 01/08 01/09 01/10 01/11 01/12 01/01 01/02 01/03 01/04 01/05 01/06 01/07 01/08 01/09Réunion de lancement Recherche technologies Rapport avant projet Soutenance avant projet POC Spécification API Spécification fonctionelle Présentation des frameworks Présentation application React / Choix définitif Création du projet Sprint 1 Point authentification Authentification Point stratégie de tests Validation specification API Fiche de développement V1 Test déploiement sur cible Sprint 2 Sprint 3 Soutenance intermédiaire Sprint 4 Formation React Introduction plateforme de développementSprint 5 Rapport final Soutenance finale EvaluationPréparationDéveloppementEseo Figure 12: Diagramme de GANTT effectif 4.7.1 Jalons Réunion de lancement : Cadre le projet, enjeux, objectifs, finalité, jalons projet, risques identifiés, calendrier projet, … Réunion premier choix : Choix du framework pour commencer le POC Réunion de validation : Validation définitive du choix du framework, planification des étapes de développement Réunion en début de sprint : Tri, organisation, revue des tâches à réaliser Réunion de fin de sprint: Bilan, présentation des tâches réalisés, revue des spécification, replanification des tâches non réalisées, … Réunion de clôture : Fin du projet, assure une transition pour la maintenance et l’exploitation du projet 4.7.2 Livrables Document de comparaison des frameworks avec une évaluation quantitative de chaque critère, basée sur une analyse détaillée. Document de spécification, détaillant les fonctionnalités, les exigences système, les ressources requises et les performances attendues. Prototype (POC) fonctionnel de l’interface web, respectant toutes les spécifications documentées. Feuille de route pour la mise en place du framework, détaillant les étapes à suivre pour la pour le développement des fonctionnalités, les ressources requises et les délais. Documentation développeur qui comprends un guide de démarrage et les bonnes pratiques de développement. Documentation utilisateur qui comprends un guide d’utilisation et les bonnes pratiques d’utilisation. Rapports et soutenances de projet. 4.8 Budget La durée du projet PING est estimée à 500 heures minimum. Ce projet devrait nécessiter plus de ressources. Le temps total nécessaire à la réalisation du projet est difficilement quantifiable pour le moment. Étant donné que le projet est cadré pour une seule personne, le coût peut être calculé en utilisant le tarif horaire du salaire brut annuel moyen d’un ingénieur débutant. Cela représente environ 10 000 €. Des coûts d’intervention de développement externes supplémentaires ponctuels peuvent s’ajouter, mais ils sont plus difficiles à chiffrer. La mobilisation de certains membres de l’équipe logiciel lors des réunions peut également être prise en compte. 5 Résultats 5.1 Evaluation des technologies du marché La première phase du projet visait à trouver une technologie alternative au framework Qt pour développer des application pour la plateforme web. Pour cela, j’ai établi une liste de critères permettant d’évaluer chaque solution. Besoins développement Besoins fonctionnels Langage de programmation Communauté et support Proximité avec l’existant Date de sortie Documentation Évolutivité Licence Outils Patron de conception Plateformes Popularité Possibles freins J’ai identifié l’ensemble des technologies qui permettent de développer des application pour la plateforme Web. Ces technologies peuvent être regroupées en 2 catégories, natif et multi-plateforme. La première catégorie comprends l’ensemble des frameworks basés sur le langage JavaScript. La seconde réuni quand à elle des solutions plus diverses qui ce basent sur différent langages comme le C#, le Dart ou le C++. Ces technologies peuvent permettre de cibler plusieurs plateformes (web, bureau, mobile) avec une base de code unique. Ce premier découpage m’a amener à tester 5 solutions. Natif : React Vue Angular Multi-plateforme : Flutter Uno platform J’ai établi une procédure de test unique. L’objectif était de comprendre le fonctionnement basique et l’architecture des frameworks ainsi que les langages sur lesquels ils sont basés. Le test consistait à réaliser une page affichant des données reçues par l’API REST et un formulaire permettant de modifier le nom d’hôte de l’équipement. A partir de cette petite étude, j’ai conclu que les frameworks multi-plateformes sont des solution moins adaptés à notre besoin. Elle sont plus complexes et n’ont pas une intégration complète avec le fonctionnalités d’un navigateur web. Les solutions natives semblaient plus simples à utiliser. La prochaine étape consistait à réaliser un POC plus complet qui intègre les fonctionnalités de certains modules de l’application Qt. J’ai opté pour l’utilisation de React, le framework les plus largement utilisé. J’ai supervisé activement la période de stage de Romain LeDivenah, qui a pris en charge la réalisation de ce POC. Malgré mon statut de débutant, j’ai pu guider Romain dans la prise en main de React. Au cours du mois de supervision, l’objectif était de développer un prototype fonctionnel, répondant aux critères définis, comprenant un menu ainsi que trois modules distincts. 5.2 Transition 5.2.1 OpenAPI La spécification de l’API est définie dans un document word qui évolue au fur et à mesure de l’ajout de fonctionnalités. Cette documentation peut être améliorée. Il existe des outils spécialisés respectant les standards de l’industrie. Ils permettent de générer la documentation, la modélisation et du code a partir d’un fichier de description. J’ai souhaité intégrer la migration vers une spécification de notre API REST basée sur la norme OpenAPI 3 pour avoir a disposition une spécification qui suis des règles appliquées au marché actuel. Il existe de nombreux outil qui utilisent cette norme que je voulais également intégrer. Le stage de Romain LeDivenah m’a permis de travailler sur d’autres aspects en parallèle comme la spécification de l’api, et l’intégration d’outils. J’ai rédigé la spécification de l’API REST en suivant la norme OpenAPI 3.0.0. J’ai également mis en place l’outil Swagger UI pour mettre à disposition une documentation interactive de l’API à partir de cette spécification. J’ai aussi utilisé OpenAPI Generator pour générer un SDK client en JavaScript. Cet étape permettra de simplifier l’utilisation de l’API REST dans l’application à développer. Enfin le dernier outil que j’ai déployé est Spotlight Prism. C’est un serveur simulé HTTP open source qui permet d’émuler le comportement de notre API à partir d’un jeu d’exemple qui peut être défini dans la spécification. Cet outil permet de tester l’API sans avoir d’équipement à disposition ou de développer la partie graphique avant ou en parallèle de l’intégration d’une nouvelle fonctionnalité dans le CAP4000. Exemple de documentation générée avec Swagger UI 5.2.2 Authentification L’authentification de l’API est basée sur un secret partagé. Cette solution pouvait être utilisé avec l’application Qt car celle-ci était compilée en un format binaire. Cela présente cependant des problématiques majeures, particulièrement dans le contexte du langage JavaScript. Étant un langage interprété, le code source JavaScript est généralement envoyé au client, exposant ainsi le secret partagé au sein du code. Cette exposition représente une menace sérieuse pour la sécurité de l’API, car un utilisateur malveillant pourrait potentiellement accéder au code source, récupérer le secret partagé et compromettre l’authentification. L’utilisation du protocole HTTP (non sécurisé) pour communiquer avec l’API représente un risque supplémentaire. En effet, les données envoyées sur le réseau ne sont pas chiffrées, ce qui permet à un attaquant d’intercepter les requêtes et d’obtenir le secret partagé. La vulnérabilité de cette approche souligne la nécessité de stratégies plus sécurisées, telles que l’utilisation de méthodes d’authentification basées sur des jetons et l’utilisation du protocole HTTPS pour communiquer avec l’API. Lucas CECILLON, référent cybersécurité, a pu travailler sur cette partie en spécifiant et en implémentant un nouveau mécanisme d’authentification basé sur l’utilisation d’un jeton JWT et l’utilisation du protocole HTTPS. Ce mécanisme à également introduit d’autres problématiques comme la gestion des certificats utilisés pour chiffrer la connexion HTTPS sur un équipement. Le protocole HTTP est utilisé pour communiquer avec l’API REST. L’authentification est basée sur un secret partagé. Ce secret est stocké dans le code source de l’application Qt. L’authentification est effectuée avec plusieurs sécurités : Signature des requêtes avec un HMAC (basé sur le secret partagé) Timestamp pour éviter les attaques de rejeu Contrairement à l’IHM Web en QT utilisant WebAssembly (projet compilé puis exécuté directement par le navigateur), React envoie les sources au navigateur (JS, HTML, CSS) pour y être exécutées. Le secret est donc facilement interceptable et visible par n’importe qui. Par conséquent, il est essentiel de mettre en place une gestion plus sécurisée de la session utilisateur. Plusieurs solutions ont été envisagées pour sécuriser l’authentification : - JWT - Bearer - Session Cookie - Basic La solution mise en place s’axe autour de trois points : Utilisation du protocole HTTPS pour sécuriser les échanges Utilisation de JWT (JSON Web Token) pour l’authentification Gestion des certificats pour chiffrer les communications Suivant les recommandations de l’anssi 5.2.2.1 Mesures de protection SSL CSRF CSP CORS Protection des cookies HSTS XSS Rafraîchissement du token Cryptographie Rejeux 5.2.2.2 HTTPS Selon l’étude [DR01] réalisée, il est essentiel d’établir une communication entre le client et l’API en utilisant le protocole HTTPS (HTTP sécurisé avec TLS). Cela signifie que chaque appel à l’API doit obligatoirement être effectué via le port HTTPS. Par exemple : https://192.168.0.1:3001/api/v3/test. De la même manière, il est impératif que l’utilisateur ne puisse accéder à l’application Web qu’en utilisant le protocole HTTPS. Aucun accès en HTTP ne pourra être fait. Tout cela nécessite une gestion appropriée des certificats sur le calculateur afin de prévenir d’éventuels problèmes d’accès en cas de certificats corrompus ou expirés. 5.2.2.3 JWT Dorénavant, l’authentification des requêtes sera basée sur l’utilisation de jetons d’accès. Ces jetons seront générés par l’API et renvoyés une fois l’authentification effectuée. Par la suite, pour chaque requête jusqu’à la déconnexion du client, le jeton devra être inclus dans la demande envoyée à l’API. Avant de traiter une demande, l’API vérifiera systématiquement le contenu et l’intégrité du jeton de la manière suivante : • Est-il valide ? ◦ SHA256(header + « . » + payload) == signature  • Est-il expiré ? ◦ Valeur « exp » • L’utilisateur a-t-il le droit d’effectuer cette requête ? ◦ Valeur « role » Selon l’étude [DR01] réalisée, l’API doit renvoyer le token dans un cookie sécurisé. Voici les paramètres à appliquer au cookie : • name : name (nom du cookie) • value : value (valeur du token) • domain : domain (nom de domaine du site, ou ip) • path : « / » (chemin du cookie) • sameSite : strict (politique de cloisonnement) • httpOnly : true (inaccessible côté client) • session : true (cookie de session) • secure : true (portée limitée aux canaux sécurisés) L’API doit spécifier le domaine du cookie en fonction de l’adresse IP à partir de laquelle elle a été sollicitée. 5.2.2.4 Certificats La mise en place d’une nouvelle version de l’IHM et d’API qui communiquent exclusivement via HTTPS soulève des questions concernant le gestion des certificats sur le calculateur. En effet, si l’utilisateur de l’IHM parvient à déposer sur le calculateur un couple de clé-privée/certificat corrompus, alors le serveur web ne sera pas joignable en HTTPS. Cela empechera toute configuration par le client, ce qui n’est pas souhaitable. Pour éviter cette situation, il est essentiel de mettre en œuvre une solution de gestion de la configuration TLS robuste, reposant sur trois piliers principaux : la génération automatique de certificats auto-signés de secours, la sécurisation de l’import des certificats et de leur utilisation. Flowchart fallback certificats 5.2.3 POC supervision stage 5.3 Développement 5.3.1 Spécification de l’application Figure 13: Redmine IHM Web CAP4000 Figure 14: User story : configuration des interfaces réseaux L’application Qt n’a pas de spécification propre. J’ai utilisé le logiciel Redmine pour créer un backlog de l’ensemble des fonctionnalités à développer. J’ai également utilisé ce logiciel pour créer des user stories qui décrivent les fonctionnalités à développer. Ces user stories sont utilisées pour définir les critères d’acceptation des fonctionnalités. Elles sont également utilisées pour suivre l’avancement du développement. Evolution spécification Avant / Après Comparaison Utilité Changements 5.3.2 Fonctionnement de la solution On garde le meme principe voir schema contexte Fonctionnement de React TODO 5.3.3 Architecture/Workspace (expliquer comment fonctionne un monorepo) CI/CD (commitlint, semantic versioning, changelog) Développement (branches, pull request) Comparaison SVN Git -> gitlab J’ai créé l’environnement du projet dans un “monorepo”. C’est un référentiel unique qui permet de contenir plusieurs projet. J’ai choisi cette approche car nous souhaitons développer une application par produit. Comme l’application Qt, le code sera découpé en modules qui pourront être intégrés ou non dans l’application d’un produit. J’ai également mis en place une démarche qualité du code et des méthodologies de développement. L’application adopte la norme semantic versioning pour gérer les versions des projets. J’ai préparé un changelog pour répertorier toutes le évolution de l’application. J’ai finalement intégré l’outil commitlint pour vérifier que les messages de commit suivent une convention de rédaction. Le projet est versionné avec l’outil de gestion de configuration Subversion. Le projet suivra une méthodologie de développement en branches. Chaque fonctionnalité sera développée dans une branche dédiée. Une fois la fonctionnalité terminée, la branche sera fusionnée dans la branche principale. Cette approche permet de travailler sur plusieurs fonctionnalités en parallèle et de garder une branche principale stable. #mermaid-8{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#ccc;}#mermaid-8 .error-icon{fill:#a44141;}#mermaid-8 .error-text{fill:#ddd;stroke:#ddd;}#mermaid-8 .edge-thickness-normal{stroke-width:2px;}#mermaid-8 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-8 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-8 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-8 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-8 .marker{fill:lightgrey;stroke:lightgrey;}#mermaid-8 .marker.cross{stroke:lightgrey;}#mermaid-8 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-8 .commit-id,#mermaid-8 .commit-msg,#mermaid-8 .branch-label{fill:lightgrey;color:lightgrey;font-family:'trebuchet ms',verdana,arial,sans-serif;font-family:var(--mermaid-font-family);}#mermaid-8 .branch-label0{fill:#2c2c2c;}#mermaid-8 .commit0{stroke:hsl(180, 1.5873015873%, 48.3529411765%);fill:hsl(180, 1.5873015873%, 48.3529411765%);}#mermaid-8 .commit-highlight0{stroke:rgb(133.6571428571, 129.7428571428, 129.7428571428);fill:rgb(133.6571428571, 129.7428571428, 129.7428571428);}#mermaid-8 .label0{fill:hsl(180, 1.5873015873%, 48.3529411765%);}#mermaid-8 .arrow0{stroke:hsl(180, 1.5873015873%, 48.3529411765%);}#mermaid-8 .branch-label1{fill:lightgrey;}#mermaid-8 .commit1{stroke:hsl(321.6393442623, 65.5913978495%, 38.2352941176%);fill:hsl(321.6393442623, 65.5913978495%, 38.2352941176%);}#mermaid-8 .commit-highlight1{stroke:rgb(93.5483870969, 221.4516129033, 139.677419355);fill:rgb(93.5483870969, 221.4516129033, 139.677419355);}#mermaid-8 .label1{fill:hsl(321.6393442623, 65.5913978495%, 38.2352941176%);}#mermaid-8 .arrow1{stroke:hsl(321.6393442623, 65.5913978495%, 38.2352941176%);}#mermaid-8 .branch-label2{fill:lightgrey;}#mermaid-8 .commit2{stroke:hsl(194.4, 16.5562913907%, 49.6078431373%);fill:hsl(194.4, 16.5562913907%, 49.6078431373%);}#mermaid-8 .commit-highlight2{stroke:rgb(149.4437086091, 117.6092715231, 107.5562913906);fill:rgb(149.4437086091, 117.6092715231, 107.5562913906);}#mermaid-8 .label2{fill:hsl(194.4, 16.5562913907%, 49.6078431373%);}#mermaid-8 .arrow2{stroke:hsl(194.4, 16.5562913907%, 49.6078431373%);}#mermaid-8 .branch-label3{fill:#2c2c2c;}#mermaid-8 .commit3{stroke:hsl(23.0769230769, 49.0566037736%, 40.7843137255%);fill:hsl(23.0769230769, 49.0566037736%, 40.7843137255%);}#mermaid-8 .commit-highlight3{stroke:rgb(99.9811320754, 162.7735849057, 202.0188679245);fill:rgb(99.9811320754, 162.7735849057, 202.0188679245);}#mermaid-8 .label3{fill:hsl(23.0769230769, 49.0566037736%, 40.7843137255%);}#mermaid-8 .arrow3{stroke:hsl(23.0769230769, 49.0566037736%, 40.7843137255%);}#mermaid-8 .branch-label4{fill:lightgrey;}#mermaid-8 .commit4{stroke:hsl(0, 83.3333333333%, 43.5294117647%);fill:hsl(0, 83.3333333333%, 43.5294117647%);}#mermaid-8 .commit-highlight4{stroke:rgb(51.5000000001, 236.5, 236.5);fill:rgb(51.5000000001, 236.5, 236.5);}#mermaid-8 .label4{fill:hsl(0, 83.3333333333%, 43.5294117647%);}#mermaid-8 .arrow4{stroke:hsl(0, 83.3333333333%, 43.5294117647%);}#mermaid-8 .branch-label5{fill:lightgrey;}#mermaid-8 .commit5{stroke:hsl(289.1666666667, 100%, 24.1176470588%);fill:hsl(289.1666666667, 100%, 24.1176470588%);}#mermaid-8 .commit-highlight5{stroke:rgb(154.2083333334, 255, 132.0000000001);fill:rgb(154.2083333334, 255, 132.0000000001);}#mermaid-8 .label5{fill:hsl(289.1666666667, 100%, 24.1176470588%);}#mermaid-8 .arrow5{stroke:hsl(289.1666666667, 100%, 24.1176470588%);}#mermaid-8 .branch-label6{fill:lightgrey;}#mermaid-8 .commit6{stroke:hsl(35.1315789474, 98.7012987013%, 40.1960784314%);fill:hsl(35.1315789474, 98.7012987013%, 40.1960784314%);}#mermaid-8 .commit-highlight6{stroke:rgb(51.331168831, 135.1948051946, 253.6688311688);fill:rgb(51.331168831, 135.1948051946, 253.6688311688);}#mermaid-8 .label6{fill:hsl(35.1315789474, 98.7012987013%, 40.1960784314%);}#mermaid-8 .arrow6{stroke:hsl(35.1315789474, 98.7012987013%, 40.1960784314%);}#mermaid-8 .branch-label7{fill:lightgrey;}#mermaid-8 .commit7{stroke:hsl(106.1538461538, 84.4155844156%, 35.0980392157%);fill:hsl(106.1538461538, 84.4155844156%, 35.0980392157%);}#mermaid-8 .commit-highlight7{stroke:rgb(206.1818181817, 89.948051948, 241.051948052);fill:rgb(206.1818181817, 89.948051948, 241.051948052);}#mermaid-8 .label7{fill:hsl(106.1538461538, 84.4155844156%, 35.0980392157%);}#mermaid-8 .arrow7{stroke:hsl(106.1538461538, 84.4155844156%, 35.0980392157%);}#mermaid-8 .branch{stroke-width:1;stroke:lightgrey;stroke-dasharray:2;}#mermaid-8 .commit-label{font-size:10px;fill:rgb(183.8476190475, 181.5523809523, 181.5523809523);}#mermaid-8 .commit-label-bkg{font-size:10px;fill:hsl(180, 1.5873015873%, 28.3529411765%);opacity:0.5;}#mermaid-8 .tag-label{font-size:10px;fill:#e0dfdf;}#mermaid-8 .tag-label-bkg{fill:#1f2020;stroke:#cccccc;}#mermaid-8 .tag-hole{fill:#ccc;}#mermaid-8 .commit-merge{stroke:#1f2020;fill:#1f2020;}#mermaid-8 .commit-reverse{stroke:#1f2020;fill:#1f2020;stroke-width:3;}#mermaid-8 .commit-highlight-inner{stroke:#1f2020;fill:#1f2020;}#mermaid-8 .arrow{stroke-width:8;stroke-linecap:round;fill:none;}#mermaid-8 .gitTitleText{text-anchor:middle;font-size:18px;fill:#ccc;}#mermaid-8 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;}mainfea-networkfix-log0-14645aa1-6fbbbf7V0.12-dfad1603-a9e8c90V0.25-dda00826-f304b18 5.3.4 Bibliothèques (zod, react-hook-form, react-query, react-router, orval, react-testing-library, vitest) 5.3.4.1 Routing Définition Modules Gestion “plug-in” multi-app 5.3.5 Gestion des erreurs et des chargements 5.3.6 Authentification et autorisation Utilisateurs et rôles principes Note Existant HTTP / Headers auth Problème Solutions HTTPS/JWT choix reauthentification flow dashboard lucas 5.3.6.1 RBAC 5.3.7 Application Primitives/Core Routing Navigation Layout Page d’accueil QtReact Empty Previous Next 5.3.8 Modules J’ai étudié l’application Qt pour comprendre son fonctionnement et son architecture. J’ai ensuite identifié les modules qui la composent et les fonctionnalités qu’ils offrent. J’ai regroupé certains modules qui partagent des fonctionnalités similaires. J’ai ensuite créé une liste permettant de prioriser les modules en fonction de leur importance. Cette liste est présentée dans le tableau ci-dessous. Table 4: Ordre de priorité des modules Priorité Module Qt Module React 0 IDENTIFICATION Shell REDEMARRAGE CALCULATEUR Shell DECONNEXION Shell 1 ETHERNET ADRESSES IP Réseau NODES CARTES ETAT DES CARTES Cartes ETAT DES E/S DATE/HEURE ETAT SYNCHRO Date et heure MODE SYNCHRO FUSEAU HORAIRE DATE / HEURE JOURNAL JOURNAL Journal RAZ DU JOURNAL 2 CONFIGURATION IMPORT CONFIGURATION Configuration EXPORT CONFIGURATION MOTS DE PASSE Utilisateurs CERT. EQUIPEMENT Certificats CERTIFICATS CA Autorités de certification 3 LISTE MNEMOS Mnémoniques SYS LOG SYSLOG SNMP SNMP 4 SERVEUR IEC 60870 IEC 61850 CLIENT IEC 61850 SERVEUR MODBUS MODBUS MAITRE MODBUS MODBUS ESCLAVE MQTT MQTT 5 FICHIER Fichier IMPRIMANTES Imprimantes DEBOGAGE Débogage 5.3.8.1 Réseau QtReact Previous Next Previous Next Le premier module sur lequel j’ai travaillé est le module réseau. Ce module permet de configurer les adresses IP des interfaces réseau de l’équipement. Il permet également de configurer les nœuds du réseau. Les nœuds sont des équipements distants qui peuvent être connectés à l’équipement principal. Ce module est essentiel pour la configuration de l’équipement et pour assurer la communication avec les autres équipements du réseau. HSR et PRP TODO : Expliquer HSR et PRP https://fr.belden.com/solutions/high-availability-seamless-redundancy 5.3.8.2 Journal QtReact Previous Next Previous Next 5.3.8.3 Date et heure QtReact Previous Next Previous Next 5.3.8.4 Cartes QtReact Previous Next Previous Next 5.3.8.5 Configuration QtReact Previous Next Previous Next 5.3.8.6 Utilisateurs QtReact Previous Next 5.3.8.7 Certificats QtReact Previous Next 5.3.8.8 Autorités de certification QtReact Empty Previous Next 5.3.8.9 Mnémoniques QtReact Previous Next 5.3.9 Tests 5.3.10 Performances Les performance des deux applications ont été évaluée avec l’outil lighthouse de chrome. C’est un outil automatisé open source permettant de mesurer la qualité des pages Web. Les rapports générés par cet outil permettent d’identifier les problèmes de performance, d’accessibilité et de compatibilité. Rapport application Qt Rapport application React Également, l’outil ne calcule pas correctement les temps de chargement initiaux au démarrage des application WebAssembly (Qt). Le temps de chargement de l’application Qt est d’environ 15 secondes tandis que le temps de chargement de l’application react est quasiment instantané (autour de 1 seconde). Finalement l’outil ne rends pas compte de l’expérience utilisateur. L’application React est plus fluide et plus réactive. L’intégration d’une couche de cache limite aussi le rechargement des données. L’interface de l’application Qt est plus simple mais ne permets pas de naviguer rapidement entre les pages. 5.3.11 Application 5.3.11.1 Sprint 1 Autoformation Développement de l’application Shell et layout Module réseau Recherche d’une suite de tests Intégration avec des bibliothèques pertinentes Routage et navigation : react-router Cache client : react-query 5.3.11.2 Sprint 2 Développement de l’application Module journal Module date et heure Module cartes Refactorisation des modules développés modules 1 à 4 Délégation de la gestion des formulaire Validation des données : zod Gestion des formulaires : react-hook-form Mise en place de la gestion multi-application Mise en place des principes d’authentification et d’autorisation avec la gestion multi-utilisateur Mise en place des tests de composants avec cypress Conformité Non régression 5.3.11.3 Sprint 3 Écriture des tests des modules 1 à 4 Test de déploiement sur cible avec un nouveau serveur web Problématiques liées au développement d’ihm Layout Front/back Authentification Echange de données Technologies Evolution rapide Cest quoi react ? fonctionnement => une seule responsabilité => affichage ecosystème très large, nécessité d’adopter d’autres bibliothèques pour étoffer les fonctionnalités et garantir la qualité du code 6 Conclusion En conclusion, le projet entamé en juin 2023, a permis de développer une application de configuration des automates SDEL plus ergonomique et sécurisée. Les nouvelles fonctionnalités et interfaces ont été conçues en collaboration avec les utilisateurs finaux. Les retours positifs attestent de l’efficacité de cette approche centrée utilisateur. Cette nouvelle application vient étoffer notre offre de services et renforce notre positionnement sur le marché de l’énergie qui nécessite l’intégration de contraintes de cybersécurité de plus en plus importantes au fils des années. La vigilance demeure de mise, et une gestion proactive post-implémentation est recommandée pour assurer une adaptation continue aux évolutions technologiques et aux nouvelles menaces. Ce projet constitue ainsi un jalon important dans notre trajectoire vers l’innovation et la pérennité de nos systèmes. Aujourd’hui, l’application est utilisée quotidiennement par les opérateurs de maintenance de nos clients et les équipes de test en interne. D’un point de vue personnel, ce projet vient clôturer ma formation d’ingénieur. Il marque un terme à mes études et m’a permis de mettre en pratique les connaissances acquises durant ces trois années d’apprentissage. Sur le plan technique et scientifique, le projet m’a permis de consolider mes compétences en informatique. J’ai pu découvrir de nouvelles technologies et approfondir mes connaissances en développement WEB. En termes de montée en compétences, j’ai constaté une nette amélioration de ma compréhension des applications WEB ainsi que les contraintes appliqués au domaine de l’embarqué et des réseaux privés. Du point de vue économique, le chiffrage du projet et la mise en place d’un feuille de route ont constitué une expérience pratique sur le fonctionnement et la nécessité des ces processus en entreprise. Cela m’a sensibilisé aux implications financières des choix effectués en tant que responsable et aiguillé ma vision vers une approche plus stratégique et pragmatique. Sur le plan organisationnel, la gestion du projet m’a confronté à certains défis en termes de coordination, de planification et de suivi des tâches. J’ai pu mettre en pratique les méthodes agiles apprises en cours. J’ai également pu progresser en autonomie. Au début parfois trop distant et au fur et à mesure de l’avancement du projet, j’ai amélioré la transmission d’informations vers les différentes parties du projet. Ces compétences organisationnelles acquises seront indéniablement bénéfiques dans ma future carrière professionnelle. Quant à mon projet professionnel, cette expérience a confirmé ma passion pour le développement logiciel et a éclairé les prochaines étapes de ma carrière. Mon objectif est de m’orienter vers le développement logiciel, en particulier dans le domaine WEB et fullstack. Pour finir, cette expérience a été formatrice et enrichissante à bien des égards. Elle a façonné mon identité professionnelle et renforcé ma détermination à exceller dans le domaine de l’informatique. Je suis prêt à relever de nouveaux défis et à contribuer de manière significative à des projets futurs. 6.1 Bilan Le projet a débuté par la mise en place de l’environnement de travail et l’apprentissage autonome des technologies, avec peu de références internes dans l’entreprise. Environ 600 à 700 heures ont été consacrées au projet, avec une moitié du projet déjà réalisée. La plupart des objectifs des premières phases du projet ont été atteints. L’authentification reste à valider lors du prochain sprint. Des refactorisations ont été nécessaires au fur et à mesure de l’apprentissage. Le projet a donc atteint des jalons significatifs, mais le chemin vers la finalisation nécessite encore des efforts substantiels. Il reste encore environ 600 heures de travail pour finir le développement des modules restant. 7 Annexes 7.1 Références 7.2 Grille des compétences 7.3 Acknowledgments I am grateful for the insightful comments offered by the anonymous peer reviewers at Books & Texts. The generosity and expertise of one and all have improved this study in innumerable ways and saved me from many errors; those that inevitably remain are entirely my own responsibility

      pas fou

    Annotators

    1. eLife assessment

      This useful study reports machine learning models derived from large-scale data to predict the risk of post-stroke epilepsy. The evidence supporting the conclusions is, however, incomplete, as many critical methodological aspects have been omitted or described too briefly, the analysis of the results is not complete, and the dataset and code have not been disclosed, which represents an obstacle to reproducibility. The study may be of some interest in the field of clinical neurology.

    2. Reviewer #3 (Public Review):

      Summary:

      The authors report the performance of a series of machine learning models inferred from a large-scale dataset and externally validated with an independent cohort of patients, to predict the risk of post-stroke epilepsy. Some of the reported models have very good explicative and predictive performances.

      Strengths:

      The models have been derived from real-world large-scale data.

      Performances of the best-performing models are very good according to the external validation results.

      Early prediction of the risk of post-stroke epilepsy would be of high interest to implement early therapeutic interventions that could improve prognosis.

      Weaknesses:

      There are issues with the readability of the paper. Many abbreviations are not introduced properly and sometimes are written inconsistently. A lot of relevant references are omitted. The methodological descriptions are extremely brief and, sometimes, incomplete.

      The dataset is not disclosed, and neither is the code (although the code is made available upon request). For the sake of reproducibility, unless any bioethical concerns impede it, it would be good to have these data disclosed.

      Although the external validation is appreciated, cross-validation to check the robustness of the models would also be welcome.

    1. richtig. ALLES blabla ist nur meinung, hypothese, theorie... egal wie viele "factchecker" oder "experten" da ihren "offiziellen" stempel draufklatschen. der einzige unterscheid zu frau faeser ist, die hat ihre soldaten, die ihre meinungen mit gewalt durchsetzen... hätt ich auch soldaten, würde die welt anders aussehen. waffengleichheit vs monopol richtig. ALLES blabla ist nur meinung, hypothese, theorie... egal wie viele "factchecker" oder "experten" da ihren "offiziellen" stempel draufklatschen. der einzige unterscheid zu frau faeser ist, die hat ihre soldaten, die ihre meinungen mit gewalt durchsetzen... hätt ich auch soldaten, würde die welt anders aussehen. waffengleichheit vs monopolformatting helphide helpcontent policysavecancelreddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues. you type:you see:*italics*italics**bold**bold[reddit!](https://reddit.com)reddit!* item 1* item 2* item 3item 1item 2item 3> quoted textquoted textLines starting with four spacesare treated like code:    if 1 * 2 < 3:        print "hello, world!"Lines starting with four spacesare treated like code:if 1 * 2 < 3:    print "hello, world!"~~strikethrough~~strikethroughsuper^scriptsuperscript

      banned from https://old.reddit.com/r/DEgegenRechts

    1. Reviewer #1 (Public Review):

      This work seeks to understand how behaviour-related information is represented in the neural activity of the primate motor cortex. To this end, a statistical model of neural activity is presented that enables a non-linear separation of behaviour-related from unrelated activity. As a generative model, it enables the separate analysis of these two activity modes, here primarily done by assessing the decoding performance of hand movements the monkeys perform in the experiments. Several lines of analysis are presented to show that while the neurons with significant tuning to movements strongly contribute to the behaviourally-relevant activity subspace, less or un-tuned neurons also carry decodable information. It is further shown that the discovered subspaces enable linear decoding, leading the authors to conclude that motor cortex read-out can be linear.

      Strengths:

      In my opinion, using an expressive generative model to analyse neural state spaces is an interesting approach to understand neural population coding. While potentially sacrificing interpretability, this approach allows capturing both redundancies and synergies in the code as done in this paper. The model presented here is a natural non-linear extension of a previous linear model PSID) and uses weak supervision in a manner similar to a previous non-linear model (TNDM).

      Weaknesses:

      This revised version provides additional evidence to support the author's claims regarding model performance and interpretation of the structure of the resulting latent spaces, in particular the distributed neural code over the whole recorded population, not just the well-tuned neurons. The improved ability to linearly decode behaviour from the relevant subspace and the analysis of the linear subspace projections in my opinion convincingly demonstrates that the model picks up behaviour-relevant dynamics, and that these are distributed widely across the population. As reviewer 3 also points out, I would, however, caution to interpret this as evidence for linear read-out of the motor system - your model performs a non-linear transformation, and while this is indeed linearly decodable, the motor system would need to do something similar first to achieve the same. In fact to me it seems to show the opposite, that behaviour-related information may not be generally accessible to linear decoders (including to down-stream brain areas).

      As in my initial review, I would also caution against making strong claims about identifiability although this work and TNDM seem to show that in practise such methods work quite well. CEBRA, in contrast, offers some theoretical guarantees, but it is not a generative model, so would not allow the type of analysis done in this paper. In your model there is a para,eter \alpha to balance between neural and behaviour reconstruction. This seems very similar to TNDM and has to be optimised - if this is correct, then there is manual intervention required to identify a good model.

      Somewhat related, I also found that the now comprehensive comparison with related models shows that the using decoding performance (R2) as a metric for model comparison may be problematic: the R2 values reported in Figure 2 (e.g. the MC_RTT dataset) should be compared to the values reported in the neural latent benchmark, which represent well-tuned models (e.g. AutoLFADS). The numbers (difficult to see, a table with numbers in the appendix would be useful, see: https://eval.ai/web/challenges/challenge-page/1256/leaderboard) seem lower than what can be obtained with models without latent space disentanglement. While this does not necessarily invalidate the conclusions drawn here, it shows that decoding performance can depend on a variety of model choices, and may not be ideal to discriminate between models. I'm also surprised by the low neural R2 for LFADS I assume this is condition-averaged) - LFADS tends to perform very well on this metric.

      One statement I still cannot follow is how the prior of the variational distribution is modelled. You say you depart from the usual Gaussian prior, but equation 7 seems to suggest there is a normal prior. Are the parameters of this distribution learned? As I pointed out earlier, I however suspect this may not matter much as you give the prior a very low weight. I also still am not sure how you generate a sample from the variational distribution, do you just draw one for each pass?

      Summary:

      This paper presents a very interesting analysis, but some concerns remain that mainly stem from the complexity of deep learning models. It would be good to acknowledge these as readers without relevant background need to understand where the possible caveats are.

    2. Reviewer #4 (Public Review):

      I am a new reviewer for this manuscript, which has been reviewed before. The authors provide a variational autoencoder that has three objectives in the loss: linear reconstruction of behavior from embeddings, reconstruction of neural data, and KL divergence term related to the variational model elements. They take the output of the VAE as the "behaviorally relevant" part of neural data and call the residual "behaviorally irrelevant". Results aim to inspect the linear versus nonlinear behavior decoding using the original raw neural data versus the inferred behaviorally relevant and irrelevant parts of the signal.

      Overall, studying neural computations that are behaviorally relevant or not is an important problem, which several previous studies have explored (for example PSID in (Sani et al. 2021), TNDM in (Hurwitz et al. 2021), TAME-GP in (Balzani et al. 2023), pi-VAE in (Zhou and Wei 2020), and dPCA in (Kobak et al. 2016), etc). However, this manuscript does not properly put their work in the context of such prior works. For example, the abstract states "One solution is to accurately separate behaviorally-relevant and irrelevant signals, but this approach remains elusive", which is not the case given that these prior works have done that. The same is true for various claims in the main text, for example "Furthermore, we found that the dimensionality of primary subspace of raw signals (26, 64, and 45 for datasets A, B, and C) is significantly higher than that of behaviorally-relevant signals (7, 13, and 9), indicating that using raw signals to estimate the neural dimensionality of behaviors leads to an overestimation" (line 321). This finding was presented in (Sani et al. 2021) and (Hurwitz et al. 2021), which is not clarified here. This issue of putting the work in context has been brought up by other reviewers previously but seems to remain largely unaddressed. The introduction is inaccurate also in that it mixes up methods that were designed for separation of behaviorally relevant information with those that are unsupervised and do not aim to do so (e.g., LFADS). The introduction should be significantly revised to explicitly discuss prior models/works that specifically formulated this behavior separation and what these prior studies found, and how this study differs.

      Beyond the above, some of the main claims/conclusions made by the manuscript are not properly supported by the analyses and results, which has also been brought up by other reviewers but not fully addressed. First, the analyses here do not support the linear readout from the motor cortex because i) by construction, the VAE here is trained to have a linear readout from its embedding in its loss, which can bias its outputs toward doing well with a linear decoder/readout, and ii) the overall mapping from neural data to behavior includes both the VAE and the linear readout and thus is always nonlinear (even when a linear Kalman filter is used for decoding). This claim is also vague as there is no definition of readout from "motor cortex" or what it means. Why is the readout from the bottleneck of this particular VAE the readout of motor cortex? Second, other claims about properties of individual neurons are also confounded because the VAE is a population-level model that extracts the bottleneck from all neurons. Thus, information can leak from any set of neurons to other sets of neurons during the inference of behaviorally relevant parts of signals. Overall, the results do not convincingly support the claims, and thus the claims should be carefully revised and significantly tempered to avoid misinterpretation by readers.

      Below I briefly expand on these as well as other issues, and provide suggestions:

      (1) Claims about linearity of "motor cortex" readout are not supported by results yet stated even in the abstract. Instead, what the results support is that for decoding behavior from the output of the dVAE model -- that is trained specifically to have a linear behavior readout from its embedding -- a nonlinear readout does not help. This result can be biased by the very construction of the dVAE's loss that encourages a linear readout/decoding from embeddings and thus does not imply a finding about motor cortex.

      (2) Related to the above, it is unclear what the manuscript means by readout from motor cortex. A clearer definition of "readout" (a mapping from what to what?) in general is needed. The mapping that the linearity/nonlinearity claims refer to is from the *inferred* behaviorally relevant neural signals, which themselves are inferred nonlinearly using the VAE. This should be explicitly clarified in all claims, i.e., that only the mapping from distilled signals to behavior is linear, not the whole mapping from neural data to behavior. Again, to say the readout from motor cortex is linear is not supported, including in the abstract.

      (3) Claims about individual neurons are also confounded. The d-VAE distilling processing is a population level embedding so the individual distilled neurons are not obtainable on their own without using the population data. This population level approach also raises the possibility that information can leak from one neuron to another during distillation, which is indeed what the authors hope would recover true information about individual neurons that wasn't there in the recording (the pixel denoising example). The authors acknowledge the possibility that information could leak to a neuron that didn't truly have that information and try to rule it out to some extent with some simulations and by comparing the distilled behaviorally relevant signals to the original neural signals. But ultimately, the distilled signals are different enough from the original signals to substantially improve decoding of low information neurons, and one cannot be sure if all of the information in distilled signals from any individual neuron truly belongs to that neuron. It is still quite likely that some of the improved behavior prediction of the distilled version of low-information neurons is due to leakage of behaviorally relevant information from other neurons, not the former's inherent behavioral information. This should be explicitly acknowledged in the manuscript.

      (4) Given the nuances involved in appropriate comparisons across methods and since two of the datasets are public, the authors should provide their complete code (not just the dVAE method code), including the code for data loading, data preprocessing, model fitting and model evaluation for all methods and public datasets. This will alleviate concerns and allow readers to confirm conclusions (e.g., figure 2) for themselves down the line.

      (5) Related to 1) above, the authors should explore the results if the affine network h(.) (from embedding to behavior) was replaced with a nonlinear ANN. Perhaps linear decoders would no longer be as close to nonlinear decoders. Regardless, the claim of linearity should be revised as described in 1) and 2) above, and all caveats should be discussed.

      (6) The beginning of the section on the "smaller R2 neurons" should clearly define what R2 is being discussed. Based on the response to previous reviewers, this R2 "signifies the proportion of neuronal activity variance explained by the linear encoding model, calculated using raw signals". This should be mentioned and made clear in the main text whenever this R2 is referred to.

      (7) Various terms require clear definitions. The authors sometimes use vague terminology (e.g., "useless") without a clear definition. Similarly, discussions regarding dimensionality could benefit from more precise definitions. How is neural dimensionality defined? For example, how is "neural dimensionality of specific behaviors" (line 590) defined? Related to this, I agree with Reviewer 2 that a clear definition of irrelevant should be mentioned that clarifies that relevance is roughly taken as "correlated or predictive with a fixed time lag". The analyses do not explore relevance with arbitrary time lags between neural and behavior data.

      (8) CEBRA itself doesn't provide a neural reconstruction from its embeddings, but one could obtain one via a regression from extracted CEBRA embeddings to neural data. In addition to decoding results of CEBRA (figure S3), the neural reconstruction of CEBRA should be computed and CEBRA should be added to Figure 2 to see how the behaviorally relevant and irrelevant signals from CEBRA compare to other methods.

      References:

      Kobak, Dmitry, Wieland Brendel, Christos Constantinidis, Claudia E Feierstein, Adam Kepecs, Zachary F Mainen, Xue-Lian Qi, Ranulfo Romo, Naoshige Uchida, and Christian K Machens. 2016. "Demixed Principal Component Analysis of Neural Population Data." Edited by Mark CW van Rossum. eLife 5 (April): e10989. https://doi.org/10.7554/eLife.10989.

      Sani, Omid G., Hamidreza Abbaspourazad, Yan T. Wong, Bijan Pesaran, and Maryam M. Shanechi. 2021. "Modeling Behaviorally Relevant Neural Dynamics Enabled by Preferential Subspace Identification." Nature Neuroscience 24 (1): 140-49. https://doi.org/10.1038/s41593-020-00733-0.

      Zhou, Ding, and Xue-Xin Wei. 2020. "Learning Identifiable and Interpretable Latent Models of High-Dimensional Neural Activity Using Pi-VAE." In Advances in Neural Information Processing Systems, 33:7234-47. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2020/hash/510f2318f324cf07fce24c3a4b89c771-Abstract.html.

      Hurwitz, Cole, Akash Srivastava, Kai Xu, Justin Jude, Matthew Perich, Lee Miller, and Matthias Hennig. 2021. "Targeted Neural Dynamical Modeling." In Advances in Neural Information Processing Systems. Vol. 34. https://proceedings.neurips.cc/paper/2021/hash/f5cfbc876972bd0d031c8abc37344c28-Abstract.html.

      Balzani, Edoardo, Jean-Paul G. Noel, Pedro Herrero-Vidal, Dora E. Angelaki, and Cristina Savin. 2023. "A Probabilistic Framework for Task-Aligned Intra- and Inter-Area Neural Manifold Estimation." In . https://openreview.net/forum?id=kt-dcBQcSA.

    1. Reviewer #3 (Public Review):

      Summary:

      The authors used recurrent neural network modelling of spatial navigation tasks to investigate border and place cell behaviour during remapping phenomena.

      Strengths:

      The neural network training seemed for the most part (see comments later) well-performed, and the analyses used to make the points were thorough.

      The paper and ideas were well explained.

      Figure 4 contained some interesting and strong evidence for map-like generalisation as environmental geometry was warped.

      Figure 7 was striking, and potentially very interesting.

      It was impressive that the RNN path-integration error stayed low for so long (Fig A1), given that normally networks that only work with dead-reckoning have errors that compound. I would have loved to know how the network was doing this, given that borders did not provide sensory input to the network. I could not think of many other plausible explanations... It would be even more impressive if it was preserved when the network was slightly noisy.

      Weaknesses:

      I felt that the stated neuroscience interpretations were not well supported by the presented evidence, for a few reasons I'll now detail.

      First, I was unconvinced by the interpretation of the reported recurrent cells as border cells. An equally likely hypothesis seemed to be that they were positions cells that are linearly encoding the x and y position, which when your environment only contains external linear boundaries, look the same. As in figure 4, in environments with internal boundaries the cells do not encode them, they encode (x,y) position. Further, if I'm not misunderstanding, there is, throughout, a confusing case of broken symmetry. The cells appear to code not for any random linear direction, but for either the x or y axis (i.e. there are x cells and y cells). These look like border cells in environments in which the boundaries are external only, and align with the axes (like square and rectangular ones), but the same also appears to be true in the rotationally symmetric circular environment, which strikes me as very odd. I can't think of a good reason why the cells in circular environments should care about the particular choice of (x,y) axes... unless the choice of position encoding scheme is leaking influence throughout. A good test of these would be differently oriented (45 degree rotated square) or more geometrically complicated (two diamonds connected) environments in which the difference between a pure (x,y) code and a border code are more obvious.

      Next, the decoding mechanism used seems to have forced the representation to learn place cells (no other cell type is going to be usefully decodable?). That is, in itself, not a problem. It just changes the interpretation of the results. To be a normative interpretation for place cells you need to show some evidence that this decoding mechanism is relevant for the brain, since this seems to be where they are coming from in this model. Instead, this is a model with place cells built into it, which can then be used for studying things like remapping, which is a reasonable stance.

      However, the remapping results were also puzzling. The authors present convincing evidence that the recurrent units effectively form 6 different maps of the 6 different environments (e.g. the sparsity of the cod, or fig 6a), with the place cells remapping between environments. Yet, as the authors point out, in neural data the finding is that some cells generalise their co-firing patterns across environments (e.g. grid cells, border cells), while place cells remap, making it unclear what correspondence to make between the authors network and the brain. There are existing normative models that capture both entorhinal's consistent and hippocampus' less consistent neural remapping behaviour (Whittington et al. and probably others), what have we then learnt from this exercise?

      One striking result was figure 7, the hexagonal arrangement of place cell centres. I had one question that I couldn't find the answer to in the paper, which would change my interpretation. Are place cell centres within a single clusters of points in figure 7a, for example, from one cell across the 100 trajectories, or from many? If each cluster belongs to a different place cell then the interpretation seems like some kind of optimal packing/coding of 2D space by a set of place cells, an interesting prediction. If multiple place cells fall within a single cluster then that's a very puzzling suggestion about the grouping of place cells into these discrete clusters. From figure 7c I guess that the former is the likely interpretation, from the fact that clusters appear to maintain the same colour, and are unlikely to be co-remapping place cells, but I would like to know for sure!

      I felt that the neural data analysis was unconvincing. Most notably, the statistical effect was found in only one of seven animals. Random noise is likely to pass statistical tests 1 in 20 times (at 0.05 p value), this seems like it could have been something similar? Further, the data was compared to a null model in which place cell fields were randomly distributed. The authors claim place cell fields have two properties that the random model doesn't (1) clustering to edges (as experimentally reported) and (2) much more provocatively, a hexagonal lattice arrangement. The test seems to collude the two; I think that nearby ball radii could be overrepresented, as in figure 7f, due to either effect. I would have liked to see a computation of the statistic for a null model in which place cells were random but with a bias towards to boundaries of the environment that matches the observed changing density, to distinguish these two hypotheses.

      Some smaller weaknesses:<br /> - Had the models trained to convergence? From the loss plot it seemed like not, and when including regularisors recent work (grokking phenomena, e.g. Nanda et al. 2023) has shown the importance of letting the regularisor minimise completely to see the resulting effect. Else you are interpreting representations that are likely still being learnt, a dangerous business.<br /> - Since RNNs are nonlinear it seems that eigenvalues larger than 1 doesn't necessarily mean unstable?<br /> - Why do you not include a bias in the networks? ReLU networks without bias are not universal function approximators, so it is a real change in architecture that doesn't seem to have any positives?<br /> - The claim that this work provided a mathematical formalism of the intuitive idea of a cognitive map seems strange, given that upwards of 10 of the works this paper cite also mathematically formalise a cognitive map into a similar integration loss for a neural network.

      Aim Achieved? Impact/Utility/Context of Work

      Given the listed weaknesses, I think this was a thorough exploration of how this network with these losses is able to path-integrate its position and remap. This is useful, it is good to know how another neural network with slightly different constraints learns to perform these behaviours. That said, I do not think the link to neuroscience was convincing, and as such, it has not achieved its stated aim of explaining these phenomena in biology. The mechanism for remapping in the entorhinal module seemed fundamentally different to the brain's, instead using completely disjoint maps; the recurrent cell types described seemed to match no described cell type (no bad thing in itself, but it does limit the permissible neuroscience claims) either in tuning or remapping properties, with a potentially worrying link between an arbitrary encoding choice and the responses; and the striking place cell prediction was unconvincingly matched by neural data. Further, this is a busy field in which many remapping results have been shown before by similar models, limiting the impact of this work. For example, George et al. and Whittington et al. show remapping of place cells across environments; Whittington et al. study remapping of entorhinal codes; and Rajkumar Vasudeva et al. 2022 show similar place cell stretching results under environmental shifts. As such, this papers contribution is muddied significantly.

  5. notebooksharing.space notebooksharing.space
    1. We find that varying the use of code-level optimizations impactsperformance significantly more than varying whether the PPO or TRPO step is used.

      Writing better code had a bigger impact than the difference in the algorithm!

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study defines a fundamental aspect of protein kinase signalling in the protist parasite Toxoplasma gondii that is required for acute and chronic infections. The authors provide compelling evidence for the role of SPARK/SPARKEL kinases in regulating cAMP/cGMP signalling, although evidence linking the loss of these kinases to changes in the phosphoproteome is incomplete. Overall, this study will be of great interest to those who study Toxoplasma and related apicomplexan parasites.

      We thank the reviewers for their thoughtful and positive evaluation of our work. Below, we have addressed all of the public reviews and recommendations for the authors in point-by-point responses. Additionally, we include with this resubmission RT-qPCR data where we observe no significant change in transcript levels for the relevant AGC kinases, supporting the hypothesis that SPARK/SPARKEL–regulation is post-translational.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Herneisen et al characterise the Toxoplasma PDK1 orthologue SPARK and an associated protein SPARKEL in controlling important fate decisions in Toxoplasma. Over recent years this group and others have characterised the role of cAMP and cGMP signalling in negatively and positively regulating egress, motility, and invasion, respectively. This manuscript furthers this work by showing that SPARK and SPARKEL likely act upstream, or at least control the levels of the cAMP and cGMP-dependent kinases PKA and PKG, respectively, thus controlling the transition of intracellular replicating parasites into extracellular motile forms (and back again).

      The authors use quantitative (phospho)proteomic techniques to elegantly demonstrate the upstream role of SPARK in controlling cAMP and cGMP pathways. They use sophisticated analysis techniques (at least for parasitology) to show the functional association between cGMP and cAMP signalling pathways. They therefore begin to unify our understanding of the complicated signalling pathways used by Toxoplasma to control key regulatory processes that control the activation and suppression of motility. The authors then use molecular and cellular assays on a range of generated transgenic lines to back up their observations made by quantitative proteomics that are clear in their design and approach.

      The authors then extend their work by showing that SPARK/SPARKEL also control PKAc3 function. PKAc3 has previously been shown to negatively regulate differentiation into bradyzoite forms and this work backs up and extends this finding to show that SPARK also controls this. The authors conclude that SPARK could act as a central node of regulation of the asexual stage, keeping parasites in their lytic cell growth and preventing differentiation. Whether this is true is beyond the scope of this paper and will have to be determined at a later date.

      Strengths:

      This is an exceptional body of work. It is elegantly performed, with state-of-the-art proteomic methodologies carefully being applied to Toxoplasma. Observations from the proteomic datasets are masterfully backed up with validation using quantitative molecular and cellular biology assays.

      The paper is carefully and concisely written and is not overreaching in its conclusions. This work and its analysis set a new benchmark for the use of proteomics and molecular genetics in apicomplexan parasites.

      Weaknesses:

      This reviewer did not identify any weaknesses.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Herneisen et al. examines the Toxoplasma SPARK kinase orthologous to mammalian PDK1 kinase. The extracellular signals trigger cascades of the second messengers and play a central role in the apicomplexan parasites' survival. In Toxoplasma, these cascades regulate active replication of the tachyzoites, which manifests as acute toxoplasmosis, or the development into drug-resilient bradyzoites characteristic of the chronic stage of the disease. This study focuses on the poorly understood signaling mechanisms acting upstream of such second messenger kinases as PKA and PKG. The authors showed that similar to PDK1, Toxoplasma SPARK appears to regulate several AGC kinases.

      Strengths:

      The study demonstrated a strong association of the SPARK kinase with an elongin-like SPARKEL factor and an uncharacterized AGC kinase. Using a set of standard assays, the authors determined the SPARK/SPARKEL role in parasite egress and invasion. Finally, the study presented evidence of the SPARK/SPARKEL involvement in the bradyzoite differentiation.

      Weaknesses:

      Although the study can potentially uncover essential sensing mechanisms operating in Toxoplasma, the evidence of the SPARK/SPARKEL mechanisms is weak. Specifically, due to incomplete data analysis, the SPARK/SPARKEL-dependent phosphoregulation of AGC kinases cannot be evaluated. The manuscript requires better organization and lacks guidance on the described experiments. Although the study is built on advanced genetics, at times, it is unnecessarily complicated, raising doubts rather than benefiting the study.

      The evidence for the SPARK/SPARKEL interaction is demonstrated through diverse experimental approaches that are internally consistent. Five separate mass spectrometry experiments, with replicates and appropriate controls, with tags on either SPARK or SPARKEL, showed that SPARK and SPARKEL form a strong interaction (Figure 1A, 1D, 1E; Figure 1—figure supplement 1). Global mass spectrometry experiments assessing the impact of  SPARK or SPARKEL depletion showed similar features (a reduction in PKG and PKA abundance and up-regulation of bradyzoite-associated proteins; Figure 3C–D). The phenotypes associated with SPARK and SPARKEL depletion phenocopy one another in all cell biological assays we tested (Figure 2A, 2D and PMID: 35484233; Figure 2E–J; Figure 4E–F; Figure 6A–B). Measuring the abundance of SPARK and SPARKEL in unenriched samples was challenging, but immunoblotting and proteomics suggest that depletion of one factor leads to down-regulation of the other (Figure 2B, 2C; Figure 3—figure supplement 1), which explains the genetic and cell biological phenocopying described above. We note that “further biochemical studies are required to discern the regulatory interactions between SPARK and SPARKEL” (first submission lines 590-591) and are beyond the scope of this work.

      The evidence for SPARK/SPARKEL regulation of AGC kinase activity is demonstrated through diverse experimental approaches that are also internally consistent. PKA C1 and PKG abundance levels decrease in parasites depleted of SPARK/SPARKEL, as measured by mass spectrometry (Figure 3A and 3C) and cell-based assays for PKA C1/R (Figure 4D–F). Comparisons of the global SPARK-, PKA R-, PKG-, and PKA C3-depleted phosphoproteomes suggest that PKA and PKG activity is reduced upon SPARK depletion whereas the activity of an unrelated factor (PP1) is unaffected (Figure 4G–H, Figure 4—figure supplement 1, Figure 5D–E, Figure 7I–J). Parasites depleted of SPARK are hypersensitized to a PKG inhibitor (Figure 5B–C). SPARK, PKA, and PKG are proximal in cellulo (Figure 3I) and SPARK co-purifies with PKA C3 (Figure 7A). The kinetic-phase phenotypes associated with SPARK and SPARKEL depletion (PMID: 32379047, Figure 2A, 2D–2J) are consistent with reduced PKG activity (PMID: 28465425) and only develop after PKG has been depleted as shown by proteomics experiments (Figure 2E-J and Figure 3C). Other studies have shown that the effects of reduced PKG activity are dominant to reduced PKA C1 activity (PMID: 29030485). The replicative-phase phenotypes associated with SPARK and SPARKEL depletion are consistent with reduced PKA C3 activity (PMID: 27247232 and herein). Mechanistically, PKG and PKA C1 activity must be lower in SPARK-depleted parasites because the abundances of these kinases are lower (Figure 3A, 3C). The mechanism of regulation may be more complex in the case of PKA C3, as SPARK depletion did not cause a reduction in PKA C3 abundance as measured by cellular assays (Figure 7B–F), but PKA C3 activity decreased (Figure 7I–K). We concede that multiple mechanisms may lead to the reduction in PKA C1 and PKG abundances, such as decreased activation loop phosphorylation and autophosphorylation at other stabilizing sites or enhanced ubiquitin ligase activity leading to active degradation of the kinases; we have moved speculation regarding such mechanisms to the Discussion.

      Although the reviewer commented that the manuscript “requires better organization” in the public review, no specific recommendations were provided to the authors. Therefore, we did not change the organization of the manuscript. We added an additional paragraph to the Discussion to reiterate key findings: “A prior study identified SPARK as a regulator of parasite invasion and egress following 24 hours of kinase depletion (Smith et al., 2022). Unexpectedly, we observed that three hours of SPARK or SPARKEL depletion were insufficient to impact T. gondii motility or calcium-dependent signaling, indicating that the phenotypes associated with SPARK and SPARKEL depletion develop over time. Quantitative proteomics revealed that PKA and PKG abundances began to decrease after more than three hours of SPARK depletion. Proximity labeling experiments also suggested that SPARK, PKA, and PKG are spatially associated within the parasite cell. We propose a model in which SPARK down-regulation coincides with reduced PKG and PKA activity due to diminished protein levels.” This work built upon genetic and proteomic approaches recently described by our group, which we cited in the text and extensive methods section. We added additional experimental detail where noted in the reviewer’s recommendations to the authors.

      The study utilizes advanced genetics because biochemical tools for eukaryotic parasites are limited. For example, no antibodies for T. gondii SPARK, PKA subunits, or PKG exist; to say nothing of phosphosite-specific antibodies, which are common in the mammalian cell signaling field. Therefore, to measure the relationship between SPARK, SPARKEL, and PKA subunits, we had to generate strains in which multiple proteins were tagged with epitopes for downstream analysis. The genetic experiments included appropriate controls and were internally consistent with results obtained using orthogonal approaches, such as mass spectrometry.

      Reviewer #3 (Public Review):

      Summary:

      This paper focuses on the roles of a toxoplasma protein (SPARKEL) with homology to an elongin C and the kinase SPARK that it interacts with. They demonstrate that the two proteins regulate the abundance of PKA and PKG, and that depletion of SPARKEL reduces invasion and egress (previously shown with SPARK), and that their loss also triggers spontaneous bradyzoite differentiation. The data are overall very convincing and will be of high interest to those who study Toxoplasma and related apicomplexan parasites.

      Strengths:

      The study is very well executed with appropriate controls. The manuscript is also very well and clearly written. Overall, the work clearly demonstrates that SPARK/SPARKEL regulate invasion and egress and that their loss triggers differentiation.

      Weaknesses:

      (1) The authors fail to discriminate between SPARK/SPARKEL acting as negative regulators of differentiation as a result of an active role in regulating stage-specific transcription/translation or as a consequence of a stress response activated when either is depleted

      We demonstrate a novel function for SPARK and SPARKEL as negative regulators of differentiation. The pathways leading to differentiation are being actively studied. Up-regulation of a positive transcriptional regulator of chronic differentiation, BFD1, is sufficient to trigger differentiation in vitro in the absence of other stressful growth conditions (PMID: 31955846). SPARK or SPARKEL depletion results in up-regulation of proteins that are up-regulated upon BFD1 overexpression. Whether BFD1 overexpression or SPARK and SPARKEL depletion triggers cellular stress pathways is beyond the scope of the current work, which focused instead on the immediate effect of these pathways on AGC kinases. Study of the effect of the various kinases on the parasite phosphoproteome shows that the putative targets of PKA C3 are specifically downregulated upon SPARK knockdown, indicating PKA C3 activity is indeed decreased in the latter condition.

      (2) The function of SPARKEL has not been addressed. In mammalian cells, Elongin C is part of an E3 ubiquitin ligase complex that regulates transcription and other processes. From what I can tell from the proteomic data, homologs of the Elongin B/C complex were not identified. This is an important issue as the authors find that PKG and PKA protein levels are reduced in the knockdown strains

      Our experiments suggest that SPARK and SPARKEL form a complex, and down-regulation of one complex member leads to down-regulation of the other. Thus in all tested assays, knockdown of SPARK and SPARKEL phenocopy one another. Further biochemical and structural work will be required to determine the mechanism by which SPARKEL regulates SPARK.

      Nearly all studies of the function of elongin C have been conducted in mammalian cells. Proteins with elongin C domains may serve alternative and unexplored functions in unicellular eukaryotes. We searched for the presence of Elongin A/B and known Elongin C complex members in the T. gondii genome and were unable to identify orthologs, explaining why these proteins were not identified in mass spectrometry experiments. Please see our response in Recommendations for the Authors, Reviewer 3 point 2.

      Beyond the concerns raised by the review team, we have identified and corrected the following errors or omissions in the first submission of the manuscript:

      - Line 176 of the first submission referred to a “peptide sequence match (PSM)”, which we have changed to “peptide-spectrum match”.

      - We recolored and relabeled the lines in Figure 5A so that it is easier to match a specific peptide with a specific line; and also corrected a mislabeling.

      - Figure 7B SPARK panel was incorrectly centered. The raw files can be viewed in Figure 7—source data 2.

      - Figure 7—figure supplement 1D was missing an x-axis label.

      - Line 1172 referred to “Supplementary File X”, which we corrected to “Supplementary File 3”.

      - We have updated references to preprints that have since been published, including PMID: 38093015, 37933960, 37966241, and 37610220.

      Editors comments:

      The proteomics data reported in this study underpin the major findings and are very comprehensive. As noted in the reviews, it is strongly recommended that the authors normalize the levels of detected phosphopeptides against the levels of the parent protein in the different mutant lines in order to identify changes in protein phosphorylation that are linked to protein kinase activity rather than protein degradation. A focus on changes that occur at early time points following protein knock-down may also help to identify the main targets of each kinase.

      Please see our response to Reviewer 2 Recommendations for the Authors, points 1 and 2.

      Reviewer #1 (Recommendations For The Authors):

      During my reading, I only found one small mistake. In Figure 7F, the x-axis is missing the word 'PKA'.

      We have updated the x-axis to read “SPARK-AID/PKA C3-mNG (h. + IAA)”.

      All information, code, and reagents are clearly explained.

      Reviewer #2 (Recommendations For The Authors):

      How the phosphoproteome was analyzed needs to be clarified. The normalization step, computing the ratio of the phosphopeptide to the protein (peptide) intensity, appears omitted. It is the most critical step of the analysis. The minor shifts between protein and phosphosite intensity seem negligible, as seen in Figure 4 AB. The significant changes can only be deduced by calculating this ratio. In the current state, the presented results are inconclusive. The manuscript contains overreaching and often unsupported statements because the data has not been appropriately filtered. Related to this topic, it is advisable to use well-accepted terminology and complete words when describing proteome and phosphoproteome. The interexchange of a "peptide" and a "phosphopeptide" in the text confuses and misleads.

      To clarify the phosphoproteome analysis:

      We cite a previous description of the phosphoproteomics sample preparation workflow (lines 1124-1125 of the first submission for example). Our quantitative phosphoproteomics experiments comprise two datasets generated from the same multiplexed samples. The samples were split at the point of phosphopeptide enrichment. Ninety-five percent of the samples were subjected to phosphopeptide enrichment (titanium dioxide followed by nickel affinity chromatography; “enriched samples”). Five percent of the samples were reserved as a reference for the non-enriched proteome (“non-enriched samples”). To clarify this point, we have added the sentences “Approximately 95% of the proteomics sample was used for phosphopeptide enrichment” and “The remaining 5% of the sample was not subjected to the phosphopeptide enrichment protocol” to the Methods sections, after describing the multiplexing steps.

      The samples were fractionated separately and run separately on an LC-MS system, which is described in the Methods section, for example lines 1130-1149 of the first submission. Raw files of the phosphopeptide-enriched and unenriched samples were analyzed separately, which is described in the Methods section, for example lines 1151-1158 of the first submission. To clarify this point, we have added the sentence “Raw files of the phosphopeptide-enriched and unenriched samples were analyzed separately” to the Methods sections. Many of the search parameters and descriptions of normalization and protein abundances were described in lines 1085-1093 of the first submission in reference to the 24h SPARK depletion proteome. We added this information to the description of the SPARK depletion time course phosphoproteome data analysis: “The allowed mass tolerance for precursor and fragment ions was 10 ppm and 0.02 Da, respectively. False discovery was assessed using Percolator with a concatenated target/decoy strategy using a strict FDR of 0.01, relaxed FDR of 0.05, and maximum Delta CN of 0.05. Only unique peptide quantification values were used. Co-isolation and signal-to-noise thresholds were set to 50% and 10, respectively. Normalization was performed according to total peptide amount. In the case of the unenriched samples, protein abundances were calculated from summation of non-phosphopeptide abundances.”

      We hope that this clarifies how the unenriched sample protein-level abundances were calculated. When we discuss “protein abundance”, we are referencing the unenriched sample summed non-phosphopeptide abundance. Our phosphoproteome analysis was based only on phosphopeptides, as our phosphopeptide enrichment resulted in 99% efficiency, and peptides lacking phosphorylation sites were filtered out before subsequent analyses. We used “peptide” and “phosphopeptide” interchangeably because the only peptide-level analysis performed was based on phosphopeptide abundances. We have changed any mention of “peptide” to “phosphopeptide” in the main text. 

      “The normalization step, computing the ratio of the phosphopeptide to the protein (peptide) intensity, appears omitted. It is the most critical step of the analysis.”:

      Unlike common differential gene expression analysis pipelines, proteomics analysis pipelines are not settled. Many analyses do not perform peptide-to-parent-protein corrections; some normalize phosphopeptide abundances to parent protein abundances calculated from summing non-phosphopeptides or a combination of phosphopeptide and non-phosphopeptides on an ad hoc basis; some calculate global normalization factors based on regressions of protein and phosphopeptide abundances or other pairwise comparisons. A caveat of protein normalization of phosphopeptides is that it over-corrects cases in which protein abundance and phosphorylation are interdependent, as is the case for auto-phosphorylation and some activation loop phosphorylations (PMID: 37394063). We used the approach that retained the greatest complexity of the data, which is to not normalize abundances across different mass spectrometry experiments and discard information that was not in the overlap. We have updated Supplementary File 3.3 to include protein-level quantification values (from Supplementary File 3.2) if measured.

      We clarified that the phosphopeptide abundances and protein-level abundances were derived from different datasets that were each internally normalized (globally centered by total peptide amount). Protein-level abundances were summed from non-phosphopeptide abundances. The calculated log2 changes are based on the globally centered data within each dataset. We analyzed the kinetic profiles of changing phosphopeptide abundances relative to a control using approaches similar to those described for several recent temporally resolved T. gondii phosphoproteomes (e.g. PMID: 37933960, 35976251, 36265000, 29141230) and as described in the Methods. The approach does not first correct for unenriched-sample parent protein abundance—in some applications, unenriched samples are not collected at all; instead, phosphopeptide ratios are median-normalized to non-phosphopeptide ratios (quantified due to inefficient phosphopeptide enrichment) and are individually tested against the null distribution of non-phosphopeptide ratios (e.g. PMID: 36265000, 29141230). We did not use this approach because our phosphopeptide enrichment was 99% efficient (18518 phosphopeptides of 18758 peptides with quantification values). In several cases using our approach, parent protein abundance is not quantified in the unenriched proteome dataset, but phosphopeptides are reliably quantified in the enriched proteome dataset. We note that phosphopeptide abundance changes can be difficult to interpret in such cases, e.g. in the first submission lines 178-186 and 193-194. We have added similar text to the results noting that in the case of PKA and PKG, both unenriched parent protein and enriched phosphopeptide abundances decreased (see below). We have also moved speculation about whether SPARK phosphorylates the activation loop of PKA and PKG, or whether the down-regulation of PKA and PKG arises from indirect effects, to the Discussion.

      We have moved comparisons of protein and phosphopeptide abundances from the Results to the Discussion. We added the following sentences to the result section Clustering of phosphopeptide kinetics identifies seven response signatures: “Because non-phosphopeptide and phosphopeptide abundances were quantified in different mass spectrometry experiments, it is challenging to compare the rates of phosphopeptide and parent protein abundance changes, especially when phosphorylation status and protein stability are interconnected. In general, both PKA C1, PKA R, and PKG protein and phosphosite abundances decreased following SPARK depletion (Figure 3—figure supplement 1), as discussed further below. We also observed down-regulation of phosphosite and protein abundances of a MIF4G domain protein.” Figure 3—figure supplement 1E is a new panel that shows PKA C1, PKA R, and PKG phosphopeptide and parent protein abundances along with global changes in phosphopeptide and parent protein abundances in the cases which both were quantified. We changed lines 278-282 in the first submission to “The SPARK depletion time course phosphoproteome showed a reduction in the abundance of PKA C1 T190 and T341, which are located in the activation loop and C-terminal tail, respectively (Figure 4A). Several phosphosites residing in the N terminus of PKA R (e.g. S17, S27, and S94) also decreased following SPARK depletion (Figure 4B).” We changed lines 313-315 in the first submission to “The SPARK depletion time course phosphoproteome showed a reduction in the abundance of several phosphosites residing in the N terminus of PKG as well as T838, which corresponds to the activation loop (Figure 5A). By contrast, S105 did not greatly decrease, and S40 abundance slightly increased.”

      The description of experiments should be more detailed. For example, the 3, 8, and 24 h treatments were used reversely; thus, they should be emphasized as time points before natural egress. Consequently, it seems that 3h treatment should be prioritized, given the SPARK/SPARKEL role in egress/invasion. Unexpectedly, the study draws more attention to a 24-hour treatment. If the AID-SPARK/SPARKEL is eliminated within 1h, parasites undoubtedly accumulate numerous secondary defects during a prolonged 23h deprivation. Since the SPARK pathway activates kinase/phosphatase cascades, the 24h data is likely overwhelmed with the consequences of the long-term complex degradation, making it a poor source of the putative SPARK substrates. Likewise, the downregulation of PKA observed in the 8 hours after SPARK depletion may be an indirect effect of the SPARK degradation. The direct effects and immediate substrates should be detectable within 2-3h of auxin treatment of the nearly egressing cultures.

      The first submission described how parasites were harvested at 32 hours post-infection with 0, 3, 8, or 24 hours of IAA treatment (lines 157-160, 1097-1110, and Figure 3B). To reiterate this experimental detail, we have added “harvested 32 hours post-infection” to the sentence “...quantitative proteomics with tandem mass tag multiplexing that included samples with 0, 3, 8, and 24 hours of SPARK or SPARKEL depletion” and similarly in the figure legend. The time points are unrelated to natural egress because the experiment was terminated at 32 hours post-infection, which is earlier than the window typically used to study natural egress under these conditions (40-48 hours post-infection). We chose to terminate the experiment before natural egress to better localize phosphopeptide changes related to SPARK depletion. The phosphoproteome undergoes dramatic reorganization during egress due to the activity of myriad kinases and phosphatases (see PMID: 35976251, 37933960, and 36265000), which would have likely complicated the signal.

      A pivotal result motivating time-course experiments and analysis was that SPARK/SPARKEL's role in egress and invasion emerges only after an extended depletion period (Figure 2E–J, first submission lines 126-145). The 24h depletion was used in the experimental system that first identified SPARK as a regulator of egress, which motivated our initial experiments, as stated in the first submission lines 126-144 and 149-151. We draw attention to the observation that SPARK and SPARKEL phenotypes develop over time in the first submission, lines 137-145. The role for SPARK/SPARKEL in egress/invasion does not manifest at 3h depletion; it manifests at 24h depletion. To ensure that this point is not overlooked by the reader, we have created a new heading in the Results section (SPARK and SPARKEL depletion phenotypes develop over time) for the paragraph that was previously lines 137-145. The remainder of the manuscript integrates data from proteomic, genetic, and cell-based assays across temporal dimensions to build a working model of how the phenotypes associated with SPARK depletion develop over time.

      Underpinning this comment is an assumption that phosphopeptides that decrease the most rapidly following a kinase’s depletion are direct substrates, whereas phosphopeptides that decrease with slower kinetics are not. This is not always the case. Consider a kinase that phosphorylates sites on substrate A and substrate B. The site on substrate A is also the target of a phosphatase, whereas the site on substrate B is recalcitrant to phosphatase activity. If the kinase were inhibited, then the site on substrate A would be actively dephosphorylated. As measured by a phosphoproteomics experiment, the abundance of the substrate A phosphopeptide would drop rapidly due to the inactivity of the kinase and activity of the phosphatase. In the text, we called such sites “constitutively regulated” or dynamic—they are actively dephosphorylated and phosphorylated within a short timeframe. The phosphosite on substrate B is comparatively static; once it is phosphorylated by the kinase, it is unaffected by subsequent inhibition of the kinase. Only newly synthesized substrate B molecules would be affected by kinase inhibition. As measured by a phosphoproteomics experiment, the abundance of the substrate B phosphopeptide would drop more gradually after kinase inhibition, as the unphosphorylated peptide is found only on newly synthesized proteins that were not previously exposed to kinase activity. An example of the scenario described for substrate A would be that of yeast Cdk1 T14/Y15, which is phosphorylated by Wee1 and dephosphorylated by Cdc25 (e.g. PMID: 7880537). An example of the scenario described for substrate B would be that of the human PKA C activation loop T197, which is phosphorylated by PDK1 and is phosphatase-resistant under physiological conditions (e.g. PMID: 22493239, 15533936).

      Both substrate A and B may be “direct” and functionally relevant targets of the kinase. Categorizing substrates as “immediate” is comparatively less informative in this context (although it may be relevant when studying fast, synchronized processes with high temporal resolution, such as induced Plasmodium spp. gametocyte activation or stimulation of T. gondii secretion). Furthermore, our earlier experiments had shown that the role for SPARK/SPARKEL in motility manifests after 3h depletion and is complete by 24h depletion. By this logic, we were most interested in the candidates showing differences at these time points. We conducted proximity labeling experiments to identify the overlap of proteins that exhibited SPARK-dependent decreases in the global proteomics and were also proximal to SPARK in space (first submission Figure 3I and lines 260-275), thus revealing a prioritized list of candidates, which included PKG and PKA. When technically feasible, we included a temporal dimension to follow-up experiments, rather than relying on a 24h terminal comparison (e.g. Figure 4E–H, Figure 5D–E, Figure 7D–F, Figure 7I–K; all first submission).

      Fig2 (B and C). What antibodies had been used to detect tagged proteins? There is a concern regarding the use of multiple tags attached to the same protein to the point that it doubles the size of the studied protein. The switch of the mobility of the SPARK and SPARKEL on the WB due to a change in MW adds to the confusion. Furthermore, the study did not use all the fused epitopes (e.g., HA). At the same time, the same V5 tag was used to detect two factors in the same parasite. Although the controls are provided, it does not eliminate the possibility that the second band on the WB results from one protein degradation rather than the presence of two individual proteins. Different tags should be used to confirm the co-expression of two proteins. Panel E is missing the X-axis label.

      Figure 2B was incorrectly labeled; the labels corresponding to SPARK and SPARKEL were switched. We corrected this error in the revised figures. The antibodies used were mouse monoclonal anti-V5 as described in the key resources table of the first submission. We added “V5” to Figure 2A and 2B. Regarding the effect of the tagging payload attached to the proteins, we have included in all assays a control relative to a parental strain (TIR1) without a tagging payload, and additionally included internal controls within tagged strains to calculate dependency of a phenotype on IAA treatment. The western blots in Figure 2B and 2C are from two different strains and experiments. The strains and experiments are described in the first submission main text (lines 113-124), the figure legend (lines 1847-1850), the key resources table, and the methods (lines 650-664, 872-891). A description of the SPARK-AID/SPARKEL-mNG strain was included in the key resources table but omitted in the methods. We therefore added the following section to the Methods:

      “SPARKEL-V5-mNG-Ty/SPARK-V5-mAID-HA/RHΔku80Δhxgprt/TIR1

      The HiT vector cutting unit gBlock for SPARKEL (P1) was cloned into the pALH193 HiT empty vector. The vector was linearized with BsaI and co-transfected with the pSS014 Cas9 expression plasmid into SPARK-V5-mAID-HA/RHΔku80Δhxgprt/TIR1 parasites. Clones were selected with 1 µM pyrimethamine and isolated via limiting dilution to generate the SPARKEL-V5-mNG-Ty/SPARK-V5-mAID-HA/RHΔku80Δhxgprt/TIR1 strain. Clones were verified by PCR amplification and sequencing of the junction between the 3′ end of SPARKEL (5’-GGGAGGCCACAACGGCGC-3’) and 5′ end of the protein tag (5’-gggggtcggtcatgttacgt-3’).”

      To clarify the expected MW of each species, we have added the following text to the Methods:

      “The expected molecular weight of SPARKEL-V5-HaloTag-mAID-Ty is 66 kDa, from the 42.7 kDa tagging payload and 23.3 kDa protein sequence. The expected molecular weight of SPARK-V5-mCherry-HA is 89.7 kDa, from the 31.9 kDa tagging payload and 57.8 kDa protein sequence. The expected molecular weight of SPARK-V5-mAID-HA is 71.3 kDa, from the 13.5 kDa tagging payload and 57.8 kDa protein sequence. The expected molecular weight of SPARKEL-V5-mNG-Ty is 55.2 kDa, from the 31.9 kDa tagging payload and 23.3 kDa protein sequence.”

      SPARK and SPARKEL are lowly expressed, which may have been compounded by basal degradation due to the AID tag (see for example Figure 3—figure supplement 1D of the first submission). We attempted several immunoblot conditions and antibodies, and only the V5 antibody proved effective in recognizing these proteins above the limit of detection. For this reason, we included an additional single-tagged control in each immunoblot experiment. Uncropped images of the blots are included in the first submission as Figure 2—figure supplement 1D and E and as Figure 2 source data. We added the following statement to the results section of the text:

      “However, SPARK and SPARKEL abundances are low and approach the limit of detection. We could only detect each protein by the V5 epitope. Although our experiments included single-tagged controls, we cannot formally eliminate the possibility that SPARK-AID yields degradation products that run at the expected molecular weight of SPARKEL. More sensitive methods, such as targeted mass spectrometry, may be required to measure the absolute abundance and stoichiometries of SPARK and SPARKEL.”

      We added “h +IAA” to the x-axis of panel 2E.

      Fig. 3. There is plentiful proteomic data on the factor-depleted parasites. Can it be used to confirm the co-degradation of the SPARK/SPARKEL complex components? This figure mainly includes quality control data that can be moved to Supplement. Did you detect SPARKEL in the TurboID experiment described in panel I? The plot shows only an AGC kinase.

      SPARK and SPARKEL are lowly expressed, and we often do not detect SPARK or SPARKEL peptides with quantification values in complex samples (such as global depletion proteomes and phosphoproteomes; IPs and streptavidin pull-downs are comparatively less complex, with IPs being the least complex samples). We discussed this caveat in the first submission lines 178-186. To additionally clarify this point, we have added “We were unable to measure SPARK or SPARKEL abundances in this proteome” earlier in the text.

      We consider the figure panels relevant to the discussion in the text.

      SPARKEL was not quantified in the SPARK-TurboID experiment (Supplementary File 2). We have added “SPARKEL was not quantified in this experiment” to the text. “Not quantified” is a different outcome from “quantified but not enriched”. The interaction between SPARK and SPARKEL is supported by five other independent interaction experiments in which SPARKEL was quantified (Figure 1A, 1D, 1E; and Figure 1—figure supplement 1). The added insight from the SPARK proximity labeling experiments comes from integration with the global proteomics, which suggests that AGC kinases are in proximity to SPARK and exhibit SPARK-dependent stability and hence activity. The logic of the proximity labeling experiment is described in lines 258-275 of the first submission.

      Fig. 6G is missing deltaBDF1 control for unbiased evaluation of the SPARK KD effect.

      The logic of this experiment was to evaluate whether excess differentiation caused by SPARK and PKA C3 depletion (Figure 6A and 6B) was dependent on the BFD1 circuit. The ∆bfd1 phenotype is well-established under these experimental conditions: parasites lacking BFD1 do not differentiate under spontaneous or alkaline conditions (e.g. PMID: 31955846, 37081202, 37770433). Parasites lacking BFD1 do not differentiate when SPARK and PKA C3 are depleted, suggesting that differentiation caused by SPARK or PKA C3 depletion occurs through the BFD1 circuit. If differentiation caused by SPARK or PKA C3 depletion did not depend on the BFD1 circuit, we might have observed differentiation in the SPARK- and PKA C3-AID/∆bfd1 mutants.

      To clarify this point, we have changed the first sentences of the last paragraph in the results section Depletion of SPARK, SPARKEL, or PKA C3 promotes chronic differentiation: “To assess whether excess differentiation caused by SPARK and PKA C3 depletion is dependent on a previously characterized transcriptional regulator of differentiation, BFD1 (Waldman et al., 2020), we knocked out the BFD1 CDS with a sortable dTomato cassette in the SPARK- and PKA C3-AID strains (Figure 6–figure supplement 1). The resulting SPARK- and PKA C3-AID/∆bfd1 mutants failed to undergo differentiation as measured by cyst wall staining (Figure 6G–H), suggesting that differentiation caused by depletion of these kinases depends on the BFD1 circuit.”

      Lines 239-242. The logic behind the categories of "constitutively regulated sites" and "newly synthesized proteins dependent on SPARK activation" is odd. The former (3h treatment) represents the SPARK-specific events (even though it should be shortened to 1-2h), while an 8h treatment is already contaminated with secondary effects. Since Toxoplasma divides asynchronously, the "newly synthesized" proteins will be present at the time. Also, the protein phosphorylation does not always lead to substrate activation; it can be repressive, too.

      We describe the logic in response to a comment above (substrate A vs. substrate B). It is correct that T. gondii divides asynchronously, with a cell cycle of approximately 8 hours, and 60% of parasites in G1 at a given time (PMID: 11420103). The proteomics experiments measure peptide and protein abundances at a population level. Newly synthesized proteins will be present at all time points; but the proportion of proteins synthesized after SPARK depletion relative to proteins synthesized before SPARK depletion will increase over time.

      We moved lines 238-243 from the first submission to the Discussion.

      It is accurate that phosphorylation does not always lead to substrate activation; it can also be repressive or not change substrate behavior. However, in the case of protein kinases, activation loop phosphorylation is highly correlated with activation (e.g. PMID: 15350212, 31521607).

      Line 250-252: Because the SPARK degradation did not affect intracellular replication, SPARK is unlikely to affect cell cycle-specific phosphorylation.

      To parallel the prior sentences describing different SPARK-dependent down-regulated clusters, we truncated this sentence to “The final cluster of depleted phosphopeptides, Cluster 4, only exhibits down-regulation at 8h of IAA treatment.”

      SPARKEL depletion did not significantly affect intracellular replication under the time frames investigated here (approximately 25 hours post-invasion; Figure 2D). A prior study reported that SPARK depletion did not affect intracellular replication measured on a similar timescale (PMID: 35484233).

      The opening sentence of the Discussion: Typically, we refer to the newly discovered proteins as the orthologs of the previously discovered counterparts and not the vice versa. Thus, calling Toxoplasma SPARK the ortholog of mammalian PDK1 would be more appropriate.

      We changed the opening sentence of the Discussion to “SPARK is an ortholog of PDK1, which is considered a key regulator of AGC kinases”.

      Reviewer #3 (Recommendations For The Authors):

      (1) Authors should show alignment of SPARKEL with Elongin C. Are key residues conserved?

      We have added an alignment of the SKP1/BTB/POZ domains of Homo sapiens elongin C, S. cerevisiae elongin C, and T. gondii SPARKEL as Figure 1—figure supplement 1B. This panel highlights elongin B interface, cullin binding sites, and target protein binding sites based on the human elongin C annotation. As discussed below, these interfaces may not be functionally conserved in T. gondii. Ultimately, future mechanistic and structural studies beyond the scope of the current work will be required to determine how SPARK and SPARKEL physically interact. The Discussion states, “further biochemical studies are required to discern the regulatory interactions between SPARK and SPARKEL” (lines 590-591).

      (2) The failure to identify other Elongin B/C complex members should be addressed by direct IP analysis.

      Indeed, elongin C has traditionally been characterized as a component of multisubunit complexes comprising Elongin A/B/C or Elongin BC/cullin/SOCS that regulate transcription or function as ubiquitin ligases, respectively (for a review, PMID: 22649776). We see two major issues when attempting to generalize these results to apicomplexan parasites. First, nearly all studies of the function of elongin C have been conducted in a single eukaryotic supergroup (the opisthokonts, including yeast and metazoans). The majority of eukaryotic diversity exists in other supergroups, including the SAR supergroup to which apicomplexans such as T. gondii belong (PMID: 31606140). Proteins with elongin C domains may serve alternative and unexplored functions in non-opisthokont unicellular eukaryotes. Second (in support of the first), we were unable to find orthologs of many of the opisthokont complex members in T. gondii, as systematically described below.

      By BLAST, the most similar protein to SPARKEL in S. cerevisiae is ELC1 (YPL046C), with a BLAST E = 0.003. The next most similar protein was SCF ubiquitin ligase subunit SKP1 (YDR328C) with an E value of 0.62. ELC1 is 99 amino acids. The Elongin C (IPR039948) and SKP1/BTB/POZ superfamily domains (IPR011333) span most of this sequence. SPARKEL is 216 amino acids; the Elongin C and  SKP1/BTB/POZ superfamily domains occupy the C-terminal half of the protein. The N-terminal domain of SPARKEL may be important for its function; however, future work is required to address this hypothesis.

      Elongin B: Elongin B is not found universally amongst even opisthokonts; fungi and choanoflagellates lack obvious orthologs. The most similar T. gondii protein to human Elongin B (Q15370) by BLAST is TGME49_223125 (E = 0.017), an apicoplast ubiquitin-like protein PUBL (PMID: 28655825, 33053376). TGME49_223125 has a C-terminal ubiquitin-like domain (IPR000626) but no ELOB domain (IPR039049); indeed, no T. gondii protein has an ELOB domain that can be identified by sequence searching. Given the lack of similarity between EloB and TGME49_223125, as well as this protein’s possible red algal endosymbiont origin, we consider it an unlikely ortholog of EloB and topologically unlikely to  interact with the SPARK/SPARKEL complex. We did not detect TGME49_223125 in SPARK or SPARKEL IPs (Supplementary File 1).

      Elongin A: T. gondii appears to lack a human elongin A ortholog (Q14241) on the basis of sequence similarity. The most similar T. gondii protein to yeast Elongin A (O59671) by BLAST is TGME49_299230 (E = 0.022). Yeast EloA is 263 amino acids. TGME49_299230 is 1101 amino acids and does not have an EloA domain (IPR010684), suggesting it is not a true EloA ortholog.

      Suppressor of cytokine signaling (SOCS): T. gondii appears to lack human SOCS1 or SOCS2 orthologs (O15524 and O14508) on the basis of sequence similarity. We were unable to identify T. gondii proteins with SOCS domains (PF07525, SM00253, SM00969, and SSF158235).

      Von Hippel-Lindau tumor suppressor (VHL): T. gondii appears to lack a human VHL ortholog (P40337) on the basis of sequence similarity.  We were unable to identify T. gondii proteins with VHL domains (IPR024048, IPR024053, PF01847, and SSF49468).

      Cul-2/5: Cullins appeared early in the eukaryotic radiation (PMID: 21554755), and thus T. gondii possesses several. Since the ELC complex has been best characterized with human cullin-2 (Q13617) and cullin-5 (Q93034), we searched for orthologs of these proteins and identified TGME49_289310, TGME49_289310, and TGME49_316660. TGME49_289310 functionally resembles cullin-1 of the SCF complex (PMID: 31348812). None of these proteins were enriched in the SPARK or SPARKEL IPs (Supplementary Table 1).

      Rbx1: We searched for human Rbx1 orthologs (P62877) and identified TGME49_213690, which functionally resembles Rbx1 of the SCF complex (PMID: 31348812); as well as several other RING proteins (TGME49_267520, TGME49_277740, TGME49_261990, and TGME49_232160) that were not found in the SPARK or SPARKEL IPs (Supplementary File 1).

      Rbx2: We searched for human Rbx2 orthologs (Q9UBF6) and identified several RING proteins (TGME49_285190, TGME49_254700, TGME49_292340, TGME49_226740, TGME49_244610, and TGME49_304460) that were not found in the SPARK or SPARKEL IPs (Supplementary File 1). No T. gondii protein has an Rbx2 domain (cd16466) that can be identified by sequence searching.

      In conclusion, we conducted “direct IP analysis” (Figure 1A, 1D; Figure 1-supplement 1A) of the SPARK and SPARKEL complex in the first submission of the manuscript. The observation that SPARK and SPARKEL form strong interactions was validated in cellulo via proximity labeling (Figure 1E; Figure 1-supplement 1B) in the first submission of the manuscript. These results are described together in the results section SPARK complexes with an elongin-like protein, SPARKEL (lines 75-110, first submission of manuscript). The failure to identify an interaction between SPARKEL and Elongin B/C complex members in T. gondii may be due to the observation that Elongin B and several ELC complex members do not exist in most eukaryotes, including T. gondii. We added the sentences “The function of proteins with Elongin C-like domains has not been widely investigated in unicellular eukaryotes” to the Results and “However, the SPARK and SPARKEL IPs and proximity experiments failed to identify obvious components of ubiquitin ligase complexes” to the Discussion.

      (3) PKA and PKG half-lives should be measured as well as their transcript abundances.

      The finding that PKA C1 and PKG protein abundances decreased upon SPARK/SPARKEL depletion was internally consistent across experiments. This down-regulation may be due to transcriptional, translational, or post-translational mechanisms. We measured PKG and PKA C1 transcript abundances in SPARK-AID and TIR1 parasites after 24 hours of IAA treatment using RT-qPCR. We did not detect significant differences in transcript levels of the queried kinases. These findings suggest that SPARK depletion leads to PKG and PKA down-regulation through post-transcriptional mechanisms. Translational control is normally enacted globally, for example through regulation of eukaryotic translation factors (PMID: 15459663). The rapid and specific down-regulation of PKG and PKA C1 would suggest that the kinase abundance levels are regulated by non-global translational mechanisms (e.g. mRNA-specific) or rather post-translational mechanisms.

      Substantial additional work is required to determine protein half-lives in eukaryotic parasites. In our discussion of possible mechanisms and models, we were agnostic as to the cause of reduced PKG and PKA abundances upon SPARK depletion. We note in the discussion, “The cause for reduction of PKA C1 and PKG levels requires further study” (lines 541-542).

    1. Reviewer #2 (Public Review):

      Summary:

      This study comprehensively presents data from single nuclei sequencing of Heigai pig skeletal muscle in response to conjugated linoleic acid supplementation. The authors identify changes in myofiber type and adipocyte subpopulations induced by linoleic acid at depth previously unobserved. The authors show that linoleic acid supplementation decreased the total myofiber count, specifically reducing type II muscle fiber types (IIB), myotendinous junctions, and neuromuscular junctions, whereas type I muscle fibers are increased. Moreover, the authors identify changes in adipocyte pools, specifically in a population marked by SCD1/DGAT2. To validate the skeletal muscle remodeling in response to linoleic acid supplementation, the authors compare transcriptomics data from Laiwu pigs, a model of high intramuscular fat, to Heigai pigs. The results verify changes in adipocyte subpopulations when pigs have higher intramuscular fat, either genetically or diet-induced. Targeted examination using cell-cell communication network analysis revealed associations with high intramuscular fat with fibro-adipogenic progenitors (FAPs).  The authors then conclude that conjugated linoleic acid induces FAPs towards adipogenic commitment. Specifically, they show that linoleic acid stimulates FAPs to become SCD1/DGAT2+ adipocytes via JNK signaling. The authors conclude that their findings demonstrate the effects of conjugated linoleic acid on skeletal muscle fat formation in pigs, which could serve as a model for studying human skeletal muscle diseases.

      Strengths:

      The comprehensive data analysis provides information on conjugated linoleic acid effects on pig skeletal muscle and organ function. The notion that linoleic acid induces skeletal muscle composition and fat accumulation is considered a strength and demonstrates the effect of dietary interactions on organ remodeling. This could have implications for the pig farming industry to promote muscle marbling. Additionally, these data may inform the remodeling of human skeletal muscle under dietary behaviors, such as elimination and supplementation diets and chronic overnutrition of nutrient-poor diets. However, the biggest strength resides in thorough data collection at the single nuclei level, which was extrapolated to other types of Chinese pigs.

      Weaknesses:

      While the authors generated a sizeable comprehensive dataset, cellular and molecular validation needed to be improved. For example, the single nuclei data suggest changes in myofiber type after linoleic acid supplementation, yet these data are not validated by other methodologies. Similarly, the authors suggest that linoleic acid alters adipocyte populations, FAPs, and preadipocytes; however, no cellular and molecular analysis was performed to reveal if these trajectories indeed apply. Attempts to identify JNK signaling pathways appear superficial and do not delve deeper into mechanistic action or transcriptional regulation. Notably, a variety of single cell studies have been performed on mouse/human skeletal muscle and adipose tissues. Yet, the authors need to discuss how the populations they have identified support the existing literature on cell-type populations in skeletal muscle. Moreover, the authors nicely incorporate the two pig models into their results, but the authors only examine one muscle group. It would be interesting if other muscle groups respond similarly or differently in response to linoleic acid supplementation. Further, it was unclear whether Heigai and Laiwu pigs were both fed conjugated linoleic acid or whether the comparison between Heigai-fed linoleic acid and Laiwu pigs (as a model of high intramuscular fat). With this in mind, the authors do not discuss how their results could be implicated in human and pig nutrition, such as desirability and cost-effectiveness for pig farmers and human diets high in linoleic acid. Notably, while single nuclei data is comprehensive, there needs to be a statement on data deposition and code availability, allowing others access to these datasets. Moreover, the experimental designs do not denote the conjugated linoleic acid supplementation duration. Several immunostainings performed could be quantified to validate statements. This reviewer also found the Nile Red staining hard to interpret visually and did not appear to support the conclusions convincingly. Within Figure 7, several letters (assuming they represent statistical significance) are present on the graphs but are not denoted within the figure legend.

    1. Reviewer #2 (Public Review):

      Summary:

      The study investigates the brain's functional connectivity (FC) dynamics across different timescales using simultaneous recordings of intracranial EEG/source-localized EEG and fMRI. The primary research goal was to determine which of three convergence/divergence scenarios is the most likely to occur.

      The results indicate that despite similar FC patterns found in different data modalities, the time points were not aligned, indicating spatial convergence but temporal divergence.

      The researchers also found that FC patterns in different frequencies do not overlap significantly, emphasizing the multi-frequency nature of brain connectivity. Such asynchronous activity across frequency bands supports the idea of multiple connectivity states that operate independently and are organized into a multiplex system.

      Strengths:

      The data supporting the authors' claims are convincing and come from simultaneous recordings of fMRI and iEEG/EEG, which has been recently developed and adapted.

      The analysis methods are solid and involve a novel approach to analyzing the co-occurrence of FC patterns across modalities (cross-modal recurrence plot, CRP) and robust statistics, including replication of the main results using multiple operationalizations of the functional connectome (e.g., amplitude, orthogonalized, and phase-based coupling).

      In addition, the authors provided a detailed interpretation of the results, placing them in the context of recent advances and understanding of the relationships between functional connectivity and cognitive states.

      Weaknesses:

      Despite the impressive work, the paper still lacks some analyses to make it complete.

      Firstly, the effect of the window size is unclear, especially in the case of different frequencies where the number of cycles that fall in a window will vary drastically. A typical oscillation lasts just a few cycles (see Myrov et al., 2024), and brain states are usually short-lived because of meta-stability (see Roberts et al., 2019).

      Secondly, the authors didn't examine frequencies lower than 1Hz despite similarities between fMRI and infra-slow oscillations found in prior literature (see Palva et al., 2014; Zhang et al., 2023).

      On a minor note, the phase-locking value (PLV) is positively biased for EEG data (see Palva et al., 2018) and a different metric for phase coupling could be a more appropriate choice (e.g., iPLV/wPLI, see Vinck et al., 2011). The repository with the code is also unavailable.

    1. Reviewer #2 (Public Review):

      MotorNet aims to provide a unified interface where the trained RNN controller exists within the same TensorFlow environment as the end effectors being controlled. This architecture provides a much simpler interface for the researcher to develop and iterate through computational hypotheses. In addition, the authors have built a set of biomechanically realistic end effectors (e.g., a 2 joint arm model with realistic muscles) within TensorFlow that are fully differentiable.

      MotorNet will prove a highly useful starting point for researchers interested in exploring the challenges of controlling movement with realistic muscle and joint dynamics. The architecture features a conveniently modular design and the inclusion of simpler arm models provides an approachable learning curve. Other state-of-the-art simulation engines offer realistic models of muscles and multi-joint arms and afford more complex object manipulation and contact dynamics than MotorNet. However, MotorNet's approach allows for direct optimization of the controller network via gradient descent rather than reinforcement learning, which is a compromise currently required when other simulation engines (as these engines' code cannot be differentiated through).

      The paper has been reorganized to provide clearer signposts to guide the reader. Importantly, the software has been rewritten atop PyTorch which is increasingly popular in ML and computational neuroscience research.

    2. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      Summary:

      Codol et al. present a toolbox that allows simulating biomechanically realistic effectors and training Artificial Neural Networks (ANNs) to control them. The paper provides a detailed explanation of how the toolbox is structured and several examples that demonstrate its usefulness.

      Main comments:

      (1) The paper is well written and easy to follow. The schematics help in understanding how the toolbox works and the examples provide an idea of the results that the user can obtain.

      We thank the reviewer for this comment.

      (2) As I understand it, the main purpose of the paper should be to facilitate the usage of the toolbox. For this reason, I have missed a more explicit link to the actual code. As I see it, researchers will read this paper to figure out whether they can use MotorNet to simulate their experiments, and how they should proceed if they decide to use it. I'd say the paper provides an answer to the first question and assures that the toolbox is very easy to install and use. Maybe the authors could support this claim by adding "snippets" of code that show the key steps in building an actual example.

      This is an important point, which we also considered when writing this paper. We instead decided to focus on the first approach, because it is easier to illustrate the scientific use of the toolbox using code or interactive (Jupyter) notebooks than a publication format. We find the “how to proceed” aspect of the toolbox can more easily and comprehensively be covered using online, interactive tutorials. Additionally, this allows us to update these tutorials as the toolbox evolves over different versions, while it is more difficult to update a scientific article. Consequently, we explicitly avoided code snippets on the article itself. However, we appreciate that the paper would gain in clarity if this was more explicitly stated early. We have modified the paper to include a pointer to where to find tutorials online. We added this at the last paragraph of the introduction section:

      The interested reader may consult the full API documentation, including interactive tutorials on the toolbox website at https://motornet.org.

      (3) The results provided in Figures 1, 4, 5 and 6 are useful, because they provide examples of the type of things one can do with the toolbox. I have a few comments that might help improving them:

      a. The examples in Figures 1 and 5 seem a bit redundant (same effector, similar task). Maybe the authors could show an example with a different effector or task? (see point 4).

      The effectors from figures 1 and 5 are indeed very similar. However, the tasks in figure 1 and 5 present some important differences. The training procedure in figure 1 never includes any perturbations, while the one from figure 5 includes a wide range of perturbations of different magnitudes, timing and directions. The evaluation procedure of figure 1 includes center-out reaches with permanent viscous (proportional to velocity) external dynamics, while that of figure 5 are fixed, transient, square-shaped perturbation orthogonal to the reach direction. Finally, the networks in figure 1 undergo a second training procedure after evaluation while the network of figure 5 do not.

      While we agree that some variation of effectors would be beneficial, we do show examples of a point-mass effector in figure 6. Overall, figure 5 shows a task that is quite different from that of figure 1 with a similar effector, while the opposite is true for figure 6. We have modified the text to clarify this for the reader, by adding the following.

      End of 1st paragraph, section 2.4.

      Therefore, the training protocol used for this task largely differed from section 2.1 in that the networks are exposed to a wide range of mechanical perturbations with varying characteristics.

      1st paragraph of section 2.5

      […] this asymmetrical representation of PMDs during reaching movements did not occur when RNNs were trained to control an effector that lacked the geometrical properties of an arm such as illustrated in Figure 4c-e and section 2.1.

      b. I missed a discussion on the relevance of the results shown in Figure 4. The moment arms are barely mentioned outside section 2.3. Are these results new? How can they help with motor control research?

      We thank the reviewer for this comment. This relates to a point from reviewer 2 indicating that the purpose of each section was sometimes difficult to grasp as one reads. Section 2.3 explains the biomechanical properties that the toolbox implements to improve realism of the effector. They are not new results in the sense that other toolboxes implement these features (though not in differentiable formats) and these properties of biological muscles are empirically well-established. However, they are important to understand what the toolbox provides, and consequently what constraints networks must accommodate to learn efficient control policies. An example of this is the results in figure 6, where a simple effector versus a more biomechanically complex effector will yield different neural representations.

      Regarding the manuscript itself, we agree that more clarity on the goal of every paragraph may improve the reader’s experience. Consequently, we ensured to specify such goals at the start of each section. Particularly, we clarify the purpose of section 2.3 by adding several sentences on this at the end of the first paragraph in that section. We also now clearly state the purpose of section 2.3 with the results of figure 6 and reference figure 4 in that section.

      c. The results in Figure 6 are important, since one key asset of ANNs is that they provide access to the activity of the whole population of units that produces a given behavior. For this reason, I think it would be interesting to show the actual "empirical observations" that the results shown in Fig. 6 are replicating, hence allowing a direct comparison between the results obtained for biological and simulated neurons.

      These empirical observations are available from previous electrophysiological and modelling work. Particularly, polar histograms across reaching directions like panel C are displayed in figures 2 and 3 of Scott, Gribble, Graham, Cabel (2001, Nature). Colormaps of modelled unit activity across time and reaching directions like panel F are also displayed in figure 2 of Lillicrap, Scott (2013, Neuron). Electrophysiological recordings of M1 neurons during a similar task in non-human primates can also be seen on “Preserved neural population dynamics across animals performing similar behaviour” figure 2 B (https://doi.org/10.1101/2022.09.26.509498) and “Nonlinear manifolds underlie neural population activity during behaviour” figure 2 B as well (https://doi.org/10.1101/2023.07.18.549575). Note that these two pre-prints use the same dataset.

      We have added these citations to the text and made it explicit that they contain visualizations of similar modelling and empirical data for comparison:

      This heterogeneous set of responses matches empirical observations in non-human primate primary motor cortex recordings (Churchland & Shenoy, 2007; Michaels et al., 2016) and replicate similar visualizations from previously published work (Fortunato et al., 2023; Lillicrap & Scott, 2013; Safaie et al., 2023).

      (4) All examples in the paper use the arm26 plant as effector. Although the authors say that "users can easily declare their own custom-made effector and task objects if desired by subclassing the base Plant and Task class, respectively", this does not sound straightforward. Table 1 does not really clarify how to do it. Maybe an example that shows the actual code (see point 2) that creates a new plant (e.g. the 3-joint arm in Figure 7) would be useful.

      Subclassing is a Python process more than a MotorNet process, as python is an object-oriented language. Therefore, there are many Python tutorials on subclassing in the general sense that would be beneficial for that purpose. We have amended the main text to ensure that this is clearer to the reader.

      Subclassing a MotorNet object, in a more specific sense, requires overwriting some methods from the base MotorNet classes (e.g., Effector or Environment classes, which correspond to the original Plant and Task object, respectively). Since we made the decision (mentioned above) to not include code in the main text, we added tutorials to the online documentation, which include dedicated tutorials for MotorNet class subclassing. For instance, this tutorial showcases how to subclass Environment classes:

      https://colab.research.google.com/github/OlivierCodol/MotorNet/blob/master/examples/3-environments.ipynb

      (5) One potential limitation of the toolbox is that it is based on Tensorflow, when the field of Computational Neuroscience seems to be, or at least that's my impression, transitioning to pyTorch. How easy would it be to translate MotorNet to pyTorch? Maybe the authors could comment on this in the discussion.

      We have received a significant amount of feedback asking for a PyTorch implementation of the toolbox. Consequently, we decided to enact this, and the next version of the toolbox will be exclusively in PyTorch. We will maintain the Application Programming Interface (API) and tutorial documentation for the TensorFlow version of the toolbox on the online website. However, going forward we will focus exclusively on bug-fixing and expanding from the latest version of MotorNet, which will be in PyTorch. We now believe that the greater popularity of PyTorch in the academic community makes that choice more sustainable while helping a greater proportion of research projects.

      These changes led to a significant alteration of the MotorNet structure, which are reflected by changes made throughout the manuscript, notably in Figure 3 and Table 1.

      (6) Supervised learning (SL) is widely used in Systems Neuroscience, especially because it is faster than reinforcement learning (RL). Thus providing the possibility of training the ANNs with SL is an important asset of the toolbox. However, SL is not always ideal, especially when the optimal strategy is not known or when there are different alternative strategies and we want to know which is the one preferred by the subject. For instance, would it be possible to implement a setup in which the ANN has to choose between 2 different paths to reach a target? (e.g. Kaufman et al. 2015 eLife). In such a scenario, RL seems to be a more natural option Would it be easy to extend MotorNet so it allows training with RL? Maybe the authors could comment on this in the discussion.

      The new implementation of MotorNet that relies on PyTorch is already standardized to use an API that is compatible with Gymnasium. Gymnasium is a standard and popular interfacing toolbox used to link RL agents to environments. It is very well-documented and widely used, which will ensure that users who wish to employ RL to control MotorNet environments will be able to do so relatively effortlessly. We have added this point to accurately reflect the updated implementation, so users are aware that it is now a feature of the toolbox (new section 3.2.4.).

      Impact:

      MotorNet aims at simplifying the process of simulating complex experimental setups to rapidly test hypotheses about how the brain produces a specific movement. By providing an end-to-end pipeline to train ANNs on the simulated setup, it can greatly help guide experimenters to decide where to focus their experimental efforts.

      Additional context:

      Being the main result a toolbox, the paper is complemented by a GitHub repository and a documentation webpage. Both the repository and the webpage are well organized and easy to navigate. The webpage walks the user through the installation of the toolbox and the building of the effectors and the ANNs.

      Reviewer #2 (Public Review):

      MotorNet aims to provide a unified interface where the trained RNN controller exists within the same TensorFlow environment as the end effectors being controlled. This architecture provides a much simpler interface for the researcher to develop and iterate through computational hypotheses. In addition, the authors have built a set of biomechanically realistic end effectors (e.g., an 2 joint arm model with realistic muscles) within TensorFlow that are fully differentiable.

      MotorNet will prove a highly useful starting point for researchers interested in exploring the challenges of controlling movement with realistic muscle and joint dynamics. The architecture features a conveniently modular design and the inclusion of simpler arm models provides an approachable learning curve. Other state-of-the-art simulation engines offer realistic models of muscles and multi-joint arms and afford more complex object manipulation and contact dynamics than MotorNet. However, MotorNet's approach allows for direct optimization of the controller network via gradient descent rather than reinforcement learning, which is a compromise currently required when other simulation engines (as these engines' code cannot be differentiated through).

      The paper could be reorganized to provide clearer signposts as to what role each section plays (e.g., that the explanation of the moment arms of different joint models serves to illustrate the complexity of realistic biomechanics, rather than a novel discovery/exposition of this manuscript). Also, if possible, it would be valuable if the authors could provide more insight into whether gradient descent finds qualitatively different solutions to RL or other non gradient-based methods. This would strengthen the argument that a fully differentiable plant is useful beyond improving training time / computational power required (although this is a sufficiently important rationale per se).

      We thank the reviewer for these comments. We agree that more clarity on the section goals may improve the reader’s experience and ensured this is the case throughout the manuscript. Particularly, we added the following on the first paragraph of section 2.3, for which an explicit goal was most missing:

      In this section we illustrate some of these biomechanical properties displayed by MotorNet effectors using specific examples. These properties are well-characterised in the biology and are often implemented in realistic biomechanical simulation software.

      Regarding the potential difference in solutions obtained from reinforcement or supervised learning, this would represent a non-trivial amount of work to do so conclusively and so may not be within the scope of the current article. We do appreciate however that in some situations RL may be a more fitting approach to a given task design. In relation to this point we now specify in the discussion that the new API can accommodate interfacing with reinforcement learning toolboxes for those who may want to pursue this type of policy training approach when appropriate (new section 3.2.4.).

      Reviewer #3 (Public Review):

      Artificial neural networks have developed into a new research tool across various disciplines of neuroscience. However, specifically for studying neural control of movement it was extremely difficult to train those models, as they require not only simulating the neural network, but also the body parts one is interested in studying. The authors provide a solution to this problem which is built upon one of the main software packages used for deep learning (Tensorflow). This allows them to make use of state-of-the-art tools for training neural networks.

      They show that their toolbox is able to (re-)produce several commonly studied experiments e.g., planar reaching with and without loads. The toolbox is described in sufficient detail to get an overview of the functionality and the current state of what can be done with it. Although the authors state that only a few lines of code can reproduce such an experiment, they unfortunately don't provide any source code to reproduce their results (nor is it given in the respective repository).

      The possibility of adding code snippets to the article is something we originally considered, and which aligns with comment two from reviewer one (see above). Hopefully this provides a good overview of the motivation behind our choice not to add code to the article.

      The modularity of the presented toolbox makes it easy to exchange or modify single parts of an experiment e.g., the task or the neural network used as a controller. Together with the open-source nature of the toolbox, this will facilitate sharing and reproducibility across research labs.

      I can see how this paper can enable a whole set of new studies on neural control of movement and accelerate the turnover time for new ideas or hypotheses, as stated in the first paragraph of the Discussion section. Having such a low effort to run computational experiments will be definitely beneficial for the field of neural control of movement.

      We thank the reviewer for these comments.

    1. stylisé l’image avec une bordure et un peu de marge intérieure, et la barre avec une couleur de fond, qui s’affiche comme ça

      quel est code html utilisé pour obtenir ceci ?

    1. Summary of "Flecs v4.0 is out!" by Sander Mertens

      What is Flecs? - Flecs is an Entity Component System (ECS) for C and C++ designed for building games, simulations, and other applications. - “Store data for millions of entities in data structures optimized for CPU cache efficiency and composition-first design.” - “Find entities for game systems with a high performance query engine that can run in time critical game loops.” - “Run code using a multithreaded scheduler that seamlessly combines game systems from reusable modules.” - “Builtin support for hierarchies, prefabs and more with entity relationships which speed up game code and reduce boiler plate.” - “An ecosystem of tools and addons to profile, visualize, document and debug projects.” - Open-source under the MIT license.

      Release Highlights for Flecs v4.0: - Over 1700 new commits, totaling upwards of 4700 commits. - “More than 1700 new commits got added since v3 with the repository now having upwards of 4700 commits in total.” - Closed and merged 400+ issues and PRs from community members. - “More than 400 issues and PRs submitted by dozens of community members got closed and merged.” - Discord community grew to over 2300 members, GitHub stars doubled from 2900 to 5800. - Test cases increased from 4400 to 8500, with test code growing from 130K to 240K lines.

      Adoption of Flecs: - Used by both small and large projects, including the highly anticipated game Hytale. - “Flecs provides the backbone of the Hytale Game Engine. Its flexibility has allowed us to build highly varied gameplay while supporting our vision for empowering Creators.” - Tempest Rising uses Flecs to manage high counts of units and spatial queries. - “We are using it [Flecs] mostly to leverage high count of units. Movement (forces / avoidance), collisions, systems that rely on spatial queries, some gameplay related stuff.” - Smaller games like Tome Tumble Tournament use Flecs for movement rules.

      Language Support and Community Contributions: - Flecs Rust binding released, actively developed and in alpha. - “An enormous amount of effort went into porting over all of the APIs, including relationships, and writing the documentation, examples, and tests.” - Flecs.NET (C#) has become the de facto C# binding. - “The binding closely mirrors the C++ API, and comes bundled with documentation, examples and tests.”

      New Features in v4.0: - Unified query API simplifies usage and enhances functionality. - “The filter, query, and rule implementations now have been unified into a single query API.” - Explorer v4 offers a revamped interface and new tools. - “The v4 explorer has a few new tricks up its sleeve, such as a utility to capture commands, editing multiple Flecs scripts at the same time, the ability to add & remove components, and new tools to inspect queries, systems and observers.” - Flecs Script for easy entity and component creation, with improved syntax and faster template engine. - “Flecs Script got completely overhauled, with an improved syntax, more powerful APIs and a much faster template engine.” - Sparse components for stable component pointers and performance gains. - “Sparse components don’t move when entities are moved between archetypes. Besides being good for performance, this also means that Flecs now supports components that aren’t movable!” - Overhauled demos showcasing new features and enhanced graphics. - “The Tower Defense demo has been overhauled for v4 to better showcase Flecs features, while also quadrupling the scale of the scene!” - Improved inheritance model, now opt-in for better performance. - “When a prefab is instantiated in v4, components are by default copied to the instance.” - Member queries reduce overhead and simplify relationships. - “In v4 queries can directly query entity members as if they were relationship targets, which is like having relationships without the fragmentation!” - Flecs Remote API for connecting to Flecs applications remotely. - “The new Flecs Remote API includes a simpler JSON format, a new REST API with a cleaner design, and a new JavaScript library for the development of web clients that use Flecs data.”

      Documentation and Future Directions: - Improved and expanded documentation covering new features in-depth. - “Several weeks of the v4 release cycle were spent on improving the documentation and making sure it’s up to date.” - Future updates to include reactivity frameworks, dense tree storage, dense/sparse tables, pluggable storages, and a node API. - “Reactivity... dense tree storage... dense/sparse tables... pluggable storages... node API...”

      Community Acknowledgment: - Special thanks to community members and sponsors who contributed to the development and support of Flecs v4. - “A special thanks to everyone that contributed PRs and helped with the development of Flecs v4 features.”

      This summary encapsulates the key updates, features, and community efforts surrounding the release of Flecs v4.0, highlighting its impact and future potential.

    1. Advanced statistical R algorithms are invoked through a dedicated R installation

      From the figure below it looks like these statistical tests would be readily available or easy to write functions for in python? Just for installation and maintenance issues it's difficult to maintain code depending on two different languages.

    1. le résultat final du site de Robbie Lens.

      J'ai n'ai pas le même résultat sur la largeur des formulaires pour le nom et l'email. J'ai pourtant revérifié mon code mais rien n'y fais

    1. (https://github.com/BIRDSgroup/Disease-Disease-Interaction)

      Thank you for providing your code! I was curious if you would be willing to add a conda environment or other installation instructions documenting your dependencies. It looked to me like the dependencies are called inline in your analysis script. In your pipeline script, jtools is also commented out (and there was a lot of commented code in general) that I wasn't sure if these things were required or not.

    1. for poster it .... you know "most wanted"

      Ceph's Leadership Team

      The leaders, innovators and talented intellects behind Ceph.

      The Ceph Leadership Team is comprised of the key technical players who manage the community and oversee the advancement of Ceph.

      Find out more about each key player and their area of expertise, or see members who have received the tentacle award!

      Component Leads

      | Adam King | Orchestration / cephadm | | Casey Bodley | RGW | | Venky Shankar | CephFS | | Ilya Dryomov | RBD | | Matan Breizman | Crimson | | Yingxin Cheng | Seastore | | Neha Ojha | RADOS | | Nizamudeen A | Dashboard | | Sage Weil | Founder |

      Other members

      | Matt Benjamin | RGW | | Zac Dover | Documentation | | Ken Dreyer | Packaging | | Josh Durgin | RADOS | | Gregory Farnum | CephFS / RADOS | | Igor Fedotov | BlueStore | | Dan Mick | build/test/lab | | Xiubo Li | Linux Kernel Integration | | Mark Nelson | Performance / CBT | | Myoungwon Oh | SeaStore | | David Orman | User | | Casey Cain | Community | | Yehuda Sadeh | RGW | | Dan van der Ster | User | | Haomai Wang | Async Messenger | | Yuri Weinstein | Release Management / Testing | | Xie Xingguo | RADOS | | Vikhyat Umrao | RADOS / Performance at Scale / User Workloads |

      Core Team

      Patrick Donnelly

      Patrick Donnelly

      IRC: pdonnell

      Patrick Donnelly is a software engineer at Red Hat, Inc. currently working on the Ceph distributed file system. In 2016 he completed his Ph.D. in computer science at the University of Notre Dame with a dissertation on the topic of file transfer management in active storage cluster file systems.

      Ilya Dryomov

      Ilya Dryomov

      IRC: dis

      Ilya has been working on Ceph since 2013, originally focusing on the Linux kernel RBD driver. Currently he serves as the technical lead for RBD and the maintainer of the Linux kernel client and also contributes to RADOS primarily in messenger and cephx areas. Previously he was involved with Btrfs and HAMMER file systems.

      Neha Ojha

      Neha Ojha

      IRC: neha

      Neha is a Principal Software Engineer at Red Hat. She is the project technical lead for the core team focusing on RADOS. Neha holds a Master's degree in Computer Science from the University of California, Santa Cruz.

      Ernesto Puerta

      Ernesto Puerta

      IRC: epuertat

      Ernesto is the Ceph Dashboard component lead. He previously worked at Telefonica R∓D, Alcatel-Lucent, Bell Labs, and Nokia, where he first came to know about Ceph, for a Cloud Video Storage project back in 2015. After that stimulating experience, he joined Red Hat in 2018 and has since contributed to the Ceph Dashboard project, trying to apply his expertise as a Ceph user. Ernesto holds a master's degree in Telecommunications Engineering from Universidad Politénica de Madrid and currently lives in that very city, famous for its fried calamari sandwich.

      Yehuda Sadeh

      Yehuda Sadeh

      IRC: yehudasa

      Yehuda has been involved in Ceph since 2008, and has been working on various related projects and subsystems. He is the original developer of the RADOS Gateway (RGW) which he currently co-leads as part of his work at Red Hat. He also worked on multiple other Ceph projects, such as the Linux kernel Ceph filesystem module, and RBD. Notable other Ceph modules that he initiated along with Sage Weil are the Linux kernel RBD module, the RADOS object classes, and the cephx authentication. Before joining Ceph, Yehuda worked in various start up companies, where he developed various storage and networking solutions. He holds a Bachelor degree in Communication Systems Engineering from Ben Gurion University, and a Master degree in Computer Engineering from Tel Aviv University.

      Sage Weil

      Sage Weil

      IRC: sage

      Sage Weil is the founder of Ceph. He also was the creator of WebRing, a co-founder of Los Angeles-based hosting company DreamHost, and the founder and CTO of Inktank. Weil earned a Bachelor of Science in computer science from Harvey Mudd College in 2000 and completed his PhD in 2007 at the University of California, Santa Cruz working with Prof. Scott Brandt on consistency protocols, data distribution (CRUSH), and the metadata manager in the Ceph distributed file system. [wikipedia]

      Maintainers

      Matt Benjamin

      Matt Benjamin

      IRC: mattbenjamin

      Matt Benjamin has been working professionally with Linux and open source software since 1994. He is a contributor to a variety of open source software packages and tools. He co-founded The Linux Box corportation in Ann Arbor, mi, has held a developer position with Comshare, Inc, and has also been a consultant with Integrated Micro Systems. Matt holds a master's degree from the University of Michigan, and a bachelor's degree (Summa Cum Laude and Phi Beta Kappa) from the University of Missouri.

      Zac Dover

      Zac Dover

      IRC: zdover

      Zac Dover is the Ceph upstream technical writer. He has been involved in open source software since the 1990s, and worked at Red Hat for seven years. Zac runs the monthly DocuBetter meeting and is (as of 2021) engaged in an ineffably tedious line-by-line edit of the Ceph documentation. Zac encourages you to write to him with your complaints about the Ceph documentation.

      Mark Nelson

      Mark Nelson

      IRC: nhm

      Mark joined the Ceph team in January 2012 and has 12 years of experience in distributed systems, HPC, and bioinformatics. Mark works on Ceph performance analysis and is the primary author of the Ceph Benchmarking Toolkit. He runs the weekly Ceph performance meeting and is currently focused on research and development of Ceph's next-generation object store.

      Mike Perez

      Mike Perez

      IRC: thingee

      Mike is the Ceph Community Manager at Red Hat. Being a contributing member of OpenStack since 2010, he has served as a core developer for the OpenStack block storage project Cinder and as a PTL for the Kilo and Liberty releases. During some of this time, he worked for DreamHost in helping with their OpenStack public cloud, one of the first large production deployments of Ceph, and helping with integrating a variety of block storage solutions like Ceph in Cinder. He later joined the OpenStack Foundation to help with the success of cross-project initiatives and the overall quality and health of the project and community.

      When Mike is not trying to make a computer work, he can be found: dancing, getting his hair done, playing with modular synthesizers, and doing karaoke for your entertainment.

      © 2024 All rights reserved.

    1. gaping security issue

      Ceph's Leadership Team

      The leaders, innovators and talented intellects behind Ceph.

      The Ceph Leadership Team is comprised of the key technical players who manage the community and oversee the advancement of Ceph.

      Find out more about each key player and their area of expertise, or see members who have received the tentacle award!

      Component Leads

      | Adam King | Orchestration / cephadm | | Casey Bodley | RGW | | Venky Shankar | CephFS | | Ilya Dryomov | RBD | | Matan Breizman | Crimson | | Yingxin Cheng | Seastore | | Neha Ojha | RADOS | | Nizamudeen A | Dashboard | | Sage Weil | Founder |

      Other members

      | Matt Benjamin | RGW | | Zac Dover | Documentation | | Ken Dreyer | Packaging | | Josh Durgin | RADOS | | Gregory Farnum | CephFS / RADOS | | Igor Fedotov | BlueStore | | Dan Mick | build/test/lab | | Xiubo Li | Linux Kernel Integration | | Mark Nelson | Performance / CBT | | Myoungwon Oh | SeaStore | | David Orman | User | | Casey Cain | Community | | Yehuda Sadeh | RGW | | Dan van der Ster | User | | Haomai Wang | Async Messenger | | Yuri Weinstein | Release Management / Testing | | Xie Xingguo | RADOS | | Vikhyat Umrao | RADOS / Performance at Scale / User Workloads |

      Core Team

      Patrick Donnelly

      Patrick Donnelly

      IRC: pdonnell

      Patrick Donnelly is a software engineer at Red Hat, Inc. currently working on the Ceph distributed file system. In 2016 he completed his Ph.D. in computer science at the University of Notre Dame with a dissertation on the topic of file transfer management in active storage cluster file systems.

      Ilya Dryomov

      Ilya Dryomov

      IRC: dis

      Ilya has been working on Ceph since 2013, originally focusing on the Linux kernel RBD driver. Currently he serves as the technical lead for RBD and the maintainer of the Linux kernel client and also contributes to RADOS primarily in messenger and cephx areas. Previously he was involved with Btrfs and HAMMER file systems.

      Neha Ojha

      Neha Ojha

      IRC: neha

      Neha is a Principal Software Engineer at Red Hat. She is the project technical lead for the core team focusing on RADOS. Neha holds a Master's degree in Computer Science from the University of California, Santa Cruz.

      Ernesto Puerta

      Ernesto Puerta

      IRC: epuertat

      Ernesto is the Ceph Dashboard component lead. He previously worked at Telefonica R∓D, Alcatel-Lucent, Bell Labs, and Nokia, where he first came to know about Ceph, for a Cloud Video Storage project back in 2015. After that stimulating experience, he joined Red Hat in 2018 and has since contributed to the Ceph Dashboard project, trying to apply his expertise as a Ceph user. Ernesto holds a master's degree in Telecommunications Engineering from Universidad Politénica de Madrid and currently lives in that very city, famous for its fried calamari sandwich.

      Yehuda Sadeh

      Yehuda Sadeh

      IRC: yehudasa

      Yehuda has been involved in Ceph since 2008, and has been working on various related projects and subsystems. He is the original developer of the RADOS Gateway (RGW) which he currently co-leads as part of his work at Red Hat. He also worked on multiple other Ceph projects, such as the Linux kernel Ceph filesystem module, and RBD. Notable other Ceph modules that he initiated along with Sage Weil are the Linux kernel RBD module, the RADOS object classes, and the cephx authentication. Before joining Ceph, Yehuda worked in various start up companies, where he developed various storage and networking solutions. He holds a Bachelor degree in Communication Systems Engineering from Ben Gurion University, and a Master degree in Computer Engineering from Tel Aviv University.

      Sage Weil

      Sage Weil

      IRC: sage

      Sage Weil is the founder of Ceph. He also was the creator of WebRing, a co-founder of Los Angeles-based hosting company DreamHost, and the founder and CTO of Inktank. Weil earned a Bachelor of Science in computer science from Harvey Mudd College in 2000 and completed his PhD in 2007 at the University of California, Santa Cruz working with Prof. Scott Brandt on consistency protocols, data distribution (CRUSH), and the metadata manager in the Ceph distributed file system. [wikipedia]

      Maintainers

      Matt Benjamin

      Matt Benjamin

      IRC: mattbenjamin

      Matt Benjamin has been working professionally with Linux and open source software since 1994. He is a contributor to a variety of open source software packages and tools. He co-founded The Linux Box corportation in Ann Arbor, mi, has held a developer position with Comshare, Inc, and has also been a consultant with Integrated Micro Systems. Matt holds a master's degree from the University of Michigan, and a bachelor's degree (Summa Cum Laude and Phi Beta Kappa) from the University of Missouri.

      Zac Dover

      Zac Dover

      IRC: zdover

      Zac Dover is the Ceph upstream technical writer. He has been involved in open source software since the 1990s, and worked at Red Hat for seven years. Zac runs the monthly DocuBetter meeting and is (as of 2021) engaged in an ineffably tedious line-by-line edit of the Ceph documentation. Zac encourages you to write to him with your complaints about the Ceph documentation.

      Mark Nelson

      Mark Nelson

      IRC: nhm

      Mark joined the Ceph team in January 2012 and has 12 years of experience in distributed systems, HPC, and bioinformatics. Mark works on Ceph performance analysis and is the primary author of the Ceph Benchmarking Toolkit. He runs the weekly Ceph performance meeting and is currently focused on research and development of Ceph's next-generation object store.

      Mike Perez

      Mike Perez

      IRC: thingee

      Mike is the Ceph Community Manager at Red Hat. Being a contributing member of OpenStack since 2010, he has served as a core developer for the OpenStack block storage project Cinder and as a PTL for the Kilo and Liberty releases. During some of this time, he worked for DreamHost in helping with their OpenStack public cloud, one of the first large production deployments of Ceph, and helping with integrating a variety of block storage solutions like Ceph in Cinder. He later joined the OpenStack Foundation to help with the success of cross-project initiatives and the overall quality and health of the project and community.

      When Mike is not trying to make a computer work, he can be found: dancing, getting his hair done, playing with modular synthesizers, and doing karaoke for your entertainment.

      © 2024 All rights reserved.

    1. Summary of Joe Armstrong's Interview on Erlang

      Introduction and Current Involvement: - Joe Armstrong is the principal inventor of Erlang and coined the term "Concurrency Oriented Programming". - "Today I go round and give talks about Erlang, promoting Erlang - that's one side of what I do."

      Companies Using Erlang: - Erlang is used by Kreditor (financials), TLF (network management systems), and Synapse (mobile phone provisioning) in Sweden. - "Each one of them employs about 30 people and they are probably market leading in each of their areas, very niched areas."

      Popularity and Strength of Erlang in Concurrency: - Ralph Johnson noted Erlang’s superiority in handling concurrency, allowing millions of processes compared to 10,000-20,000 in other languages. - "In Erlang the notion of a process is part of a programming language, is not part of the operating system."

      Theoretical Basis: - Erlang is based on the Actors model of computation and is a pure message-passing language. - "The theoretical basis would be Actors model of computation, Carl Hewitt."

      Development and Changes in Erlang: - Future changes will be minimal to avoid breaking legacy code, focusing mainly on libraries rather than syntax. - "I think we'll see very few changes in the language itself. We'll see changes to the libraries and things like that."

      Comparison with Object-Oriented Programming (OOP): - Armstrong criticizes OOP for its complexity and inefficiency, favoring Erlang’s messaging model for true object-oriented behavior. - "Erlang is actually more object oriented, truer to the spirit of pure object orientation than all object-oriented languages."

      Garbage Collection and Multicore Processing: - Erlang’s soft real-time behavior minimizes issues with garbage collection, even in multicore environments. - "It's extremely unusual that Erlang programs are bothered by garbage collection issues."

      Interfacing with Other Languages: - Erlang deliberately avoids linking with other languages’ memory spaces to ensure fault tolerance. - "Erlang is built for fault tolerant systems and therefore it does not allow you to link anything into the same memory space."

      Philosophy of Connecting Components: - Armstrong advocates for simple, message-based connections similar to Unix pipes over complex API integrations. - "There is an easy and a difficult way to connect components together, and the easy way - the prime example is the Unix pipe mechanism."

      High-Performance Erlang (HiPE): - HiPE compiles Erlang to native code, enhancing performance. - "Yes, this is high performance Erlang, done at the university of Upsala."

      Advantages of a Register Machine: - Erlang’s VM is a register machine, which is more efficient than a stack machine. - "It's better to have a register machine than a stack machine."

      This summary encapsulates the essence of Joe Armstrong's insights on Erlang, its development, advantages, and practical applications, while highlighting key quotes and ideas from the original interview.

    1. Author response:

      Please find below our provisional author response, outlining the revisions we plan to undertake to address the Recommendations received:

      Reviewer #1 (Recommendations For The Authors):

      (1) A set of recent advances have shown that embeddings of unsupervised/self-supervised speech models aligned to auditory responses to speech in the temporal cortex (e.g. Wav2Vec2: Millet et al NeurIPS 2022; HuBERT: Li et al. Nat Neurosci 2023; Whisper: Goldstein et al. bioRxiv 2023). These models are known to preserve a variety of speech information (phonetics, linguistic information, emotions, speaker identity, etc) and perform well in a variety of downstream tasks. These other models should be evaluated or at least discussed in the study.

      We plan to evaluate two of these other models, Wav2Vec2 and HuBERT, in the brain encoding and RSA parts.

      (2) The test statistics of the results in Fig 1c-e need to be revised. Given that logistic regression is a convex optimization problem typically converging to a global optimum, these multiple initializations of the classifier were likely not entirely independent. Consequently, the reported degrees of freedom and the effect size estimates might not accurately reflect the true variability and independence of the classifier outcomes. A more careful evaluation of these aspects is necessary to ensure the statistical robustness of the results.

      We plan to address this point to ensure the statistical robustness of our results.

      (3) In Line 198, the authors discuss the number of dimensions used in their models. To provide a comprehensive comparison, it would be informative to include direct decoding results from the original spectrograms alongside those from the VLS and LIN models. Given the vast diversity in vocal speech characteristics, it is plausible that the speaker identities might correlate with specific speech-related features also represented in both the auditory cortex and the VLS. Therefore, a clearer understanding of the original distribution of voice identities in the untransformed auditory space would be beneficial. This addition would help ascertain the extent to which transformations applied by the VLS or LIN models might be capturing or obscuring relevant auditory information.

      We plan to include direct decoding results from the original spectrograms in addition from the VLS and LIN models.

      Reviewer #2 (Recommendations For The Authors):

      We plan to address the following points raised by Reviewer #2:

      (1) English mistakes, rewordings:

      a. L31: 'in voice' > consider rewording (from a voice?).

      b. L33: consider splitting sentence (after interactions).

      c. L39: 'brain' after parentheses.

      d. L45-: certainly DNNs 'as a powerful tool' extend to audio (not just image and video) beyond their use in brain models.

      e. L52: listened to / heard.

      f. L63: use second/s consistently.

      g. L64: the reference to Figure 5D is maybe a bit confusing here in the introduction.

      h. L79-88: this section is formulated in a way that is too detailed for the introduction text (confusing to read). Consider a more general introduction to the VLS concept here and the details of this study later.

      i. L99-: again, I think the experimental details are best saved for later. It's good to provide a feel for the analysis pipeline here, but some of the details provided (number of averages, denoising, preprocessing), are anyway too unspecific to allow the reader to fully follow the analysis.

      We will correct the mistakes, apply the suggested rewordings, and clarify the points raised.

      (2) Clarification.

      • L159: what was the motivation for classifying age as a 2-class classification problem? Rather than more classes or continuous prediction? How did you choose the age split?

      • L263: Is the test of RDM correlation>0 corrected for multiple comparisons across ROIs, subjects, and models?

      • L379: 'these stimuli' - weren't the experimental stimuli different from those used to train the V/AE?

      • L443: what are 'technical issues' that prevented subject 3 from participating in 48 runs??

      • L444: participants were instructed to 'stay in the scanner'!? Do you mean 'stay still', or something?

      • L463: Hearing thresholds of 15 dB: do you mean that all had thresholds lower than 15 dB at all frequencies and at all repeated audiogram measurements?

      • L472: were the 4 category levels balanced across the dataset (in number of occurrences of each category combination)?

      • L482: the test stimuli were selected as having high energy by the amplitude envelope. It is unclear what this means (how is the envelope extracted, what feature of it is used to measure 'high energy'?)

      • L500 was the audio filtered to account for the transfer function of the Sensimetrics headphones?

      • L500: what does 'comfortable level' correspond to and was it set per session (i.e. did it vary across sessions)?

      • L526- does the normalization imply that the reconstructed spectrograms are normalized? Were the reconstructions then scaled to undo the normalization before inversion?

      • L606: does the identity GLM model the denoised betas from the first GLM or simply the BOLD data? The text indicates the latter, but I suspect the former.

      • L704: could you unpack this a bit more? It is not easy to see why you specify the summing in the objective. Shouldn't this just be the ridge objective for a given voxel/ROI? Then you could just state it in matrix notation.

      • L716: you used robust scaling for the classifications in latent space but haven't mentioned scaling here. Are we to assume that the same applies?

      • L720: Pearson correlation as a performance metric and its variance will depend on the choice of test/train split sizes. Can you show that the results generalize beyond your specific choices? Maybe the report explained variance as well to get a better idea of performance.

      • Could you specify (somewhere) the stimulus timing in a run? ISI and stimulus duration are mentioned in different places, but it would be nice to have a summary of the temporal structure of runs.

      We will clarify the points raised.

      Reviewer #3 (Recommendations For The Authors):

      We plan to address the following points raised by Reviewer #3:

      Comments:

      • Code and data are not currently available.

      • In the supplementary material, it would be beneficial to present the different analyses as boxplots, as in the main text, but with the ROIs in the left and right hemispheres separated, to better show potential hemispheric effect. Although this information is available in the Supplementary Tables, it is currently quite tedious to access it.

      • In Figure 3a, it might be beneficial to order the identities by age for each gender in order to more clearly illustrate the structure of the RDMs,

      • In Figure 3b, the variance for the correlations for the aTVA is higher than in other regions, why?

      • Please make sure that all acronyms are defined, and that they are redefined in the figure legends.

      • Gender and age are primarily encoded by different brain regions (Figure 5, pTVA vs aTVA). How does this finding compare with existing literature?

      We will upload the code and the preprocessed data; improve the supplementary material figures; Fix Figure 3 according to the Reviewer’s suggestion, and clarify the points raised.

    1. Author response:

      We thank the reviewers for their comments and will revise the manuscript to provide more comprehensive clarifications to aide readers’ understanding of behaviorMate. Additionally, we intend to take several steps which could provide further insights and improve the ease of use for new behaviorMate users: (1) to release an expanded and annotated library of existing settings and VR scene files, (2) improve the online documentation of context lists and decorators which allow behaviorMate to run custom experimental paradigms without writing code, and (3) release online API details of the JSON messaging protocol that is used between behaviorMate, the Arduinos, and the VRMate program which could be especially helpful to developers interested in expanding or modifying the system. Here we provide a few brief points of clarification to some of the concerns raised by the reviewers.

      Firstly, we clarify the system’s focus on modularity and flexibility. behaviorMate leverages the “Intranet of Things” framework to provide a low-cost platform that relies on asynchronous message passing between independent networked devices. While our current VR implementation typically involves a PC, 2 Arduinos, and an Android device per VR display, the behaviorMate GUI can be configured without editing any source code to listen on additional ports for UDP messages which will be automatically timestamped and logged. Since the current implementation of the behaviorMate GUI can be configured through the settings file to send and receive JSON-formatted messages on arbitrary ports, third-party devices could be configured to listen and respond to these messages also without editing the UI source code. More specialized responsibilities or tasks that require higher temporal precision (such as position tracking) are handled by dedicated circuits so as to not overload the general purpose one. This provides a level of encapsulation/separation of concerns since components can be optimized for performance of a single tasks—a feature that is especially desirable given resource limitations on the most common commercially available microcontrollers.

      A number of methods exist for synchronizing recording devices like microscopes or electrophysiology recordings with behaviorMate’s time-stamped logs of actuators and sensors. For example, the GPIO circuit can be configured to send sync triggers, or receive timing signals as input, alternatively a dedicated circuit could record frame start signals and relay them to the PC to be logged indecently of the GPIO (enabling a high-resolution post-hoc alignment of the time stamps). The optimal method to use varies based on the needs of the experiment. For example, if very high temporal precision is needed, such as during electrophysiology experiments, a high-speed data acquisition (DAQ) circuit to capture a fixed interval readout might be beneficial. behaviorMate could still be set up as normal to provide closed and open-loop task control at behaviorally relevant timescales alongside a DAQ circuit recording events at a consistent temporal resolution. While this would increase the relative cost of the recording setup, identical rigs for training animals could still be configured without the DAQ circuit avoiding the additional cost and complexity.

      VRMate provides the interface between Unity and behaviorMate—therefore using the two systems together mean that no Unity or C# programming is necessary. VRMate provides a prespecified set of visual cues that can be scaled in 3 dimensions and have textures applied to them, permitting a wide variety of different scenes to be displayed. All VRMate scene details are additionally logged by behaviorMate to allow for consistency checks across experiments. The VRMate project also includes “editor scripts” that provide a drag-and-drop utility in Unity Editor for developing new scenes. Since the details pertaining to specific scenes and view angle are loaded at runtime via JSON-formatted UDP messages, it is not necessary to recompile VRMate in order to use this feature. Since we send individual position updates to VRMate from the PC, any issues with clock drift would be limited to the refresh rate of the Unity program that fast enough to be perceived as instantaneous and we have thoroughly tested the timing differences between displays using high-speed cameras and found them to be negligible. While we find using 5 separate Android computers to render scenes as described an optimal solution to maximize flexibility, it would also be possible to render all scenes on a single PC to further mitigate this concern depending on experimental demands. Finally, our treadmill implementations of behaviorMate use no monitor displays, however due to the modular design of behaviorMate virtual cues could be seamlessly added by added to any such setup by a VR context to the settings files.

      One last point to mention is that while our project is not affected by the recent changes in pricing structure of the Unity project, since the compiled software does not need to be regenerated to update VR scenes, or implement new task logic since this is handled by the behaviorMate GUI. This means the current state of the VRMate program is robust to any future pricing changes or other restructuring of the Unity program and does not rely on continued support of Unity. Additionally, the solution presented in VRMate has many benefits, however, a developer could easily adapt any open-source VR Maze project to receive the UDP-based position updates from behaviorMate or develop their own novel VR solutions. We intend to update the VR section of the manuscript to make all of this information clearer in the document as well as to provide the additional online documentation in the materials linked in the supplemental information.

    2. Reviewer #1 (Public Review):

      Summary:

      Bowler et al. present a thoroughly tested system for modularized behavioral control of navigation-based experiments, particularly suited for pairing with 2-photon imaging but applicable to a variety of techniques. This system, which they name behaviorMate, represents a valuable contribution to the field. As the authors note, behavioral control paradigms vary widely across laboratories in terms of hardware and software utilized and often require specialized technical knowledge to make changes to these systems. Having a standardized, easy-to-implement, and flexible system that can be used by many groups is therefore highly desirable. This work will be of interest to systems neuroscientists looking to integrate flexible head-fixed behavioral control with neural data acquisition.

      Strengths:

      The present manuscript provides compelling evidence of the functionality and applicability of behaviorMate. The authors report benchmark tests for real-time update speed between the animal's movement and the behavioral control, on both the treadmill-based and virtual reality (VR) setups. Further, they nicely demonstrate and quantify reliable hippocampal place cell coding in both setups, using synchronized 2-photon imaging. This place cell characterization also provides a concrete comparison between the place cell properties observed in treadmill-based navigation vs. visual VR in a single study, which itself is a helpful contribution to the field.

      Documentation for installing and operating behaviorMate is available via the authors' lab website and linked in the manuscript.

      Weaknesses:

      The following comments are mostly minor suggestions intended to add clarity to the paper and provide context for its significance.

      (1) As VRMate (a component of behaviorMate) is written using Unity, what is the main advantage of using behaviorMate/VRMate compared to using Unity alone paired with Arduinos (e.g. Campbell et al. 2018), or compared to using an existing toolbox to interface with Unity (e.g. Alsbury-Nealy et al. 2022, DOI: 10.3758/s13428-021-01664-9)? For instance, one disadvantage of using Unity alone is that it requires programming in C# to code the task logic. It was not entirely clear whether VRMate circumvents this disadvantage somehow -- does it allow customization of task logic and scenery in the GUI? Does VRMate add other features and/or usability compared to Unity alone? It would be helpful if the authors could expand on this topic briefly.

      (2) The section on "context lists", lines 163-186, seemed to describe an important component of the system, but this section was challenging to follow and readers may find the terminology confusing. Perhaps this section could benefit from an accompanying figure or flow chart, if these terms are important to understand.

      (2a) Relatedly, "context" is used to refer to both when the animal enters a particular state in the task like a reward zone ("reward context", line 447) and also to describe a set of characteristics of an environment (Figure 3G), akin to how "context" is often used in the navigation literature. To avoid confusion, one possibility would be to use "environment" instead of "context" in Figure 3G, and/or consider using a word like "state" instead of "context" when referring to the activation of different stimuli.

      (3) Given the authors' goal of providing a system that is easily synchronizable with neural data acquisition, especially with 2-photon imaging, I wonder if they could expand on the following features:

      (3a) The authors mention that behaviorMate can send a TTL to trigger scanning on the 2P scope (line 202), which is a very useful feature. Can it also easily generate a TTL for each frame of the VR display and/or each sample of the animal's movement? Such TTLs can be critical for synchronizing the imaging with behavior and accounting for variability in the VR frame rate or sampling rate.

      (3b) Is there a limit to the number of I/O ports on the system? This might be worth explicitly mentioning.

      (3c) In the VR version, if each display is run by a separate Android computer, is there any risk of clock drift between displays? Or is this circumvented by centralized control of the rendering onset via the "real-time computer"?

    3. Reviewer #2 (Public Review):

      Summary:

      The authors present behaviorMate, an open-source behavior recording and control system including a central GUI and compatible treadmill and display components. Notably, the system utilizes the "Intranet of things" scheme and the components communicate through a local network, making the system modular, which in turn allows user to easily configure the setup to suit their experimental needs. Overall, behaviorMate is a valuable resource for researchers performing head-fixed imaging studies, as the commercial alternatives are often expensive and inflexible to modify.

      Strengths and Weaknesses:

      The manuscript presents two major utilities of behaviorMate: (1) as an open-source alternative to commercial behavior apparatus for head-fixed imaging studies, and (2) as a set of generic schema and communication protocols that allows the users to incorporate arbitrary recording and stimulation devices during a head-fixed imaging experiment. I found the first point well-supported and demonstrated in the manuscript. Indeed, the documentation, BOM, CAD files, circuit design, source, and compiled software, along with the manuscript, create an invaluable resource for neuroscience researchers looking to set up a budget-friendly VR and head-fixed imaging rig. Some features of behaviorMate, including the computer vision-based calibration of the treadmill, and the decentralized, Android-based display devices, are very innovative approaches and can be quite useful in practical settings. However, regarding the second point, my concern is that there is not adequate documentation and design flexibility to allow the users to incorporate arbitrary hardware into the system. In particular:

      (1) The central controlling logic is coupled with GUI and an event loop, without a documented plugin system. It's not clear whether arbitrary code can be executed together with the GUI, hence it's not clear how much the functionality of the GUI can be easily extended without substantial change to the source code of the GUI. For example, if the user wants to perform custom real-time analysis on the behavior data (potentially for closed-loop stimulation), it's not clear how to easily incorporate the analysis into the main GUI/control program.

      (2) The JSON messaging protocol lacks API documentation. It's not clear what the exact syntax is, supported key/value pairs, and expected response/behavior of the JSON messages. Hence, it's not clear how to develop new hardware that can communicate with the behaviorMate system.

      (3) It seems the existing control hardware and the JSON messaging only support GPIO/TTL types of input/output, which limits the applicability of the system to more complicated sensor/controller hardware. The authors mentioned that hardware like Arduino natively supports serial protocols like I2C or SPI, but it's not clear how they are handled and translated to JSON messages.

      Additionally, because it's unclear how easy to incorporate arbitrary hardware with behaviorMate, the "Intranet of things" approach seems to lose attraction. Since currently, the manuscript focuses mainly on a specific set of hardware designed for a specific type of experiment, it's not clear what are the advantages of implementing communication over a local network as opposed to the typical connections using USB.

      In summary, the manuscript presents a well-developed open-source system for head-fixed imaging experiments with innovative features. The project is a very valuable resource to the neuroscience community. However, some claims in the manuscript regarding the extensibility of the system and protocol may require further development and demonstration.

    1. There also exist well-known vulnerabilities for eBPF programs, which can allow attacksto break container isolation [13] and execute malicious code inthe kernel [22]. Since Wattmeter is built on top of eBPF and ac-cesses RAPL information, only privileged users should be allowedto access it.

      So, vulnerable to platypus attack?

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      While there are many models for sequence retrieval, it has been difficult to find models that vary the speed of sequence retrieval dynamically via simple external inputs. While recent works [1,2] have proposed some mechanisms, the authors here propose a different one based on heterogeneous plasticity rules. Temporally symmetric plasticity kernels (that do not distinguish between the order of pre and post spikes, but only their time difference) are expected to give rise to attractor states, asymmetric ones to sequence transitions. The authors incorporate a rate-based, discrete-time analog of these spike-based plasticity rules to learn the connections between neurons (leading to connections similar to Hopfield networks for attractors and sequences). They use either a parametric combination of symmetric and asymmetric learning rules for connections into each neuron, or separate subpopulations having only symmetric or asymmetric learning rules on incoming connections. They find that the latter is conducive to enabling external inputs to control the speed of sequence retrieval.

      Strengths:

      The authors have expertly characterised the system dynamics using both simulations and theory. How the speed and quality of retrieval varies across phases space has been well-studied. The authors are also able to vary the external inputs to reproduce a preparatory followed by an execution phase of sequence retrieval as seen experimentally in motor control. They also propose a simple reinforcement learning scheme for learning to map the two external inputs to the desired retrieval speed.

      Weaknesses:

      (1) The authors translate spike-based synaptic plasticity rules to a way to learn/set connections for rate units operating in discrete time, similar to their earlier work in [5]. The bio-plausibility issues of learning in [5] carry over here, for e.g. the authors ignore any input due to the recurrent connectivity during learning and effectively fix the pre and post rates to the desired ones. While the learning itself is not fully bio-plausible, it does lend itself to writing the final connectivity matrix in a manner that is easier to analyze theoretically.

      We agree with the reviewer that learning is not `fully bio-plausible’. However, we believe that extending the results to a model in which synaptic plasticity depends on recurrent inputs is beyond the scope of this work. We have added a mention of this issue in the Discussion in the revised manuscript.

      (2) While the authors learn to map the set of two external input strengths to speed of retrieval, they still hand-wire one external input to the subpopulation of neurons with temporally symmetric plasticity and the other external input to the other subpopulation with temporally asymmetric plasticity. The authors suggest that these subpopulations might arise due to differences in the parameters of Ca dynamics as in their earlier work [29]. How these two external inputs would connect to neurons differentially based on the plasticity kernel / Ca dynamics parameters of the recurrent connections is still an open question which the authors have not touched upon.

      The issue of how external inputs could self-organize to drive the network to retrieve sequences at appropriate speeds is addressed in the Results section, paragraph `Reward-driven learning’. These inputs are not `hand-wired’ - they are initially random and then acquire the necessary strengths to allow the network to retrieve the sequences at different speeds thanks to a simple reinforcement learning scheme. We have rewritten this section to clarify this issue.

      (3) The authors require that temporally symmetric and asymmetric learning rules be present in the recurrent connections between subpopulations of neurons in the same brain region, i.e. some neurons in the same brain region should have temporally symmetric kernels, while others should have temporally asymmetric ones. The evidence for this seems thin. Though, in the discussion, the authors clarify 'While this heterogeneity has been found so far across structures or across different regions in the same structure, this heterogeneity could also be present within local networks, as current experimental methods for probing plasticity only have access to a single delay between pre and post-synaptic spikes in each recorded neuron, and would therefore miss this heterogeneity'.

      We agree with the reviewer that this is currently an open question. We describe this issue in more detail in the Discussion of the revised manuscript.

      (4) An aspect which the authors have not connected to is one of the author's earlier work:

      Brunel, N. (2016). Is cortical connectivity optimized for storing information? Nature Neuroscience, 19(5), 749-755. https://doi.org/10.1038/nn.4286 which suggests that the experimentally observed over-representation of symmetric synapses suggests that cortical networks are optimized for attractors rather than sequences.

      We thank the reviewer for this suggestion. We have added a paragraph in the discussion that discusses work on statistics of synaptic connectivity in optimal networks. We expect that in networks that contain two subpopulations of neurons, the degree of symmetry should be intermediate between a network storing fixed point attractors exclusively, and a network storing sequences exclusively.

      Despite the above weaknesses, the work is a solid advance in proposing an alternate model for modulating speed of sequence retrieval and extends the use of well-established theoretical tools. This work is expected to spawn further works like extending to a spiking neural network with Dale's law, more realistic learning taking into account recurrent connections during learning, and experimental follow-ups. Thus, I expect this to be an important contribution to the field.

      We thank the reviewer for the insightful comments.

      Reviewer #2 (Public Review):

      Sequences of neural activity underlie most of our behavior. And as experience suggests we are (in most cases) able to flexibly change the speed for our learned behavior which essentially means that brains are able to change the speed at which the sequence is retrieved from the memory. The authors here propose a mechanism by which networks in the brain can learn a sequence of spike patterns and retrieve them at variable speed. At a conceptual level I think the authors have a very nice idea: use of symmetric and asymmetric learning rules to learn the sequences and then use different inputs to neurons with symmetric or asymmetric plasticity to control the retrieval speed. The authors have demonstrated the feasibility of the idea in a rather idealized network model. I think it is important that the idea is demonstrated in more biologically plausible settings (e.g. spiking neurons, a network with exc. and inh. neurons with ongoing activity).

      Summary

      In this manuscript authors have addressed the problem of learning and retrieval sequential activity in neuronal networks. In particular, they have focussed on the problem of how sequence retrieval speed can be controlled?

      They have considered a model with excitatory rate-based neurons. Authors show that when sequences are learned with both temporally symmetric and asymmetric Hebbian plasticity, by modulating the external inputs to the network the sequence retrieval speed can be modulated. With the two types of Hebbian plasticity in the network, sequence learning essentially means that the network has both feedforward and recurrent connections related to the sequence. By giving different amounts of input to the feed-forward and recurrent components of the sequence, authors are able to adjust the speed.

      Strengths

      - Authors solve the problem of sequence retrieval speed control by learning the sequence in both feedforward and recurrent connectivity within a network. It is a very interesting idea for two main reasons: 1. It does not rely on delays or short-term dynamics in neurons/synapses 2. It does not require that the animal is presented with the same sequences multiple times at different speeds. Different inputs to the feedforward and recurrent populations are sufficient to alter the speed. However, the work leaves several issues unaddressed as explained below.

      Weaknesses

      - The main weakness of the paper is that it is mostly driven by a motivation to find a computational solution to the problem of sequence retrieval speed. In most cases they have not provided any arguments about the biological plausibility of the solution they have proposed e.g.:

      - Is there any experimental evidence that some neurons in the network have symmetric Hebbian plasticity and some temporally asymmetric? In the references authors have cited some references to support this. But usually the switch between temporally symmetric and asymmetric rules is dependent on spike patterns used for pairing (e.g. bursts vs single spikes). In the context of this manuscript, it would mean that in the same pattern, some neurons burst and some don't and this is the same for all the patterns in the sequence. As far as I see here authors have assumed a binary pattern of activity which is the same for all neurons that participate in the pattern.

      There is currently only weak evidence for heterogeneity of synaptic plasticity rules within a single network, though there is plenty of evidence for such a heterogeneity across networks or across locations within a particular structure (see references in our Discussion). The reviewer suggests another interesting possibility, that the temporal asymmetry could depend on the firing pattern on the post-synaptic neuron. An example of such a behavior can be found in a paper by Wittenberg and Wang in 2006, where they show that pairing single spikes of pre and post-synaptic neurons lead to LTD at all time differences in a symmetric fashion, while pairing a pre-synaptic spike with a burst of post-synaptic spikes lead to temporally asymmetric plasticity, with a LTP window at short positive time differences. We now mention this possibility in the Discussion, but we believe exploring fully this scenario is beyond the scope of the paper.

      - How would external inputs know that they are impinging on a symmetric or asymmetric neuron? Authors have proposed a mechanism to learn these inputs. But that makes the sequence learning problem a two stage problem -- first an animal has to learn the sequence and then it has to learn to modulate the speed of retrieval. It should be possible to find experimental evidence to support this?

      Our model does not assume that the two processes necessarily occur one after the other. Importantly, once the correct external inputs that can modulate sequence retrieval are learned, sequence retrieval modulation will automatically generalize to arbitrary new sequences that are learned by the network.

      - Authors have only considered homogeneous DC input for sequence retrieval. This kind of input is highly unnatural. It would be more plausible if the authors considered fluctuating input which is different from each neuron.

      We have modified Figure 1e and Figure 2c to show the effects of fluctuating inputs on pattern correlations and single unit activity. We find that these inputs do not qualitatively affect our results.

      - All the work is demonstrated using a firing rate based model of only excitatory neurons. I think it is important that some of the key results are demonstrated in a network of both excitatory and inhibitory spiking neurons. As the authors very well know it is not always trivial to extend rate-based models to spiking neurons.

      I think at a conceptual level authors have a very nice idea but it needs to be demonstrated in a more biologically plausible setting (and by that I do not mean biophysical neurons etc.).

      We have included a new section in the discussion with an associated figure (Figure 7) demonstrating that flexible speed control can be achieved in an excitatory-inhibitory (E-I) spiking network containing two excitatory populations with distinct plasticity mechanisms.

      Reviewer #1 (Recommendations For The Authors):

      In the introduction, the authors state: 'symmetric kernels, in which coincident activity leads to strengthening regardless of the order of pre and post-synaptic spikes, have also been observed in multiple contexts with high frequency plasticity induction protocols in cortex [21]'. To my understanding, [21]'s final model 3, ignores LTD if the post-spike also participates in LTP, and only considers nearest-neighbour interactions. Thus, the kernel would not be symmetric. Can the authors clarify what they mean and how their conclusion follows, as [21] does not show any kernels either.

      In this statement, we were not referring to the model in [21], but rather the experimentally observed plasticity kernels at different frequencies. In particular, we were referring to the symmetric kernel that appears in the bottom panel of Figure 7c in that paper.

      The authors should also address the weaknesses mentioned above. They don't need to solve the issues but expand (and maybe indicate resolutions) on these issues in the Discussion.

      For ease of reproducibility, the authors should make their code available as well.

      We intend to publish the code required to reproduce all figures on Github.

      Reviewer #2 (Recommendations For The Authors):

      -  Show the ground state of the network before and after learning.

      We have decided not to include such a figure, as we have not analyzed the learning process, but instead a network with a fixed connectivity matrix which is assumed to be the end result of a learning process.

      -  Authors have only considered a network of excitatory neurons. This does not make sense. I think they should demonstrate a network of both exc. and inch. neurons (spiking neurons) exhibiting ongoing activity.

      See our comment to Reviewer #2 in the previous section.

      -  Show how the sequence dynamics unfolds when we assume a non-zero ongoing activity.

      We are not sure what the reviewer means by `non-zero ongoing activity. We show now the dynamics of the network in the presence of noisy inputs, which can represent ongoing activity from other structures (see Fig 1e and 2c).

      -  From the correlation (==quality) alone it is difficult to judge how well the sequence has been recovered. Authors should consider showing some examples so that the reader can get a visual estimate of what 0.6 quality may mean. High speed is not really associated with high quality (Fig 2b). So it is important to show how the sequence retrieval quality is for non-linear and heterogeneous learning rules.

      We believe that some insight into the relationship between speed and quality for the case of non-linear and heterogeneous learning rules is addressed by the correlation plots for chosen input configurations (see Fig. 3a and and 5b). We leave a full characterization for future work.

      -  Authors should show how the retrieval and quality of sequences change when they are recovered with positive input, or positive input to one population and negative to another. In the current version sequence retrieval is shown only with negative inputs. This is a somewhat non-biological setting. The inhibitory gating argument (L367-389) is really weak.

      We would like to clarify that with the parameters chosen in this paper, the transfer function has half its maximal rate at zero input. This is due to the fact we chose the threshold to be zero, using the fact that any threshold can be absorbed in the external inputs. Thus, negative inputs really mean sub-threshold inputs, and they are consistent with sub-threshold external excitatory inputs. We have clarified this issue in the revised manuscript.

      -  Authors should demonstrate how the sequence retrieval dynamics is altered when they assume a fluctuating input current for sequence retrieval instead of a homogeneous DC input.

      See our comment to Reviewer #2 in the previous section.

      -  Authors should show what are the differences in synaptic weight distribution for the two types of learning (bi-linear and non-linear). I am curious to know if the difference in the speed in the two cases is related to the weight distribution. In general I think it is a good idea to show the synaptic weight distribution before and after learning.

      As mentioned above, we do not study any learning process, but rather a network with a fixed connectivity matrix, assumed to represent the end result of learning. In this network, the distribution of synaptic weights converges to a Gaussian in the large p and cN limits, independently of the functions f and g, because of the central limit theorem, if there are no sign constraints on weights. In the presence of sign constraints, the distribution is a truncated Gaussian.

      -  I suggest the use of a monochromatic color scale for figure 2b and 3b.

      Figure 3: The sentence describing panel 2 seems incomplete.

      Also explain why there is non-monotonic relationship between I_s and speed for some values of

      I_a in 3b

      There is a non-monotonic relationship for retrieval quality, not speed. We have clarified this in the manuscript text, but don’t currently have an explanation for why this phenomenon occurs for these specific values of I_a.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      (1) Line 56: replace "pyomastitis" with "pyogenic skin infections".

      Corrected.

      (2) Line 58: replace "basal strains" with "ancestral strains".

      Corrected.

      (3) Line 62: population structure impacts gene acquisition too, however, gene acquisitions can be easier to connect with a phenotype. For example, acquisition of mecA is thought to be adaptive rather than just linked to a successful lineage. This same reasoning applies to resistance-associated mutations such as gyrA mutations in ST22 emergence.

      We completely agree with the reviewer that population structure also impacts gene acquisition. We wanted to convey that connecting gain or loss of genes to a change in particular phenotype is much easier than doing the same for a mutation, specially in the presence of strong linkage, and therefore gene level analysis is the focus of many previous studies. We have rewritten the sentence to better convey this idea:

      “Due to this limitation, studies of emerging strains often focus on gene level analysis such as acquisition of mobile genetic elements or loss of gene function as their effect on phenotype is easier to determine than that of point mutations.”

      (4) Line 112 this might be simply due to the smaller size of the intergenic regions chosen. I suggest to correct for the size of the genome segment considered.

      We thank the reviewer for pointing this out. The size of the intergenic was indeed the simple explanation for this observation. We have added the following sentence to the manuscript:

      “This is reflective of the fact that most of S. aureus genome sequence comprises of ORFs e.g. ~84% of TCH1516 genome is part of an ORF.”

      (5) Line 189: please add p values to supp table 2.

      We have added the p and q values from DBGWAS into Supp table 2. It is under the ‘DBGWAS Result’ sheet.

      (6) Line 227: high entropy indicates that this site is polymorph, not necessarily that there is selective pressure. In the extreme, this might actually point to a neutral position, since any amino-acid could be equally present (see for example https://www.nature.com/articles/s41467-022-31643-3#Sec10 ).

      We agree that high entropy by itself may point to a position with neutral selection leading to some false positives. However, we were focused on positions that were mostly biallelic in CC8, and with differential prevalence in USA300 vs non-USA300 (albeit in the presence of strong linkage disequilibrium) in addition to having high entropy in non-CC8 strains. This helps us filter some of the positions that were mostly monoallelic or with rare mutations while preserving other sites of interest. The approach was able to find cap5E mutation which has been associated with disruption of capsule production.

      (7) Line 271: show USA500 on the tree.

      Our current study is mostly focused on differences between USA300 and non-USA300 strains and we want to highlight those differences in the tree.

      (8) Line 327: still not possible to infer causality.

      We have changed the language to remove mentions of causality and instead talk about the association of GWAS enriched genes with measured transcriptional changes. The revised sentence now reads:

      “Here, we demonstrated how a model of transcriptional regulation with iModulons can be used to make a headway through the impasse created by the high linkage disequilibrium and identify GWAS-enriched mutations that are also associated with measurable phenotypic changes in the TRN.”

      (9) Line 324: subclades reference.

      We are unsure what this means.

      (10) Line 366: the authors seem to have used a bespoke pan-genome analysis approach. Would they be able to validate it using established tools such as Roary, Pirate or Panaroo? Panaroo in particular appears to have superior accuracy thanks to its pan-genome graph approach (https://github.com/gtonkinhill/panaroo). 

      We have added the results of Roary to our analysis (Figure S1b). The roary results largely agree with our biggest take away from pangenomics which is that our collection of genomes have a good coverage of the CC8 clade at the gene level.

      (11) Line 397: what was the size of the core genome?

      There were 24881 core sites. We have added the number to the manuscript.

      (12) Line 407: please add citation or website for SCCmecFinder.

      The citation of SCCmecFinder (45) is at the end of the sentence.

      (13) Line 421: I was not able to find the code used for this analysis in the github repository provided.

      The code can be found in “notebook/02_Preprocess_DBGWAS.ipynb” within the repo.

      (14) Line 427: this is a very complex analysis for a simple univariate comparison between USA300-vs-non USA300 strains with no correction for population structure. The authors should compare their results with a more established pipeline like Pyseer or Gemma that can handle kmers and show the added value of their approach.

      We wanted to take advantage of DBGWAS’s ability to collapse kmers into unitigs and further collapse significant unitigs within a genetic neighborhood into components. Unfortunately, we found that in many cases, it became difficult to determine the exact mutation that was being enriched e.g. (T234G) without doing lots of manual work. Our network analysis simply parses the DBGWAS graph to automatically extract these mutations, making the results more interpretable. It does not do any additional hypothesis testing.

      We also attempted to pass kmer data into GEMMA but without the compaction provided by DBGWAS the memory required (>168 GB) exceeded what we had available.

      (15) DBGWAS: please indicate DBGWAS version and the options used for kmer size and number of neighbour nodes retained in the subgraph. Also, I assume that no correction for population structure was applied.

      We have added the version and parameters for DBGWAS. The method section now reads:

      “DBGWAS (v0.5.4) was used to enrich mutations unique to USA300 strains using default kmer size of 31 (-k 31) and neighborhood size of 5 (-nh 5). Alleles with frequency less than 0.1 were filtered  (-maf 0.1) and all components enriched with q-values less than 0.05 were documented (-SFF q0.05).”

      (16) Could the authors provide the DBGWAS output for the most significant unitings in graph format? This would help readers understand the findings.

      The outputs are available in the github repo. The link to this specific data is (https://github.com/sapoudel/USA300GWASPUB/tree/master/data/dbgwas/dbgwas_output/visualisations)

      The text format of the output is part of Supplementary Table 2 under “DBGWAS Result” sheet.

      (17) Line 469: please provide more details on iModulons, it is not enough to simply reference the paper: specific QC criteria, mapping algorithm and parameters, ICA algorithm.

      We have now added a new Supplementary Note 2 section with more details about building iModulons.

      (18) Line 474: what is log-TPM?

      Log-Transcripts per Million. We have added the description in the text.

      (19) Line 479: not sure what "Chapter 3" refers to.

      Thank you for correcting the mistake. The reference has been corrected.

      Reviewer #2 (Recommendations For The Authors):

      Line 45. The introduction is not well-structured, and there is a lack of coherence among the topics pertinent to the research objective. I would recommend rewriting this section addressing the following topics: the challenge of distinguishing lineages within the CC8, especially the CA-MRSA USA300 strains; discussing the state-of-the-art GWAS methodologies, elucidating the main confounding factors in the application of GWAS to bacterial studies, and finally, exploring how current methods aim to address these concerns.

      We would like to thank the reviewer for the suggestions. The main innovation of the paper is using iModulons to find phenotype associated mutations from a set of linked mutations. The challenge of distinguishing CC8 subclades has been largely resolved thanks to efforts by Bowers et al. (PMID: 29720527). We have made some revisions to address the GWAS methodologies (bugwas and DBGWAS), the effect of linkage disequilibrium in interpreting the output of these methods and how combining the results of these association tests with modeling of TRN with iModulons can lead to finding candidate mutations of interest that are linked to specific changes in gene regulation.

      Line 56. Replace "pyomastitis" with "pyomyositis".

      Corrected to “pyogenic skin infections.”

      Lines 71. What do the authors mean by "endemic USA300 strain"?

      We have removed references to endemic strains.

      Line 106. Please verify the number of genomes used in the DBGWAS analysis. In the text, the authors mention that 2038 genomes were utilized. However, in Supplementary Table 1, only 2030 genomes are listed.

      Thank you for catching the discrepancy. We started the analysis with 2037 genomes, including four “spiked-in” reference genomes- USA100 D592 (CC5 strain used for rooting the CC8 tree), TCH1516 (same accession number as the one used for ICA), COL and Newman. Before further analysis, we removed 6 genomes for being smaller than 2.5 million base-pairs (see preprocessing.ipynb) and the USA100 D592 strain as it is not part of CC8. This resulted in 2030 genomes being used for DBGWAS. We kept the other 3 spiked CC8 genomes to help annotate the unitigs from DBGWAS.  Lastly, we removed the other three CC8 clade spiked genomes for pangenomic analysis. To clarify this, we have made the following changes to the text:

      (1) Changed line 106: We downloaded 2033 S. aureus genomes for analysis and excluded six of them with genome length of less than 2.5 million base pairs. The remaining 2027 S. aureus CC8 genomes formed a closed pangenome, suggesting that the sampled genomes mostly captured the gene level variations within the clonal complex (Figure 1a).

      (2) DBGWAS section Line 177: We used 2030 genomes for this analysis; the 2027 genomes in pangenomics analysis above were “spiked” with three well known CC8 genomes- TCH1516, COL, and Newman- to help annotate the DBGWAS unitigs.

      Line 108. Could the authors provide a table with the genes that constitute the core, accessory genome, and unique genes for each of the strains?

      The genes presence absence tables are very large files and therefore we have only added them to our github repo. The results can be found in following files:

      Pangenomics: data/pangenome/Pangenomics/CC8_strain_by_gene.pickle.gz

      Lines 112 and 315. On what basis did the authors decide on the size of the upstream regulatory region? In the search for mutations, they extracted segments of 300 base pairs, whereas, in the search for the Fur binding motif, only 100 base pairs were considered. The RegPrecise database contains regulons for Staphylococcus aureus N315 (https://regprecise.lbl.gov/genome.jsp?genome_id=26), including the Fur regulon with multiple Transcription Factor Binding Sites (TFBSs) that extend beyond the 100 base-pair sequence. I would recommend reconsidering the search within the standardized upstream region of -400 base pairs. In the case of the Fur binding motif search, it might be beneficial to include the TFBSs available in the RegPrecise database.

      For Fur motif search, we chose 100 base-pairs because the Fur motif in non-USA300 strains were within ~20 base-pairs of isdH translation start site (Figure 4C). In our search of Fur motif in this analysis, we were not looking to see if any exists, we were simply looking to see if the one proximal to the translation start site exists as our DBGWAS analysis suggested that specific region was deleted in USA300 strains.

      Line 175. This work aimed to identify potential mutations associated with the success of a specific lineage rather than a phenotype, where correction for population structure effects is necessary. Would the implementation of the bugwas method in DBGWAS for controlling bacterial population structure not potentially impact the results? How was this issue addressed in your analysis? Would it not be pertinent to run a program without population structure correction to enable a comparison of results?

      We initially tried to use Linear Mixed Models to find kmers that were only enriched in USA300 strains. These efforts were hampered by extreme linkage disequilibrium which led to high collinearity between kmer abundance making it extremely difficult to get a good estimate of the coefficients. We also tried to run chi-squared tests individually on each kmer which led to unmanageable number (>100k) kmers that were significantly different. DBGWAS on the other hand was able to compress unbranched kmers in the De Bruijn into unitigs and further reduce the number of tests by testing at pattern level instead of unitig level. We found no straight forward way to run DBGWAS (or GEMMA) without population structure correction. Therefore, it is likely we may be underestimating the number of significant unitigs with this approach.

      Line 189. Please italicize the gene name cap5E.

      Corrected.

      Line 277. Please clarify the QC/QA criteria and curation process employed for the selection of RNA-seq experiments, as this constitutes a crucial step in the reconstruction of the network.

      We have now added a new supplementary material section, Supplementary Note 2 titled “Creating iModulons for CC8 Clade Staphylococcus aureus” with details of QC/QA.

      Line 279. In Supplementary Table 3, please label the first column and standardize the use of either the experiment ID or the run ID. Furthermore, verify the experiment identifiers from rows 19 to 26, as I could not locate them in the SRA database.

      We have changed all accession to experiment ID including rows 19 to 26.

      Lines 290, 330, 424, and 437. Please correct "SCCMec" to "SCCmec IVa" (italicize "mec").

      Corrected.

      Line 298. What is the size of the upstream regulatory region considered for this analysis? It is important to standardize this value for all analyses involving the upstream regulatory region. In this regard, I recommend maintaining a consistent size of -400 base pairs.

      For Fur motif search we chose 100 base-pairs because the Fur motif in non-USA300 strains were within ~20 base-pairs of isdH translation start site (Figure 4C). In our search of Fur motif in this analysis, we were not looking to see if any exists, we were simply looking to see if the one proximal to the translation start site exists as our DBGWAS analysis suggested that specific region was deleted in USA300 strains. In our usual analysis, we use -300 base pairs.

      Line 321. The discussion is rather concise and lacks an in-depth comparative perspective with relevant literature on any of the obtained results, whether concerning the proposed methodology or the potential new markers associated with the success of the USA300 lineage. The authors must underscore the method is not applicable to all GWAS analyses, due to the issue of correction for population structure.

      We have now added sections talking about the importance of isdH in S. aureus infection and a section addressing the limitation of the current approach when applied to other GWAS type study.

      Line 366. The authors employed the methodology described in the article by Hyun et al. 2022 (https://doi.org/10.1186/s12864-021-08223-8) to construct the pangenome. However, this methodology was designed for comparative analysis of pangenomes across various species, which does not align with the objective of this study, focusing solely on S. aureus genomes. Consequently, it remains unclear to me why the authors made this particular choice and, more importantly, what advantages it offers over well-established tools for individual pangenomes, such as Roary. I would strongly recommend validating the results using at least one established tool.

      With our analysis, we can determine proper thresholds for core/accessory/unique genes based on the observed data (Supplementary Figure 1a). However, we agree that it would be proper to include a more established pangenome package. We have added the results of Roary to our analysis. The Roary results largely agree with our biggest take away from pangenomics which is that our collection of genomes have a good coverage of the CC8 clade at the gene level.

      Line 370. Please include the version of CD-HIT that was utilized.

      Added. CD-HIT version 4.6 was used for the analysis.

      Line 372. What tool did the authors use to extract these regions?

      The list of CDS, 5’ and 3’ sequences can be extracted easily with a combination of fasta file and gff file. The gff file was used to find the position of each of these sequences and the sequences were extracted from the fasta file with python scripts.

      Line 395. What were the QC/QA criteria used to select the sequences?

      The QC/QA criteria for the sequences are mentioned in the beginning of the Pangnomic analysis subsection and is as follows:

      “Briefly, “complete” or “WGS” samples from CC8/ST8 were downloaded from the PATRIC database. Sequences with lengths that were not within 3 standard deviations of the mean length or those with more than 100 contigs were filtered out.”

      Line 407. Please correct the tool name to "SCCmecFinder" (italicize "mec").

      The name has been corrected.

      Line 409. I believe BLASTp was run locally, so please specify the version used and the search parameters.

      As corrected further down, we used BLASTn not BLASTp. The version v2.2.31 has been added to the methods section.

      Line 416. There is conflicting information with line 409, which mentions that PVL was identified through a protein BLAST, but right below, it states it was a BLASTn. Please verify which information is correct and consider the previous comment to specify the version and parameters.

      Thank you catching the discrepancy. We have corrected the text:

      “PVL was detected using nucleotide BLAST.”

      Line 418. Please provide the column identifiers for the Supplementary Table 5 (PVL worksheet).

      Column names are added.

      Line 418. Please remove the repeated word "and" in Supplementary Table 5 (mecA worksheet) and italicize the gene names in this table.

      Corrected

      Line 419. You can use the abbreviation "SNPs" since it was introduced in line 65.

      Corrected.

      Line 420. In my view, this analysis could benefit from a more detailed and clearer explanation.

      We have added to the explanation. The section now reads:

      “To find the root of the USA300 strains in the phylogenetic tree, the genomes in the tree were first annotated by their PVL and SCC_mec_ status. Then the tree traversed from leaf to root starting from known USA300 strains – TCH1516 and FPR3757- while keeping track of the number of descendant genomes from the current root that contained known markers SCC_mec_ IVa and PVL. The node where the number of genomes with the markers started flatlining was marked as the root of USA300.”

      Line 428. Specify the version and parameters used in the analysis with DBGWAS.

      Added. The text now reads:

      “DBGWAS (v0.5.4) was used to enrich mutations unique to USA300 strains using default kmer size of 31 (-k 31) and neighborhood size of 5 (-nh 5). Alleles with frequency less than 0.1 were filtered  (-maf 0.1) and all components enriched with q-values less than 0.05 were documented (-SFF q0.05).”

      Line 431. What tools were employed to calculate Pearson correlation and distances relative to the reference genome?

      Added. The text now reads:

      “Genome-wide linkage was estimated by Pearson correlation (calculated with built-in Pandas function) of the presence/ absence of enriched kmers and distance was measured based on the kmer alignment to the reference TCH1516 genome as determined by BLASTn.”

      Line 450. What type of BLAST was used?

      Added. Nucleotide blast was used for all kmer analysis.

      Line 452. I didn't quite understand the reason for making this analysis available in a separate repository. It would be easier for readers looking to reproduce the work if all the codes were in a single repository.

      We kept the repository separate in case we wanted to further develop the network analysis code in the future. We have added the link to the network analysis repository in the README of the publication repo.

      Line 460. Please specify the version and parameters, if run locally, or indicate if a web page was used.

      Corrected to indicate that we used the PATRIC website for this

      Line 470. Specify the version and provide a detailed account of all parameters used, along with the QC/QA criteria and curation methods applied.

      We have added Supplementary Note 2 with all the details about packages and parameters used to calculate the iModulons.

      Line 479. The phrase "ICA was then run as previously described in chapter 3" does not make sense. Please clarify.

      We have corrected the mistake and added a new supplementary note with details about our ICA run. The line now reads:

      “A detailed version of the methods for RNA-sequencing and ICA analysis is available as Supplementary Note 2. ICA of RNA sequencing data was performed using the pymodulon package.”

      Line 484. Specify the version of CD-HIT.

      Added. The version used was v4.6.

      Line 494. To enable reproducibility, the repository should be better organized, especially the directory containing the code. Numbering each script in the order it was run would assist the reader in comprehending the overall analysis flow and adapting it to their needs. If creating a manual for method usage is not feasible, the code could be more extensively commented on to explain the parameters, choices made, and how these could be modified. The "Data" folder seems to contain some test files, such as those in the "isdh_fimo" folder, so removing test files would aid the understanding of the reader.

      Thank you for the suggestions. We have now numbered the notebooks that generate the figures, we have added more comments to the code, removed testing code and test datasets.

      Throughout the article, please correct "SCCMec" to "SCCmec" (italicize "mec").

      Corrected.

    1. Reviewer #1 (Public Review):

      Lu & Golomb combined EEG, artificial neural networks, and multivariate pattern analyses to examine how different visual variables are processed in the brain. The conclusions of the paper are mostly well supported, but some aspects of methods and data analysis would benefit from clarification and potential extensions.

      The authors find that not only real-world size is represented in the brain (which was known), but both retinal size and real-world depth are represented, at different time points or latencies, which may reflect different stages of processing. Prior work has not been able to answer the question of real-world depth due to the stimuli used. The authors made this possible by assessing real-world depth and testing it with appropriate methodology, accounting for retinal and real-world size. The methodological approach combining behavior, RSA, and ANNs is creative and well thought out to appropriately assess the research questions, and the findings may be very compelling if backed up with some clarifications and further analyses.

      The work will be of interest to experimental and computational vision scientists, as well as the broader computational cognitive neuroscience community as the methodology is of interest and the code is or will be made available. The work is important as it is currently not clear what the correspondence between many deep neural network models and the brain is, and this work pushes our knowledge forward on this front. Furthermore, the availability of methods and data will be useful for the scientific community.

      Some analyses are incomplete, which would be improved if the authors showed analyses with other layers of the networks and various additional partial correlation analyses.

      Clarity

      (1) Partial correlations methods incomplete - it is not clear what is being partialled out in each analysis. It is possible to guess sometimes, but it is not entirely clear for each analysis. This is important as it is difficult to assess if the partial correlations are sensible/correct in each case. Also, the Figure 1 caption is short and unclear.

      For example, ANN-EEG partial correlations - "Finally, we directly compared the timepoint-by-timepoint EEG neural RDMs and the ANN RDMs (Figure 3F). The early layer representations of both ResNet and CLIP were significantly correlated with early representations in the human brain" What is being partialled out? Figure 3F says partial correlation

      Issues / open questions

      (2) Semantic representations vs hypothesized (hyp) RDMs (real-world size, etc) - are the representations explained by variables in hyp RDMs or are there semantic representations over and above these? E.g., For ANN correlation with the brain, you could partial out hyp RDMs - and assess whether there is still semantic information left over, or is the variance explained by the hyp RDMs?

      (3) Why only early and late layers? I can see how it's clearer to present the EEG results. However, the many layers in these networks are an opportunity - we can see how simple/complex linear/non-linear the transformation is over layers in these models. It would be very interesting and informative to see if the correlations do in fact linearly increase from early to later layers, or if the story is a bit more complex. If not in the main text, then at least in the supplement.

      (4) Peak latency analysis - Estimating peaks per ppt is presumably noisy, so it seems important to show how reliable this is. One option is to find the bootstrapped mean latencies per subject.

      (5) "Due to our calculations being at the object level, if there were more than one of the same objects in an image, we cropped the most complete one to get a more accurate retinal size. " Did EEG experimenters make sure everyone sat the same distance from the screen? and remain the same distance? This would also affect real-world depth measures.

    1. eLife assessment

      In their manuscript, Cummings et al. use in vitro reconstitution to examine the differential activities of tubulin polyglycylases, providing valuable insights into the enzymatic regulation of microtubule glycylation and its mechanistic role in maintaining cilia function and microtubule dynamics. The convincing evidence, supported by well-designed experiments and appropriate controls, significantly advances our understanding of the tubulin code and its biochemical mechanisms.

    2. Reviewer #2 (Public Review):

      In their manuscript, Cummings et al. focus on the enzymatic activities of TTLL3, TTLL8, and TTLL10, which catalyze the glycylation of tubulin, a crucial posttranslational modification for cilia maintenance and motility. The experiments are beautifully performed, with meticulous attention to detail and the inclusion of appropriate controls, ensuring the reliability of the findings. The authors utilized in vitro reconstitution to demonstrate that TTLL8 functions exclusively as a glycyl initiase, adding monoglycines at multiple positions on both α- and β-tubulin tails. In contrast, TTLL10 acts solely as a tubulin glycyl elongase, extending existing glycine chains. A notable finding is the differential substrate recognition between TTLL glycylases and TTLL glutamylases, highlighting a broader substrate promiscuity in glycylases compared to the more selective glutamylases. This observation aligns with the greater diversification observed among glutamylases. The study reveals a hierarchical mechanism of enzyme recruitment to microtubules, where TTLL10 binding necessitates prior monoglycylation by TTLL8. This binding is progressively inhibited by increasing polyglycine chain length, suggesting a self-regulatory mechanism for polyglycine chain length control. Furthermore, TTLL10 recruitment is enhanced by TTLL6-mediated polyglutamylation, illustrating a complex interplay between different tubulin modifications. In addition, they uncover that polyglutamylation stimulates TTLL10 recruitment without necessarily increasing glycylation on the same tubulin dimer, due to the potential for TTLLs to interact with neighboring tubulin dimers. This mechanism could lead to an enrichment of glycylation on the same microtubule, contributing to the complexity of the tubulin code. The article also addresses a significant challenge in the field: the difficulty of generating microtubules with controlled posttranslational modifications for in vitro studies. By identifying the specific modification sites and the interplay between TTLL activities, the authors provide a valuable tool for creating differentially glycylated microtubules. This advancement will facilitate further studies on the effects of glycylation on microtubule-associated proteins and the broader implications of the tubulin code. In summary, this study substantially contributes to our knowledge of posttranslational enzymes and their regulation, offering new insights into the biochemical mechanisms underlying microtubule modifications. The rigorous experimental approach and the novel findings presented make this a pivotal addition to the field of cellular and molecular biology.

    1. const dispatch = useDispatch()

      Just react hook to get dispatch of the store to replicate the same functionality as in the first code snippet: call store.dispatch(…).

    1. We can live without Google, Facebook Microsoft, Apple, Amazon. We can write code which is not on Github, which doesn’t run on an Amazon server and which is not displayed in a Google browser.

      ..., by means of widening the possibility space.

    2. So, what can we do? In the short term, it’s very simple. If you care about the commons, you should put your work under a strong copyleft license like the AGPL. That way, we will get back to building that commons we lost because of web services. If someone ever complains that a web service broke because of your AGPL code, reply that the whole web service should be under the AGPL too.
    3. When publicly distributed, the open-source code is hidden behind layers of indirection bypassing any packaging/integration effort, relying instead on virtualisation and downloading dependencies on the fly. Thanks to those strategies, corporations could benefit from open source code without any consequence. The open source code is, anyway, mostly hosted and developed on proprietary platforms.
    1. Reviewer #2 (Public Review):

      Summary:<br /> The manuscript by Kelbert et al. presents results on the involvement of the yeast transcription factor Sfp1 in the stabilisation of transcripts whose synthesis it stimulates. Sfp1 is known to affect the synthesis of a number of important cellular transcripts, such as many of those that code for ribosomal proteins. The hypothesis that a transcription factor can remain bound to the nascent transcript and affect its cytoplasmic half-life is attractive, but the methods used to demonstrate the half-life effects and the association of Sfp1 with cytoplasmic transcripts remain to be fully validated, as explained in my comments on the results below:

      Comments on methodology and results:<br /> (1) A two-hybrid-based assay for protein-protein interactions identified Sfp1, a transcription factor known for its effects on ribosomal protein gene expression, as interacting with Rpb4, a subunit of RNA polymerase II. Classical two-hybrid experiments depend on the presence of the tested proteins in the nucleus of yeast cells, suggesting that the observed interaction occurs in the nucleus. Unfortunately, the two-hybrid method cannot determine whether the interaction is direct or mediated by nucleic acids.

      (2) Inactivation of nup49, a component of the nuclear pore complex, resulted in the redistribution of GFP-Sfp1 into the cytoplasm at the temperature non-permissive for the nup49-313 strain, suggesting that GFP-Sfp1 is a nucleo-cytoplasmic shuttling protein. This observation confirmed the dynamic nature of the nucleo-cytoplasmic distribution of Sfp1. For example, a similar redistribution to the cytoplasm was previously reported following rapamycin treatment and under starvation (Marion et al., PNAS 2004). In conjunction with the observation of an interaction with Rpb4, the authors observed slower nuclear import kinetics for GFP-Sfp1 in the absence of Rpb4 when cells were transferred to a glucose-containing medium after a period of starvation. Since the redistribution of GFP-Sfp1 was abolished in an rpb1-1/nup49-313 double mutant, the authors concluded that Sfp1 localisation to the cytoplasm depends on transcription. The double mutant yeast cells may show a variety of non-specific effects at the restrictive temperature, and whether transcription is required for Sfp1 cytoplasmic localisation remains incompletely demonstrated.

      (3) Under starvation conditions, which led to the presence of Sfp1 in the cytoplasm and have previously been correlated with a decrease in the transcription of Sfp1 target genes, the authors observed that a plasmid-based expressed GFP-Sfp1 accumulated in cytoplasmic foci. These foci were also labelled by P-body markers such as Dcp2 and Lsm1. The quality of the microscopic images provided does not allow to determine whether Rpb4-RFP colocalises with GFP-Sfp1.

      (4) To understand to which RNA Sfp1 might bind, the authors used an N-terminally tagged fusion protein in a cross-linking and purification experiment. This method identified 264 transcripts for which the CRAC signal was considered positive and which mostly correspond to abundant mRNAs, including 74 ribosomal protein mRNAs or metabolic enzyme-abundant mRNAs such as PGK1. The authors did not provide evidence for the specificity of the observed CRAC signal, in particular, what would be the background of a similar experiment performed without UV cross-linking. In a validation experiment, the presence of several mRNAs in a purified SFP1 fraction was measured at levels that reflect the relative levels of RNA in a total RNA extract. Negative controls showing that abundant mRNAs not found in the CRAC experiment were clearly depleted from the purified fraction with Sfp1 would be crucial to assessing the specificity of the observed protein-RNA interactions. The CRAC-selected mRNAs were enriched for genes whose expression was previously shown to be upregulated upon Sfp1 overexpression (Albert et al., 2019). The presence of unspliced RPL30 pre-mRNA in the Sfp1 purification was interpreted as a sign of co-transcriptional assembly of Sfp1 into mRNA, but in the absence of valid negative controls, this hypothesis would require further experimental validation.

      (5) To address the important question of whether co-transcriptional assembly of Spf1 with transcripts could alter their stability, the authors first used a reporter system in which the RPL30 transcription unit is transferred to vectors under different transcriptional contexts, as previously described by the Choder laboratory (Bregman et al. 2011). While RPL30 expressed under an ACT1 promoter was barely detectable, the highest levels of RNA were observed in the context of the native upstream RPL30 sequence when Rap1 binding sites were also present. Sfp1 showed better association with reporter mRNAs containing Rap1 binding sites in the promoter region. However, removal of the Rap1 binding sites from the reporter vector also led to a drastic decrease in reporter mRNA levels. Whether the fraction of co-purified RNA is nuclear and co-transcriptional or not cannot be inferred from these results.

      (6) To complement the biochemical data presented in the first part of the manuscript, the authors turned to the deletion or rapid depletion of SFP1 and used labelling experiments to assess changes in the rate of synthesis, abundance, and decay of mRNAs under these conditions. An important observation was that in the absence of Sfp1, mRNAs encoding ribosomal protein genes not only had a reduced synthesis rate but also an increased degradation rate. This important observation needs careful validation, as genomic run-on experiments were used to measure half-lives, and this particular method was found to give results that correlated poorly with other measures of half-life in yeast (e.g. Chappelboim et al., 2022 for a comparison). Similarly, the use of thiolutin to block transcription as a method of assessing mRNA half-life has been reported to be problematic, as thiolutin can specifically inhibit the degradation of ribosomal protein mRNA (Pelechano & Perez-Ortin, 2008). Specific repressible reporters, such as those used by Baudrimont et al. (2017), would need to be tested to validate the effect of Sfp1 on the half-life of specific mRNAs. Also, it would be very difficult to infer from the images presented whether the rate of deadenylation is altered by Sfp1.

      (7) The effects of SFP1 on transcription were investigated by chromatin purification with Rpb3, a subunit of RNA polymerase, and the results were compared with synthesis rates determined by genomic run-on experiments. The decrease in polII presence on transcripts in the absence of SFP1 was not accompanied by a marked decrease in transcript output, suggesting an effect of Sfp1 in ensuring robust transcription and avoiding RNA polymerase backtracking. To further investigate the phenotypes associated with the depletion or absence of Sfp1, the authors examined the presence of Rpb4 along transcription units compared to Rpb3. One effect of spf1 deficiency was that this ratio, which decreased from the start of transcription towards the end of transcripts, increased slightly. The results presented are largely correlative and could arise from the focus on very specific types of mRNAs, such as those of ribosomal protein genes, which are sensitive to stress and are targeted by very active RNA degradation mechanisms activated, for example, under heat stress (Bresson et al., 2020).

      Strengths:<br /> - Diversity of experimental approaches used<br /> - Validation of large-scale results with appropriate reporters

      Weaknesses:<br /> - Choice of evaluation method to test mRNA half-life<br /> - Lack of controls for the CRAC results

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The paper combines phenotypic and genomic analyses of the "sheltered load" (i.e. the accumulation of deleterious mutations linked to S-loci that are hidden from selection in the homozygous state) in Arabidopsis. The authors compare results to previous theoretical predictions concerning the extent of the load in dominant vs recessive S-alleles, and further develop exciting theory to reconcile differences between previous theory and observed results.

      Strengths:

      This is a very nice combination of theory and data to address a classical question in the field.

      We thank the reviewer for this positive feedback.

      Weaknesses:

      The "genetic load" is a poorly defined concept in general, and its quantification via the number of putatively deleterious mutations is quite difficult. Furthermore counting up the number of derived mutations at fully constrained nucleotides may not be a great estimate of the load, and certainly does not allow for evaluation of recessivity -- a concept critical to ideas concerning the sheltered load. Alternative approaches - including estimating the severity of mutations - could be helpful as well. This imperfection in available approaches to test theory must be acknowledged more strongly by the authors.

      As suggested by the reviewer, we implemented alternative approaches to estimate the severity of deleterious mutations and now report the results of SNPeff and

      SIFT4G analyses in Table S6. The results we obtained with these other metrics were overall very similar to those based on our previous counting of mutations at 0-fold and 4-fold degenerate sites. More generally, we tried to improve the presentation of our strategy to estimate the genetic load (clarified in lines 262-268, 271, 292-295, 297. In particular, we made it clear that our population genetic analysis cannot assess the recessivity of the observed mutations (lines 428-434).

      Reviewer #2 (Public Review):

      Summary:

      This study looks into the complex dominance patterns of S-allele incompatibilities in Brassicaceae, through which it attempts to learn more about the sheltering of deleterious load. I found several weak points in the analyses that diminished my excitement about the results. In particular, the way in which deleterious mutations were classified lacked the ability to distinguish the severity of the mutations and thus their expected associated dominance.

      First, we would like to clarify that our goal with this study is NOT to learn something about dominance of the linked deleterious mutations (we can not). Instead, we compare the accumulation of deleterious mutations linked to dominant vs recessive S-ALLELES, but are agnostic regarding the dominance level of the LINKED mutations themselves. The rationale is that the different intensities of natural selection between dominant vs recessive S-alleles provide a powerful way to examine the process by which deleterious mutations are sheltered in general. We further clarified this aspect on lines 70-73 and 399-401.

      Second, as mentioned above in response to Reviewer 1, we complemented the analysis by predicting the severity of the deleterious mutations by SIFT4G and SNPeff. The results were largely consistent, with the exception that the number of sites included in SIFT4G was low, such that the statistical power was reduced (lines 296-300).

      Furthermore, the simulation approach could have provided this exact sort of insight but was not designed to do so, making this comparison to the empirical data also less than exciting for me.

      As explained above, studying dominance of the linked mutations we observed is an interesting research question (albeit a difficult one), but it was not our goal here. Instead, our study was designed as an empirical test of the predictions presented in Llaurens et al (2009), and we re-analysed some aspects of the model outcome to illustrate our points.

      We now better explain that we based our choice of parameters on the fact that in the theoretical study by Llaurens et al (2009), recessive deleterious mutations are predicted to accumulate in a much more straightforward manner (line 316-318).

      We now dedicate a paragraph of the discussion to explain how our stochastic simulations could be improved, and acknowledge that a full exploration of the interaction between dominance of the S-alleles and dominance of the linked deleterious mutations would be an interesting follow-up - albeit beyond the scope of our study (line 437-441).

      Major and minor comments:

      I think the introduction (or somewhere before we dive into it in the results) of the dominance hierarchy for the S-alleles needs a more in-depth explanation. Not being familiar with this beforehand really made this paper inaccessible to me until I then went to find out more before continuing. I would expect this paper to be broad enough that self-contained information makes it accessible to all readers. For example, lines 110-115 could be in the Introduction.

      We thank the reviewer for this useful remark. We now give a more comprehensive description of the dominance hierarchy and introduce the classes of dominance in A. lyrata already in the introduction, on lines 64-70.

      Along with my above comment, perhaps it is not my place to comment, but I find the paper not of a broad enough scope to be of interest to a broad readership. This S-allele dominance system is more than simple balancing selection, it is a very complex and specific form of dominance between several haplotypes, and the mechanism of dominance does not seem to be genetic. I am not sure that it thus extrapolates to broad comments on general dominance and balancing selection, e.g. it would not be the same as considering inversions and this form of balancing selection where we also expect recessive deleterious mutations to accumulate.

      We disagree with these interpretations by the reviewer, for two reasons:

      First, the mechanism of dominance is actually entirely genetic. In fact, we uncovered some years ago that it is based on the molecular interaction between small non-coding RNAs from dominant alleles and their target sites on recessive alleles (Durand et al. Science 2014, see lines 68-70). If there is something specific with this system, it is that the dominance phenomenon is better understood at the mechanistic level than in most other cases, but the resulting phenomenon in itself (a dominance hierarchy) is rather common.

      Second, the kind of variation in the intensity of linked selection created by this mechanism is actually a general phenomenon, so our results have broad relevance beyond our particular study system. We modified the introduction to explain this point

      more clearly, highlighting in particular the fact that the situation we study closely resembles the case of sex chromosomes, where X (or Z) chromosomes are genetically recessive and Y (or W) chromosomes are genetically dominant. We cite this example in lines 83-87 of the introduction and also several well-studied other examples on lines 480-489 of the discussion.

      It would have been particularly interesting, or a nice addition, to see deleterious mutations classed by something like SNPeff or GERP where you can have different classes of moderate to severe deleterious variants, which we would expect also to be more recessive the more deleterious they are. In line with my next comment on the simulations, I think relative differences between mutations expected to be more or less dominant may be even more insightful into the process of sheltering which may or may not be going on here.

      We agree with the reviewer, and as detailed above we have now integrated such analyses with SNPeff and SIFT4G (Table S6). These new results reinforce our conclusion that while S-allele dominance influences the fixation of deleterious mutations, it has no effect on their total number. See lines 270-272 and 296-300.

      In the simulations, h=0 and s=0.01 (as in Figure 5) for all deleterious mutations seems overly simplistic, and at the convenient end for realistic dominance. I think besides recessive lethals which we expect to be close to h=0 would have a much larger selection coefficient, and other deleterious mutations would only be partially recessive at such an s value. I expect this would change some of the simulation results seen, though to what degree I am not certain. It would be nice to at least check the same exact results for h=0.3 or 0.2 (or additionally also for recessive lethals, e.g. h=0 and s=-0.9). I would also disagree with the statement in line 677, many studies have shown, particularly those on balancing selection, that partially recessive deleterious mutations are not eliminated by natural selection and do play a role in population genetic dynamics. I am also not surprised that extinction was found for higher s values when the mutation rate for such mutations was very high and the distribution of s values was constant. An influx of such highly deleterious mutations is unlikely to ever let a population survive, yet that does NOT mean that in nature, the rare influx of such mutations does lead to them being sheltered. I find overall that the simulation results contribute very little, to none, to this paper, as without something more realistic, like a simultaneous distribution of s and h values, you cannot say which, if any class of these mutations are the ones expected to accumulate because of S-allele dominance.

      We understand that the previous version of our manuscript was confusing between dominance of the S-alleles and dominance of the linked deleterious mutations. We clarified that our study focuses on the effect of the former only (lines 99, 263-264 and 581-583).

      We agree that a complete exploration of the interaction between dominance of the S-alleles and dominance of the linked mutations being sheltered would have been an asset, but as explained above this is not the focus of our study. The previous work by Llaurens et al (2009) has already established that deleterious mutations can fix within S-allele lineages, especially when linked to dominant S-alleles, and when the number of S-alleles is large. Under the conditions they examined, deleterious mutations were much more strongly eliminated if not fully recessive (h=0 vs h=0.2), so for the present study we decided to simulate fully recessive mutations only. We now formally acknowledge the possibility that some complex interaction may take place between dominance of the S-alleles and dominance of the linked deleterious mutations (lines 440-442). However, as explained above we feel that fully exploring this complex interaction would require a detailed investigation, which is clearly beyond the scope of the present study.

      Rather they only show the disappointing or less exciting result that fully recessive, weakly deleterious mutations (which I again think do not even exist in nature as I said above) have minor, to no effect across the classes of S-allele dominance. They provide no insight into whether any type of recessive deleterious mutation can accumulate under the S-allele dominance hierarchy, and that is the interesting question at hand. I would either remove these simulations or redo them in another approach. The authors never mention what simulation approach was used, so I can only assume this is custom, in-house code. Yet I do not find that code provided on the github page. I do not know if the lack of a distribution for h and s values is then a choice or a programming limitation, but I see it as one that should be overcome if these simulations are meant to be meaningful to the results of the study.

      The code we used (in C) was adapted from the previous study by Llaurens et al. (2009), which at the time was not deposited in a data repertory, unfortunately. With the agreement of the authors of that study, this code is now available on Github:

      (https://github.com/leveveaudrey/model_ssi_Llaurens; line 723).

      It is correct that our simulations were not aimed at determining whether “any type of recessive deleterious mutation can accumulate”, but we strongly believe that they help interpreting the observations made in the genomic data.

      Recommendations for the authors:

      Notes from the editor:

      I found Table 1 confusing, with column headings of observed proportion but perhaps numbers reflecting counts.

      Thank you for pointing out this confusion. There was indeed an error in the last column, which we have now corrected.

      I found Figure 2 a bit hard to parse, with the vertical lines being unclear and the x-axis ticks of insufficient resolution to evaluate the physical extent of the signals.

      We increased the size of the label on the x-axis and detailed it on the Figure 2, which is now hopefully more clear. Moreover, we increase the size of the vertical lines.

      Finally, I wonder, given the rapid decay of signal in lyrata, whether 25kb is the right choice for evaluating load and whether the pattern may look different on a smaller scale.

      It is true that the signal decays rapidly in A. lyrata, as can be seen in the haplotype structure analysis and in line with our previous analysis of the same populations Le Veve et al (MBE 2023; in this study we explored the effect of the choice of the size of the chromosomal region analyzed; lines 266-269). However, for the sake of comparison, we prefer to stick to the same window size. The fact that we still see an effect of dominance in spite of the lower statistical power associated with the more rapid decay (because a smaller number of genes is expected to be impacted) actually reinforces our conclusions.

      Reviewer #1 (Recommendations For The Authors):

      I have a few additional suggestions to improve the manuscript.

      (1) How does the load linked to the S-locus compare to that observed in other genomic regions? It would be useful to provide a comparison of the results quantified in Figures three and four to comparable genomic regions unlinked to the S-locus. How severe is the linked load?

      This comparison to the genomic background was actually the core of our previous study (Le Veve et al MBE 2023), which was based on the same populations. This analysis revealed that polymorphism of the 0-fold degenerate sites was more than twice higher in the 25kb immediately flanking the S-locus than in a series of 100 unlinked control regions. Here, the main focus of the present study is on the effect of linkage to particular S-alleles (which was not possible previously because haplotypes had to be phased).

      (2) Details of the GLM for data underlying Figures 3 and 4 are somewhat unclear. Is the key explanatory variable (Dominance) treated as continuous? Categorical? Ordinal etc…

      Dominance is considered as a continuous variable. We specify this in line 162 of the results, in the legends of Figures 3 and 4, in the Material and Method (lines 627 and 660) and in the legend of Table S4.

      (3) I had some trouble understanding the two different p-values in columns five and six of table one. Please provide more detail.

      We understand that the two p-values in Table 1 were confusing. The first was related to the binomial test and the second to the permutation test. To be consistent with the rest of the manuscript, we conserved only the p-value of the permutation test.

      (4) As mentioned in the "weaknesses" above, the authors should be more clear about what they are quantifying. They are explicitly counting the number of variants at 0-fold degenerate sites as a proxy for the genetic load. How good this proxy is is unclear. The most egregious misstatement here was on line 314 in which they make reference to the "total load." However, this limitation should be acknowledged throughout the manuscript and deserves more attention in the methods and discussion.

      As mentioned above, we now integrate additional methods to define and quantify the load (SIFT4G and SNPeff), which reinforced our previous conclusions (lines 271-272, 297-302).

      We clarified our wording and replaced the mention of “total load” by “mean number of linked deleterious mutations per copy of S-allele” (line 324-325). In the discussion we tried to better explain the limitations of approaches to estimate the genetic load (line 431-437).

      Reviewer #2 (Recommendations For The Authors):

      Line 60, it should be specified that this is only for recessive deleterious mutations.

      Non-recessive deleterious mutations would certainly not be expected to accumulate.

      As explained in details above, the question of whether and how non-recessive deleterious mutations can accumulate when linked to the S-locus is difficult and would in itself deserve a full treatment, which is clearly beyond the scope of the present study. We clarified this point on line 56.

    1. Reviewer #1 (Public Review):

      Summary:<br /> This paper presents a cognitive model of out-of-distribution generalisation, where the representational basis is grid-cell codes. In particular, the authors consider the tasks of analogies, addition, and multiplication, and the out-of-distribution tests are shifting or scaling the input domain. The authors utilise grid cell codes, which are multi-scale as well as translationally invariant due to their periodicity. To allow for domain adaptation, the authors use DPP-A which is, in this context, a mechanism of adapting to input scale changes. The authors present simulation results demonstrating that this model can perform out-of-distribution generalisation to input translations and re-scaling, whereas other models fail.

      Strengths:<br /> This paper makes the point it sets out to - that there are some underlying representational bases, like grid cells, that when combined with a domain adaptation mechanism, like DPP-A, can facilitate out-of-generalisation. I don't have any issues with the technical details.

      Weaknesses:<br /> The paper does leave open the bigger questions of 1) how one learns a suitable representation basis in the first place, 2) how to have a domain adaptation mechanism that works in more general settings other than adapting to scale. Overall, I'm left wondering whether this model is really quite bespoke or whether there is something really general here. My comments below are trying to understand how general this approach is.

      COMMENTS<br /> This work relies on being able to map inputs into an appropriate representational space. The inputs were integers so it's easy enough to map them to grid locations. But how does this transfer to making analogies in other spaces? Do the inputs need to be mapped (potentially non-linearly) into a space where everything is linear? In general, what are the properties of the embedding space that allows the grid code to be suitable? It would be helpful to know just how much leg work an embedding model would have to do.

      It's natural that grid cells are great for domain shifts of translation, rescaling, and rotation, because they themselves are multi-scaled and are invariant to translations and rotations. But grid codes aren't going to be great for other types of domain shifts. Are the authors saying that to make analogies grid cells are all you need? If not then what else? And how does this representation get learned? Are there lots of these invariant codes hanging around? And if so how does the appropriate one get chosen for each situation? Some discussion of the points is necessary as otherwise, the model seems somewhat narrow in scope.

      For effective adaptation of scale, the authors needed to use DPP-A. Being that they are relating to brains using grid codes, what processes are implementing DPP-A? Presumably, a computational module that serves the role of DPP-A could be meta-learned? I.e. if they change their task set-up so it gets to see domain shifts in its training data an LSTM or transformer could learn to do this. The presented model comparisons feel a bit of a straw man.

      I couldn't see it explained exactly how R works.

    2. Reviewer #2 (Public Review):

      Summary:<br /> This paper presents a model of out-of-distribution (OOD) generalization that focuses on modeling an analogy task, in which translation or scaling is tested with training in one part of the space and testing in other areas of the space progressively more distant from the training location. Similar tests were performed on arithmetic including addition and multiplication, and similarly impressive results appear for addition but not multiplication. The authors show that a grid cell coding scheme helps performance on these analogy and arithmetic tasks, but the most dramatic increase in performance is provided by a complex algorithm for distributional point-process attention (DPP-A) based on maximizing the determinant of the covariance matrix of the grid embeddings.

      Strengths:<br /> The results appear quite impressive. The results for generalization appear quite dramatic when compared to other coding schemes (i.e. one-hot) or when compared to the performance when ablating the DPP-A component but retaining the same inference modules using LSTM or transformers. This appears to be an important result in terms of generalization of results in an analogy space.

      Weaknesses:<br /> There are a number of ways that its impact and connection to grid cells could be enhanced. From the neuroscience perspective, the major comments concern making a clearer and stronger connection to the actual literature on grid cells and grid cell modeling, and discussing the relationship of the complex DPP-A algorithm to biological circuits.

      Major comments:<br /> 1. They should provide more citations to other groups that have explored analogy using this type of task. Currently, they only cite one paper (Webb et al., 2020) by their own group in their footnote 1 which used the same representation of behavioral tasks for generalization of analogy. It would be useful if they could cite other papers using this simplified representation of analogy and also show the best performance of other algorithms from other groups in their figures, so that there is a sense of how their results compare to the best previous algorithm by other groups in the field (or they can identify which of their comparison algorithms corresponds to the best of previously published work).

      2. While the grid code they use is very standard and based on grid cell researchers (Bicanski and Burgess, 2019), the rest of the algorithm doesn't have a clear claim on biological plausibility. It has become somewhat standard in the field to ignore the problem of how the brain could biologically implement the latest complex algorithm, but it would be useful if they at least mention the problem (or difficulty) of implementing DPP-A in a biological network. In particular, does maximizing the determinant of the covariance matrix of the grid code correspond to something that could be tested experimentally?

      3. Related to major comment 2., it would be very exciting if they could show what the grid code looks like after the attentional modulation inner product xT w has been implemented. This could be highly useful for experimental researchers trying to connect these theoretical simulation results to data. This would be most intuitive to grid cell researchers if it is plotted in the same format as actual biological experimental data - specifically which grid cell codes get strengthened the most (beyond just the highest frequencies).

      4. To enhance the connection to biological systems, they should cite more of the experimental and modeling work on grid cell coding (for example on page 2 where they mention relational coding by grid cells). Currently, they tend to cite studies of grid cell relational representations that are very indirect in their relationship to grid cell recordings (i.e. indirect fMRI measures by Constaninescu et al., 2016 or the very abstract models by Whittington et al., 2020). They should cite more papers on actual neurophysiological recordings of grid cells that suggest relational/metric representations, and they should cite more of the previous modeling papers that have addressed relational representations. This could include work on using grid cell relational coding to guide spatial behavior (e.g. Erdem and Hasselmo, 2014; Bush, Barry, Manson, Burges, 2015). This could also include other papers on the grid cell code beyond the paper by Wei et al., 2015 - they could also cite work on the efficiency of coding by Sreenivasan and Fiete and by Mathis, Herz, and Stemmler.

    1. writing code, reviewing code, deploying configs to harden environments, reading CVEs to know just how bad that vulnerability in our environment is and where it prioritize it in patching and what it could affect, trying to make sense of logs to determine if that oddity is an indicator of compromise or not
    1. AmbientLight in Three.js

      Overview: AmbientLight is a type of light in Three.js that illuminates all objects in the scene equally without a specific direction. This light type is generally used to provide a basic level of illumination across the entire scene, helping to ensure that all objects are visible, regardless of their position or orientation.

      Characteristics: - Global Illumination: It illuminates all objects equally, which means it doesn't create shadows or highlights. - No Shadows: Since AmbientLight does not have a direction, it cannot be used to cast shadows.

      Code Example

      Here's a simple example of how to use AmbientLight in a Three.js scene:

      ```javascript // Import Three.js import * as THREE from 'three';

      // Create a new scene const scene = new THREE.Scene();

      // Create an AmbientLight with a soft white color const light = new THREE.AmbientLight(0x404040); // soft white light

      // Add the light to the scene scene.add(light); ```

      In this example: - We create a new THREE.AmbientLight with a color value of 0x404040, which is a soft white light. - We add the light to the scene using scene.add(light).

      Constructor

      The AmbientLight constructor in Three.js takes two optional parameters: color and intensity.

      javascript const light = new THREE.AmbientLight(color, intensity);

      • color: (optional) The RGB color of the light, represented as an integer. The default value is 0xffffff (white).
      • intensity: (optional) The intensity or strength of the light. The default value is 1.

      Example with Parameters:

      ```javascript // Create an AmbientLight with a specific color and intensity const light = new THREE.AmbientLight(0xff0000, 0.5); // red light with half intensity

      // Add the light to the scene scene.add(light); ```

      In this example: - We create a new THREE.AmbientLight with a red color (0xff0000) and an intensity of 0.5.

      Properties

      AmbientLight inherits properties from the base Light class. Some common properties include:

      • color: The color of the light.
      • intensity: The intensity of the light.

      Specific to AmbientLight:

      • .isAmbientLight: This is a read-only boolean property that allows you to check if an object is an instance of AmbientLight.

      Example:

      javascript if (light.isAmbientLight) { console.log('This light is an AmbientLight.'); }

      Methods

      AmbientLight also inherits methods from the base Light class. These methods allow you to interact with and manipulate the light in various ways.

      For example, you can set the color and intensity of the light:

      ```javascript // Set the color of the light light.color.set(0x00ff00); // green light

      // Set the intensity of the light light.intensity = 0.8; ```

      Source

      The AmbientLight class is defined in src/lights/AmbientLight.js within the Three.js library. This is where the implementation details for the AmbientLight class can be found.

      Summary

      AmbientLight is a basic light source in Three.js that provides global illumination to all objects in the scene without casting shadows. It's useful for ensuring that all objects are visible and can be combined with other types of lights to achieve more complex lighting effects.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review): 

      Wang, He et al have constructed comprehensive single nucleus atlas for the gills of the deep sea Bathymodioline mussels, which possess intracellular symbionts that provide a key source of carbon and allow them to live in these extreme environments. They provide annotations of the different cell states within the gills, shedding light on how multiple cell types cooperate to give rise to the emergent functions of the composite tissues and the gills as a whole. They pay special attention to characterizing the bacteriocyte cell populations and identifying sets of genes that may play a role in their interaction with the symbiotes. 

      Wang, He et al sample mussels from 3 different environments: animals from their native methane rich environment, animals transplanted to a methane-poor environment to induce starvation and animals that have been starved in the methane-poor environment and then moved back to the methane-rich environment. They demonstrated that starvation had the biggest impact on bacteriocyte transcriptomes. They hypothesize that the up-regulation of genes associated with lysosomal digestion leads to the digestion of the intracellular symbiont during starvation, while the non-starved and reacclimated groups more readily harvest the nutrients from symbiotes without destroying them. Further work exploring the differences in symbiote populations between ecological conditions will further elucidate the dynamic relationship between host and symbiote. This will help disentangle specific changes in transcriptomic state that are due to their changing interactions with the symbiotes from changes associated with other environmental factors. 

      This paper makes available a high quality dataset that is of interest to many disciplines of biology. The unique qualities of this non-model organism and collection of conditions sampled make it of special interest to those studying deep sea adaptation, the impact of environmental perturbation on Bathymodioline mussels populations, and intracellular symbiotes. The authors also use a diverse array of tools to explore and validate their data. 

      Reviewer #2 (Public Review): 

      Wang, He et al. shed insight into the molecular mechanisms of deep-sea chemosymbiosis at the single-cell level. They do so by producing a comprehensive cell atlas of the gill of Gigantidas platifrons, a chemosymbiotic mussel that dominates the deep-sea ecosystem. They uncover novel cell types and find that the gene expression of bacteriocytes, the symbiont-hosting cells, supports two hypotheses of host-symbiont interactions: the "farming" pathway, where symbionts are directly digested, and the "milking" pathway, where nutrients released by the symbionts are used by the host. They perform an in situ transplantation experiment in the deep sea and reveal transitional changes in gene expression that support a model where starvation stress induces bacteriocytes to "farm" their symbionts, while recovery leads to the restoration of the "farming" and "milking" pathways. 

      A major strength of this study includes the successful application of advanced single nucleus techniques to a non-model, deep sea organism that remains challenging to sample. I also applaud the authors for performing an in situ transplantation experiment in a deep sea environment. From gene expression profiles, the authors deftly provide a rich functional description of G. platifrons cell types that is well-contextualized within the unique biology of chemosymbiosis. These findings offer significant insight into the molecular mechanisms of deep-sea host-symbiont ecology, and will serve as a valuable resource for future studies into the striking biology of G. platifrons. 

      The authors' conclusions are generally well-supported by their results. However, I recognize that the difficulty of obtaining deep-sea specimens may have impacted experimental design and no replicates were sampled. 

      It is notable that the Fanmao cells were much more sparsely sampled. It appears that fewer cells were sequenced, resulting in the Starvation and Reconstitution conditions having 2-3x more cells after doublet filtering. These discrepancies also are reflected in the proportion of cells that survived QC, suggesting a distinction in quality or approach. However, the authors provide clear and sufficient evidence via bootstrapping that batch effects between the three samples are negligible. While batch effect does not appear to have affected gene expression profiles, the proportion of cell types may remain sensitive to sampling techniques, and thus interpretation of Fig. S12 must be approached with caution. 

      Reviewer #3 (Public Review): 

      Wang et al. explored the unique biology of the deep-sea mussel Gigantidas platifrons to understand fundamental principles of animal-symbiont relationships. They used single-nucleus RNA sequencing and validation and visualization of many of the important cellular and molecular players that allow these organisms to survive in the deep-sea. They demonstrate that a diversity of cell types that support the structure and function of the gill including bacteriocytes, specialized epithelial cells that host sulfur-oxidizing or methane-oxidizing symbionts as well as a suite of other cell types including supportive cells, ciliary, and smooth muscle cells. By performing experiments of transplanting mussels from one habitat which is rich in methane to methane-limited environments, the authors showed that starved mussels may consume endosymbionts versus in methane-rich environments upregulated genes involved in glutamate synthesis. These data add to the growing body of literature that organisms control their endosymbionts in response to environmental change. 

      The conclusions of the data are well supported. The authors adapted a technique that would have been technically impossible in their field environment by preserving the tissue and then performing nuclear isolation after the fact. The use of single-nucleus sequencing opens the possibility of new cellular and molecular biology that is not possible to study in the field. Additionally, the in-situ data (both WISH and FISH) are high-quality and easy to interpret. The use of cell-type-specific markers along with a symbiont-specific probe was effective. Finally, the SEM and TEM were used convincingly for specific purposes in the case of showing the cilia that may support water movement. 

      The one particular area for future exploration surrounds the concept of a proliferative progenitor population within the gills. The authors recover molecular markers for these putative populations and additional future work will uncover if these are indeed proliferative cells contribute to symbiont colonization. 

      Overall the significance of this work is identifying the relationship between symbionts and bacteriocytes and how these host bacteriocytes modulate their gene expression in response to environmental change. It will be interesting to see how similar or different these data are across animal phyla. For instance, the work of symbiosis in cnidarians may converge on similar principles of there may be independent ways in which organisms have been able to solve these problems. 

      We extend our sincere gratitude to all the reviewers for their positive comments and kind words. We highly value the substantial efforts they made in helping us improve and enhance our manuscript. Additionally, we appreciate the reviewers for pointing out the limitations of our current study, which will guide us in improving our future researches.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      This study system is so interesting and this is a truly unique and exciting dataset. Most of my suggestions are aimed at improving readability and making it more accessible for a broader audience, since I predict many fields will find it interesting. 

      Line 60: which species of mussel? Is this the same one? 

      We appreciate the comments from the reviewer. The reference here is to deep-sea bathymodiolin mussels, which, in most cases, possess enlarged gill filaments that accommodate symbionts.

      Line 237-230: citation of previous findings missing 

      We appreciate the comments from the reviewer. After carefully reviewing these paragraphs, we believe that all the previous findings have now been properly cited.

      Line 256: it might be a good idea to give a brief description of what slingshot analysis is here 

      We appreciate the comments from the reviewer. We have revise the corresponding part of our manuscript to make it clear.

      This parts of manscript now reads: “We performed Slingshot analysis, which uses a cluster-based minimum spanning tree (MST) and a smoothed principal curve to determine the developmental path of cell clusters. The re-sult shows that the PEBZCs might be the origin of all gill epithelial cells, including the other two proliferation cells (VEPC and DEPC) and bacteriocytes (Supplementary Fig. S6).” Line 203-207 of the revised manscript.

      Line 289: Wording is a bit confusing- what is meant by morphological analysis?

      We acknowledge that our wording might be a bit confusing here. We are referring to the TEM ultrastructural analysis. Therefore, we have changed “morphological analysis” to “ultrastructural analysis.” Line231 in the revised manuscript.

      Line 351-354: how did you calculate distances? How many dimensions were used? 

      We calculated the centroid coordinates for each cell type in each state on the 2-dimensional UMAP plot (Fig. 6A). Then, for each cell type, we determined the Euclidean distance between the centroid coordinates of each pair of states. We have revised the manuscript with this more detailed description. Line 292-295 of revised manuscript.

      Line 462: identify -> identified 

      We apologize for our mistake and appreciate the reviewer’s kind assistance with proofreading. The typo has been corrected in the new version. Line396 of the revised manscript.

      Line 509: what does the size of the dot represent? 

      In this context, the color and intensity of each dot represent a specific gene’s expression level in the single-cell cluster. The dot size is universal and therefore does not convey a specific meaning.

      Fig 3A: What is the blue cluster highlighted? 

      We apologize for our mistake. The label for the teal box was missed. We have corrected our mistake in the revised manuscript.

      Fig 3K: Wording in key is confusing. 

      We have modified our description of Fiugre 3K in the figure legneds. Now it reads: “Schematic of water flow agitated by different ciliary cell types. The color of arrowheads corresponds to water flow potentially influenced by specific types of cilia, as indicated by their color code in Figure 3A.” Line462-464 in the revised manscript.

      Fig 5B: which population of mussels was used to take these images? 

      These mussels from “Fanmao” (methane rich) site were used to take these images. We have revised our material and methods to make it clear. Line602-603 of the revised manuscript.

      Fig 5E,5G,5H: panels not referenced in text 

      We apologize for our mistake and appreciate the reviewer’s thorough reading. This error has been corrected in the new version of the manuscript. Line233 of the revised manuscript.

      Reviewer #2 (Recommendations For The Authors): 

      Minor comments: 

      Fig. 3A - the teal box in the legend lacks a label 

      We apologize for our mistake. The label for the teal box was missed. We have corrected our mistake in the

      Reviewer #3 (Recommendations For The Authors): 

      My enthusiasm for the manuscript remains high and I appreciate the authors care in responding to the various reviewer questions and concerns. 

      In regards to the cell proliferation results, I have modified my public review and look forward to your future work in this area. The data for both pHistone H3 and anti PCNA are compelling! 

      One typo I did catch occurs on line 520. I believe you meant to say "outer" not "otter." 

      We apologize for our mistake and appreciate the reviewer’s kind assistance with proofreading. The typo has been corrected in the new version.

    1. citing a lack of comprehension for AI-generated code translation and “spotting errors in‘foreign’ code” as challenges.

      I wonder if similar challenges can be found in LLM powered data analysis. Here the problem would be the ease of verification and spotting inaccuracies in AI's reason behind decision making.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewers

      We thank the reviewers for their comments and suggestions, which we think are helpful and will improve the manuscript, and intend to address with the changes and planned revisions below.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Bello et al look at the SNP rs28834970 associated with Alzheimer's disease (AD), with C being the risk allele, on chromatin accessibility and expression of a nearby gene, PTK2B, in microglia. Their contention is that the single SNP affects chromatin accessibility and binding of the transcription factor CEBP[beta] in an intronic region of PTK2B and thereby affects PTKB expression. I had a few questions that I think are critical to be addressed. Please note that my numbering of panels is based on the figures, not the legends, which do not seem to quite agree with each other. There are also some figure legends that say "IFNg" while the figures say "LPS", which should be fixed.

      We apologise for the mistake in the figure legend that made this confusing, which we have now revised.

      The abstract says that editing a line that is homozygous for protective alleles to homozygous for risk results in "subtle downregulation of PTK2B expression". It isn't clear to me that the presented data fully supports this contention, which is central to the argument of the paper. In figure 2e, the authors show in both RNAseq and ddPCR that there is numerically lower PTK2B expression but this is not indicated to be statistically significant by one-way paired ANOVA. If there is no nominally significant difference in the edited lines, compared to the proposed significant differences in lines carrying the full risk haplotype (figure 1), then it would not seem sensible to ascribe the effects to the single edited base pair.

      We agree with the reviewer that given the effect of the SNP on PTK2B expression in the edited lines is small and only significant in macrophages, we should not interpret the effects to be mediated solely through PTK2B expression, and have substantially reworded the manuscript accordingly.

      Whilst the effects in the eQTL analysis are significant, it is worth noting that this is likely due to the much larger number of donors (133-217) giving greater power to detect the subtle changes in expression (~1.1 to 2 fold in eQTL). This change is of a similar magnitude in our SNP edited lines (~1.2 fold in SNP edited lines) as would be expected of most common regulatory variants so we believe that it could be the primary causal variant. However, we cannot exclude that other variants in the haplotype could contribute to the effect, so have also reworded accordingly to make this clear.

      Given this uncertainty about the overall strength of effect of the single base pair change it would seem important to evaluate the proposed mechanism of CEBPb binding. It wasn't clear whether the ATAC-seq data summarized in the volcano plot in 2C is proposed to be a cause or a consequence of the CEBPb binding change. I am assuming that the 'fold change' estimate here is CC compared to TT, which would be consistent with direction of effect in figure 1, but please clarify.

      We apologise for the mistake in the figure legend that made this confusing, which we have now revised along with clarification in the revised text. It is difficult to be sure whether changes in chromatin accessibility are a cause or consequence of CEBPb binding, but the fact that the binding of CEBPb is increased in the CC allele (Fig 2a, Fig 2c), that the C allele better matches the consensus sequence (Fig 2b) and there is increased chromatin accessibility (Fig 2a, Supp Fig 3b) suggests that CEBPb binding is causing the formation of the region of chromatin accessibility.

      In contrast to the subtle effects at PTK2B, the global transcriptional effects in figure 3 look quite strong. Are any of these changes dependent on PTK2B, that is to say, are they mimicked by partial suppression of PTK2B expression or activity?

      We agree that the downstream effects of the SNP are much stronger than the effects on PTK2B expression, and we have substantially reworded the manuscript to make it clear that we are unsure that the effects of the SNP are all mediated via PTK2B.

      However, we note that there is evidence in the literature of a loss in CCL4 and CCL5 expression upon PTK2B knockout in macrophages (https://www.nature.com/articles/s41467-021-27038-5) and inhibition of PTK2B in monocytes results in a reduction in CCL5 and CXCL1 (https://www.nature.com/articles/s41598-019-44098-2) consistent with our observations.

      Experiments to manipulate PTK2B expression in microglia and readout changes at the RNA level would take a few months to complete, but we would be willing to do this if the reviewer felt this was necessary.

      Finally, in figure 4, it should be clarified as to why lower expression of PTK2B would be expected to have a detrimental effect on Alzheimer's risk. If understood correctly, and again fixing the figure legends would be helpful, the CC edited lines (risk) have lower chemokine induction than the unedited TT lines.

      We apologise for the error in this figure which we have corrected in the revised version. You are correct that the CC lines have a lower chemokine level in both unstimulated and stimulated cells, and we have now discussed further how this may be linked to increased disease risk.

      "Even though overexpression of these chemokines is characteristic of neuroinflammation, correlated with disease progression and found in late stages of AD, knockout of chemokines, such as CCL2, and chemokine receptors, such as CCR2 and CCR5, in mice is associated with increased Aβ deposition and accumulation [47,50-52,107]. It has also been found that patients carrying CCR5Δ32 mutation, which prevents CCR5 surface expression, develop AD at a younger age[108]. Therefore, we hypothesize that in individuals carrying the C/C allele of rs28834970 downregulation of these chemokines in macrophages and microglia harbouring the C/C allele of rs28834970 affects Aβ-induced microglia chemotaxis, leukocytes recruitment and clearance of Aβ, and may increase the risk of developing symptomatic AD"

      Reviewer #1 (Significance (Required)):

      Going from GWAS hits, which represent blocks of high LD inherited variants, to single functional variants is a difficult problem in human genetics. The current paper attempts to isolate the effect of a single variant within an LD block on IPSC derived macrophages and microglia. This idea might be useful in nominating PTK2B as a therapeutic target for AD, although there is some question in my mind as to direction of effect.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      SUMMARY: In this manuscript the authors explore the biological effects of an intronic SNP in the PTK2B gene, previously shown to be associated with late onset Alzheimer's disease (AD) risk. Based on the likely effect of the SNP locus on PTK2B expression in the macrophage lineage, the authors explore the consequences of introducing with the Crispr/Cas9 technique the biallelic SNP base change (C/C vs T/T) in a human IPSC line that is then differentiated into macrophages or microglia. They observe that C/C increases chromatin accessibility and CEBPb binding in comparison to T/T, with a slight decrease in PTK2B expression, significant in macrophages but not in microglia. The authors then investigate the transcriptome changes induced by the C/C mutation and find alteration in many genes, including a decreased expression of a number of cytokine or receptor proteins involved in inflammatory responses. The authors also mention a decreased effect on IFNg-induced reduced mobility but the data are missing (see Figure errors below). Overall the authors propose that the risk SNP is associated with a decreased PTK2B expression and hypothesize a link between this change and a decreased function of macrophages/microglia that may contribute to AD pathology.

      MAJOR COMMENTS

      1- The authors claim that their results show that the investigated SNP has a causal effects in "microglial function" (Title) and in Alzheimer's disease (AD) (Abstract 2nd sentence "Here we validate a causal single nucleotide polymorphism (SNP) associated with an increased risk of Alzheimer's disease". The word "causal" is repeated many times. However the authors should qualify their claim with respect to AD. Their results do show that the SNP has an effect on chromatin accessibility, CEBP binding, PTK2B expression and transcriptome, but the link between these changes is not formally demonstrated and their potential role in AD-like phenotype is not explored. The "causal" role is not formally and logically demonstrated. It remains an interesting, plausible hypothesis and the results provide strong arguments in support of that hypothesis but do not prove it, yet.

      Concerning the title, "causal effects on microglial function" is awkward, anything that has effects is logically "causal" in these effects. The title should be "... has effects on microglial functions" or "... alters microglial function".

      We agree with the reviewer that given the effect of the SNP on PTK2B expression in the edited lines is small and only significant in macrophages, we should not interpret the effects to be mediated solely through PTK2B expression, or that they cause AD. We have substantially reworded the manuscript throughout to account for this.

      2- One major difficulty in the results is to link the slight decrease in PTK2B transcript, which is only significant in macrophages, with the rest of the phenotype. Because what matters to make this link is not the mRNA but the protein, and because mRNA levels are often not strictly correlated with the protein levels, the authors should measure the PTK2B/PYK2 protein levels in their differentiated cell lines in basal conditions and following activation (as they do for other readouts) using immunoblotting. A robust and significant diminution in PYK2 protein would strongly support its role in linking PTK2B expression and transcriptome change.

      We have performed preliminary analyses of PTK2B expression by Western blot in these cell lines after differentiation, but were unable to observe a significant change in abundance in the edited cell lines. This is not unexpected given the results at the RNA level, since the effect size of this common regulatory variant is likely very small (estimated to be ~1.2 fold from the eQTL analysis), and likely within the variability of this assay.

      As mentioned above, we have reworded the manuscript to avoid interpreting that the effects of rs28834970 are mediated solely through effects on PTK2B expression. We think that an experiment to manipulate PTK2B levels (see next point) may be a better way to demonstrate whether these effects are mediated through PTK2B expression.

      An optional additional key experiment would be to reverse the transcriptome phenotype by increasing the expression of PTK2B (e.g. by cDNA transfection). Note that these points are important because an alternative hypothesis to explain the effects of C/C mutation on macrophage function would be that the C/C mutation has a long distance effect on other chromatin regions with key role in regulating these cells.

      We agree that this would be a valuable experiment, and are planning additional experiments to investigate the effect of manipulating PTK2B levels (through knockout) on microglia.

      3- The manuscript contains several errors in the figures and figure legends. In Fig. 2 the legends for the figure items are shuffled. Figure 4 and Supplementary Figure 5 are duplicates of the same one. Consequently important data are not presented.

      We apologise for the errors in these figures that were due to a mistake during uploading where the incorrect versions were used. The legends for figure 2 and panels in figure 4 have now been corrected, and show the effects of rs28834970 on microglial migration and chemokine release in the presence or absence of IFNg.

      4- When the number of replicates is small (e.g. n = 3) it is preferable to use non parametric tests (rank analysis, e.g. Mann Whitney's test) rather than t test. This applies to Figures 2D (current legend 2A), 2E (current legend 2B), Figure 4A-C, Supplementary Figures 2A, 2B. In Supplementary Fig 4E (MARCO) the number of replicates (presumably 3 because based on RNAseq) and the used test are not indicated. Is it the RNAseq statistical analysis?

      We thank the reviewer for this comment. We acknowledge that the t-test may lead to inflated false discovery rates. However, it has been shown that for small sample sizes parametric tests have a power advantage compared to non-parametric ones that may outweigh the possibly exaggerated false positives. See https://genomebiology.biomedcentral.com/articles/10.1186/s13059-022-02648-4#Sec3 which states:

      "In conclusion, when the per-condition sample size is less than 8, parametric methods may be used because their power advantage may outweigh their possibly exaggerated false positives."

      We have also modified the legend of supplementary figure 4E to clarify the number of replicates used.

      5- In addition to the above comment on tests, when the number of replicates is small it is not appropriate (and misleading) to show box plots or bars with SEM. In the indicated figures the individual data points should be shown.

      We now show individual replicates on box plots (Figure 2D, 2E and supp figure 4E).

      MINOR COMMENTS:

      a- Macrophages and microglia are very similar cell types. Could the authors comment more on the differences they observe and how they are related to those previously described?

      We have now referenced the original papers and commented on the markers that we see differentially expressed, notably P2RY12 which is a key homeostatic microglia marker that distinguishes these cells from macrophages.

      b- In Fig. 2A CEBPb cut and run plot, the differences are not limited to the SNP immediate vicinity, there are also visible differences between T/T and C/C plots in at least a 40-kb range. Is it due to multiple interactions of CEBPb? How can the point difference have broad consequences? Please explain this potentially interesting and relevant finding.

      Whilst there may be small changes in CEBPb binding at the second intronic PTK2B chromatin peak, this is not statistically significant given the variability between repeats. In fact, the only significant change we see in CEBPb binding genome-wide is at the locus overlapping the SNP (Fig 2c).

      c- Potentially cis-altered genes near the SNP include CHRNA2 and EPHX2 (see Sup. Fig. 3a). Their expression may not be detected in macrophage lineage. If this is the case please indicate in the text, otherwise please include the corresponding data in Sup. Fig. 3b to show the presence or absence of SNP-induced change.

      You are correct that CHRNA2 and EPHX2 are not expressed in our macrophages or microglia, and we have now explicitly stated this in the revised text.

      d- In general the Figures are not of very high quality and are difficult to read or understand without constantly going back and forth to the legends (which are mislabeled in some instances). To improve:

      . Please increase font size whenever possible.

      . Please improve Fig. 1d by indicating the position of the SNP, numbering the exons (an intermediate scale plot may be necessary and lines on bottom trace are hardly visible).

      . Please indicate the correct color code for T/T and C/C in Fig 3a and b, left panels, which currently doesn't match.

      . Please label the Venn's diagrams comparisons in Sup. Fig. 4b.

      . In the text and legends the Figure items are identified with letters in upper case, in the figures they are in lower case. Please be consistent.

      We have improved the resolution of the images in the pdf and Fig 1d has been revised to include the position of the SNP. The colour code for T/T and C/C is correct in fig 3a and 3b, but since the PCA plots are independently created, we would not always expect the position of the T/T and C/C alleles to be the same. The Venn diagrams in Sup Fig 4b have been updated, and the letters for the figure panels made consistently upper case throughout.

      e- In Fig. 2D and 2E, the Y axes should start at zero to avoid artificially increasing the visual differences. If there is a strong reason not to do so (I don't see any here), the Y axis should be clearly interrupted to avoid confusion.

      We have altered this accordingly.

      f- In the introduction the authors provide some background about previous work about the potential role of PTK2B/PYK2 in AD pathophysiology. The cited preclinical results suggest that PTK2B activity could have a deleterious effect (references in the manuscript). In contrast, some other reports (PMID: 29803828, 33718872) suggest a protective effect of PTK2B/PYK2. Because the evidence in the current manuscript suggests that the risk-associated SNP results in a decreased function of PTK2B/PYK2 (through decreased levels), at least in cells of the macrophage lineage, the authors could broaden their discussion to include these results.

      We have now discussed the conflicting evidence in the revised manuscript.

      Reviewer #2 (Significance (Required)):

      ADVANCE: Late onset Alzheimer's disease is a major medical issue. It has a complex genetic risk component with many associated loci identified in GWAS. Most of these have only a small individual impact on the risk. One of the SNPs associated with increased risk (rs28834970) is located in an intron of the PTK2B gene. Although various reports have investigated the role of the PTK2B gene product, the tyrosine kinase PYK2, in several AD models, the possible link with rs28834970, is unclear.

      An important point is to determine whether TàC SNP corresponding to rs28834970 alters PTK2B expression and how it does so. An alternative hypothesis could be that the SNP has a strong linkage disequilibrium with an unidentified allele in human populations that could be responsible for AD risk. The current manuscript is a significant step forward in addressing that question. By generating a biallelic C/C SNP mutation in a human IPSC line the current study allows to eliminate such linked contribution.

      The strength of the manuscript is to show an effect on chromatin accessibility, CEBP binding and possibly PTK2B transcripts. It also provides interesting evidence of a broad effect of the C/C mutation on the transcriptome of macrophage lineage cells. In its current form the manuscript presents weaknesses that could be improved. These flaws include issues with the presentation discussed above and the uncomplete demonstration that it is the decrease in PTK2B expression that causes the macrophage/microglia phenotype. If these flaws were overcome the paper would represent a significant advance.

      AUDIENCE: The expected audience is specialized in AD with a possible broader range if all weaknesses are addressed.

      REVIEWER EXPERTISE: Basic science close to the field.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary: In this manuscript the authors explore the biological effects of an intronic SNP in the PTK2B gene, previously shown to be associated with late onset Alzheimer's disease (AD) risk. Based on the likely effect of the SNP locus on PTK2B expression in the macrophage lineage, the authors explore the consequences of introducing with the Crispr/CAS9 technique the biallelic SNP base change (C/C vs T/T) in a human IPSC line that is then differentiated into macrophages or microglia. They observe that C/C increases chromatin accessibility and CEBPb binding in comparison to T/T, with a slight decrease in PTK2B expression, significant in macrophages but not in microglia. The authors then investigate the transcriptome changes induced by the C/C mutation and find alteration in many genes, including a decreased expression of a number of cytokine or receptor proteins involved in inflammatory responses. The authors also mention a decreased effect on IFNg-induced reduced mobility but the data are missing (see Figure errors below). Overall the authors propose that the risk SNP is associated with a decreased PTK2B expression and hypothesize a link between this change and a decreased function of macrophages/microglia that may contribute to AD pathology.

      Major comments:

      1. The authors claim that their results show that the investigated SNP has a causal effects in "microglial function" (Title) and in Alzheimer's disease (AD) (Abstract 2nd sentence "Here we validate a causal single nucleotide polymorphism (SNP) associated with an increased risk of Alzheimer's disease". The word "causal" is repeated many times. However the authors should qualify their claim with respect to AD. Their results do show that the SNP has an effect on chromatin accessibility, CEBP binding, PTK2B expression and transcriptome, but the link between these changes is not formally demonstrated and their potential role in AD-like phenotype is not explored. The "causal" role is not formally and logically demonstrated. It remains an interesting, plausible hypothesis and the results provide strong arguments in support of that hypothesis but do not prove it, yet. Concerning the title, "causal effects on microglial function" is awkward, anything that has effects is logically "causal" in these effects. The title should be "... has effects on microglial functions" or "... alters microglial function".
      2. One major difficulty in the results is to link the slight decrease in PTK2B transcript, which is only significant in macrophages, with the rest of the phenotype. Because what matters to make this link is not the mRNA but the protein, and because mRNA levels are often not strictly correlated with the protein levels, the authors should measure the PTK2B/PYK2 protein levels in their differentiated cell lines in basal conditions and following activation (as they do for other readouts) using immunoblotting. A robust and significant diminution in PYK2 protein would strongly support its role in linking PTK2B expression and transcriptome change. An optional additional key experiment would be to reverse the transcriptome phenotype by increasing the expression of PTK2B (e.g. by cDNA transfection). Note that these points are important because an alternative hypothesis to explain the effects of C/C mutation on macrophage function would be that the C/C mutation has a long distance effect on other chromatin regions with key role in regulating these cells.
      3. The manuscript contains several errors in the figures and figure legends. In Fig. 2 the legends for the figure items are shuffled. Figure 4 and Supplementary Figure 5 are duplicates of the same one. Consequently important data are not presented.
      4. When the number of replicates is small (e.g. n = 3) it is preferable to use non parametric tests (rank analysis, e.g. Mann Whitney's test) rather than t test. This applies to Figures 2D (current legend 2A), 2E (current legend 2B), Figure 4A-C, Supplementary Figures 2A, 2B. In Supplementary Fig 4E (MARCO) the number of replicates (presumably 3 because based on RNAseq) and the used test are not indicated. Is it the RNAseq statistical analysis?
      5. In addition to the above comment on tests, when the number of replicates is small it is not appropriate (and misleading) to show box plots or bars with SEM. In the indicated figures the individual data points should be shown.

      Minor comments:

      • a. Macrophages and microglia are very similar cell types. Could the authors comment more on the differences they observe and how they are related to those previously described?
      • b. In Fig. 2A CEBPb cut and run plot, the differences are not limited to the SNP immediate vicinity, there are also visible differences between T/T and C/C plots in at least a 40-kb range. Is it due to multiple interactions of CEBPb? How can the point difference have broad consequences? Please explain this potentially interesting and relevant finding.
      • c. Potentially cis-altered genes near the SNP include CHRNA2 and EPHX2 (see Sup. Fig. 3a). Their expression may not be detected in macrophage lineage. If this is the case please indicate in the text, otherwise please include the corresponding data in Sup. Fig. 3b to show the presence or absence of SNP-induced change.
      • d. In general the Figures are not of very high quality and are difficult to read or understand without constantly going back and forth to the legends (which are mislabeled in some instances). To improve:
        • Please increase font size whenever possible.
        • Please improve Fig. 1d by indicating the position of the SNP, numbering the exons (an intermediate scale plot may be necessary and lines on bottom trace are hardly visible).
        • Please indicate the correct color code for T/T and C/C in Fig 3a and b, left panels, which currently doesn't match.
        • Please label the Venn's diagrams comparisons in Sup. Fig. 4b.
        • In the text and legends the Figure items are identified with letters in upper case, in the figures they are in lower case. Please be consistent.
      • e. In Fig. 2D and 2E, the Y axes should start at zero to avoid artificially increasing the visual differences. If there is a strong reason not to do so (I don't see any here), the Y axis should be clearly interrupted to avoid confusion.
      • f. In the introduction the authors provide some background about previous work about the potential role of PTK2B/PYK2 in AD pathophysiology. The cited preclinical results suggest that PTK2B activity could have a deleterious effect (references in the manuscript). In contrast, some other reports (PMID: 29803828, 33718872) suggest a protective effect of PTK2B/PYK2. Because the evidence in the current manuscript suggests that the risk-associated SNP results in a decreased function of PTK2B/PYK2 (through decreased levels), at least in cells of the macrophage lineage, the authors could broaden their discussion to include these results.

      Significance

      Advance: Late onset Alzheimer's disease is a major medical issue. It has a complex genetic risk component with many associated loci identified in GWAS. Most of these have only a small individual impact on the risk. One of the SNPs associated with increased risk (rs28834970) is located in an intron of the PTK2B gene. Although various reports have investigated the role of the PTK2B gene product, the tyrosine kinase PYK2, in several AD models, the possible link with rs28834970, is unclear.

      An important point is to determine whether TC SNP corresponding to rs28834970 alters PTK2B expression and how it does so. An alternative hypothesis could be that the SNP has a strong linkage disequilibrium with an unidentified allele in human populations that could be responsible for AD risk. The current manuscript is a significant step forward in addressing that question. By generating a biallelic C/C SNP mutation in a human IPSC line the current study allows to eliminate such linked contribution.

      The strength of the manuscript is to show an effect on chromatin accessibility, CEBP binding and possibly PTK2B transcripts. It also provides interesting evidence of a broad effect of the C/C mutation on the transcriptome of macrophage lineage cells. In its current form the manuscript presents weaknesses that could be improved. These flaws include issues with the presentation discussed above and the uncomplete demonstration that it is the decrease in PTK2B expression that causes the macrophage/microglia phenotype. If these flaws were overcome the paper would represent a significant advance.

      Audience: The expected audience is specialized in AD with a possible broader range if all weaknesses are addressed.

      Reviewer Expertise: Basic science close to the field.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The current study aims to quantify associations between the regular use of proton-pump inhibitors (PPI) - defined as using PPI most days of the week during the last 4 weeks at one cross-section in time - with several respiratory outcomes up to several years later in time. There are 6 respiratory outcomes included: risk of influenza, pneumonia, COVID-19, other respiratory tract infections, as well as COVID-19 severity and mortality).

      Strengths:

      Several sensitivity analyses were performed, including i) estimation of the e-value to assess how strong unmeasured confounders should be to explain observed effects, ii) comparison with another drug with a similar indication to potentially reduce (but not eliminate) confounding by indication.

      We are grateful for your pointing out the strengths in our article, particularly the assessment of e-values and the comparison with another medication to mitigate confounding by indication. We extend our sincere gratitude to the reviewer for identifying multiple concerns and offering constructive feedback to help improve our manuscript. We will incorporate these suggestions into our revisions.

      Weaknesses:

      (1) The main exposure of interest seems to be only measured at one time-point in time (at study enrollment) while patients are considered many years at risk afterwards without knowing their exposure status at the time of experiencing the outcome. As indicated by the authors, PPI are sometimes used for only short amounts of time. It seems biologically implausible that an infection was caused by using PPI for a few weeks many years ago.

      We agree with the reviewer that PPIs are sometimes used for only short amounts of time, as indicated in our manuscript. We acknowledge that it is a limitation of the UK Biobank cohort, and we have discussed this in the discussion section as follows:

      “Given that the PPI exposure was mainly assessed at the baseline recruitment, it was possible that a small proportion of PPI users was misclassified during the follow-up due to the medication discontinuation, which may result in an underestimation of potential risk.” (Page 14, Line 8-10)

      In addition, to alleviate these concerns, we have conducted effect medication for the subgroup of potential long-term users, which were defined by participants with indications of PPI use. This information has been included in the discussion section:

      “In addition, no effect moderation was observed in subgroup analyses for the main outcome among PPI users with indications (more likely to regularly use PPIs for a long period) compared to those without indications, indicating the risks remained increased among long-term PPI users.” (Page 14, Line 12-15)

      We hope that in the future, the concerns highlighted by the reviewer can be resolved by utilizing datasets with close follow-up, especially regarding medication use:

      “Since the follow-up prescription data was lacking in our study to precisely identifying the long-term users, further evaluation using cohorts with close follow-up is needed.” (Page 14, Line 15-17)

      (2) Previous studies have shown that by focusing on prevalent users of drugs, one often induces several biases such as collider stratification bias, selection bias through depletion of susceptible, etc.

      Because of the limitations of data from the UK Biobank, such as the absence of details on initiation of medications and regular monitoring, we were restricted to using a prevalent user design to assess the associations between PPI use and respiratory outcomes. We have discussed it in the limitation section:

      “Given that the PPI exposure was mainly assessed at the baseline recruitment, it was possible that a small proportion of PPI users was misclassified during the follow-up due to the medication discontinuation, which may result in an underestimation of potential risk. However, the prevalent user design could underestimate the actual risks of PPI use for respiratory infections, which indicates the real effect might be stronger [38]……Since the follow-up prescription data was lacking in our study to precisely identifying the long-term users, further evaluation using cohorts with close follow-up is needed.” (Page 14, Line 8-17)

      (3) It seems Kaplan Meier curves are not adjusted for confounding through e.g. inverse probability weighting. As such the KM curves are currently not informative (or the authors need to make clearer that curves are actually adjusted for measured confounding).

      Your kind suggestions are greatly appreciated. We have plotted Kaplan Meier curves adjusted for confounding by inverse probability weighting with the measured confounders according to the reviewer’s advice. The methods and results are demonstrated as follows:

      “The event-free probabilities were compared by Kaplan-Meier survival curves with inverse probability weights adjusting for the measured covariates.” (Page 8, Line 13-15)

      “Regular PPI users had lower event-free probabilities for influenza and pneumonia compared to those of non-users (Supplementary Figure 2 A-B).” (Page 9, Line 21-23)

      “PPI users had lower event-free probabilities for COVID-19 severity and mortality, but not COVID-19 positivity compared to those of non-users (Supplementary Figure 2 C-E).” (Page 10, Line 9-10)

      (4) Throughout the manuscript the authors seem to misuse the term multivariate (using one model with e.g. correlated error terms to assess multiple outcomes at once) when they seem to mean multivariable.

      We apologize for misusing the term “multivariate” and “multivariable” in our previous manuscript. We have corrected the misused terms throughout the manuscript:

      “Univariate and multivariable Cox proportional hazards regression models were utilized to assess the association between regular use of PPIs and the selected outcomes.” (Page 7, Line 19-20)

      “The remaining imbalanced covariates (standardized mean difference ≥ 0.1) after propensity score matching were further adjusted by multivariate multivariable Cox regression models to calculate HRs and 95% CIs.” (Page 8, Line 23-25)

      (5) Given multiple outcomes are assessed there is a clear argument for accounting for multiple testing, which following the logic of the authors used in terms of claiming there is no association when results are not significant may change their conclusions. More high-level, the authors should avoid the pitfall of stating there is evidence of absence if there is only an absence of evidence in a better way (no statistically significant association doesn't mean no relationship exists).

      We have revised our interpretation for the results, particularly for those without statically significant association based on the reviewer’s advice, and clearly recognize that the conclusions should be interpreted with cautions:

      “In contrast, the risk of COVID-19 infection was not significant with regular PPI use…” (Page 2, Line 11-12)

      “PPI users were associated with a higher risk of influenza (HR 1.74, 95%CI 1.19-2.54), but the risks with pneumonia or COVID-19-related outcomes were not evident.” (Page 2, Line 14-16)

      “…while the effects on pneumonia or COVID-19-related outcomes under PPI use were attenuated when compared to the use of H2RAs.” (Page 2, Line 18-19, in the Abstract)

      “…while their association with pneumonia and COVID-19-related outcomes is diminished after comparison with H2RA use and remains to be further explored.” (Page 15, Line 21-22, in the Conclusion)

      (6) While the authors claim that the quantitative bias analysis does show results are robust to unmeasured confounding, I would disagree with this. The e-values are around 2 and it is clearly not implausible that there are one or more unmeasured risk factors that together or alone would have such an effect size. Furthermore, if one would use the same (significance) criteria as used by the authors for determining whether an association exists, the required effect size for an unmeasured confounder to render effects 'statistically non-significant' would be even smaller.

      We agree with the reviewer that there might still exist one or more unmeasured risk factors that have effect sizes larger than 2. Hence, we cannot affirm that the findings are robust to unmeasured confounding in the current analysis, which is a limitation of our study. We have deleted the previous statement, and added more discussion in the limitation section:

      “Moreover, patients with exacerbations of respiratory disorders (e.g., asthma, COPD) might suffer from a wide range of gastrointestinal symptoms that lead to the use of PPIs [38]. Due to the lack of data for respiratory severity and close follow-up for medication use, residual confounding might still exist due to the observational nature.” (Page 14, Line 23-27)

      (7) Some patients are excluded due to the absence of follow-up, but it is unclear how that is determined. Is there potentially some selection bias underlying this where those who are less healthy stop participating in the UK biobank?

      Thank you for your question. The reasons for the absence of follow-up are mainly classified into five categories, including: (1) Death reported to UK Biobank by a relative; (2) NHS records indicate they are lost to follow-up; (3) NHS records indicate they have left the UK; (4) UK Biobank sources report they have left the UK; (5) Participant has withdrawn consent for future linkage. According to the data from UK Biobank (https://biobank.ndph.ox.ac.uk/ showcase/field.cgi?id=190), the major reason for the loss of follow-up among participants is their departure from the UK (84.7% of participants who were lost to follow-up). In addition, not including those who were less healthy in the study might also underestimate the risk, leading to lower estimated effects of PPIs for respiratory infections. We have supplemented this in our revised manuscript:

      “Among them, 1,297 participants without follow-up, which were mainly determined by reported death, departure from the UK, or withdrawn consent, had been removed after initial exclusion.” (Page 4, Line 25-27)

      (8) Given that the exposure is based on self-report how certain can we be that patients e.g. do know that their branded over-the-counter drugs are PPI (e.g. guardium tablets)? Some discussion around this potential issue is lacking.

      Thank you for your concerns. In the data collection by the UK Biobank, the participants can enter the generic or trade name of the treatment on the touchscreen to match the medications they used. We have added this important information to the method section:

      “The exposure of interest was regular use of PPIs. The participants could enter the generic or trade name of the treatment on the touchscreen to match the medications they used (Supplementary Table S1).” (Page 5, Line 6-8)

      We acknowledge that specific information on prescribed or over-the-counter use of medications is lacking in the UK Biobank. We have discussed it in the limitation section:

      “Limitations exist in our study. Information on dose and duration of PPI use, discrimination between prescription and over-the-counter use of PPIs, health-seeking behavior, different types of pneumonia, and pneumococcus vaccination is currently not available from the UK Biobank.” (Page 14, Line 5-8)

      (9) Details about the deprivation index are needed in the main text as this is a UK-specific variable that will be unfamiliar to most readers.

      Thank you for your question on the definition of deprivation index. We have proved the details  about the deprivation index in the manuscript:

      “…socioeconomic status (deprivation index, which was defined using national census information on car ownership, household overcrowding, owner occupation, and unemployment combined for postcode areas of residence)…” (Page 6, Line 14-17)

      (10) It is unclear how variables were coded/incorporated from the main text. More details are required, e.g. was age included as a continuous variable and if so was non-linearity considered and how?

      We apologize for not elucidating how variables were incorporated into the main text. Previously, the linearity between continuous variables and outcomes was assessed by Martingale residuals plots, while the variables detected with non-linearity were regarded as categorical variables for further analyses. For example, after evaluation with the Martingale residuals plot, age demonstrated non-linearity, and we incorporated it as a categorical variable for the analysis of COVID-related mortality.

      We have supplemented the information in the method section:

      “The linearity between continuous variables and outcomes was assessed by Martingale residuals plots, while the variables detected with non-linearity were regarded as categorical variables for further analyses.” (Page 6, Line 28 to Page 7, Line 1)

      (11) The authors state that Schoenfeld residuals were tested, but don't report the test statistics. Could they please provide these, e.g. it would already be informative if they report that all p-values are above a certain value.

      We are sorry for not providing the statistics about the Schoenfeld residual in our previous manuscript. We have supplemented the information in our revisions:

      “Schoenfeld residuals tests were used to evaluate the proportional hazards assumptions, while no violation of the assumption was detected (Supplementary Table S3).” (Page 7, Line 27 to Page 8, Line 1)

      (12) The authors would ideally extend their discussion around unmeasured confounding, e.g. using the DAGs provided in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7832226/, in particular (but not limited to) around severity and not just presence/absence of comorbidities.

      Thank you for your insightful suggestions that the discussion about unmeasured confounding should be extended. We agree with the reviewer that, in addition to the comorbidities themselves, their severity could also have an important impact on the use of PPIs. We have added the discussion in the limitation section with citing the article (PMC7832226):

      “Moreover, patients with exacerbations of comorbid disorders (e.g., diabetes, asthma, COPD) might suffer from a wide range of gastrointestinal symptoms that lead to the use of PPIs [38] (Supplementary Figure S4). Due to the lack of data for respiratory severity and close follow-up for medication use, residual confounding might still exist due to the observational nature.” (Page 14, Line 23-27)

      (13) The UK biobank is known to be highly selected for a range of genetic, behavioural, cardiovascular, demographic, and anthropometric traits. The potential problems this might create in terms of collider stratification bias - as highlighted here for example: https://www.nature.com/articles/s41467-020-19478-2 - should be discussed in greater detail and also appreciated more when providing conclusions.

      We acknowledge the reviewer's point about the UK Biobank's highly selective nature potentially leading to collider stratification bias in the evaluation of COVID-19-related outcomes. We have discussed this in detail and are cautious when generating conclusions.

      “Furthermore, the highly selective nature of the UK Biobank might create collider stratification bias for the evaluation of COVID-19-related outcomes, and thus the conclusions should be interpreted with cautions [39].” (Page 15, Line 2-4)

      Reviewer #2 (Public Review):

      Summary:

      Zeng et al investigate in an observational population-based cohort study whether the use of proton pump inhibitors (PPIs) is associated with an increased risk of several respiratory infections among which are influenza, pneumonia, and COVID-19. They conclude that compared to non-users, people regularly taking PPIs have increased susceptibility to influenza, pneumonia, as well as COVID-19 severity and mortality. By performing several different statistical analyses, they try to reduce bias as much as possible, to end up with robust estimates of the association.

      Strengths:

      The study comprehensively adjusts for a variety of critical covariates and by using different statistical analyses, including propensity-score-matched analyses and quantitative bias analysis, the estimates of the associations can be considered robust.

      We are grateful to the reviewer for pointing out the merits of our articles, which include adjusting for a wide range of covariates, employing diverse statistical analyses, and using robust data. We will revise our manuscript further based on the reviewer's suggestions.

      Weaknesses:

      As it is an observational cohort study there still might be bias. Information on the dose or duration of acid suppressant use was not available, but might be of influence on the results. The outcome of interest was obtained from primary care data, suggesting that only infections as diagnosed by a physician are taken into account. Due to the self-limiting nature of the outcome, differences in health-seeking behavior might affect the results.

      Thank you for your questions for information on the dose/duration of acid suppressants, the source of diagnosis, and the health-seeking behavior of participants. For the data from the UK Biobank, the dose or duration of acid suppressant use was not available since the information was not collected as baseline or follow-up. In addition, the outcome of interest was also retrieved from the hospital ICD diagnosis. We apologize for not clarifying it in our previous manuscript. Moreover, we agree with the reviewer that the health-seeking behavior could have an impact on the analyses, whereas the correlated data are still not available from the UK Biobank. We have discussed them in the method and limitation section:

      “Briefly, the first reported occurrences of respiratory system-related conditions within primary care data,  and hospital inpatient data defined by the International Classification of Diseases (ICD)- 10 codes were categorized by the UK Biobank.” (Page 5, Line 21-25)

      “Limitations exist in our study. Information on dose and duration of PPI use, discrimination between prescription and over-the-counter use of PPIs, health-seeking behavior, different types of pneumonia, and pneumococcus vaccination is currently not available from the UK Biobank.” (Page 14, Line 5-8)

      Reviewer #1 (Recommendations For The Authors):

      Analysis code should be made available.

      Thank you for your question. We have provide the sources of the analysis code we used for this study in our revised manuscript:

      “The codes used in this study can be found at: https://epirhandbook.com/en/ and https://cran.r-project.org/doc/contrib/Epicalc_Book.pdf.” (Page 16, Line 21-22)

      Reviewer #2 (Recommendations For The Authors):

      It might be interesting to study whether including self-reported infections changes the results, as people using PPI may more easily consult their GP even for a self-limiting disease such as influenza and therefore are more likely diagnosed/confirmed with such a respiratory infection.

      Thank you for your insightful suggestions on conducting analyses including self-reported infections. Therefore, we have included the self-reported cases as sensitivity analyses, and the results were not significantly altered, which confirms the robustness of our results:

      “Self-reported infections, except for COVID-19-related outcomes due to the lack of data, were also included for the outcomes as sensitivity analyses. The self-reported cases were reported at the baseline or subsequent UK Biobank assessment center visit.” (Page 8, Line 17-19)

      “Inclusion of the self-reported cases did not significantly alter the results (Supplementary Table S4).” (Page 9, Line 17-18)

      Moreover, to address the above-mentioned, sub-analyses differentiating between over-the-counter and prescribed medication might be interesting.

      Thank you for your questions on differentiating between over-the-counter and prescribed medication. We have thoroughly looked up the data provided by the UK Biobank, but it is a pity that they are not provided. We have discussed this in the limitation section:

      “Information on dose and duration of PPI use, discrimination between prescription and over-the-counter use of PPIs, health-seeking behavior, different types of pneumonia, and pneumococcus vaccination is currently not available from the UK Biobank.” (Page 14, Line 5-8)

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      In my opinion, the three most important controls (hopefully easy):

      (1) Include no ATR controls for optogenetic activation experiments (not all, just one or two, e.g., Figure 4B, C, or D, for the highest activation condition). The concern is that it can be quite hard to use light to both monitor neural responses while also using light to activate the function of other neurons.

      We thank the reviewer for the suggestions. We use a 2-photon 910-nm laser (which does not activate Chrimson) for imaging of GCaMP and a 624-nm LED (which does not activate GFP) for Chrimson activation. Calcium (GCaMP) signals are detected by PMT during Chrimson activation. With this setup, we are able to image GCaMP signals without crosstalk during activation of Chrimson.

      We performed calcium imaging in animals that were not fed ATR and found that SS04185 showed no response to LED stimulation at the strongest intensity (µW/mm) (New Figure 4 – figure supplement 1B).

      (2) Demonstrate that their RNAi constructs do indeed knock down the intended target gene. They showed nicely in Figure 5A that SeIN128 expresses GABA. Presumably, these neurons also express VGAT. Is it possible to check the expression of VGAT after RNAi knockdown? The concern is that using only a single RNAi introduces the possibility of off-target effects. Using multiple RNAi lines for VGAT or other parts of the pathway would also alleviate this (minor concern).

      We thank the reviewer for raising this point. We agree that using only one RNAi line (HMS02355) for VGAT in Figure 5A is a weakness. 

      Accordingly, we have performed additional experiments to quantify the effect of RNAi knockdown of VGAT using HMS02335 in all neurons, followed by subsequent immunostaining against GABA or VGAT. We found that both VGAT and GABA were significantly reduced in the neuropil (Figure 5 – figure supplement 1C and D). These data strongly suggest that HMS02355 knocks down VGAT and reduces GABA at axon terminals. We note that HMS02355 has been used previously for knocking down GABA signaling in the following studies.

      (1) Kallman BR, Kim H, Scott K (2015). Excitation and inhibition onto central courtship neurons biases Drosophila mate choice. eLife 4:e11188. https://doi.org/10.7554/eLife.11188

      (2) Zhao W, Zhou P, Gong C et al. (2019). A disinhibitory mechanism biases Drosophila innate light preference. Nat Commun 10, 124. https://doi.org/10.1038/s41467-018-07929-w

      (3) Yamagata N, Ezaki T, Takahashi T, Wu H, Tanimoto H (2021). Presynaptic inhibition of dopamine neurons controls optimistic bias. eLife 10:e64907. https://doi.org/10.7554/eLife.6490

      (3) Include genetic controls for their driver line.

      In Figure 1, it would be nice to see one half or the other half of their split GAL4 line in their manipulations. The concern is that perhaps the phenotype is coming from something unexpected in the genetic background.

      We thank the reviewer for the suggestion. We have added half of the GAL4 lines (AD or DBD) as controls (New Figure 1 – figure supplement 2). We found that SS04185 showed reduction of rolling, whereas AD only or DBD only (split control) did not (half of the split lines). 

      In the discussion:

      It seems that activation of SS014185 has additional effects beyond what the authors have quantified. Specifically, larvae do not appear to re-initiate rolling in the same manner as Basin activation alone. Also, there appears to be an off-response, turning.

      We appreciate the reviewer’s comments. We have included a section in the discussion to consider the differences patterns of rolling observed during joint stimulation of Basins and SS04185 and during stimulation of Basins alone, as well as the increase in turning following the offset of joint stimulation of Basins and SS04185 compared with stimulation of Basins alone (lines 464 to 481). Although the reasons for these differences are beyond the scope of the paper, we have added Figure 2 – figure supplement 1K, which shows that co-activation of SS04185-MB and Basins is sufficient to evoke turning following the offset of stimulation, suggesting that the increased turning may be due to the activation of SS04185-MB neurons and independent of SS04185-DN neurons.  

      The labeling of the Figure panels could be improved. In many places, it is not clear that Basins are being stimulated in the background, whereas in nearby panels, it is clearly labeled. This is confusing for the reader.

      We thank the reviewer for the constructive suggestions. We have modified all relevant figures to read “Basins>Chrimson” above the pink line indicating the period of optogenetic activation.

      Reviewer #2 (Recommendations For The Authors):

      Claims, rigorousness, repeatability, and accuracy of terms.

      (1) In line 254, the authors suggest that the slow response of SeIN128 neurons is due to the input they receive from SEZ, but in line 453, they suggest it is due to axo-axonal connections. However, their evidence does not support one factor over the other. Overall, only the axo-axonal connection was strongly suggested in the discussion. The authors could clarify that the delay of SeIN128 activity may also be caused by multisynaptic connections involving SEZ or other neurons in the last section of the Discussion.

      Although SeIN128 primarily receives inputs from the SEZ, it also receives inputs within the VNC from Basin-2 (Figure 4 – figure supplement 2). Specifically, in the VNC, the axons of SeIN128 make inhibitory synaptic contacts onto the axon of Basin-2, which in turn makes reciprocal excitatory contacts onto the axon of SeIN128, thereby forming a feedback loop. However, by the time we wrote the original discussion, we had inadvertently focused on the potential of the negative feedback loop formed by these axo-axonal synapses in the VNC to mediate the slow response of SeIN128, overlooking the possibility that other as yet unidentified pathways could convey Basin or A00c activity indirectly to SeIN128 dendrites in the SEZ. Therefore, we have revised the original text, which read “These data suggest that the main synaptic inputs onto SeIN128 neurons in the SEZ mediate the slow responses upon activation of Basins or A00c neurons” to “These data suggest that the delay of SeIN128 activity may be caused by multi-synaptic connections involving the SEZ or a feedback loop involving axo-axonal connections between SeIN128 and Basin-2 or A00c” (revised, Lines 259 and 261). Accordingly, we have also adjusted the relevant discussion section to be consistent with this change (Lines 460 and 466).

      (2) Please clarify the following: How does the algorithm define rolling and crawling? Healthy larvae complete 360{degree sign} rolls, in each roll they rotate from dorsal up to dorsal up. It is possible that a larva rolls for an incomplete cycle and straightens up. Does the algorithm simply label individual frames as “roll”, “non-roll”, or “unknown”, and defines rolling by the existence of “roll” frames? If so, then larvae that rolled for 90{degree sign} and straightened would be counted as “rolling” though they failed to complete a full rolling bout. Also, how were “hunch” “turn” and “back” identified? Lastly, is there any manual quality control involved? Address this and related issues in the methods:

      a)  Expand the description of the classifier algorithm.

      b)  How are rolling and non-rolling animals defined in the "rolling%" assay? Were all "rolling" animals able to do at least one 360{degree sign} roll?

      c)  How are "rolling duration" and "end of 1st rolling" defined? Is the algorithm able to distinguish different rolling bouts? In these two assays, were the animals rolled for <1 second (in total or their "first roll") able to complete a 360{degree sign} roll?

      The Multi-worm Tracker (MWT) records only the contours of animals (no real video image data). Thus, the data fed into the classifier algorithm only includes features based on contour time-series data. The algorism uses movement perpendicular to the body axis—the characteristic feature of larval rolling—to classify rollers and non-rollers. Although the algorithm cannot determine whether a rolling event involves a rotation of more than 360 degrees, we ensure that rolling events are at least 360 degrees by removing any events that are shorter than 0.2 s (the minimum time to complete a 360-degree roll).

      We have accordingly revised the section of “Behavior detection” relating to the behavior classification algorithm in the methods section as follows (Lines 600 to 620).

      “After extracting behavioral parameters from Choreography, we used an unsupervised machine learning behavior classification algorithm to detect and quantify the following behaviors: hunching (Hunch), headbending (Turn), stopping (Stop), and peristaltic crawling (Crawl) as previously reported (Masson et al., 2020). Escape rolling (Roll) was detected with a classifier developed using the Janelia Automatic Animal Behavior Annotator (JAABA) platform (Kabra et al., 2013; Ohyama et al., 2015). JAABA transforms the MWT tracking data into a collection of ‘per-frame’ behavioral parameters and regenerates 2D dorsal-view videos of the tracked larvae. Based on such videos, we defined rolling as a rotation around the body while the larva maintains a C-shape, which results in a movement perpendicular to larval body axis (Supplementary videos 1 and 2). Using this definition, we trained the algorithm in the JAABA platform by labeling ~10,000 randomly chosen frames as rolling or non-rolling to develop the rolling classifier. If a larva did not curl into a C-shape or move sideways, it was labeled as a “non-roller.” Every animal with at least one rolling event longer than 0.2 s in a given period was labeled as a “roller” (i.e., it was assumed to have rolled at least 360 degrees), based on the observation that when the start and end of rolling events were precisely measured, the algorithm could identify rolling events completed in 0.2 s.

      The rejection of false positives, especially at the beginning and the end of each rolling bout, enhanced accuracy. The algorithm integrated these training labels and parameters generated with Choreography in a time series, such as speed, crabspeed, and body curvature, to generate a score for rolling detection. Above a certain threshold, the classifier labeled the frame as rolling. This classifier, which has false negative and false positive rates of 7.4% and 7.8%, respectively (n = 102), was utilized to detect rolling in this paper.”

      Readability of text

      (1) I suggest giving the SS04185 line and SeIN128 neuron common names that are easier to remember and follow (after mentioning their full name once).

      We acknowledge the reviewer’s concerns. However, because SS04185 was initially named using the Janelia split-line pipeline, and SeIN128 was named independently in a more recent study (Ohyama et al., 2015), we have retained these designations in the present manuscript.

      Figures and figure legends

      (1) It would help if the authors could put visual representations of rolling and crawling, such as a cartoon larva performing the rolling-crawling switch, and still frames of rolling and crawling of real larvae, especially in Figure 1. Also, please consider including a video of rolling and crawling in real larvae (preferably comparing control and experimental groups).

      We appreciate the reviewer’s suggestion. We have added a cartoon of the behavioral sequence in Figure 1A, as well as a Figure 1 supplement video based on MWT data, which shows rolling followed by crawling. 

      (2) To give the reader a take-home message, it would help if the authors could make a simplified version of Figure 4A and put it at the end of the paper.

      We thank the reviewer for the suggestion. To assist the reader, we have added schematics depicting how the circuit may function in panel I of Figure 8.

      (3) In Figure 1A, add the text "activation " after the neuron names.

      We have added “Chrimson” following “Basins>” to the new Figure 1B (old Figure 1A) and other figures (Figure 1C and D, Figure 5A, Figure 6A, and figure supplements).

      (4) Figure 1G: a data point is misaligned (at the top of the graph). 

      We have aligned the data point accordingly.

      (5) Figure 1B can benefit from a better design. If possible, please separate the crawling speed into an independent graph (or at least use a different line shape to code for crawling speed and indicate it on the in-graph legend). Is the speed of Basin/SS04185 co-activation studied?

      We appreciate the reviewer’s suggestion. We have separated the plots for rolling and crawling speed into different panels (Figure 1C and D). As shown in Figure 1D, the crawling speed observed during coactivation of Basins and SS04185 was similar to that during activation of Basins alone.

      (6) Figure S1 uses a different color-coding scheme from Figure 1. I suggest making the color coding consistent between figures.

      We are grateful for the reviewer’s suggestion. We have adjusted the color-coding scheme accordingly.

      (7) Line 692 (Figure 2 legend), "Killer Zipper" is misspelled as "Kipper Zipper". Out of curiosity, is there a way to remove or reduce SS04185-DN expression in the same manner as SS04185-MB reduction?

      We have corrected the text in the legend for Figure 2. As for the reviewer’s question, we did attempt to reduce or abolish SS04185-DN expression with tsh-LexA and LexAop-Kip+ but found no effect. Other identified LexA constructs with SeIN128 expression, however, all showed SS04185-MB expression. Consequently, we could not use these constructs because they inhibit both SeIN128 and SS04185-DN.

      (8) The color coding of Figure 2 (especially in D) makes it hard to distinguish between the brown and red groups.

      We thank the reviewer for the suggestion. Accordingly, we have changed the color for the brown group to orange.

      (9) In line 926 (Figure S2 legends), the description of F and G seems inverted.

      We appreciate the reviewer for pointing out the error. We have revised the text from “(F) has only SS04185-

      MB expression, and (G) has both SS04185-DN and SS04185-MB expression” to “(F) has both SS04185DN and SS04185-MB expression, and (G) has only SS04185-MB expression.”

      (10) Figure 7B: which line does the top group of asterisks belong to?

      The top group of asterisks indicates that each experimental group differs significantly (p < 0.001) from the control group. We have revised the figure to clarify the comparisons indicated by the asterisks in Figure 7B, as well as the figure legend below (Line 890-894).

      “(B) Cumulative plot of rolling duration. Statistics: Kruskal-Wallis test: H = 69.52, p < 0.001; Bonferronicorrected Mann-Whitney test, p < 0.001 between control and the GABA-B-R11, GABA-B-R12 and GABAB-R2 RNAi groups, p < 0.001 between GABA-A-R and all other experimental RNAi group. Sample size for the colored bars from top (control, black) to bottom (GABA-A-R, red); n = 520, 488, 387, 582, 306.”

      (11) Figure S8 D and F: indicate Basin-2 or Basin-4 activation on graph.

      We have revised Figure 8 – figure supplement D and F accordingly.

      Reviewer #3 (Recommendations For The Authors):

      (1) Lines 86-87: Text needs to be rewritten for clarity. Also, include the genotype in the corresponding figure legend (Figure 1B).

      We thank the reviewer for pointing this out. We have clarified the text accordingly and included the genotype in the figure legend (lines 86 and 87). Specifically, we have revised Figure 1B (New Figure 1C and D) and adjusted the legend accordingly as follows. 

      Lines 86 and 87: Crawling speed during the activation of all Basins following rolling was ~1.5 times that of the crawling speed at baseline (Figure 1D).

      (2) Include the protocol for heat shock-FLP out experiments

      We have added the following paragraph to the Methods section describing the heat shock-FlpOut experiments (lines 537 to 546).

      “Heat shock FlpOut mosaic expression

      First instar Drosophila larvae were exposed to heat shock in a water bath at 37°C for 12 min as previously described (Nern et al., 2015). With precise temporal and thermal control of heat shock, larvae with genotype

      w+, hs(KDRT.stop)FLP/13xLexAop2-IVS-CsChrimson::tdTomato; R54B01-Gal4.AD/72F11LexA;20xUAS-(FRT.stop)-CsChrimson::mVenus/R46E07-Gal4.DBD showed sporadic

      CsChrimson::mVenus expression driven by SS04185 split GAL4. As a result, the ratio of the larvae with SS04185-DN and SS04185-MB expression to those with only SS04185-MB expression was 1:1. Each larva was individually examined with optogenetic stimulation and behavior analysis. After behavioral experiments, mVenus expression in CNS was confirmed under the fluorescence microscope.”

      (3) In the immunohistochemistry, the authors exclude the steps for washings. Recommend the authors to cite the previous literature. Similar to the other protocols detailed in the methods.

      We have added a brief description of the steps involved in washing (lines 641 and 648). We have also provided a citation with similar immunohistology protocols (Patel, 1994).

      (4) Keeping the same Y-axis scale for similar graphical representation would be helpful to compare across different experimental conditions and genotypes-for example, 2E and 2H for the start of the first crawl.

      As suggested by the reviewer, we have adjusted the y-axis scales for Figure 2E and H to be identical.

      (5) The color schematics used for the graph make it hard to visualize the data. The author might reconsider the better presentation of the data by avoiding darker colors.

      We thank the reviewer for the constructive suggestion. We have lightened the shading of all violin plots. We have also modified the shading for the middle group in Figure 2C and E from dark brown to orange.

      (6) Co-activation of the SS04185 and Basins in the figures represented as Basins+SS04185 (Figure 1A) and SS04185 (rest of the figures). Authors might reconsider this terminology to define and distinguish the coactivation of SS04185 and Basins neurons from the activation of SS04185 or Basins alone. It needs to be clarified in the figures.

      We have adjusted the terminology by including “Basins>Chrimson” in all panels in which Basin neurons are optogenetically activated to trigger rolling in the background for all groups. Additionally, we have labeled the control group as “Control” and the experimental group as ”SS04185”. 

      (7) Figure 4A, summarizes the synaptic connection and strength between different neurons - SeIN128, Basins, A00c and mdIV. However, the nature of these synaptic connections - excitatory and inhibitory- is not represented. Based on the previous and current studies, the authors consider providing the schematic for circuit mechanisms of escape behavior sequences in larvae. Also, discussing these findings in light of the downstream output circuit and motor regulation might be informative (See Cooney et al. 2023, PNAS).

      As the reviewer correctly points out, the diagram of the connectome shown in Figure 4A does not indicate whether the connections are excitatory or inhibitory. Accordingly, we have added a new summary panel (Figure 8I) based on the results of examining GABAergic synapses (Figure 5A). The schematics in Figure 8I depict how the joint activity of inhibitory and excitatory synapses (indicated by arrowheads and blunt ends, respectively) may lead to rolling or fast crawling.

      We have also added a section discussing the premotor circuits for crawling and rolling premotor circuit in discussion (Line 512 – 519).

      (8) Percentage rolling present in figure 5B and 6A correspond to the control larvae 13xLexAop2-IVS-CsChrimson::mVenus; R72F11-lexA/+; HMS02355/+ and 13xLexAop2-IVS- Cs-Chrimson::mVenus; R72F11-lexA/+; UAS-TeTxLC.tnt/+. How does the author interpret the observed variability across the experiments? The author might consider discussing the genetic background effect on the observed behaviors, if any.

      As pointed out by the reviewer, we noticed that rolling probability varied depending on genetic background. We have revised the text accordingly (Lines 277 to 280).

      (9) Recheck the arrowheads in Figure 5A.

      We have confirmed the positions of the arrowheads in Figure 5A and modified the figures by outlining the cells with dotted lines.

      (10) Lines 295-298: Data presented in the supplementary figure and p-values in the text (p=0.11) suggest that the first crawl's onset is comparable to controls. Rewrite this text for clarity and include the statistical values in the supplemental figure 6.

      We have revised the text as follows (Lines 302 to 305).

      “Although the duration of each rolling bout, time to onset of the first rolling bout, and time to onset of the first crawling bout did not differ from those of controls (Figure 6–figure supplement 1D, E and G), the time to offset of the first rolling bout was delayed relative to controls (p = 0.013 for Figure 6–figure supplement 1F).”

      (11) Lines 263-264: Data provide evidence for SS04185 receiving inputs Basin-2 and A00c neurons. SS04185, which provides inputs to other neurons, specifically A00c neurons, but still needs clarification.

      We have revised the text as follows (Lines 264 to 266).

      The results thus far indicate that, activation of SeIN128 neurons inhibits rolling (Figure 1A–C), SeIN128 neurons receive functional inputs from Basin-2 and A00c (Figure 4A-C); and SeIN128 neurons make anatomical connections onto Basin-2 and A00c (Figure 4A). 

      (12) In the table that lists the genotypes, instead of '-' or the blank space in the label column, the author might consider using 'control,' consistent with the figures.

      In accord with the reviewer’s suggestion, we have revised the notation of ‘-’ or the blank space, to ‘control’ for all figures.

      (13) Check the typographical errors throughout the manuscript. Some below:

      We have revised the text accordingly as suggested below.

      a.  Lines 100, 142: SS4185 should be SS04185

      b.  Line 230: A00C should be A00c

      c.  Line 180: Expand VNC

      d.  10xUAS-IVS-mry::GFP should be 10xUAS-IVS-myr::GFP

      e.  Lines 444, 449: drosophila should be Drosophila

    1. Reviewer #1 (Public Review):

      Summary:

      This work aims to understand the role of thalamus POm in dorsal lateral striatum (DLS) projection in learning a sensorimotor associative task. The authors first confirm that POm forms "en passant" synapses with some of the DLS neuronal subtypes. They then perform a go/no-go associative task that consists of the mouse learning to discriminate between two different textures and to associate one of them with an action. During this task, they either record the activity of the POm to DLS axons using endoscopy or silence their activity. They report that POm axons in the DLS are activated around the sensory stimulus but that the activity is not modulated by the reward. Last, they showed that silencing the POm axons at the level of DLS slows down learning the task.

      The authors show convincing evidence of projections from POm to DLS and that POm inputs to DLS code for whisking whatever the outcome of the task is. However, their results do not allow us to conclude if more neurons are recruited during the learning process or if the already activated fibres get activated more strongly. Last, because POm fibres in the DLS are also projecting to S1, silencing the POm fibres in the DLS could have affected inputs in S1 as well and therefore, the slowdown in acquiring the task is not necessarily specific to the POm to DLS pathway.

      Strengths:

      One of the main strengths of the paper is to go from slice electrophysiology to behaviour to get an in-depth characterization of one pathway. The authors did a comprehensive description of the POm projections to the DLS using transgenic mice to unambiguously identify the DLS neuronal population. They also used a carefully designed sensorimotor association task, and they exploited the results in depth.

      It is a very nice effort to have measured the activity of the axons in the DLS not only after the mice have learned the task but throughout the learning process. It shows the progressive increase of activity of POm axons in the DLS, which could imply that there is a progressive strengthening of the pathway. The results show convincingly that POm axons in the DLS are not activated by the outcome of the task but by the whisker activity, and that this activity on average increases with learning.

      Weaknesses:

      One of the main targets of the striatum from thalamic input are the cholinergic neurons that weren't investigated here, is there information that could be provided?

      It is interesting to know that the POm projects to all neuronal types in the DLS, but this information is not used further down the manuscript so the only take-home message of Figure 1 is that the axons that they image or silence in the DLS are indeed connected to DLS neurons and not just passing fibres. In this line, are these axons the same as the ones projecting to S1? If this is the case, why would we expect a different behaviour of the axon activity at the DLS level compared to S1?

      The authors used endoscopy to measure the POm axons in the DLS activity, which makes it impossible to know if the progressive increase of POm response is due to an increase of activity from each individual neuron or if new neurons are progressively recruited in the process.

      The picture presented in Figure 4 of the stimulation site is slightly concerning as there are hardly any fibres in neocortical layer 1 while there seems to be quite a lot of them in layer 4, suggesting that the animal here was injected in the VB. This is especially striking as the implantation and projection sites presented in Figures 1 and 2 are very clean and consistent with POm injection.