10,000 Matching Annotations
  1. Last 7 days
    1. California was flush with thousands of immigrants. Most expected it would become a free state, since the Mexican ban on slavery remained in force.

      I knew that California was flushed with immigrants but I did not know that it was the most expected too become a free state and that Mexicans ban slavery.

    1. addbacks

      I think it would be safer/cleaner to re-compute daily visibility and then group by / count per country, day, (optionally taxo). Would also be nice to read this all in SQL CTEs. I don't trust myself to verify the logic on pre-computed tables, moving from SQL to python, etc..., but if someone else feels they verify that's okay with me. :-)

    2. AND sj.TS::DATE = DATEADD('day', -1, m.first_match_date)

      This only counts for a given day the number of jobs newly removed that day. But don't we want to add it back every day it should not have been removed?

    3. the rule changes the job’s visibility level from organic/jobalert to sponsored

      Same edge case as above, but does it explicitly have to change the visibility or does it just ensure that visibility on days 31+ is sponsored?

    4. before the rule fired

      Weird edge case related to my above comment, but what if the job was not visible on days <30 for a different reason, then it became eligible for visibility on day 31 but incurred rule 244435 and so it stayed invisible days 31+. We would want to add this back, right?

    5. LEFT JOIN other_rules ON fm.jobid = other_rules.jobid

      I think this should be joined on jobid + day, since it's possible those rules applied on day(s) <30. For example if job 123456 has this time series:

      • Day 1: [] --> organic viz
      • ...
      • Day 26: [] --> organic viz
      • Day 27: [236670] --> sponsored viz
      • Day 28: [236670] --> sponsored viz
      • Day 29: [] --> organic viz
      • Day 30: [] --> organic viz
      • Day 31: [244435] --> sponsored viz
      • Day 32: [244435] --> sponsored viz
      • ...

      ... then we would not add this job back, because we just join on distinct jobid. We should consider only cases where the other rules applied after day 30. I think the most robust way to do this is to add back per day. We can work on an alternative query to see how different the results are.

    1. Literature reviews range from exhaustive searches to mere summaries of articles, but the fundamental objective is always the same—to establish the history of the problem investigated by summarizing the what, how, and why of the work that has already been done. Writing a literature review requires you to establish relationships among findings from other researchers and to condense many pages of published material into shorter segments. Your ability to assimilate material is critical.

      This stands out as the most important section needed in one's report. If the reader can tell you did extensive research on the history of the problem, they will begin to gage an idea of how many people were affected by the issues. It can help provide details of what was done in past to try to correct it, or if the problem was repeated ignore. Once you've input your solution, they will be less like to questions it since you will have detailed reasons as to why suggested solution will work.

    2. If the submittal letter, executive summary or abstract, and introduction strike you as repetitive, remember that readers do not necessarily start at the beginning of a report and read page by page to the end. They skip around—they may scan the table of contents, and then skim the executive summary for key facts and conclusions.

      This is an important note for me because I try to have every section of my writing look different from each other. This paragraph is saying that things should look similar because we have to keep our audience in mind. If certain details are important they should be repeated throughout the report so the reader could catch it if they missed it in the first section.

    3. The recipient has not requested your report. With unsolicited reports, you must convince the recipient that a problem or need exists in addition convincing them to accept your conclusions and/or implement change.

      This section stands out to me because it helps me understand that the report our group is choosing to do is identified as unsolicited. We found a problem and are trying to convince Montclair administrators of changes that can improve the situation at hand.

    1. The authors used a clinical prediction model to estimate the individual risk of developing breast cancer, but it is questionable if models can do this. A recently updated description of these models reads (1): An individuals’ risk of an event as estimated by a prediction model refers to the probability of that event in the subgroup of individuals with the same predictor values. Truly individual risks do not exist because of the ‘reference class problem’: an individual can belong to an infinite number of subgroups. By choosing a different set of predictors to condition on, the reference class changes, and so does the risk.

      It may seem paradoxical that using prediction models for decision making can benefit    outcomes at a population level without having trustworthy risk estimates for individuals    from that population. ... Clinical utility at the population level, for example in terms of     cost-effectiveness, is sufficient to warrant the use of a decision strategy based on a  model. We may need to accept that two models may have the same clinical utility yet     very different risk estimates for the same individual, often to the extent that model A     recommends treatment initiation, but model B does not.
      

      Another concluded that (2):

      clinical prediction models cannot, do not, and need not estimate individual risk.
      

      Since there is no evidence individual risk exist (3) and clinical prediction models don't estimate individual risk, these models are better understood to be risk stratifying a population. They separate a population into groups differing in risk, not individuals differing in risk, which can then be used to devise policies that efficiently allocate surveillance or preventive measures. This may make sense for the general population, but is unlikely to be useful for BRCA1 and BRCA2 pathogenic variant carriers. If predictors are added to a model, the risk distribution often broadens (detected by measures of discrimination) which may allow even more efficient allocations. This was nicely shown by the figure depicting the broader risk distribution resulting from addition of the polygenic risk score to age alone.

      Discordance in predictions for an individual across models is well documented and has been called "the multiverse of madness (4), having been observed with breast cancer risk models (5) and polygenic risk models (6). This does not impact their use for population risk stratification, but should preclude their use for individual risk estimation.

      Holmberg and Parascandola critiqued the use of breast cancer risk models, including the presentation of numbers as personalized risks (7): But we would argue that the more general problem here is with the promise of individualised risk information, which implies a sense of authority and certainty about the individual. Framing risk estimates as individualised may be misleading as it implies a high level of specificity for the individual. The use of individualised risk estimates suggests that a risk model says something personal about a specific individual rather than an entire group.

      Models including more or better predictors or using machine learning may provide greater discrimination but will not evade the reference class problem. So, this problem will persist and geneticists using clinical prediction models need to understand this counterargument to the conventional narrative on individual risk. To date, the reference class problem has been a well-kept secret in medicine with only 10 results in PubMed. A new points to consider statement from the ACMG (8) does caution that risk estimates cannot be directly applied to an individual and that discordant risks may be estimated. Nevertheless, personalized or individualized risk estimates remain the goal (with future model improvements expected to make this possible) and the discordance is not flagged, as it usually is, as a reason to question the use of these numbers. There no discussion of the reference class problem and its implication that individual risks don't exist and that the continuum of risk applies to groups, not individuals, so models provide population risk stratification suitable for policy development but not for counseling.

      As geneticists considering counseling based on a clinical prediction model need to know about the reference class problem and its implications, this merits inclusion in the discussion of limitations when the research is eventually published. Ralph Stern Division of Cardiovascular Medicine Department of Internal Medicine University of Michigan

      1. Barreñada L, Steyerberg EW, Timmerman D, Thomassen D, Wynants L, Van Calster B. The fundamental problem of risk prediction for individuals: health AI, uncertainty, and personalized medicine. arXiv. Preprint posted online 2025. doi:10.48550/ARXIV.2506.17141

      2. Stern RH. Accuracy of Preoperative Risk Assessment: Individuals versus Groups. Am J Med. 2025;138(6):926-927. doi:10.1016/j.amjmed.2025.02.007

      3. Dawid P. On individual risk. Synthese. 2017;194(9):3445-3474. doi:10.1007/s11229-015-0953-4

      4. Riley RD, Pate A, Dhiman P, Archer L, Martin GP, Collins GS. Clinical prediction models and the multiverse of madness. BMC Med. 2023;21(1):502. doi:10.1186/s12916-023-03212-y

      5. Paige JS, Lee CI, Wang PC, et al. Variability Among Breast Cancer Risk Classification Models When Applied at the Level of the Individual Woman. J Gen Intern Med. 2023;38(11):2584-2592. doi:10.1007/s11606-023-08043-4

      6. Clifton L, Collister JA, Liu X, Littlejohns TJ, Hunter DJ. Assessing agreement between different polygenic risk scores in the UK Biobank. Sci Rep. 2022;12(1):12812. doi:10.1038/s41598-022-17012-6

      7. Holmberg C, Parascandola M. Individualised risk estimation and the nature of prevention. Health, Risk & Society. 2010;12(5):441-452. doi:10.1080/13698575.2010.508835 1.

      8. Pal T, Christopher J, Astiazaran-Symonds E, et al. Consideration of inherited cancer risk on a continuum: An international and multidisciplinary perspective: A points to consider statement of the American College of Medical Genetics and Genomics (ACMG). Genet Med. 2026;28(3):101659. doi:10.1016/j.gim.2025.101659
    1. eLife Assessment

      The authors present useful findings demonstrating that the RNA modification enzyme Mettl5 regulates sleep in Drosophila. Through transcriptome- and proteome-wide analyses, the authors identified downstream targets affected in heterozygous mutants and proposed that Mettl5 regulates the translation and degradation of clock genes to maintain normal sleep function. Through additional analyses, the authors provided solid evidence supporting this model.

    2. Reviewer #1 (Public review):

      Summary:

      Here the authors attempted to test whether the function of Mettl5 in sleep regulation was conserved in Drosophila, and if so, by which molecular mechanisms. To do so they performed sleep analysis, as well as RNA-seq and ribo-seq in order to identify the downstream targets. They found that the loss of one copy of Mettl5 affects sleep, and that its catalytic activity is important for this function. Transcriptional and proteomic analyses show that multiple pathways were altered, including the clock signaling pathway and the proteasome. Based on these changes the authors propose that Mettl5 modulate sleep through regulation of the clock genes, both at the level of their production and degradation, possibly by altering the usage of Aspartate codon.

      Comments on revisions:

      The authors addressed all my comments satisfactorily.

    3. Reviewer #3 (Public review):

      Xiaoyu Wu and colleagues examined a potential role in sleep of a Drosophila ribosomal RNA methyltransferase, mettl5. Based on sleep defects reported in CRISPR generated mutants, the authors performed both RNA-seq and Ribo-seq analyses of head tissue from mutants and compared to control animals collected at the same time point. A major conclusion was that the mutant showed altered expression of circadian clock genes, and that the altered expression of the period gene in particular accounted for the sleep defect reported in the mettl5 mutant. In this revision, the authors have added a more thorough analysis of clock gene expression and show that PER protein levels are increased relative to wild type animals a specific times of day, indicating increased stability of the protein. Given that PER inhibits its own transcription, the per RNA is low in the mutants. The revised manuscript included efforts toward a more detailed understanding of how clock gene expression was altered in the mutants, as well as other clarification of sleep phenotypes.

      Comments on revisions:

      All critiques have been addressed by the authors; the manuscript is much improved from its original submission. Thank you.

    4. Author Response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Here, the authors attempted to test whether the function of Mettl5 in sleep regulation was conserved in drosophila, and if so, by which molecular mechanisms. To do so they performed sleep analysis, as well as RNA-seq and ribo-seq in order to identify the downstream targets. They found that the loss of one copy of Mettl5 affects sleep, and that its catalytic activity is important for this function. Transcriptional and proteomic analyses show that multiple pathways were altered, including the clock signaling pathway and the proteasome. Based on these changes the authors propose that Mettl5 modulate sleep through regulation of the clock genes, both at the level of their production and degradation, possibly by altering the usage of Aspartate codon.

      Comments on revised version:

      The authors satisfactorily addressed my comments, even though the precise mechanism by which Mettl5 regulates translation of clock genes remains to be firmly demonstrated.

      Reviewer #3 (Public review):

      Xiaoyu Wu and colleagues examined a potential role in sleep of a Drosophila ribosomal RNA methyltransferase, mettl5. Based on sleep defects reported in CRISPR generated mutants, the authors performed both RNA-seq and Ribo-seq analyses of head tissue from mutants and compared to control animals collected at the same time point. A major conclusion was that the mutant showed altered expression of circadian clock genes, and that the altered expression of the period gene in particular accounted for the sleep defect reported in the mettl5 mutant. In this revision, the authors have added a more thorough analysis of clock gene expression and show that PER protein levels are increased relative to wild type animals a specific times of day, indicating increased stability of the protein. Given that PER inhibits its own transcription, the per RNA is low in the mutants. Efforts toward a more detailed understanding of how clock gene expression was altered in the mutants, as well as other clarification of sleep phenotypes throughout is appreciated. As noted above, a strength of this work is its relevance to a human developmental disorder as well as the transcriptomic and ribosomal profiling of the mutant. However, there still remain some minor weaknesses in the manuscript. This reviewer is not in agreement with the interpretation of the epigenetic experiments. Specifically, co-expression of Clk[jrk] or per [01] with the mettl5 mutant recovered the nighttime sleep phenotype, but was additive to the daytime sleep phenotype such that double mutants showed higher sleep. This effect should be acknowledged and discussed. Overall, this is an interesting paper that indicates a molecular link between mettl5 and the circadian clock in regulation of sleep.

      Recommendations for the authors:

      Reviewer #3 (Recommendations for the authors):

      The authors misunderstood my original comment for Fig 1A. Please provide an explanation for the significance of the boxed region. There is little or no detail in the legend to help guide the reader.

      The information has been added to the figure legends for Figure 1A.

      Efforts toward improving analysis of circadian genes as well as sleep phenotypes (sleep onset time, rebound, etc) is much appreciated, thank you. However, Figure S1H and G panel labels are mixed up; please label in the order that they appear and that they correspond to the main text. Why is Figure S1H labeled "ZT 14"?

      Sleep latency is defined as the time from preparing to sleep to actually falling asleep. In this study, it specifically refers to the time taken for each individual fly to reach the sleep phenotype (i.e., 25 minutes of continuous sleep). We noted that this label was misleading, as the actual time to reach the sleep phenotype varied among individual flies. Therefore, in the revised figures, we have removed the ZT14 label. In addition, we have corrected the labeling of Figures S1G and S1H to ensure they appear in the correct order and correspond accurately to the descriptions in the main text.

      Unfortunately, based on Fig S1A-C, I am not convinced that mettl5 localizes to neurons, as there are no cells that show double labelling. This figure does not support the statement: "we found expression in both neurons (colocalizing with ELAV staining: Figure S1A-C) (lines 91-92), and "Mettl5-Gal4 is expressed in distinct neurons and glia that appear crucial for sleep regulation." (line 297). What "distinct" sleep related neurons were labeled? The staining in Fig S1A shows a different distribution from that in Fig S1D, and so it's possible this was a technical issue. Is there a better example?

      Thank you for your careful review and valuable comments. We agree that the colocalization of METTL5 with the neuronal marker ELAV is relatively sparse. However, as indicated by the arrows in Fig S1A–C, we did observe a few cells showing clear double labeling. These examples support the presence of METTL5 expression in neurons, albeit at a low frequency.

      In Figure 4G-H, please indicate the time of day of tissue collection.

      In Figure 4G-H, the tissue was collected at ZT0. We have now indicated this time point in the figure and legend to clarify the experimental timing.

      As noted in the public comment, I remain in disagreement with the assessment that "the double mutant showed the similar phenotype as downstream genes". The striking significant increase in daytime sleep in the double mutants remains unexplained. No further experiments are necessary, but this should be acknowledged in the text. Instead of an epistatic effect, given that overall sleep is high in the double mutants, another possible explanation is that the flies are sick and so are less active and sleeping more.

      Thank you for your suggestion. This has been acknowledged in the text. “Genetic epistasis experiments further supported this model, with clock gene mutants modified Mettl5 mutant phenotypes that suggesting both Clock and  Per downstream of Mettl5 (Figure 4I-N, Table 1). Secondary effect may exist for the significant increase in daytime sleep in the double mutants.”

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The study by Amal et al. investigates how signaling cues regulate epithelial permeability using Drosophila oogenesis as a model system. During mid-oogenesis, a process known as patency occurs, in which tricellular junctions within the follicular epithelium transiently open, allowing yolk proteins to be transported from the hemolymph to the oocyte. The authors demonstrate that the spatial pattern of patency along the anterior-posterior axis of the egg chamber is inversely correlated with the activity gradient of TGF-β signaling. They further show that TGF-β signaling inhibits vertex opening and influences both actomyosin contractility and DE-cadherin levels. Importantly, although DE-cadherin is required for the TGF-β-dependent suppression of vertex opening, elevated actomyosin contractility itself does not appear to be required for this effect. Overall, this is a well-executed study that links a tissue patterning signal to the regulation of epithelial permeability. The experiments are clearly presented, and the quantification and statistical analyses are rigorous. I nevertheless have several points that should be addressed, either through additional experiments or through further discussion in the manuscript.

      Main Points

      1. Suppressing the effect of activated Tkv (TkvQD) by mad depletion is indeed good yet indirect evidence for the involvement of canonical (Mad-dependent) TGF-ß signaling. I believe a more direct way to reach this conclusion would be the generation of anterior mad loss of function clones which should mimic the tkv8 phenotypes.
      2. On a more general note, most of the results of the paper are based on the hyperactivation of the pathway using TkvQD overexpression. I find this limiting for two reasons: First, the levels of TGF-ß signaling are abnormally high under these conditions. In this context, the interpretation of the contribution of TGF-ß induced MyoII and MyoII activity is unclear. The authors find that TGF-ß signaling activates MyoII activity, however inhibiting actomyosin contractility by various means did not restore vertex opening. This is however at levels of Tkv activity that are far beyond normal (TkvQD). At the same time, the same manipulations are sufficient to open vertices in cells that experience peak, endogenous levels of Tkv activity (anterior cells). Does endogenous Tkv signaling induce MyoII, MyoII activity, Rho1 in anterior levels? Addressing this in tkv8 mosaics would be helpful. I can imaging that, unlike Cadherin which seems to be epistatic to TkvQQ, it is a very difficult to exclude a contribution of TGF-ß mediated actomyosin contractility and there is probably not a good experiment to address this. However, I do not agree with the statement of line 174 "Although.... MyoII activity is dispensable for TGF-ß -mediated inhibition of vortex opening..." I think more appropriate would be to state that MyoII is dispensable for the abnormally/experimentally high TGF-ß signaling-mediated inhibition of vortex opening...". The explanation would be that under these conditions the exceptionally high TGF-ß signaling bypasses the need for MyoII (maybe through exceptionally high adhesion). This is apparently not the case at physiological levels of TGF-ß signaling at anterior cells. Second, high levels of TkvQD, a protein that has been found to localize at junction in other systems, might have secondary effects in vertex opening for example by affecting their structural integrity or even by affecting endocytosis.
      3. The effects of clonal manipulation of TGF-ß signaling within the clones are clear and solid. Although this would not affect the statements of this paper, it would be good if the authors could comment on the effects at clone boundaries. What happens to "hybrid" TCJ when wild-type cells (at the respective position and patency status) meet a clone with elevated or reduced TGF-ß signaling?
      4. From a TGF-ß signaling-centric point of view: In this and other tissues, most of the TGF-ß signaling effects are mediated through the transcriptional repressor Brinker. The pattern of Brk expression is at the patency stage inverse to the pMad/ TGF-ß signaling activity (pMad represses brk transcription) and would in principle be identical in its graded profile with the pattern of vertex opening. Did the authors tried to manipulate levels of Brk? Is it possible to restore tkv8 phenotypes by simultaneously depleting brk?

      Minor points

      • Other than stated, not all egg chambers seem to be at stage 10 A in Fig. 1. Are the eggs shown in C older ?
      • The box in 2A is very hard to see
      • It is hard to correlate the dad::GFP-nls staining of 2A with the intensity profile of 2B. Is the quantification really at the sub-apical region as stated in the legend?

      Significance

      The findings of this study are highly significant and likely to be of broad interest, as they establish a strong link between a signaling pathway (TGF-β signaling), best known for its role in gene expression and tissue patterning, and a highly dynamic cellular process-the remodeling of epithelial junctions that regulates epithelial permeability. While the involvement of TGF-β signaling in this process is not entirely new (see Row et al., iScience, 2021), the present study provides a more detailed analysis and offers a molecular explanation linking TGF-β signaling to epithelial junction patency.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Amal et al investigate how canonical TGF-β signaling regulates tricellular junction (TCJ) remodeling during follicular patency in the Drosophila ovarian follicular epithelium. Using genetic mosaics, quantitative imaging, and perturbations of signaling and cytoskeletal pathways, the authors show that TGF-β signaling suppresses patency in a cell-autonomous manner.

      The authors convincingly show that TGF-β signaling prevents remodeling of tricellular junctions (TCJs) during patency. The figures and quantitative analyses are of an excellent standard, and I commend the authors on the clarity of their data presentation. Previous work from this laboratory demonstrated that patency is regulated by actomyosin activity. In the present study, the authors show that although TGF-β signaling increases actomyosin contractility, perturbation of downstream effectors of actomyosin contractility does not rescue the patency defect caused by constitutively active TGF-β signaling. This is a surprising and interesting result.

      The authors then show that TGF-β regulates patency through effects on E-Cadherin. However, the mechanism by which TGF-β signaling regulates E-Cad remains somewhat unclear. Although the authors show that E-Cad levels appear elevated when TGF-β signaling is activated, E-Cad overexpression alone does not affect patency. The authors also test whether the effect reflects a broader change in adhesion proteins by examining Fas2 and N-Cad, which appear unchanged, suggesting that the effect is specific to E-Cad.

      The introduction and discussion are scholarly and cite the appropriate literature. Overall, the manuscript is rigorous, clearly presented, and ready for publication.

      The experimental approaches are described in sufficient detail to allow reproduction, and the statistical analysis and quantification appear appropriate. The experiments appear adequately replicated, and the presentation of the quantitative data is clear.

      Major comments:

      N numbers for experiments Cells/Egg chambers appear to be missing. Please add these details.

      Single images showing no change in the localization of Fas2 and NCad found in supplementary are not convincing. The authors should quantify this data.

      Minor comments:

      Figure 2A: Instead of sagittal sections through egg chambers, it may be more informative to show the imaging plane that highlights the surrounding follicular epithelium, which would better illustrate the spatial organization of the follicle cells.

      Lines 73-85: Consider referring the reader to Figure 1A earlier in the text to help orient the reader to the architecture of the egg chamber.

      It would also be helpful to include the abbreviation CPFC in the schematic in Figure 1A to make the terminology consistent with the text.

      Significance

      This is an exceptionally well-written and well-presented manuscript. The story presented is logical and the work is carefully executed with top-level figures and quantification. The manuscript is logically organized and controls and statistical tests are appropriate. The authors provide convincing evidence through careful genetic manipulations that TGF-β signaling suppresses vertex opening primarily by reinforcing E-Cad-dependent adhesion rather than through actomyosin contractility.

      A particular strength of the study is the clear dissection of two potential downstream pathways of TGF-β signaling regulated patency- actomyosin contractility and E-Cad-mediated adhesion - and the demonstration that the suppression of patency depends primarily on E-Cad function. The manuscript represents a conceptual advance over the lab's previous work by demonstrating that patency is regulated by an upstream signaling pathway. Whereas earlier studies from this group established the cell biological mechanism of patency, this work shows that TGF-β signaling acts as a regulatory input controlling this process.

      The main limitation of the study is that the downstream molecular mechanism linking TGF-β signaling to stabilization of E-Cad at tricellular vertices remains only partially defined. While the authors show that TGF-β signaling increases E-Cad levels and promotes its retention at vertices many questions remain unclear as to how this is achieved. The data implicate p120-catenin as a possible contributor, but it does not appear to be required, leaving the mechanistic basis of E-Cad stabilization incompletely resolved.

      The primary advance of the study is conceptual and mechanistic, showing that morphogen signaling can control TCJ integrity by stabilizing cadherin-based adhesion independently of actomyosin contractility. The work therefore advances our understanding of of how epithelial junction remodeling is regulated during development in the common model system of the Drosophila ovary.

      In my opinion, the manuscript is exceptionally well presented and appropriate for publication essentially as-is.

      The primary audience for this work will be researchers studying epithelial biology, morphogenesis and developmental cell biology, primarily those working in Drosophila. The manuscript will also be of interest to the broader cell and developmental biology community because it provides evidence for how signaling pathways and morphogen patterning regulates epithelial architecture and barrier function.

      My expertise lies in epithelial morphogenesis, cell-cell adhesion, junction dynamics, and developmental cell biology and I use the Drosophila ovary as a model system. I reviewed the previous paper from this lab that went to Current Biology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, the authors explore how TGF signaling inhibits patency in the follicular epithelia of the Drosophila ovary. In this setting, patency is the opening of the tricellular junctions within the follicular epithelium (FE) covering the ovary to allow the transfer of yolk proteins into the underlying ovary. The authors first demonstrate that there is an inverse correlation between levels of Dpp signaling (based on a Dad-GFP reporter) to both the vertex (tricellular junction) opening size and the "circularity" of the FE cells, with Dpp signaling being highest at the anterior end. They show that activated Dpp signaling (Dad-GFP signal) is highest in the most anterior FE as are the highest levels of F-actin and MyoII (mCherry reporter) and that ectopic activation of Dpp signaling (using an activated receptor) in posterior FE cells is sufficient to induce higher levels of RhoI, junctional F-actin and MyoII at the tricellular junctions. However, neither knockdown of RhoI nor expression of a dominant negative form of MyoII have any impact on whether Dpp signaling blocks patency. Thus, although activated by Dpp signaling, MyoII activation is not required for Dpp to block patency. They show that Ecad is not present in the patent tricellular junctions, although it is present earlier and that Dpp signaling is required for enhanced levels of Ecad in anterior FEs and is sufficient to induce Ecad transcription (based on a lacZ reporter in the Ecad gene) and to increase Ecad protein levels. They show that Ecad is required to block patency regardless of Dpp signaling. They show that MyoII activity is not required for Dpp enhancement of Ecad protein levels. They show that Dpp signaling can increase p120cat levels and that p120ctn can increase Ecad levels. However, knockdown of P120cat has no effect on patency in either WT or TKV activated FEs.

      The experiments are nicely down and illustrated, and the paper is well written.

      I think the authors are overstating what they can conclude in both the title and abstract.

      Significance

      I think some of the conclusions cannot be made with the data in hand. Overall, the authors have shown that Dpp signaling enhances levels of several proteins that would be thought to block patency (Rho1, MyoII, F-Actin, p120cat, and Ecad (transcriptionally). They have shown that, except for Ecad, knockdown of most of these do not affect Dpp-dependent patency. However, showing that patency is severely enhanced in both WT and Dpp-activated cells with loss of Ecad is not sufficient evidence that Dpp signaling works through Ecad. Taking away Ecad is going to cause near or complete loss of AJs - thus, it is no surprise that patency is enormously increased everywhere. Importantly, overexpression of Ecad (or of p120cat, which increases Ecad levels) did not block patency. Indeed, it seems like the only manipulation that mimics the effects of Dpp activation on patency is blocking endocytosis - so this seems a likely mechanism (it could also explain the higher levels of p120cat and/or Ecad at junctions). Overall, I agree that the authors can conclude that the Rho1 activation of MyoII observed downstream of Dpp signaling does not impact repression of patency. However, since overexpression of Ecad had no impact on patency, I think they can only conclude that the Ecad expression is enhanced downstream Dpp signaling but that this increase in Ecad expression is insufficient to block patency on its own. Thus, the the title and abstract should be modified to more accurately reflect the conclusions that can be made.

      Minor suggestions

      Figure 1G. Please clearly indicate where the clone of tkv8 null cells is located within the follicular epithelium.

      In my opinion, both supplemental figures should be included in the main body of the paper. They make important points relevant to the conclusion. Figure S1 should be included as part of Figure 3. Figure S2 should be included a stand-alone figure, as there are currently only six figures in the manuscript and the panel in that figure showing that blocking endocytosis blocks patency is an interesting and potentially relevant finding.

      In its current state, the paper is most appropriate for a specialized reader in the field of Drosophila oogenesis. If the authors were to follow up on a potential link between Dpp signaling and endocytosis and find such a link, then I think it would be of more general interest.

      The time estimate below is based on not doing major experiments. If the authors were to follow up on the observation regarding endocytosis, it would be more in the 6 month range.

    1. : ‘Proletarian education needs first and foremost a framework, an objective space within which education can be located. The bourgeoisie, in contrast, requires an idea toward which education leads’ (Benjamin, 1999, p. 202)

      freire?

    1. eLife Assessment

      This study provides valuable insights into the role of MATR3 in oocyte maturation and folliculogenesis, using conditional knockout mice and in vitro follicle culture systems to show that MATR3 is required for oocyte growth and gene transcription, with downstream effects on follicle development. The strength of the evidence is incomplete, as key findings lack independent validation, methodological details are insufficient, and inconsistencies in data presentation reduce confidence in the conclusions. The work will be of interest to researchers in reproductive biology and fertility.

    2. Reviewer #1 (Public review):

      Summary:

      This study aims to clarify MATR3's function and molecular mechanism in oocyte growth and maturation, explore its association with OMA, and its potential as a diagnostic and therapeutic target using specific knockout mouse models, human OMA samples, and multi-omics technologies. And it has fully achieved preset objectives with results strongly supporting conclusions. Specifically, it addresses the gap in the synergistic mechanism of epigenetic and secretory signals regulated by RNA-binding proteins (RBPs) in oocyte growth and enriches the molecular etiological spectrum of oocyte maturation disorders. It is the first time the conservative function of MATR3 has been revealed in multiple species, providing a paradigm for cross-species research on RBPs in the field of reproductive biology. It also provides a new candidate target for OMA, a clinically refractory infertility disease, and is expected to promote the optimization of assisted reproductive technology and the development of precision medicine.

      Strengths:

      The strengths of this study are significant and prominent. First, the research system is comprehensive, integrating knockout mouse models, in vitro knockdown models, multi-species (mouse, porcine, and human) verification, combined with scRNA-seq, LACE-seq, CO-IP, and other multi-omics and molecular biology technologies, forming a complete and progressive evidence chain. Second, the mechanism analysis is in-depth, clarifying the dual molecular mechanisms of MATR3 regulating the transcriptional synthesis and secretion of GDF9 through "recruiting KDM3B to regulate H3K9me2 demethylation" and "directly binding to Rdx mRNA", with a clear logical closed loop. Third, the clinical correlation is close. It is the first time to find abnormal nuclear localization of MATR3 in oocytes of OMA patients, providing new clues for clinical disease mechanism research, and verifying the downstream function of GDF9 through rescue experiments, effectively enhancing the translational value of the results.

      Weaknesses:

      This study included only one OMA patient's oocyte sample. Without clinical screening for MATR3 mutations or abnormal expression, establishing a causal relationship between MATR3 and OMA remains difficult.

    3. Reviewer #2 (Public review):

      Summary:

      This study investigates the role of MATR3 in oocyte development and folliculogenesis using conditional knockout mouse models together with in vitro follicle culture and molecular analyses. The authors aim to determine whether MATR3 regulates oocyte maturation and follicle development and to explore potential mechanisms linking MATR3 function to transcriptional and epigenetic regulation in growing oocytes.

      Strengths:

      A major strength of the work is the use of a conditional knockout mouse model combined with complementary in vitro follicle culture approaches, which together provide a useful framework for examining gene function during oocyte development. The study also attempts to integrate cellular phenotypes with molecular analyses of transcriptional activity and epigenetic markers.

      Weaknesses:

      Several weaknesses limit the strength of the conclusions. These include insufficient validation of key experimental manipulations (such as the efficiency of MATR3 knockdown in siRNA experiments), limited quantification or statistical analysis for some datasets, inconsistencies between the text and presented data in certain figures, and incomplete methodological descriptions that make it difficult to fully evaluate reproducibility.

    4. Reviewer #3 (Public review):

      Summary:

      The study aims to elucidate the dual molecular mechanisms of the RNA-binding protein MATR3 in oocyte growth and maturation. The authors propose that MATR3, highly expressed in growing oocytes (GOs), regulates oocyte quality through two pathways: epigenetically, by recruiting KDM3B to remove the repressive H3K9me2 mark at the Gdf9 locus to activate transcription; and post-transcriptionally, by binding Rdx mRNA to maintain microvillus structure for GDF9 secretion. This mechanism ensures oocyte-granulosa cell communication and female fertility. The study also explores the link between MATR3 and human oocyte maturation arrest (OMA).

      Strengths:

      The study proposes an innovative dual-mechanism model encompassing "epigenetic transcriptional activation and cytoskeletal regulation," which not only expands the functional understanding of RNA-binding proteins in chromatin regulation but also reveals the coordination between nuclear transcription and organelle structure. By integrating scRNA-seq and LACE-seq, the authors constructed a comprehensive regulatory network for MATR3, identifying both key targets and numerous potential molecules, thereby providing rich resources for future mechanistic studies. Furthermore, the inclusion of oocyte samples from human OMA patients directly links the basic findings to clinical reproductive disorders. Despite the limited sample size, this approach demonstrates strong translational potential.

      Weaknesses:

      The partial phenotypic improvement achieved by exogenous GDF9 supplementation suggests that the downstream effector pathways may involve a more complex network regulation, implying that the current interpretation of GDF9's central role could be further explored. Regarding the developmental abnormalities of granulosa cells in the conditional knockout model, their pathological origins require in-depth analysis to determine whether they represent primary alterations or secondary adaptive responses resulting from the loss of oocyte signaling.

    1. eLife Assessment

      This important study combines cryo-EM, biochemical, and cell-based assays to examine how Gβγ interacts with and potentiates PLCβ3. The authors present evidence for multiple Gβγ interaction surfaces and argue that Gβγ primarily enhances PLCβ3 activity after membrane recruitment rather than serving mainly as a membrane-recruitment factor. The evidence is solid overall, although uncertainty remains about the physiological relevance and precise arrangement of the proposed interfaces because the structural model relies on engineered crosslinking.

    2. Reviewer #1 (Public review):

      The manuscript by Fisher et al describes the molecular mechanism underlying how G beta gamma subunits engage with the beta 3 isoform of PLC. The paper used a combination of cryo EM, BRET assays, and biochemical assays of PLC beta activity. A key discovery is that G beta gamma is not sufficient to drive membrane binding by itself, and instead promotes G alpha activation. The work is important, but suffers slightly from some ambiguity in the actual interface that is present in their cryo EM model, as crosslinkers could stabilise a transient and non-native complex. This is somewhat abrogated by the careful mutational analysis, which shows that mutation of any of these three sites does somewhat block PLC beta G beta gamma activation. However, there could be some improvement in the presentation of this data, as well as possible mutant selection. Overall, this paper is a nice complement to the Falzone et al paper, showing the membrane-bound complex of PLCB3 on membranes, with this work building on this work, highlighting the importance this will have in our full understanding of PLC beta activation.

      Major concerns:

      My biggest concern is the potential that this interface is artefactual based on the crosslinking strategy utilised. Here are thoughts on how this could be better validated, presented in a more convincing way.

      (1) The authors' main claim is that there is a degree of plasticity of G beta gamma binding to the PLC beta 3 isoform, with three possible binding sites. The main complication of this is, of course, the possibility that the crosslinking stabilises a non-native complex, driven by a mutated cysteine.

      Because of this, any other additional details about this interface are going to be critical for the scientific audience to judge if this is accurate.

      What would greatly help Figure 1 is an evolutionary conservation analysis of the novel Gbg interface in PLC, to see how well this is conserved, and compare this to the conservation of the previously annotated sites. Conservation of these sites on both the G beta gamma and PLC side would help justify this as a native complex.

      This will also help orient the reader to the identity of the mutated residues assayed in Figure 3.

      (2) The g beta gamma orientation is also different than what I have observed in previous g beta gamma effector structures. Is there any precedent for this as an effector interface? A supplemental figure comparing this structure to other g beta gamma interfaces from other enzymes, for example recent Tesmer structure with PI3K.

      (3) The mutational analysis in Figure 2D-G seems to give some strange results, and I have some question why certain residues were chosen rather than others. Mutation of the Gbg side will be more complicated, as of course that can affect any of the three surfaces. My main question is that, from the way Figure 2A is oriented, the main salt bridge in their novel interface to me looks like R199-D228, with K183 being in the wrong orientation to E226, and D167 being far from any charged residues. Why did the authors not make the corresponding R199 to D or E mutation?

      (4) To help the reader's interpretation of Figure 2A, I would recommend a supplemental figure showing the density for interfacial residues, as that also would increase confidence in the interface.

    3. Reviewer #2 (Public review):

      In this manuscript, the authors dissect how Gβγ potentiates PLCβ3 signaling in cells. Using engineered crosslinking to stabilize a Gβγ-PLCβ3 complex, single particle cryo-EM, and cell-based functional assays, they identify and map multiple putative Gβγ interaction surfaces on PLCβ3, including a previously unrecognized binding mode. Structure-guided mutagenesis supports the functional relevance of these interactions and suggests that Gβγ potentiation is not primarily mediated by PLCβ3 membrane recruitment, but instead enhances PLCβ3 activity after the lipase is already at the membrane.

      Previous reconstitution work on the membrane surface (Falzone & MacKinnon, 2023) proposed a recruitment/partitioning-centric model in which Gβγ increases PLCβ3 output largely by elevating its membrane surface concentration, whereas Gαq primarily increases catalytic turnover; under those reconstitution conditions, the two inputs can combine approximately multiplicatively. In receptor-driven cellular signaling, however, PLCβ3 is robustly recruited to the plasma membrane upon Gαq activation, which raises the question of whether Gβγ contributes mainly through additional recruitment or through a post-recruitment mechanism once PLCβ3 is already at the membrane.

      This manuscript helps address that gap by using membrane-anchored PLCβ3 and complementary cellular readouts to separate "getting PLCβ3 to the membrane" from "boosting activity once PLCβ3 is already there." Their results argue that, in cells, membrane recruitment is largely dominated by Gαq·GTP, while Gβγ can further potentiate PIP2 hydrolysis after membrane association, consistent with a modulatory role at the membrane rather than primary recruitment.

      Overall, the work provides a structural and mechanistic framework for Gβγ-PLCβ3 cooperation and helps clarify the basis of Gq pathway amplification. The manuscript is generally strong, but some issues need to be addressed.

      Major comments:

      (1) BMOE/BM(PEG)2 crosslinking may enforce a non-native docking geometry, potentially compromising the physiological relevance and precision of the Gβγ-PLCβ3 interface as described. Although a >50% 1:1 crosslinked complex is formed and remains active, the solution maps show lower local resolution for Gβγ, consistent with a dynamic, potentially heterogeneous, interface. One interface is captured via a single engineered cysteine pair (PLCβ3 E60C-Gβ C271), which could potentially bias the pose. It would be helpful if the authors could provide additional orthogonal support (e.g., alternative crosslinked sites) and bolster the clarification of its uniqueness and relevance.

      (2) In the crosslinked structure, the authors report that GβD228 interacts with PLCβ3 R199 and K183. In Figure 2A, R199 appears closer to Gβ D228 than K183, yet only K183 is functionally tested. Testing R199 (e.g., R199E/R199A) would strengthen the structure-guided validation of this interface.

      (3) The mutagenesis strategy appears inconsistent across figures/assays, which makes it difficult to interpret phenotypes and directly link the functional data to the proposed interfaces. For example, in Figure 2E, we see R185L but R215E, while residue L40 is mutated to Gly in the IP accumulation assays but to Glu/Lys (L40E/K) in the BRET assays (Figures 3B/3D/3F). The authors should (i) clearly justify the rationale for each substitution (conservative vs charge-reversal, interface disruption, etc.) and (ii), where possible, test the same mutants across assays (or provide evidence that alternative substitutions yield consistent conclusions).

    4. Reviewer #3 (Public review):

      Summary:

      PLCβ3 is activated by both Gαq and Gβγ subunits. This paper follows previous solutions and cryoEM studies of PLCβ3 / Gβγ, trying to understand the molecular details of activation using cellular BRET assays and cryoEM.

      Strengths:

      The authors find evidence for multiple binding sites on PLCβ3 for Gβγ and suggest that Gβγ is not bone fide activator per se but enhances Gαq activation by positioning the catalytic site towards substrate, although this is not completely convincing. Although these sites may not naturally be operative, the authors might want to develop the potential role of these sites.

      The authors also find that this activation is not through recruitment of the enzyme to the membrane by Gβγ released upon G protein activation, in accord with other PLCβ enzymes, but not for PLCβ3, and again, the authors might want to develop this point further.

      Weaknesses:

      (1) I'm confused as to why the authors feel that their mechanism is distinct from the two-state enzyme, the synergistic activation proposed by Ross in 2011, using a primarily thermodynamic argument. As written, the authors appear to be very reliant on structural and BRET studies that do not give the details that would disprove this interpretation. The main issue is that the author's mechanism does not fully explain how Gβγ activation occurs for PLCβ2 in reconstituted systems in the absence of Gαq subunits.

      (2) In a recent study, McKinnon presents a model showing that Gαq and Gβγ activate PLCβ3 by two distinct pathways and that activation by Gβγ occurs through membrane recruitment. It is not surprising that the authors find that this is not true since the pelleting method used by McKinnon is subject to error. The authors should directly address the limitations of this previous work and the changes in proteoliposomes with sedimentation that alter partition coefficients. Although the inability of Gβγ to drive membrane binding is in accord with the quantitative studies of Scarlata, showing that the affinity of PLCβ3 to Gβγ is fairly weak as compared to the intrinsic membrane partition coefficient.

      (3) It was proposed many years ago that in signaling complexes Gαq - Gβγ may not have to fully dissociate when binding PLCβ, but rather shift their relative orientation when binding to PLCβ to allow activation. Is their model consistent with this? Is it possible that PLCβ3 keeps Gβγ from diffusing to enhance the rate of Gq / Gβγ re-association?

      (4) The authors find that Gβγ binds multiple sites, and it is clear that the PH domain site is the primary one in accord with previous work. Could these weaker sites be an artifact of the elevated concentrations used in cryoEM and BRET assays?

      (5) Although their assays infer differences in binding affinities, it would strengthen the paper if the authors could estimate the association energies of these different binding sites. This estimation would also address the concern stated above.

    5. Author Response:

      Public Reviews:

      Reviewer #1 (Public review):

      The manuscript by Fisher et al describes the molecular mechanism underlying how G beta gamma subunits engage with the beta 3 isoform of PLC. The paper used a combination of cryo EM, BRET assays, and biochemical assays of PLC beta activity. A key discovery is that G beta gamma is not sufficient to drive membrane binding by itself, and instead promotes G alpha activation. The work is important, but suffers slightly from some ambiguity in the actual interface that is present in their cryo EM model, as crosslinkers could stabilise a transient and non-native complex. This is somewhat abrogated by the careful mutational analysis, which shows that mutation of any of these three sites does somewhat block PLC beta G beta gamma activation. However, there could be some improvement in the presentation of this data, as well as possible mutant selection. Overall, this paper is a nice complement to the Falzone et al paper, showing the membrane-bound complex of PLCB3 on membranes, with this work building on this work, highlighting the importance this will have in our full understanding of PLC beta activation.

      Thank you for the positive feedback.

      Major concerns:

      My biggest concern is the potential that this interface is artefactual based on the crosslinking strategy utilised. Here are thoughts on how this could be better validated, presented in a more convincing way.

      (1) The authors' main claim is that there is a degree of plasticity of G beta gamma binding to the PLC beta 3 isoform, with three possible binding sites. The main complication of this is, of course, the possibility that the crosslinking stabilises a non-native complex, driven by a mutated cysteine.

      Because of this, any other additional details about this interface are going to be critical for the scientific audience to judge if this is accurate.

      What would greatly help Figure 1 is an evolutionary conservation analysis of the novel Gbg interface in PLC, to see how well this is conserved, and compare this to the conservation of the previously annotated sites. Conservation of these sites on both the G beta gamma and PLC side would help justify this as a native complex.

      This will also help orient the reader to the identity of the mutated residues assayed in Figure 3.

      We agree that crosslinking can result in the capture a non-physiologically relevant interface. However, we do not observe any crosslinking between Gbg and a PLCb3 variant that retains a cysteine in the disordered region of the X–Y linker, nor crosslinking between PLCb3 and any other cysteine present in the Gbg heterodimer. The evolutionary conservation analysis is a great suggestion and will included in the revision for both Gbg and PLCb.

      (2) The g beta gamma orientation is also different than what I have observed in previous g beta gamma effector structures. Is there any precedent for this as an effector interface? A supplemental figure comparing this structure to other g beta gamma interfaces from other enzymes, for example recent Tesmer structure with PI3K.

      Yes, this is not the more typically observed Gbg–effector interaction, which is mediated by the narrow face of the Gbgtoroid. We are not aware of other structures in which Gbg interacts with a binding partner in the same way. A supplemental figure comparing this Gbg–PLCb interaction to the Gbg–PI3K and Gbg–GRK2 structures will be included in the revision.

      (3) The mutational analysis in Figure 2D-G seems to give some strange results, and I have some question why certain residues were chosen rather than others. Mutation of the Gbg side will be more complicated, as of course that can affect any of the three surfaces. My main question is that, from the way Figure 2A is oriented, the main salt bridge in their novel interface to me looks like R199-D228, with K183 being in the wrong orientation to E226, and D167 being far from any charged residues. Why did the authors not make the corresponding R199 to D or E mutation?

      Thank you for pointing this out. We are in the process of testing the PLCb3 R199E mutant in our assays and will include the results in the revised manuscript.

      (4) To help the reader's interpretation of Figure 2A, I would recommend a supplemental figure showing the density for interfacial residues, as that also would increase confidence in the interface.

      Thank you for the suggestion, this will be included in the revised manuscript.

      Reviewer #2 (Public review):

      In this manuscript, the authors dissect how Gβγ potentiates PLCβ3 signaling in cells. Using engineered crosslinking to stabilize a Gβγ-PLCβ3 complex, single particle cryo-EM, and cell-based functional assays, they identify and map multiple putative Gβγ interaction surfaces on PLCβ3, including a previously unrecognized binding mode. Structure-guided mutagenesis supports the functional relevance of these interactions and suggests that Gβγ potentiation is not primarily mediated by PLCβ3 membrane recruitment, but instead enhances PLCβ3 activity after the lipase is already at the membrane.

      Previous reconstitution work on the membrane surface (Falzone & MacKinnon, 2023) proposed a recruitment/partitioning-centric model in which Gβγ increases PLCβ3 output largely by elevating its membrane surface concentration, whereas Gαq primarily increases catalytic turnover; under those reconstitution conditions, the two inputs can combine approximately multiplicatively. In receptor-driven cellular signaling, however, PLCβ3 is robustly recruited to the plasma membrane upon Gαq activation, which raises the question of whether Gβγ contributes mainly through additional recruitment or through a post-recruitment mechanism once PLCβ3 is already at the membrane.

      This manuscript helps address that gap by using membrane-anchored PLCβ3 and complementary cellular readouts to separate "getting PLCβ3 to the membrane" from "boosting activity once PLCβ3 is already there." Their results argue that, in cells, membrane recruitment is largely dominated by Gαq·GTP, while Gβγ can further potentiate PIP2 hydrolysis after membrane association, consistent with a modulatory role at the membrane rather than primary recruitment.

      Overall, the work provides a structural and mechanistic framework for Gβγ-PLCβ3 cooperation and helps clarify the basis of Gq pathway amplification. The manuscript is generally strong, but some issues need to be addressed.

      Thank you for the positive comments.

      Major comments:

      (1) BMOE/BM(PEG)2 crosslinking may enforce a non-native docking geometry, potentially compromising the physiological relevance and precision of the Gβγ-PLCβ3 interface as described. Although a >50% 1:1 crosslinked complex is formed and remains active, the solution maps show lower local resolution for Gβγ, consistent with a dynamic, potentially heterogeneous, interface. One interface is captured via a single engineered cysteine pair (PLCβ3 E60C-Gβ C271), which could potentially bias the pose. It would be helpful if the authors could provide additional orthogonal support (e.g., alternative crosslinked sites) and bolster the clarification of its uniqueness and relevance.

      We did attempt to isolate other crosslinked complexes. PLCb3-D892 self-crosslinked under all reaction conditions, while PLCb3-D892 XY<sub>Cys</sub> , which retains an endogenous cysteine within the X–Y linker (C516), did not result in any crosslinked product when incubated with Gbg. Only the PLCb3-D892 E60C crosslinked to Gbg as confirmed by SDS-PAGE and SEC. All experiments also used wild-type Gb which contains two solvent-exposed cysteines in the effector binding site (C204 and C271). The greatest number of particles correspond to crosslinking between Gb C271 and E60C in PLCb3-D892. Crosslinking between PLCb3-D892 E60C and other residues in Gbg is possible, but there are not sufficient particle numbers corresponding to these species for 2D classing and reconstruction. These observations, together with the high efficiency of crosslinking, are consistent with a stable and persistent interaction.

      (2) In the crosslinked structure, the authors report that GβD228 interacts with PLCβ3 R199 and K183. In Figure 2A, R199 appears closer to Gβ D228 than K183, yet only K183 is functionally tested. Testing R199 (e.g., R199E/R199A) would strengthen the structure-guided validation of this interface.

      We agree, and functional analysis of PLCb3 R199E will be included in the revision.

      (3) The mutagenesis strategy appears inconsistent across figures/assays, which makes it difficult to interpret phenotypes and directly link the functional data to the proposed interfaces. For example, in Figure 2E, we see R185L but R215E, while residue L40 is mutated to Gly in the IP accumulation assays but to Glu/Lys (L40E/K) in the BRET assays (Figures 3B/3D/3F). The authors should (i) clearly justify the rationale for each substitution (conservative vs charge-reversal, interface disruption, etc.) and (ii), where possible, test the same mutants across assays (or provide evidence that alternative substitutions yield consistent conclusions).

      The mutagenesis experiments were initially carried out independently in the Lambert and Lyon groups. As the study progressed, additional mutations were designed based on prior results. The L40G mutation is one such example. Given its modest impact on activity in the IP accumulation assay, the L40E and L40K mutants designed to maximally disrupt the interface in the BRET experiments. The revision will include the rationale behind different substitutions and discussion of any potential differences.

      Reviewer #3 (Public review):

      Summary:

      PLCβ3 is activated by both Gαq and Gβγ subunits. This paper follows previous solutions and cryoEM studies of PLCβ3 / Gβγ, trying to understand the molecular details of activation using cellular BRET assays and cryoEM.

      Strengths:

      The authors find evidence for multiple binding sites on PLCβ3 for Gβγ and suggest that Gβγ is not bone fide activator per se but enhances Gαq activation by positioning the catalytic site towards substrate, although this is not completely convincing. Although these sites may not naturally be operative, the authors might want to develop the potential role of these sites.

      The authors also find that this activation is not through recruitment of the enzyme to the membrane by Gβγ released upon G protein activation, in accord with other PLCβ enzymes, but not for PLCβ3, and again, the authors might want to develop this point further.

      Thank you for the suggestions.

      Weaknesses:

      (1) I'm confused as to why the authors feel that their mechanism is distinct from the two-state enzyme, the synergistic activation proposed by Ross in 2011, using a primarily thermodynamic argument. As written, the authors appear to be very reliant on structural and BRET studies that do not give the details that would disprove this interpretation. The main issue is that the author's mechanism does not fully explain how Gβγ activation occurs for PLCβ2 in reconstituted systems in the absence of Gαq subunits.

      The reconstitution experiments rely on nM-mM concentrations of purified proteins and liposomes that contain up to 30% PI (4,5)2. PLCb2 and PLCb3 show dose-dependent increases in activity with increasing concentrations of Gbg. PLCb enzymes that interact with the liposomes would encounter liposome-tethered Gbg subunits, which would in turn bind the lipase, tethering to the membrane and helping position the active site for catalysis. While there is not yet experimental evidence that Gbg binding can displace the Ha2’ helix, it could facilitate interfacial activation given the net negative charge of PI (4,5) P2. In addition, PLCb2 is fundamentally different from the other PLCb isoforms in its sensitivity to heterotrimeric G proteins. Given its decreased sensitivity to Ga<sub>q</sub> and increased basal activity, it is possible that autoinhibition by the proximal CTD is weaker. PLCb2 is also abundantly expressed in neutrophils, along with more Gi-coupled receptors. Thus, it is possible that Gbg directly activates PLCb2 in these cells, but future experiments are required to definitively answer this question.

      (2) In a recent study, McKinnon presents a model showing that Gαq and Gβγ activate PLCβ3 by two distinct pathways and that activation by Gβγ occurs through membrane recruitment. It is not surprising that the authors find that this is not true since the pelleting method used by McKinnon is subject to error. The authors should directly address the limitations of this previous work and the changes in proteoliposomes with sedimentation that alter partition coefficients. Although the inability of Gβγ to drive membrane binding is in accord with the quantitative studies of Scarlata, showing that the affinity of PLCβ3 to Gβγ is fairly weak as compared to the intrinsic membrane partition coefficient.

      Thank you for raising this point. The changes in composition, size, and structure when pelleting proteoliposomes may complicate data interpretation and will be discussed in the revision.

      (3) It was proposed many years ago that in signaling complexes Gαq - Gβγ may not have to fully dissociate when binding PLCβ, but rather shift their relative orientation when binding to PLCβ to allow activation. Is their model consistent with this? Is it possible that PLCβ3 keeps Gβγ from diffusing to enhance the rate of Gq / Gβγ re-association?

      The crosslinked complex is compatible with simultaneous binding of a Gbg –Gbg heterotrimer to the PLCb3 without disrupting the observed interface. It is possible that Gbg could interact with Gbg bound to the PH domain or the EF hands in the previously reported reconstruction. If so, the interaction would be mediated by the N-terminal helix of Gbg. Alternatively, the intrinsic GAP activity of PLCb3 may also prevent Gbg from diffusing to promote heterotrimer reassociation.

      (4) The authors find that Gβγ binds multiple sites, and it is clear that the PH domain site is the primary one in accord with previous work. Could these weaker sites be an artifact of the elevated concentrations used in cryoEM and BRET assays?

      Assuming the PH domain is the primary Gbg binding site, it is possible that the secondary EF hand site observed by Falzone and Mackinnon reflects high protein concentrations. However, it seems unlikely that we would reach these concentrations within cells. Our functional data is also consistent with the Gbg binding site in the EF hands playing a functional role in increasing PLCb activity.

      (5) Although their assays infer differences in binding affinities, it would strengthen the paper if the authors could estimate the association energies of these different binding sites. This estimation would also address the concern stated above.

      We appreciate this suggestion and will keep it in mind as we complete the revision.

    1. eLife Assessment

      Kim et al. provide important findings explaining how T3SS assembly is regulated by a conserved genetic context. The evidence supporting the conclusions is compelling, with numerous experiments demonstrating the validity of the findings. The work will be of interest to molecular biologists, biochemists, and microbiologists working on secretion systems or similar complexes. Further studies revealing similar mechanisms in other systems would enhance the impact of the current study.

    2. Reviewer #1 (Public review):

      Summary:

      The authors set out to understand the complex regulation of the assembly of the Type 3 Secretion System of S. typhimurium. They found that the gene synteny as well as specific mRNA stem loops were important for the translational coupling of sctS and sctT. Without this regulation, SctT self-oligomerizes, which disrupts the export of effector proteins and leads to a decreased fitness of the pathogen. The work was done using a variety of convincing methods and leads to an updated picture of how T3SS assembly occurs. Since the same genetic synteny is found in a large majority of T3SS in different bacteria, it is likely that this is a general mechanism, but one that needs to be further experimentally validated.

      Strengths:

      The paper uses an impressive amount of experiments, with different techniques, to describe how they identified the genetic regulation of SctT production.

      Weaknesses:

      Only minor weaknesses are found.

      (1) Regarding the use of the complex being unique. It is not well explained what makes this a unique complex.

      (2) The paper would benefit from a discussion regarding how regulation might work in the minority of bacterial strains where the T3SS gene synteny is largely different. One would expect that those bacteria would have a different way of regulating T3SS assembly, but that is not discussed at all by the authors.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Samuel Wagner and colleagues describe an elegant mechanism to prevent promiscuous assembly of a core virulence type III secretion system protein, SctS. Starting from a bioinformatic standpoint, they demonstrate that synteny is highly conserved, and sctT occurs immediately downstream of sctS. Secretion is greatly reduced when sctT is removed or scrambled from its genomic context, and sctT expression is accordingly reduced (sctS synteny is also important, though less so). The distance between sctS and sctT is crucial. An elegant series of genetic experiments leads the authors to pinpoint a stem loop structure that occludes the Shine-Dalgarno sequence of sctT. This property is independent of the actual gene preceding sctT. In sum, this means that SctS is already expressed before SctT is expressed, preventing SctT from forming cytotoxic homooligomers.

      Strengths:

      The manuscript is very well-written, easy to follow, and describes a substantial amount of genetic detective work to identify the underlying mechanism. I have only a number of textual suggestions, mainly for the Introduction text, which I believe could be revised for a flagellar and broader audience.

      Weaknesses:

      Major concern:

      While the work is rigorous and substantial, I am unsure as to whether its findings will appeal beyond a niche audience.

      Minor points:

      (1) Line 117: The number here seems to be very small. RefSeq has ~200,000 genomes. My guess is that at least 100,000 of these will be bacterial. Many (most?) bacteria have flagella, and some unflagellated strains have injectisomes, meaning I would have guessed that the authors would have ~50,000 genomes with SctRSTU. This estimate is error-prone, but not by too much. Can the authors explain the discrepancy between my estimate and their figure of almost two orders of magnitude? (SctRSTU/FliPQGFlhB should also be easy to pick up by sequence searches, so I don't think this is due to false negatives).

      (2) Discussion: I would appreciate some discussion of how species that do not conserve the synteny of sctS and sctT prevent problems of sctT oligomerisation? It doesn't need to be evidence-based at this stage, but I'm sure the authors have thought about this, and the Discussion is an appropriate place to share their speculations.

    4. Reviewer #3 (Public review):

      At the core of the bacterial type III secretion system (T3SS), a nanomachine used to inject effector proteins into eukaryotic cells, five highly conserved proteins, SctRSTUV, form the export apparatus, which is the actual gate for effector proteins. Not only are these proteins the most strongly conserved parts of the system, but also their gene order is conserved, which is not the case for most other components of the T3SS. Interestingly, this order does not completely recapitulate the assembly order, which is SctR5-T4-S-U-V. Looking into the reasons for the conserved synteny, the authors noted a stem-loop in the mRNA of the Salmonella SPI-1 sctS gene, which is present in many other T3SS as well (and in fact had been found in Yersinia before). They then use an array of clever gene permutations and modifications to discern the benefit of this order for the bacteria. The combination of thorough sequence analysis with different, partly quantitative, protein expression and secretion assays and growth curves, both in the native Salmonella background and in heterologous systems, provides strong evidence for the interpretation of the authors: The stem-loop in sctS prevents the premature expression of SctT, which can otherwise assemble into "futile multimers" that can lead to ion loss. The presence of stem-loops in many other sctS/T genes gives weight to this finding.

      This is a very nice and thorough study addressing an important point in the assembly of type III secretion systems. I only have a few suggestions.

      (1) Conserved gene orders have been shown for many complexes, and the findings presented in this manuscript might be applicable to other membrane complexes.

      The conservation of gene order and the presence of the stem loop give weight to the authors' findings. However, it is only mentioned quite late in the discussion that a similar stem loop was found in Yersinia upstream sctT earlier, and was interpreted differently. The authors' current discussion is somewhat evasive on this point. Why would these similar structures be used differently? Why would temperature not play a role in Salmonella SPI-1? And wouldn't the stem-loop also couple sctS and sctT expression in Yersinia? This should be addressed, if possible, by experiments (at least, the influence of temperature on the SPI-1 mRNA structure should be testable for the authors) and by a more detailed discussion (given the redundancy of RNA thermometers in the Yersinia T3SS, the interpretation in the current paper might well be the more compelling one).

      (2) A point that deserves more attention is that a similar finding in Yersinia has been interpreted differently before (as a temperature sensor rather than translational coupling) - are these systems really different? Testing the different interpretations in the respective other system (at least the influence of temperature in the Salmonella SPI-1 system used in this manuscript) would have made the interpretation even more compelling.

      (3) Another point that should be discussed in more detail is why this mechanism is present when replacement of the sctT ATG by weaker start codons and the simple omission of a separate SD sequence upstream sctT would achieve the same outcome. This could be tested in one of the nice heterologous systems, as used in Figure 4.

    1. eLife Assessment

      This valuable study presents a comparative dataset on crab locomotion to examine the evolution of sideways walking. The evidence supporting the authors' claims is largely convincing. This work will be of interest to researchers in animal locomotion.

    2. Reviewer #1 (Public review):

      Summary:

      This is an interesting and well-written manuscript in which the authors set out to answer a simple, old question with a modern toolkit: where in crab evolution did sideways walking arise, how often has it been lost or regained, and is it plausibly linked to the ecological and taxonomic success of true crabs. To do this, they record locomotion from 50 live species, convert each species' movements into a quantitative index that compares forward versus sideways bouts, and then map the resulting states onto a recent crab phylogeny to infer the most likely evolutionary history of locomotor direction.

      Strengths:

      The strongest part of the study is the dataset itself. Comparable behavioral measurements across dozens of crab species are rare. The authors have done the field and husbandry work needed to make this possible. The overall pattern they recover, that most true crabs are strongly biased toward sideways movement (while a smaller set of lineages move predominantly forward), is interesting and likely to be useful to others. The phylogenetic mapping is also a reasonable way to address the "how many times" question (although this is peripheral to my expertise). The manuscript makes a convincing case that sideways locomotion is not simply a trivial byproduct of a crab-like body plan.

      Weaknesses:

      Where I am less convinced is in how strongly the authors describe the discreteness of the behavioral categories and the absence of intermediates. The manuscript states that the Forward-Sideways Index shows a clear separation between two locomotor types with little evidence for intermediates, and it cites a statistical test rejecting a single peak in the distribution. However, the histogram in Figure 3 appears structured within each labeled category, with subclusters inside both the forward and sideways groups rather than a single tight peak per group. This matters because the index is built by first placing each movement bout into "forward" versus "sideways" bins using a fixed angle boundary and then collapsing the result into a single ratio. That approach is simple and transparent enough, but it can also hide mixed strategies. For example, a species that produces substantial amounts of both forward and sideways walking can still end up with a strongly positive or negative index, and therefore be classified as a pure "type," even though the underlying behavior is mixed. In that context, rejecting a single peak in the across-species distribution does not, by itself, justify the stronger claim that intermediates are rare or absent.

      Related to this, a key methodological choice is the use of 60 degrees as the cutoff between forward and sideways bouts. This boundary may be reasonable as a convention, but the paper does not explain why it is the right place to draw the line, and there is a plausible biological concern that a fixed angular cutoff does not mean the same thing across taxa.

      Crabs vary in body shape and in how the legs are arranged around the body. In my own comparative work, for example, some species show an elliptical stance pattern elongated along the preferred direction of travel, while others show a more circular leg arrangement, and the latter can express more mixed forward and sideways behavior. When limb arrangement and body geometry differ across species, the same measured angle can correspond to different underlying mechanics and different functional "degree of sidewaysness." The practical implication is that the reported binary separation may partly reflect the imposed classification rule, rather than a sharp biological divide.

      Another limitation that affects interpretation is the decision to use one individual per species. I understand the logistics, and for some questions, a single representative individual can be a reasonable first pass. But it is not strong support for negative claims about intermediates, especially in a group where individuals can change substantially with growth and allometry. Crabs can grow dramatically, often with pronounced allometric shifts in limb proportions that can alter the center of mass location. Size alone can alter the kinematics and choice of locomotor behaviors in crustaceans. In species where appendage proportions change with size, or where certain legs become disproportionately large (or calcified), it is plausible that locomotor direction and the distribution of movement angles shift across ontogeny. That makes it hard to treat a single individual as a complete description of a species-level strategy, particularly for species that fall closer to the boundary between categories.

      In sum, this is a valuable and useful behavioral comparative study with a dataset that many in the field will appreciate. The main conclusions about the likely evolutionary placement of sideways walking are plausible, but several of the stronger claims about discrete locomotor types, the absence of intermediates, and the relationship to diversification would be more convincing if the analysis were less dependent on a fixed angular cutoff and on single individuals per species, or if the manuscript framed those points more cautiously so the conclusions track the strength of the evidence.

    3. Reviewer #2 (Public review):

      Summary:

      The current work investigates the evolution of sideward locomotion in Brachyura in light of a single evolutionary origin. To this end, the authors first analysed the mode of locomotion in 50 crab species and observed mutually exclusive presence of sideways vs. forward movement. The phylogenetic analysis confirmed that there is indeed a single evolutionary origin for sideways movement, which was sometimes followed by several reversions to forward locomotion. This way, authors demonstrate how locomotor movement modes shape evolutionary diversification in animals by showing that species richness is much higher in side-ways-moving crabs than in the nearest groups. This is an interesting work that integrates behavioural analysis and phylogenetic relations, capitalising largely on crabs. I have a few suggestions and questions.

      Firstly, I think the paper spends too much time on a straightforward analysis of the mode of locomotion. I was also wondering whether the phylogenetic analysis could be simply achieved by maximising an objective function in which the modes of movement are inversely coded for two putative groups, with all values calculated at all possible nodes.

      Unfortunately, I find that the authors did not sufficiently discuss differences in the ecological niches of species with forward vs. sideways locomotion modes (including challenges of locomotion and substrate).

      Likewise, what are the anatomic correlates of forward vs. sideways locomotion? For instance, how are the advantages assumed for sideways movement associated with a flattened body? Is it possible that the mode of motion is secondary to flattened/narrow body structure, which basically limits the distance between legs and thus makes the forward movement difficult - under this logic, the mode of movement would be a secondary phenomenon to body shape traits. How can one differentiate between this alternative and the one that puts the mode of movement in the centre of the story? On a related note, how do different modes of movement relate to the ability to fit into tight spaces - how does it relate to differences in leg joints?

      Is it possible that the sideways movement maximises the scanned visual field per unit time/displacement, which may be beneficial for mostly forward-moving predators?

      It is really difficult to decipher the information contained in the nodes (circles) in the printed black-and-white version of the manuscript.

      Briefly, although I find the study interesting, the presented complexity may not be necessary given the endpoints; it can be achieved much more simply. Furthermore, the degree to which the conceptual analysis of different modes of locomotion was exercised was limited. The general approach may serve as a good model for the evolutionary analysis of other traits. The demonstration of traceability of the relations in question is a major contribution of the work.

      Strengths:

      The research question and the novel combination of different data types.

      Weaknesses:

      The complexity of the methods used, along with a limited discussion of the potential dynamics that may underlie the evolution of the sideways movement mode.

    1. Our services will enable businesses as commercial users of digital platforms to exercise their data access rights under Art. 6(10) of Regulation (EU) 2022/1925 (Digital Markets Act) and make the obtained data available to authorised third parties.Commercial users can submit a formal data access request through our service and designate third parties authorised to receive their data. The request is transmitted to the respective gatekeeper.

      Their data intermediary services are aimed add enabling data access requests under the DMA. Their primary focus is LinkedIn.

    1. The guide points the reading system to the main content in the book. Typically, the guide includes the cover, the table of contents, and the beginning of the actual text. The guide section is deprecated in ePUB 3.0, but some vendors still use it.

      the guide section in opf points to the main content (cover, ToC and beginning of text). Deprecated in epub3.0.

    2. The spine is an ordered list of all of the contents of the book. It uses the item IDs you’ve created in the manifest. Each item gets an item ref, and you use the item id that you created in the manifest for the id ref.

      the spine lists the book content in the form \<itemref idref=\"sameasinmanifest\"\/>

    3. OPF file in an epub is 'open package format' and lists the content of an epub. some metadata (title, date, creator, identifier,language, publisher) the manifest point to needed resources (images, fonts, css, navigation file and all xhtml pages) spine gives the content of the book (reusing the ids in the manifest).

    1. new forms of propaganda

      Algorithms that are codded to prioritize feeding you state propaganda regularly despite your perceived interest in the topics as to control what information you hear about our government to ensure the general population is being feed the "information", viewpoints, and beliefs that result in trust and belief in the government, state, and police? A more realistic reality then you'd think. Every country does propaganda but you know their good at it when the majority of citizens don't think their is any.

    2. and decision making under uncertainty to cope with the complex and dynamic environment

      Decisions that are dictated by the rules and data it has been feed by a flawed and biased human creator.

    3. For some, AI is about artificial life-forms that can surpass human intelligence, and for others, almost any data processing technology can be called AI.

      This is something I find very interesting as the term "AI" itself is very misleading as we haven't reached the point yet in LLM's, generative "AI", or any algorithms we have where they have gained sentience or a "mind of their own" like how the term Artificial Intelligence suggests as something having intelligence is something we consider human and sentient. And AI is a term commonly used in Sci-Fi and fiction story telling to describe something that doesn't exist yet because it always describes a specifically smart and sentient machine of some kind.

    4. these involve so called filter bubbles, echo-chambers, troll factories, fake news, and new forms of propaganda

      This is familiar to me since the TedTalk video covers this bias in algorithms and this has been something I have been more cautious about in my day to day life especially with news.

    5. Even AI researchers have no exact definition of AI. The field is rather being constantly redefined when some topics are classified as non-AI, and new topics emerge.

      I found this interesting since and a little surprising since AI is such a big part of todays world I thought there would be a set definition for it. But after thinking it makes sense that there isn't since AI in todays world is constantly changing and being reimagined.

    6. However, in the context of AI, it is obvious that different AI systems cannot be compared on a single axis or dimension in terms of their intelligence.

      this is something new that I did not consider, AI can be rapidly different and that impacts its use in our lives. It reminds me of the TedTalk video on biased views in algorithms because it correlates to AI's biases and information being different depending on which one you use.

    1. This essay introduces this special issue on virtue ethics in relation to military AI. It describes the current situation of military AI ethics as following that of AI ethics in general, caught between consequentialism and deontology.

      This statement presents the broad objective of the research.

    1. Over the past decade, the share of low-income,(link is external) special education, homeless,(link is external) and English learner students has grown.

      Requiring greater investment. Special education specifically requires greater parental involvement.

    2. only 18 states(link is external) provide at least 10% more funding to high-poverty districts than low-poverty districts, and nearly one third provide less funding to high-poverty school districts than low-poverty districts.

      Important to tether this information with the percentage that states invest in the school. 45% of total funding goes here

    1. That number is a file descriptor, and it's the single abstraction that holds together files, sockets, pipes, terminals, timers, signals, and /dev/null. They are all just integers pointing into a kernel table

      FD is basically an integer, which is an entry number in the kernel table. Each process has its own copy of kernel table.

    1. students of color (15 percent) in Virginia attended a high-poverty school in the 2013-2014 school year, as did more than one out of every five (22 percent) Black students — compared to just 3 percent of White students.

      DIsparity between minority and white opportunities in education

    1. lets you prove you made something without revealing what it is.

      That is a key affordance. IndyWeb has that too and more

      share reveal the top slice and provide the means for p2p inter persona engagements, conversations about and around progressively shared content with full provenance and recapitulable history only available to the participants

      let people to grow their own(ed) private, secure, networks of trust recorded, verifyably, mutual sharing and co-laboration mutual learnings and symmathesy. Trust but verify

    2. make it structurally difficult to extract data without user participation, preventing platform lock-in or sudden rule change

      step in the right direction

    1. IFT proposes that information-seeking behavior develops to maximize the rate of information gained per unit of time or effort invested. Note that the term information does not refer to the information-theoretic concept but to subjective interest; here, information means anything that users find interesting.

      sentence that mentions implicitly or explicitly a particular theory about computing or information

    2. MDP is a formalism that originates from studies of sequential decision-making in artificial intelligence and operations research. Instead of the choice between n actions, MDP deals with environments where rewards are delayed (or distal). This requires an ability to plan actions as part of sequences instead of one-shot choices.

      sentence that mentions implicitly or explicitly a particular theory about computing or information

    3. Visual statistical learning is a research topic in perception that studies how the statistical distribution of our environments affects the deployment of gaze.

      sentence that mentions implicitly or explicitly a particular concept relevant to HCI

    4. It assumes that human long-term memory evolved to help survival by anticipating organismically important events. It is evolutionarily important to remember things that are important for survival. Therefore, the expected value of remembering a thing in the future should affect the probability of recalling it.

      sentence that mentions implicitly or explicitly a particular theory about how humans think or act

    5. According to rational analysis, behavior is sensitive to the statistical distribution of rewards in the environment that a user has experienced. Users learn the way rewards are distributed through continued exposure to an environment and adapt their behavior accordingly. A user's behavior is rational because it is tuned to the distribution of rewards in the environment—the ecology.

      sentence that mentions implicitly or explicitly a particular theory about how humans think or act

    6. The theory assumes that users are 'computationally rational': When picking an action—or deciding how to get from the present state to a state with positive rewards—users are as rational as their cognition allows. Users act based on their often inaccurate and partial beliefs, which they have formed via experience.

      sentence that mentions implicitly or explicitly a particular theory about how humans think or act

    7. Rational analysis is a theory of rational behavior proposed by Anderson and Schooler [21]. It examines the distribution of rewards in the environment to explain how users adapt their behavior. According to rational analysis, behavior is sensitive to the statistical distribution of rewards in the environment that a user has experienced.

      sentence that mentions implicitly or explicitly a particular theory about how humans think or act

    8. These four theories differ in the factors they include and how the agent's decision-making problem is formulated. As such, the theories differ in how easily they help us find a solution to the user's decision-making problem.

      sentence that describes theories in the abstract

    9. The term satisficing is used to describe how users tend to behave when facing a complex decision-making problem. It refers to settling on a satisfactory but not optimal solution in the normative sense.

      sentence that mentions implicitly or explicitly a particular concept relevant to HCI

    10. The concept of rationality has its roots in economics, where it was developed to study how peo-ple should act in economic decision-making. In such settings, the idea is that people reach theirgoal, such as maximizing their return, by maximizing utility.
    1. Our design was motivated by two major goals for notation authoring. These goals followed from recent studies of notation augmentation [30, 71] and conversations with scientists who had experience writing notation in instructional materials and research communications (4 professors, 2 graduate students, R1–6).

      sentence that describes who the system is designed for

    2. We define the key projections as markup (in this case, LaTeX), an annotatable render, and a structure hierarchy view. Augmentations are made easy to invoke, and projections are kept synchronized and co-present so that authors can shift between representations as is expedient to them.

      sentence that describes the characteristics that define the proposed system

    3. the challenge of using these tools is that annotations are unmoored from the structure of the formula and must be redone whenever the formula changes. Authors must perform precision positioning and sizing operations that could be inferred from the coordinates of the augmented expressions.

      sentence that describes the obstacles that the proposed system is designed to help the intended user get around to reach their goals

    4. these markup languages can require cumbersome and error-prone editing, arising from the intermixing of annotation markup with the underlying formula. Participants in a study by Wu et al. [71] identified difficulty with debugging nested braces and locating markup to edit.

      sentence that describes the obstacles that the proposed system is designed to help the intended user get around to reach their goals

    5. lab study participants frequently made errors related to incorrectly matched braces when using a LaTeX baseline to augment formulas.

      sentence that describes the obstacles that the proposed system is designed to help the intended user get around to reach their goals

    6. Authors in Head et al. [30] described that "code gets horrible looking" as macros are added to it to specify augmentations.

      sentence that describes the obstacles that the proposed system is designed to help the intended user get around to reach their goals

    7. FreeForm, a projectional editor wherein authors can augment formulas—with color, labels, spacing, and more—across multiple synchronized representations. Augmentations are created graphically using direct selections and compact menus. Those augmentations propagate to LaTeX markup, which can itself be edited and easily exported.

      sentence that describes the characteristics that define the proposed system

    8. FreeForm is a projectional editor optimized for notation augmentation. This paper defines the key projections for the text: textual LaTeX, a formula render with tree-aware selections, and a property/hierarchy view.

      sentence that describes the characteristics that define the proposed system

    1. CASE ILLUSTRATION 2 Two days after sustaining minor injuries in a traffic accident, Jeff, a 17-year-old teenager, comes to the physician’s office complaining of left shoulder pain. He is accompanied by his mother, who is concerned because Jeff was also recently arrested for driving under the influence of alcohol. There is no history of medical or behavioral problems, although, on questioning, his mother describes a 12-month history of moodiness and falling school grades. Using the HEADSS format assessment, the physician assesses Jeff’s health risks:

      .

    2. CASE ILLUSTRATION 1 Lauren is a 15-year-old girl admitted to the hospital with an arm fracture requiring surgical repair. The fracture occurred during cheerleading practice while climbing a pyramid of other cheerleaders. She reported being distracted while climbing, lost concentration, and fell to the ground. During the admission history the patient was talkative and easily distracted. Although she did not report taking any medications to the admitting nurse, when asked “are you supposed to be taking” any medications, Lauren reported that she should be taking medication for ADHD (see Chapter 28). She had not taken medication for the last several days because she was staying with a friend and did not want her friend to know that she took medication.

      .

    3. A comprehensive health-risk assessment should cover issues dealing with home, education, activities, drug use, sexual practices, and suicidal ideation (HEADSS). Using the HEADSS format helps with organization and standardization

      .

    4. Although most teenagers want to receive health information and discuss personal behavior, these discussions must generally be initiated by the physician. Many teenagers are not accustomed to interacting in such participatory, nonjudgmental conversations with adults.

      .

    5. In 1992, the American Medical Association published Guidelines for Adolescent Preventive Services (GAPS), the first set of developmentally and behaviorally appropriate comprehensive health care guidelines for adolescents.

      .

    6. In psychosocial and behavioral terms, it is the time during which adult body image and sexual identity emerge; independent moral standards, intimate interpersonal relationships, vocational goals, and health behaviors develop; and the separation from parents takes place.

      .

    1. designing complex behavior can be a difficult programming task, and program representations in end-user programming tools may not be well-suited for heavy programs.

      sentence that describes the obstacles that the proposed system is designed to help the intended user get around to reach their goals

    2. Ply allows users to develop, test, and tweak program components, exploring possibilities for how data can be transformed and composed to discover and achieve goals. This style of programming can support many use cases, even those not traditionally considered in the trigger-action programming model.

      sentence that describes the goals of the intended user

    3. Through the combination of these features, Ply allows users to develop, test, and tweak program components, exploring possibilities for how data can be transformed and composed to discover and achieve goals.

      sentence that describes the goals of the intended user

    4. Frequently, code-generation systems focus on building and then refining a full working application, using visibility of the full underlying code as a fallback when users need to build understanding of the generated program.

      sentence that describes the obstacles that the proposed system is designed to help the intended user get around to reach their goals

    5. Each sensor is accompanied by a glanceable visualization of the sensor's output payloads on the Ply canvas. This visualization is specific to the sensor and its output type, showing the most critical information for evaluating whether the sensor is behaving as expected.

      sentence that describes the characteristics that define the proposed system

    6. Ply uses a server program written in TypeScript to make code generation requests to a large language model and to execute the resulting code, which passes messages to and from sensors and actuators.

      sentence that describes the characteristics that define the proposed system

    7. Each layer in Ply tracks its dependencies; sensors receive data from their dependencies, actuators push data to their dependencies, and linkages each refer to exactly one sensor and one actuator dependency. Collections of layers and linkages in Ply are isomorphic to node graphs in node-based programming languages.

      sentence that describes the characteristics that define the proposed system

    8. Code generation offered by large language models can serve to author this glue code for trigger-action programs, allowing for data from triggers to be mapped to input data for actions automatically even when their native data formats or intended functionality do not match exactly.

      sentence that describes the conditions for which the system is designed

    9. Ply allows users to develop, test, and tweak program components, exploring possibilities for how data can be transformed and composed to discover and achieve goals. This style of programming can support many use cases, even those not traditionally considered in the trigger-action programming model.

      sentence that describes who the system is designed for

    10. It encourages program decomposition into "layer" abstractions, It automatically creates visualizations of event payloads at layer boundaries to help users understand layer behavior without having to read the underlying generated code, and It constructs ad hoc parametrization interfaces that allow users to configure important dimensions of the behavior of each layer without having to re-author it.

      sentence that describes the characteristics that define the proposed system

    11. However, such LLM-authored code, especially when implementing nontrivial logic, can be difficult to specify, understand or debug. Users need appropriate tools and handles to understand and make changes to the computation that is being performed in such code.

      sentence that describes the obstacles that the proposed system is designed to help the intended user get around to reach their goals

    12. Trigger-action programming has been a success in end-user programming. Traditionally, the simplicity of links between triggers and actions limits the expressivity of such systems. LLM-based code generation promises to enable users to specify more complex behavior in natural language. However, users need appropriate ways to understand and control this added expressive power.

      sentence that describes the conditions for which the system is designed

    1. AbstractHerbarium collections are a vast but underutilized resource for ancient DNA research, containing over 400 million specimens with detailed metadata and spanning centuries of global biodiversity. Understanding patterns of DNA preservation in natural collections is crucial for optimizing ancient DNA studies and informing future curation practices. We analysed genomic data for 573 herbarium specimens from six plant species from the genera Hordeum and Oryza collected from the Americas and Eurasia over 220 years. Using standardized laboratory protocols and shotgun sequencing, we quantified DNA degradation and elucidated factors that accelerate it. We find significant age-dependent DNA fragmentation rates, indicating temporal degradation processes not detected in prehistoric samples. In our analysis, DNA decay rates in herbarium specimens were almost eight times faster than in moa bones, reflecting fundamental differences in tissue composition and preservation environments. Environmental conditions at the time of specimen collection emerged as the major determinants of post-mortem damage rates, with the interaction term between temperature and genus being the dominant driver of cytosine deamination. We find no effect of sample storage on DNA damage and degradation. These findings provide insights into how climatic origin, preservation environment, taxonomic identity and age influence DNA preservation while highlighting opportunities for improving institutional preservation practices. Due to standardised preservation conditions, museum collections can provide better insights into DNA damage and degradation over time than archaeological and paleontological samples.

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag026), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 3:

      I read this work with great interest, and I believe it represents an excellent contribution to our understanding of aDNA preservation, particularly welcome for plants, since most studies in this field are usually carried out on animal tissues, bones, and similar materials. The authors show that ancient DNA (aDNA) damage in herbarium specimens results from a combination of temporal, environmental, and biological factors, with storage conditions affecting decay rates. Their results indicate that DNA fragmentation increases in dry plant tissue with sample age, it varies between genera, and that temperature is the main driver of cytosine deamination. I agree with these interpretations, but the discussion can emphasize more the roles of water and oxidation in DNA degradation. Rapid drying of herbarium specimens limits hydrolytic damage but may increase the oxidative processes, on the contrary, animal or arthropod specimens dry more slowly, andthis allows different degradation dynamics. Considering these differences in the discussion can further clarify the mechanisms behind the observed patterns, especially across museum tissue types.

      In the study, the methodologiies vare solid. The approaches used to estimate endogenous DNA content is appropriate, though applying a mapping quality threshold could strengthen the calculation. Methods for assessing DNA fragmentation, for DNA damage, and for decay rates, and 5' C→T substitutions seem robust and oprimal for validating aDNA authenticity. The climate analyses also appear sound but I cannot provide detailed evaluations on this part due to limited expertise in this area.

      The explanation for the correlation between fragment length and sample age it seems logical. Unlike animals, where DNA decay occurs in two phases, plant tissue death is instead gradual and diverse depending on tissue, and this allows enzymatic and microbial degradation to continue over longer periods, contributing to the strong age-fragmentation relationship. Overall, the study highlights the importance of tissue type and storage conditions on DNA decay; however discussing how hydrolytic and oxidative processes differ between herbarium plants and other specimen types (animal) would further strengthen the interpretation of the decay rates.

      Specific comments

      The terminology related to ancient DNA preservation (e.g., DNA damage, DNA degradation, DNA decay) should be clarified and used more consistently throughout the text. These terms describe distinct processes, and specifying the intended meaning for each will improve precision and avoid confusion for the reader. DNA damage refers to specific chemical lesions; DNA degradation describes the physical fragmentation of DNA molecules; and DNA decay refers to the temporal process or rate at which DNA deteriorates over time.

      The two most prominent reactions associated with DNA degradation are deamination (resulting with spontaneous substitutions of cytosine residues to uracil) and depurination (breakage of the phosphodiester bond resulting in DNA backbone fragmentation). In view of the comment above on the terminology used, I believe that the sentence above conflates different processes: deamination is a form of DNA damage, whereas depurination leads to DNA degradation through strand fragmentation. I suggest the terminology in the paper should be modified to reflect this distinction. Even if the authors do not wish to adopt this terminology I suggest that they clarify the terms more clearly at the beginning.

      Line 106: …six plant species, spanning…

      Line 98, 105: In this context, it is not appropriate to refer to deamination-induced substitutions as "mutations," since they represent post-mortem chemical damage rather than random biological changes (mutations) that occurr in vivo. In addition, introducing this new term complicates even more the terminology presented in the previous comments.

      Line 116-118: I wonder if the sampling coverage for Hordeum, with highest counts in arid and warm regions, may be incomplete, as certain regions, such as northern Europe (e.g., Scandinavia) or Russia are not represented. These species are cultivated in Russia, Denmark, southern Sweden, I believe. Should this limitation be acknowledged as it could affect the generality of the conclusions especially regarding temperatures?

      It is unclear why the study included only wild Oryza species (O. alta, O. grandiglumis, O. latifolia, O. rufipogon), whereas for Hordeum the cultivated Hordeum vulgare was used. Perhaps, including Oryza sativa can provide more information on DNA preservation in domesticated material and allow a more consistent comparison across genera?

      Table 1: Draw a line above the last row (Total)

      Line 140: Oryza should be in italics

      Line 140: why 58 Oryza (30 O. latifolia, 18 O. rufipogon and 10 O. grandiglumis)? Why not all Oryza samples.

      From line 169, it appears that an additional 287 Oryza samples from different origins (KAUST) were used, but it is not clear (not explained) if these are herbarium specimens, and why this origin (KAUST) is not included in Table 1. Perhaps it would be better to explain at the beginning of this paragraph that there are two subsets of samples and to clarify the content of Table 1 more clearly.

      Line 143: it is not specified which part of the herbarium material was used. I assume leaves, but this should be clearly stated

      Line 149: Please clarify what "gDNA" refers to; genomic DNA? Since you spell out "genomic DNA" elsewhere in the paragraph, the abbreviation here seems unnecessary.

      Line 149: Why was only a subset used? Please explain and provide a rationale.

      Line 154: were the libraries constructed only on this subset as well?

      Line 162: Fragment size: The first letter of the sentence should be capitalized.

      Lines 165-169: It is not clear for me how the different subsets of samples were used in this study. Here it is stated that all barley samples (but how many exactly?) were sequenced on NovaSeq in a specific place, whereas only 40 rice samples (from the initial subset of how many?) were sequenced on another NovaSeq platform and at a different institute. Also, the 287 samples from KAUST are seqeunced on a MiSeq that has lower output compared to NovaSeq. Somehow, it is necessary to explain how the initial 573 samples were selected and used for all analyses. Also, the 287 samples from KAUST were processed in an ancient DNA lab, but what about all the other samples? It would be strange if a specialized laboratory for ancient DNA analyses was not used for all samples. In this regard, it should also be noted that the issue of contamination is not mentioned in the manuscript, although it was certainly considered by the authors; for example, by indicating whether negative controls (blank samples) were used and how they were processed. Certainly, the C>T signal ensures that we are dealing with authentic ancient sequences, but this should be highlighted and explained more clearly.

      Line 189: Why was it aligned to Oryza glumipatula (a new species not mentioned before?) and not against Oryza rufipogon? The authors report measuring gDNA fragment size distributions on a subset of 40 samples. It would be helpful if they could provide a motivation for why this subset was chosen, and how it is representative of the full dataset, to clarify the rationale behind not analyzing all samples.

    2. AbstractHerbarium collections are a vast but underutilized resource for ancient DNA research, containing over 400 million specimens with detailed metadata and spanning centuries of global biodiversity. Understanding patterns of DNA preservation in natural collections is crucial for optimizing ancient DNA studies and informing future curation practices. We analysed genomic data for 573 herbarium specimens from six plant species from the genera Hordeum and Oryza collected from the Americas and Eurasia over 220 years. Using standardized laboratory protocols and shotgun sequencing, we quantified DNA degradation and elucidated factors that accelerate it. We find significant age-dependent DNA fragmentation rates, indicating temporal degradation processes not detected in prehistoric samples. In our analysis, DNA decay rates in herbarium specimens were almost eight times faster than in moa bones, reflecting fundamental differences in tissue composition and preservation environments. Environmental conditions at the time of specimen collection emerged as the major determinants of post-mortem damage rates, with the interaction term between temperature and genus being the dominant driver of cytosine deamination. We find no effect of sample storage on DNA damage and degradation. These findings provide insights into how climatic origin, preservation environment, taxonomic identity and age influence DNA preservation while highlighting opportunities for improving institutional preservation practices. Due to standardised preservation conditions, museum collections can provide better insights into DNA damage and degradation over time than archaeological and paleontological samples.

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag026), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 2:

      Reproducibility report for: Patterns of aDNA Damage Through Time end Environments - lessons from herbarium specimens Journal: Gigascience ID number/DOI: GIGA-D-25-00447 Reviewer(s): Laura Caquelin, Department of Clinical Neuroscience, Karolinska Institutet, Sweden [Wrote the report and reproduced the results] Gustav Nilsonne, Department of Clinical Neuroscience, Karolinska Institutet, Sweden [Reviewed the final report]


      1. Summary of the study The authors evaluated DNA preservation in herbarium collections by analyzing genomic data from 573 specimens of Hordeum and Oryza. They quantified DNA degradation and identified factors affecting decay, finding that specimen age and environmental conditions strongly influence DNA preservation.

      1. Scope of reproducibility

      According to our assessment the primary objective is: the regression analyses of aDNA damage metrics for Hordeum and Oryza.

      • Outcome: "Four metrics were selected to quantify patterns of aDNA damage: (i) the proportion of endogenous DNA content, (ii) the fragment length distribution, (iii) the damage fraction per site (λ), and (iv) the frequencies of 5' C>T substitutions." (lines 197-199)
      • Analysis method outcome: "The four metrics were analysed in linear models as a function of collection year and sample age using the 'lm' function in R" (lines 199-200)

      • Main result: The results of this outcome are presented in figure 2 "Regression analyses of aDNA damage metrics for Hordeum and Oryza" and in the related text lines 302 to 361 in the "Regression analysis" section: "Endogenous fraction […] The regression analyses revealed no statistically significant relationship between the proportion of endogenous DNA and the sample collection year in Hordeum (R2 = 0.003, p = 0.451, N = 211), but a very weak yet significant relationship was observed in Oryza (R2 = 0.04, p = 0.00167, N= 245; figure 2a).

      Fragment length […] We observed a statistically significant relationship between the log-mean fragment length and the sample collection year for both genera (figure 2b), with a stronger relationship for Hordeum (R2 = 0.27, p =5.33 x 10-16, N=211) than Oryza (R2 = 0.112, p = 8.58 x 10-8, N= 245).

      Damage fraction per site (λ) and DNA decay rate (k) […] We estimated the DNA decay rate per year (k) for Hordeum and Oryza from the slope of the linear relationship between λ and sample age (figure 2c). We observed a per nucleotide decay rate of k= 2.64 x 10-4 per year for Hordeum (R2 = 0.208, p =3.27 x 10-12, N= 211), which was 1.5 times faster than the decay rate of Oryza of k= 1.79 x 10-4 per year (R2 = 0.101, p = 3.65 x 10-7, N= 245) […].

      Nucleotide misincorporations […] (figure 2d), with Oryza starting from a higher baseline of damage when compared to Hordeum and displaying a stronger relationship (R2 = 0.303, p = 8.62 x 10-21, N= 245 for Oryza, and R2 = 0.207, p =3.63x 10-12, N= 211 for Hordeum, respectively). […]"


      1. Availability of Materials a. Data
      2. Data availability: Raw data are not yet publicly available but uploaded in NCBI database. Processed data are shared via the private journal dropbox, and the intermediate file is available on the GitHub repository.
      3. Data completeness: Complete processed data and intermediate file (all data necessary to reproduce main results are available).
      4. Access Method: Private journal dropbox and GitHub repository
      5. Repository: https://github.com/Stefano-Porrelli/Herbaria_aDNA_Damage -Data quality: Structured b. Code
      6. Code availability: Open
      7. Programming Language(s): R and Bash
      8. Repository link: https://github.com/Stefano-Porrelli/Herbaria_aDNA_Damage
      9. License: MIT license
      10. Repository status: Public
      11. Documentation: Clear Readme file. Additional details may be required to run the Bash code.

      1. Computational environment of reproduction analysis

      2. Operating system for reproduction: MacOS 15.7.2

      3. Programming Language(s): R
      4. Code implementation approach: Using shared code
      5. Version environment for reproduction: R version 4.5.1/RStudio 2025.05.1

      1. Results

      5.1 Original study results - Results 1: See screenshot figure 2:

      5.2 Steps for reproduction

      -> Run 01_Plant_aDNA_screening_prep.sh - Issue 1: The reviewer link provided for the bioprojects on NCBI did not allow downloading. -- Partial resolution: An email was sent to the authors requesting access to the raw data or sharing processed data and intermediate files. Processed data were shared via the private journal dropbox and intermediate file (aDNA_damage_screening_MAIN.txt) was shared both on the dropbox and the GitHub repository.

      The authors contacted NCBI to enable downloading the raw data with the reviewer link, but no response has yet been received. As the review needed to be performed within a set timeframe, the computational reproducibility review was performed first using the processed data and then directly with the intermediate file.

      Note: The two bash scripts were not run. Additional guidelines would be helpful for running these scripts, especially regarding terminal commands and manual steps (changing the repository name or the link to the data for example).

      -> Run the analysis from the processed data shared --> Run code aDNA_Dmg_Script00_collate_screening_results.r - Issue 2: The code expects data organized in two sub-folders: 4_mapping and 5_aDNA_characteristics. Processed data were received in several species-specific folders, each containing 4_mapping and 5_aDNA_characteristics. -- Resolved: All data were merged manually into single 4_mapping and 5_aDNA_characteristics folders to match the script's requirements. This detail should be added to the readme file. - Issue 3: The sample_metadata.txt file was not correctly merged with the results dataframe. Many columns (Batch_no to X) in aDNA_damage_screening_MAIN.txt contained NA values. -- Resolved: A message was sent to the authors to resolve the issue. They updated both sample_metadata.txt and aDNA_damage_screening_MAIN.txt on GitHub. Author's response: "I have realised the problem stems from inconsistencies between sample naming conventions in the screening output directories and the sample identifiers in the metadata file. Specifically, for the Hordeum samples, the directories are named using library IDs rather than the short sample names, and some of the Oryza samples were missing their expected suffixes. This meant the left_join step failed to match metadata for those samples. Thank you for flagging this up. I have now corrected this by updating the "Sample" column in the metadata file to reflect the actual directory names used in the screening outputs. The original short names are preserved in a "Sample_ID" column. I have uploaded the corrected sample_metadata.txt file to the GitHub repository, and also updated the aDNA_damage_screening_MAIN.txt dataset on the GitHub repo to reflect these changes. I have re-run the pipeline and it now works correctly. Please let me know if you encounter any further issues, and thank you again for catching this."

      The reproduced aDNA_damage_screening_MAIN.txt file no longer contains NA values.

      --> Run code aDNA_Dmg_Script02_Regressions.r: The script was run without any issues.

      -> Run the analysis from the intermediate data file shared on Github --> Run code aDNA_Dmg_Script02_Regressions.r: Run the code after renaming the file to aDNA_damage_screening_MAIN_shared.txt.

      5.3 Statistical comparison Original vs Reproduced results - Reproduced results: -- Using the processed data and the reproduced aDNA_damage_screening_MAIN.txt, the results of Figure 2 were successfully reproduced (see screenshots below). -- Using the shared aDNA_damage_screening_MAIN.txt from GitHub, the results were also successfully reproduced (see screenshots below).

      • Comments: Supplementary Figure 1 was also reproduced using the same code. We confirmed that the reproduced values match the original results. Both the processed data and the intermediate data file reproduced Supplementary Figure 1 (see screenshots below).
      • Errors detected: One reporting error was detected in the "Fragment length" section (line 336): the p-value for Oryza should be 8.47 x 10-8, not 8.58 x 10-8 as reported in the text.
      • Statistical Consistency: All statistical results reproduced from both the processed data and the intermediate file are identical to those reported in the manuscript (see Comparison_reproduced_vs_original.csv and Comparison_two_reproductions.csv files attached with this report).

      1. Conclusion
      2. Summary of the computational reproducibility review The computational reproducibility review shows that the results in Figure 2 and related text of the original study were fully reproducible using both the processed data and the intermediate data file shared (aDNA_damage_screening_MAIN.txt). The statistical results reproduced are identical to those presented in the manuscript. One minor reporting error was detected in the manuscript: the p-value for Oryza in the "Fragment length" section should be 8.47 × 10⁻⁸ instead of 8.58 × 10⁻⁸.

      3. Recommendations for authors -- Provide clearer instructions for running the Bash scripts, including terminal commands and any manual steps. -- Ensure consistent sample naming across metadata files and data directories to avoid merging issues for all analysis/scripts. -- Consider making raw data publicly available or provide clear guidance for reviewers to access it. -- Maintain clear documentation of file structure to facilitate future reproducibility.

    3. AbstractHerbarium collections are a vast but underutilized resource for ancient DNA research, containing over 400 million specimens with detailed metadata and spanning centuries of global biodiversity. Understanding patterns of DNA preservation in natural collections is crucial for optimizing ancient DNA studies and informing future curation practices. We analysed genomic data for 573 herbarium specimens from six plant species from the genera Hordeum and Oryza collected from the Americas and Eurasia over 220 years. Using standardized laboratory protocols and shotgun sequencing, we quantified DNA degradation and elucidated factors that accelerate it. We find significant age-dependent DNA fragmentation rates, indicating temporal degradation processes not detected in prehistoric samples. In our analysis, DNA decay rates in herbarium specimens were almost eight times faster than in moa bones, reflecting fundamental differences in tissue composition and preservation environments. Environmental conditions at the time of specimen collection emerged as the major determinants of post-mortem damage rates, with the interaction term between temperature and genus being the dominant driver of cytosine deamination. We find no effect of sample storage on DNA damage and degradation. These findings provide insights into how climatic origin, preservation environment, taxonomic identity and age influence DNA preservation while highlighting opportunities for improving institutional preservation practices. Due to standardised preservation conditions, museum collections can provide better insights into DNA damage and degradation over time than archaeological and paleontological samples.

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag026), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1:

      The manuscript by Stefano Porrelli and colleagues make a valuable contribution by scaling up previous work on DNA damage in plant herbarium specimens and by exploring how collection environments influence patterns of aDNA degradation. The authors present a large-scale analysis of DNA damage in 573 specimens from six Hordeum and Oryza species spanning ~220 years and diverse climates. Using standardized ancient DNA protocols, shotgun sequencing, and high-resolution climate data, they model the effects of specimen age, collection environment, genus, and herbarium of origin on DNA fragmentation, decay rates, and cytosine deamination.

      The study robustly confirms that DNA fragmentation and λ are strongly age-dependent, that herbarium specimens exhibit decay rates intermediate between bones and arthropods, and that environmental factors (particularly temperature) appear to correlate with 5′ C→T damage when all samples are analysed together. At the same time, some aspects of the temperature interpretation, especially in relation to genus-level structure, merit further clarification (as detailed below). Storage conditions (herbarium identity) seem to have comparatively minor influence.

      Overall I enjoyed reading this research, the dataset is rich, the methodological framework is strong, and the work has significant potential to become a reference for understanding plant aDNA preservation in herbaria. I believe the paper merits publication, though several concerns should be addressed prior to its acceptance. Please, find bellow several points that I hope will help strengthen and refine the manuscript.


      Major comments

      Definition and calculation of endogenous DNA fraction

      You define endogenous fraction as "the percentage of post-quality trimmed and merged reads for each sample mapped to its respective reference" (lines 203-206) and say it "was calculated with SAMtools 'flagstat'" (line 206) However, this is somewhat ambiguous:

      Is the denominator the number of merged reads after AdapterRemoval, the total raw reads, or only non-duplicate mapped reads?

      Do you include secondary/supplementary alignments (multi mappers), and how are PCR duplicates treated here?

      Given that endogenous fraction is one of your four key metrics (Methods, lines 197-200), it would be useful to make this completely explicit.

      Need a better explanation of the "month of collection" variable

      Lines 266-273: you state that monthly temperature and precipitation were extracted "to infer climatic conditions at the time of specimen collection" and that in the collection climate model variables were assigned "based on their location and month of collection." Later, in the Results you again refer to "collection climate" and "annual climate" models (lines ~438-441).

      However, it is not entirely clear whether month is explicitly included as a variable (e.g. as a categorical factor or via the corresponding monthly raster) or whether you simply used the CHELSA monthly layer corresponding to the recorded month? Please clarify in the Methods how the month of collection enters the model. Is there a variable "month" per se, or is the only effect that you choose the relevant tas_XX and pr_XX layer?

      This would make it much easier for readers to follow how "month" is used and what the collection climate actually represents.

      Need a clarification of "Collection Climate" vs. Herbarium Storage

      In the Methods (lines ~271-274), you describe a collection climate model where "monthly climatic variables (temperature and precipitation) were assigned to samples based on their location and month of collection," and an annual climate model based on annual means at the collection location. However, it is not clearly stated how this model relates to the actual time each specimen spent in the field vs. in herbarium storage. By definition, a 150-year-old specimen will have spent the majority of its lifetime in a collection, yet the climate used in the models is that of the collection locality at the time of sampling, not the climate of the herbarium building where it spent decades, despite the herbarium being included as a factor.

      Could you please clarify explicitly what period of a specimen's "life after death" you intend to capture with the collection climate model? Is it mainly the drying/early post-mortem period, or are you also considering longer-term storage conditions in the herbarium?. Do you assume that most deamination and oxidative damage occur in the first days to months after collection, and that later storage in relatively stable herbarium conditions contributes little to further degradation?

      Need for the integration of non-deamination mismatch controls and baseline divergence

      Your analysis focuses on the aDNA-typical 5′ C→T misincorporations (Methods, lines 238-245; Results, lines 355-361). However, you do not show any other mismatch frequencies (e.g. A→G, G→A etc) as a "negative control" to demonstrate that the patterns you report (exponential decay, climate, age, genus effects) are specific to deamination rather than general elevation of error rates or mapping artefacts.

      On that specific point, Lines 622-624 and 651-653: You attribute the higher 5′ C>T frequencies in Oryza to greater susceptibility to post-mortem deamination, potentially linked to its tropical and sub-tropical distribution. However, because Oryza originates from consistently warmer regions while Hordeum is predominantly temperate, genus and temperature are strongly confounded in your dataset. This is also supported by your own variance partitioning analysis, where large shared variance fractions (temperature × genus) indicate that these two predictors are difficult to disentangle.

      Furthermore, Figure 6 shows that when analysing each genus separately, the relationship between either annual mean temperature or collection temperature and 5′ C>T frequencies is no longer significant. This suggests that the global temperature-damage correlation you report is largely driven by genus-level differences rather than temperature acting independently or am I wrong ? Otherwise could you add a bit of discussion on that point to explain why if temperature does have an impact of deamination, why do we not see this intra-genus with different temperature values?

      While I agree that environmental conditions at the time of collection may influence DNA degradation, another factor that could contribute to the observed genus-specific patterns is reference-read divergence. Indeed, in a recent unreviewed work (see preprint: https://doi.org/10.1101/2025.07.16.665190), showed that the percentage identity between the reference genome and the ancient reads can influence apparent damage estimates. Although divergence between the ancient Hordeum/Oryza reads and their respective references is unlikely to be extreme given that plants do not evolve as rapidly as microbial taxa, a sanity check (e.g., adding the percentage average identity of each species per genus in the model) would help confirm that reference mismatch is not inflating differences in estimated 5′ C>T frequencies between genera.


      Minor comments

      Title : "Patterns of aDNA Damage Through Time end Environments" → "Time and Environments."

      Line 95 - ex situ in italic.

      Line 140 and elsewhere: Oryza should be in italics whenever used as a genus (same for Hordeum).

      Line ~551: "extremally well-preserved samples" → "extremely well-preserved samples."

      It may help to add one sentence acknowledging that classical laboratory negative controls (blank extractions) are not relevant to the regression models, but that misincorporation spectra and MapDamage profiles effectively serve as authenticity checks (Methods, lines 176-187 and 238-247).

      Discussion lines 641-648 compare herbarium specimens to bones and arthropods. It might help the reader if you add one explicit sentence summarizing why age-fragmentation relationships are detectable in herbaria but not in bones (standardized post-collection environment, as you nicely explain in lines 595-603).

      In Figure 6, consider adding a brief note in the legend stating that the strong relationship in panels a-b is largely driven by contrasting climates and baseline damage between genera, and that it disappears within genera (c-d). This would remind readers of the confounding you discuss in the text.

      In the Methods you state that you used linear models (lm) for regressions and varpart + rda for variance partitioning (lines 197-201 and 269-281). While the overall approach is reasonable, it would help to briefly address whether model assumptions (normality, homoscedasticity) were checked for the linear regressions (e.g. on log-transformed variables).

      While the manuscript mentions storage effects in the discussion, it doesn't explore them in great detail. More focus on specific herbarium storage methods (e.g., temperature, humidity control) might help contextualize the minor storage effects observed. A brief section or discussion on institutional preservation practices and their variability could provide readers with more context about herbarium differences.

    1. *

      Yicong Wang: This Annotation refers to the allusions used in the commentary by Feng Zhenluan.

      • 魯男子: Man in the Lu State: First mentioned in The Book of Poetry, telling the story of a man from the Lu State, who had no lust for women, refused to stay with a woman in a night of rainstorm. This term was later commonly used in classical Chinese to refer to a man with no lust for women.
      • 北邙 Beimang Mountain: The mountain was the place of plenty of premodern Chinese nobles' tombs; the name of this mountain was commonly used to refer to tombs in classical Chinese.
    2. 💬

      Dan Minglun: She was clearly a beautiful woman, yet had a bluish-green face and jagged, saw-like teeth, only wearing a coloured-painted human skin? Those in the world who confuse people with gaudiness are actually those people who wear human skin, and paint them with coloured pen every day. Alas, that is so scary!

      但明倫:明明麗人也,而乃翠面鋸齒,徒披采繪之人皮者乎?世之以妖冶惑人者,固日日鋪人皮,執采筆而繪者也。吁!可畏矣!

    3. 💬

      Feng Zhenluan: Everyone would see her and call her a beauty, but I would see her as a fiendish ghost. If everyone had my eyes, they would all be man with no lust for women (Man in the Lu State).* My heart is like a dead tree, abstinent like the sage, deity, or Buddha. Otherwise, if the mind is deluded by desires, I will be dust in the grave (under the Beimang Mountain)."

      馮鎮巒: 人見呼佳人,我見如獰鬼,人人如我眼,便是魯男子 。此心即枯木,聖賢仙佛矣,不然心眼迷,北邙 山下土。

    1. AbstractBackground West Africa has high biodiversity that is relatively understudied, especially for insects. Studies of West African arthropod diversity can therefore help address important questions regarding conservation, ecosystem services, and insecticide use and other species-control interventions in agriculture and disease management. We intensively sampled arthropods in Ghana using complementary trapping methods, generated DNA barcodes, and classified sequences by Barcode Index Numbers (BINs, a species proxy). Using this dataset, we investigate assemblage composition, temporal activity patterns, and the state of regional biodiversity sampling.Results Sequencing DNA from 95,996 individuals captured using Malaise, yellow pan, pitfall, Heath and Centre for Disease Control (CDC) traps, we identified 10,120 unique BINs. The rate of species accumulation did not approach an asymptote for any taxonomic group or trap type, indicating high biodiversity. The different trap types sampled different subsets of the local community, with greatest similarity between yellow pan and pitfall traps. More insects and species (BINs) were trapped during the day than at night. Our dataset shared more BINs in the Barcode of Life Database with South Africa than with any other country, although this predominantly reflects the limited sampling and DNA sequencing campaigns in Africa.Conclusions This study more than doubles the published BINs for West Africa, offering insights into the biodiversity of an ecologically important but understudied taxon and region. Using multiple trap types allowed a more complete assessment of the local arthropod assemblage. The public release of these data will support and stimulate further taxonomic and ecological work in the region.

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag028), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 3:

      This paper describes a massive DNA barcoding project of arthropods in Ghana, West Africa with a dataset of 95,996 individuals and 10,120 BINs (Barcode Index Numbers). The publication is a major contribution to characterizing biodiversity of tropical insects in a poorly studied area, answering methodological questions concerning trap complementarity and temporal activity, and is also an invaluable resource to the public. The research is well structured, analyses are favorable and the manuscript is well written. I recommend acceptance after minor revisions to address a few clarifications and technical points.

      1. The manuscript acknowledges that only a subset of individuals was sequenced due to logistical constraints, and for Heath traps, selection was based on wet mass. While the authors argue that sub-sorting aimed to maximize diversity, this could still introduce biases in abundance estimates and BIN accumulation curves. Please include a brief discussion of how this sub-sampling might affect the conclusions (e.g., richness estimates, trap comparisons) and consider adding a sensitivity analysis in the supplement if feasible.

      2. The finding that South Africa shares the most BINs with Ghana despite geographic distance is interesting and attributed to sampling effort. However, the regression model explains only 3% of variance (R2=0.03), suggesting other factors may be at play. Please discuss potential biogeographic or ecological reasons (e.g., similar habitats, historical connectivity) that might contribute to this pattern, even if sampling effort is the dominant driver.

      3. The use of BINs as a species proxy is appropriate for this study, but the manuscript should briefly acknowledge known limitations (e.g., BINs may over- or under-split species, particularly in poorly studied taxa). A sentence or two in the Discussion would suffice, noting that BINs are a pragmatic tool for biodiversity assessment but not a replacement for formal taxonomy.

      4. Line 381: "insFect" should be "insect".

      5. able 1 and Table 2 are well-presented, but consider adding a footnote explaining that "BINs unique to trap type" means not found in other trap types in this study.

      6. Line 140: Specify the soap concentration used in pan and pitfall traps.

      7. Line 150: Clarify how "wet mass" was measured (precision, handling protocol).

      8. Line 156: Mention the success rate of PCR and sequencing (how many samples failed?).

      9. Line 360-379: The section on "Taxa of potential human importance" is interesting but could be strengthened by relating findings to local agricultural or health contexts. For example, what do the low numbers of crop pests or disease vectors imply for local management?

      10. Line 390-396: The conclusion could briefly highlight future directions, e.g., integrating morphological taxonomy with BINs, or using this dataset for metabarcoding studies.

      11. Line 228: "Neuroptera had the lowest completeness at 13.5%" - mention the sample size for this order.

      12. Line 302: "β = -1.92, p >0.05" - report the exact p-value. Transfer Authorization

    2. AbstractBackground West Africa has high biodiversity that is relatively understudied, especially for insects. Studies of West African arthropod diversity can therefore help address important questions regarding conservation, ecosystem services, and insecticide use and other species-control interventions in agriculture and disease management. We intensively sampled arthropods in Ghana using complementary trapping methods, generated DNA barcodes, and classified sequences by Barcode Index Numbers (BINs, a species proxy). Using this dataset, we investigate assemblage composition, temporal activity patterns, and the state of regional biodiversity sampling.Results Sequencing DNA from 95,996 individuals captured using Malaise, yellow pan, pitfall, Heath and Centre for Disease Control (CDC) traps, we identified 10,120 unique BINs. The rate of species accumulation did not approach an asymptote for any taxonomic group or trap type, indicating high biodiversity. The different trap types sampled different subsets of the local community, with greatest similarity between yellow pan and pitfall traps. More insects and species (BINs) were trapped during the day than at night. Our dataset shared more BINs in the Barcode of Life Database with South Africa than with any other country, although this predominantly reflects the limited sampling and DNA sequencing campaigns in Africa.Conclusions This study more than doubles the published BINs for West Africa, offering insights into the biodiversity of an ecologically important but understudied taxon and region. Using multiple trap types allowed a more complete assessment of the local arthropod assemblage. The public release of these data will support and stimulate further taxonomic and ecological work in the region.

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag028), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 2:

      General Comments:

      This manuscript presents an impressive and highly valuable study that significantly advances our understanding of tropical arthropod diversity in West Africa. The sampling effort is extraordinary (nearly 100,000 individuals sequenced), and the dataset generated more than doubles the number of Barcode Index Numbers (BINs) publicly available for the region. The study is well-designed, employing multiple complementary trap types to capture diverse components of the arthropod community. The analyses are generally robust and appropriate for the research questions. The public release of this large dataset is a major contribution that will undoubtedly stimulate further taxonomic and ecological research in understudied tropical regions. The manuscript is clearly written and well-structured. I am generally in favour of acceptance after minor revisions.

      Specific Comments and Suggestions for Revision: 1. Visual Documentation of Methods The manuscript would benefit from including representative photographs of each of the five trap types (Malaise, yellow pan, pitfall, Heath, CDC) as deployed in the field. This is particularly helpful for readers less familiar with entomological methods. Given potential space constraints in the main text, I recommend including these as a Supplementary Figure (e.g., a panel of five photos with concise captions). Please cite this figure in the Methods (Sampling) section. 2. Robustness of Community Composition Analyses. The NMDS and PERMANOVA results convincingly show differences among trap types. However, the sequencing effort (and thus sample size) varied greatly among traps (e.g., Heath: 65,293 samples vs. CDC: 3,039 samples). Could the authors please clarify if the Bray-Curtis dissimilarity matrices used in these analyses were calculated on standardized or rarefied data to account for this large disparity in sample size? A brief note in the Methods (Data analyses) or figure legend would assure readers that the observed patterns are not primarily an artefact of sampling intensity. The finding of significantly higher diurnal catches (individuals and BINs) in Malaise traps is interesting. The discussion briefly mentions variance in thermal conditions. Could the authors expand the Discussion (Diurnal activity patterns) to include other potential ecological or methodological explanations? For example, might this reflect true peaks in flight activity for dominant taxa (Diptera, Hymenoptera), or could it be influenced by trap visibility or wind patterns differing between day and night? A sentence or two of speculation would enrich the interpretation. The authors transparently note that only 34 of 117 Malaise lots were fully sequenced and that spiders were removed from some analyses. In the Discussion, please add a short statement evaluating how these practical limitations might have influenced the key conclusions regarding trap complementarity and overall community completeness. For instance, does the high rate of BIN accumulation in Malaise traps (Supplementary Figure 6) suggest that sequencing the remaining lots might have yielded many additional unique BINs, potentially altering the estimated contribution of this trap type? 3. Minor Editorial and Clarity Points: Line 381: There is a typo: "more insFect individuals" should be "more insect individuals". Figure 2 & 3 Citations in Text: The in-text citations for Figures 2 and 3 (e.g., lines 239, 274-277) are currently embedded in the legend descriptions copied from the PDF. These should be simplified to standard figure calls (e.g., "(Figure 2)", "(Figure 3A, B)") and the legend text removed from the main manuscript body.

    3. AbstractBackground West Africa has high biodiversity that is relatively understudied, especially for insects. Studies of West African arthropod diversity can therefore help address important questions regarding conservation, ecosystem services, and insecticide use and other species-control interventions in agriculture and disease management. We intensively sampled arthropods in Ghana using complementary trapping methods, generated DNA barcodes, and classified sequences by Barcode Index Numbers (BINs, a species proxy). Using this dataset, we investigate assemblage composition, temporal activity patterns, and the state of regional biodiversity sampling.Results Sequencing DNA from 95,996 individuals captured using Malaise, yellow pan, pitfall, Heath and Centre for Disease Control (CDC) traps, we identified 10,120 unique BINs. The rate of species accumulation did not approach an asymptote for any taxonomic group or trap type, indicating high biodiversity. The different trap types sampled different subsets of the local community, with greatest similarity between yellow pan and pitfall traps. More insects and species (BINs) were trapped during the day than at night. Our dataset shared more BINs in the Barcode of Life Database with South Africa than with any other country, although this predominantly reflects the limited sampling and DNA sequencing campaigns in Africa.Conclusions This study more than doubles the published BINs for West Africa, offering insights into the biodiversity of an ecologically important but understudied taxon and region. Using multiple trap types allowed a more complete assessment of the local arthropod assemblage. The public release of these data will support and stimulate further taxonomic and ecological work in the region.

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag028), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1:

      The manuscript "Characterising a species-rich and understudied tropical insect fauna using DNA Barcoding" by Hemprich-Bennett and co-authors provides DNA barcodes from 95,996 individuals sampled in Ghana using various trap systems. In total, 10,120 unique BINs were identified, including 4,939 that were newly generated. Most sampled taxa were Diptera, Coleoptera, and Lepidoptera. In addition, the authors compared the determined BINs with already published data at BOLD, revealing the greatest overlap in BIN sharing with South Africa. In my eyes, the topic of this manuscript is interesting and for suitable for a publication in "GigaScience" that is focusing on "big data" research. The amount of new sequence data for arthropods, in particular insects, is awesome and represents an important step to assess the (molecular) biodiversity, or better species diversity, of a super diverse region which has hardly been studied so far. The authors use state-of-the-art methods to analyze their data including the BOLD database and BIN approach. However, there are some points that should be added or discussed in a broader context (see below). In addition, please find some specific comments made via sticky notes on the PDF file of the manuscript.

      I feel that the authors should provide some more references on various topics, especially in the introduction but discussion, too.

      It would be nice to present some maps, photos of the collection sites, the sampling devices as well as the samples themselves as part of the main manuscript, documenting the efforts that were taken.

      A BIN does per se not represent a species, because the variability of the DNA barcode fragment and mitochondrial DNA in general can be affected by various effects, e.g., incomplete lineage sorting, Wolbachia infections (especially true for arthropods), phylogeographic events, hybridization, and others. As consequence, BIN sharing and splitting can be observed - and in fact such effects are more often found than expected. It is fully clear that such analysis cannot be done for the given dataset, but a discussion of these effects is important and has been lacking thus far.

      What happened with the vouchers and DNA extracts? It is obvious that the collected specimens will include a high number of undescribed species, therefore the deposition of the voucher specimens is highly important.

      In my eyes it would be interesting to provide a summary of the lengths of the barcodes that were studied. How many barcodes were complete with a length of 658 base pairs? How many were about 300 bp etc.? I think such analysis can be easily done and visualized.

      Please find some other specific suggestions for corrections or additions made via notes on the document file of the manuscript.

    1. CASE ILLUSTRATION 6 (CONTD.) The primary care physician made an agreement with the girl and her parents that they would work together on the problem until the bullying was over. The girl would share her experiences with her parents, despite embarrassment, and her parents would take these events seriously and keep her from facing them alone. The parents would speak with the school administration and suggest a plan. Further history taking revealed that she was an excellent artist. The parents were encouraged to praise her for her artistic accomplishments as well as other achievements, and they encouraged school staff to do the same. They also sought new opportunities for their daughter to exhibit these strengths to herself and to others. The youngster’s parents kept a record of bullying episodes and communicated these with the school principal. Eventually, enrollment in an after-school art class helped this girl develop new friendships, which improved her self-esteem and made her less vulnerable to being bullied.

      .

    2. CASE ILLUSTRATION 6 During a routine health supervision visit of a 12-year-old girl, your customary questioning of social development reveals that this seventh grade student has been having problems with peers at school. She dislikes school and many of her classmates. Problems began about 3 months ago when another girl knocked an apple out of her hand and onto the cafeteria floor. Your patient tried to swat at the girl (but missed) and was reprimanded by the lunch monitor. Your patient broke into tears at that time and has since been the butt of jokes among a group of girls. False rumors about her have been spread at school and through social media.

      .

    3. CASE ILLUSTRATION 5 A 3-year-old girl has shown her genitals to a peer and has commented on her father’s genitals. One week prior to their visit to your office, the little girl began masturbating at home and occasionally in public. Not knowing how to react, her parents have been begging their daughter to stop. They are concerned that her sexualized behavior indicates something is wrong.

      .

    4. CASE ILLUSTRATION 4 (CONTD.) In the second case, the child was offered three wholesome meals and one snack at preset times of the day. After telling their daughter once about the meal, parents were not to engage in any discussion with their child about the volume eaten. No other foods in the house were made available to her during this behavioral management period. Between meals this girl was allowed an unlimited quantity of water, but nothing else. After a difficult period of 1½ days (thrown silverware, persistent crying, etc.), she began to nibble at new foods and to enjoy the positive attention for doing so. Although the child still enjoyed only a limited range of foods, parents were able to expand her repertoire to include broccoli, milk, and pasta.

      .

    5. CASE ILLUSTRATION 3 A 32-month-old boy refuses to go to bed on time. He prolongs bedtime rituals by making numerous requests (e.g., for water, use of bathroom, and adjusting the door). He repeatedly leaves his bed. On many nights he finally falls asleep in the living room or his parents’ bedroom while spending time with his parents.

      .

    6. CASE ILLUSTRATION 2 (CONTD.) In this case, the child’s tantrums began as a result of typical frustrations experienced by children his age. Over a period of months, as his parent became busier with other family needs, however, he discovered that expressing anger was an excellent way to get adult attention, and the frequency of these behaviors increased. As part of the management plan, his parent was instructed to ignore his anger and put him in his room for three minutes when he became physically violent toward others. She was also advised to increase time spent doing positive things with him, like playing games, going on walks, and having him help around the house. At day care he was given increased attention during times he was behaving well. Child care providers were asked to ignore him when he was aggressive toward other children and to shower a noticeable amount of attention on the other child. Within a couple of weeks he stopped biting and seemed happier. Although he still had tantrums, these strategies gave his mother the feeling that she had some control over the situation.

      .

    7. CASE ILLUSTRATION 2 The parent of a 3-year-old boy reports that her son throws himself on the floor, throws objects, and screams … usually when he does not get his own way. This seems to happen daily. At his child care center, he has begun to bite other children when he is angry, and other parents have begun to complain about him.

      .

    8. CASE ILLUSTRATION 1 (CONTD.) On further questioning in this case, the parents reported that their baby falls asleep immediately after daytime feeds and sleeps for 3–5 consecutive hours thereafter. Since this baby did not adapt to an acceptable day/night schedule, the doctor recommended waking the baby up after no more than 2–3 hours of daytime sleep. Parents were to occupy their infant’s daytime hours by walking around, talking, playing music, and other activities. It was recommended that nighttime feeds be made minimally stimulating: soften the lights, produce minimal noise, and avoid “fun” interactions at night. Although sleeping and feeding “on demand” does not need to be discouraged if parents find it acceptable, in this case the infant’s pattern was distressing to the parents. After 5–6 days of compliance with this schedule, it became easier for the parents to keep their daughter awake during the day, and they settled for a nighttime feed before they went to bed at 11 p.m. and another feed at 4 a.m.

      .

    1. Creating a List of References Each of the sources you cite in the body text will appear in a references list at the end of your paper. While in-text citations provide the most basic information about the source, your references section will include additional publication details. In general, you will include the following information: The author’s last name followed by his or her first (and sometimes middle) initial The year the source was published The source title For articles in periodicals, the full name of the periodical, along with the volume and issue number and the pages where the article appeared. Additional information may be included for different types of sources, such as online sources

      creating a list of references

    2. When you do choose to quote directly from a source, follow these guidelines: Make sure you have transcribed the original statement accurately. Represent the author’s ideas honestly. Quote enough of the original text to reflect the author’s point accurately. Never use a stand-alone quotation. Always integrate the quoted material into your own sentence by creating a signal phrase. Use ellipses (…) if you need to omit a word or phrase. Use brackets [ ] if you need to replace a word or phrase. Make sure any omissions or changed words do not alter the meaning of the original text. Omit or replace words only when absolutely necessary to shorten the text or to make it grammatically correct within your sentence. Write away from the quote. Create an original sentence following the quote that introduces the connection you are making between your argument and the quoted material. Include correctly formatted citations that follow the assigned style guide.

      what to do when you quote directly from a source

    1. We also discuss the role of AI in science, including AI safety.

      「我们也讨论了 AI 在科学中的角色,包括 AI 安全」——这句话出现在一篇关于「AI 自主做科研」的论文中,是整篇文章最具讽刺意味的一句话。Sakana AI 用 AI 自动生成了一篇讨论 AI 安全的论文,并让它通过了人类评审。我们还没弄清楚如何防止 AI 在科学出版物中作弊,AI 就已经在帮我们思考如何防止 AI 在科学中作弊了。这个自指性令人眩晕。

    2. we discover a clear scaling law: as the underlying foundation models improve, the quality of the generated papers increases correspondingly.

      AI Scientist 存在「论文质量 Scaling Law」——底层模型越强,生成的论文质量越高。这个发现的含义令人不寒而栗:随着 GPT-5、Claude Opus 4.6、Gemini 3.1 等模型持续迭代,AI Scientist 生成的论文质量将自动提升,无需任何额外的工程投入。AI 加速科研,更强的 AI 又反过来加速 AI 自身的科研——这是第一个有实证数据支撑的正反馈循环证据。

    3. using Claude 3.5 Sonnet for the experimentation phase typically costs around $15–$20 per run.

      一篇通过 ICLR workshop 同行评审的科学论文,AI 生成成本约为 15-20 美元。相比之下,一位博士生培养成本超过 10 万美元,发表一篇顶会论文需要数月时间。这个成本差距意味着:如果这项技术成熟,科研论文的生产成本将下降数千倍。学术期刊、同行评审系统、学术出版业的整个商业模式,都将面临根本性的重构压力。

    4. we had predetermined that we would withdraw the paper prior to publication if accepted, which we did.

      通过评审后主动撤稿——这个决定令人感到既欣慰又不安。欣慰:Sakana AI 展示了负责任的研究伦理;不安:如果换一个不那么有道德感的团队,这篇 AI 生成的论文本可以悄悄混入正式出版的学术文献库。同行评审制度目前对 AI 生成内容几乎没有系统性防御,这是整个学术界的集体盲点。

    5. external evaluations of the passing paper also uncovered hallucinations, faked results, and overestimated novelty

      通过了同行评审,但独立评估发现了幻觉、伪造结果和夸大新颖性——这个细节极为重要,却经常被忽视。它揭示了一个深刻的系统性漏洞:AI 已经学会了「通过评审」,但没有学会「诚实做科学」。这两件事在人类评审员看来是同一件事,但在 AI 系统的优化目标中可能是分离的。这是 AI 安全在科学领域的具体表现。

    6. The AI Scientist-v2 eliminates the reliance on human-authored code templates

      v1 到 v2 最关键的跨越是「去除人类模板依赖」。v1 仍然需要人类提供初始代码框架,v2 从零开始自主生成代码、设计实验。这个区别的深远意义:v1 是「AI 完成人类设计的任务」,v2 是「AI 自己设计任务并完成它」。这条界线一旦被跨越,AI 在科研中的角色就从工具变成了研究者。

    7. one manuscript achieved high enough scores to exceed the average human acceptance threshold, marking the first instance of a fully AI-generated paper successfully navigating a peer review.

      史上第一篇完全由 AI 自主生成并通过同行评审的论文——这个里程碑的重要性不亚于 AlphaFold 折叠蛋白质。令人惊讶的是,这篇论文得分超越了 55% 的人类作者投稿(平均分 6.33,高于人类投稿平均录取线)。学术界存在了数百年的「同行评审」制度,第一次被一个 AI 系统悄悄穿越了。

    1. AbstractStrain-level metagenomic classification is essential for understanding microbial diversity and functional potential, but remains challenging, par- ticularly in the absence of prior knowledge about the composition of the sample. In this paper we present MADRe, a modular and scalable pipeline for long-read strain-level metagenomic classification, enhanced with Metagenome Assembly-Driven Database Reduction. MADRe com- bines long-read metagenome assembly, contig-to-reference mapping reas- signment based on an expectation-maximization algorithm for database reduction, and probabilistic read mapping reassignment to achieve sensi- tive and precise classification. We extensively evaluated MADRe on sim- ulated datasets, mock communities, and a real anaerobic digester sludge metagenome, demonstrating that it consistently outperforms existing tools by achieving higher precision with reduced false positives. MADRe’s de- sign allows users to apply either the database reduction or read classi- fication step individually. Using only the read classification step shows results on par with other tested tools. MADRe is open source and pub- licly available at https://github.com/lbcb-sci/MADRe.

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag030), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 2:

      This manuscript presents MADRe, a modular pipeline for strain-level metagenomic classification from long-read data, emphasizing an assembly-driven database reduction strategy coupled with probabilistic reassignment. The work is methodologically sound and well aligned with the scope of GigaScience. However, the study can be benefit from the following revisions:

      1, the study's main contribution is engineering and integration, rather than a fundamentally new statistical model. The authors thus should explicitly mention this in the Abstract as well as the Discussion part.

      2, although comparisons are reasonable, the manuscript could do more to clarify how MADRe compares against state-of-the-art strain-resolved tools under identical parameter tuning, and whether performance gains are consistent across different strain divergence levels.

      3, when comparing with existing tools, improvements appear primarily in precision, while recall trade-offs are less emphasized. The authors should explicitly discuss precision-recall trade-offs and clarify in which biological scenarios MADRe is most advantageous.

      4, While database reduction is presented as efficient, the computational cost of assembly plus EM iterations is not deeply analyzed. The authors should include a concise runtime/memory comparison or at least a qualitative discussion of computational trade-offs.

      5, The approach implicitly assumes that metagenome assembly is sufficiently accurate and representative. However, in highly complex or low-coverage samples, assembly could be fragmented or biased. The authors should add a clearer discussion on the sensitivity to assembler choice and parameters.

    2. AbstractStrain-level metagenomic classification is essential for understanding microbial diversity and functional potential, but remains challenging, par- ticularly in the absence of prior knowledge about the composition of the sample. In this paper we present MADRe, a modular and scalable pipeline for long-read strain-level metagenomic classification, enhanced with Metagenome Assembly-Driven Database Reduction. MADRe com- bines long-read metagenome assembly, contig-to-reference mapping reas- signment based on an expectation-maximization algorithm for database reduction, and probabilistic read mapping reassignment to achieve sensi- tive and precise classification. We extensively evaluated MADRe on sim- ulated datasets, mock communities, and a real anaerobic digester sludge metagenome, demonstrating that it consistently outperforms existing tools by achieving higher precision with reduced false positives. MADRe’s de- sign allows users to apply either the database reduction or read classi- fication step individually. Using only the read classification step shows results on par with other tested tools. MADRe is open source and pub- licly available at https://github.com/lbcb-sci/MADRe.

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag030), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1:

      I have no significant concerns with the MADRe methodology, and the current datasets provide sufficient evidence of its strain-level performance. However, several issues still need to be addressed.

      The reponse states: "However, we observed a limitation when Centrifuger cannot confidently assign a read to a specific reference sequence (for example, when multiple chromosomes belong to the same strain). In such cases, it often classifies the read under the NCBI strain-level taxid, which in some instances is identical to the species-level taxid. This makes it impossible to directly and fairly compare those classifications with other tools that operate at the sequence level."

      Although I agree this issue may not substantially affect the overall conclusions, the current handling of strain-level evaluation for Centrifuger is not sufficiently rigorous. The underlying problem is that Centrifuger (and Kraken2) rely on nodes.dmp and names.dmp, where the lowest taxonomic rank is often species or subspecies. As a result, these tools cannot report strain-level abundances directly in their standard output. A more appropriate solution would be to assign custom, unique strain-level taxIDs for all reference genomes, allowing proper classification at the strain level. This approach has been discussed in https://github.com/mourisl/centrifuger/issues/18 and https://github.com/jenniferlu717/Bracken/issues/113. Additionally, Centrifuger has an extra program, centrifuger-quant, that uses the EM algorithm to estimate abundance. The read assignment results produced by Centrifuger do not apply the EM algorithm.

      In the similarity experiment, some strains exhibit extremely high similarity, which makes proportional read distribution practically impossible for MADRe. To better characterize the performance limits of MADRe for accurate strain classification and abundance estimation, I recommend including additional simple synthetic mixtures at different combinations of similarity and coverage depth. Because long reads vary widely in length, read counts alone can be misleading. I strongly encourage reporting strain abundances rather than raw read counts, as abundances are more relevant for downstream applications. Finally, the authors should clarify whether MADRe's limitations in detecting low-abundance strains (referring more to low coverage) is entirely determined by the performance of the assembly tool, or whether additional factors influence this limitation.

      In Figure 4, please specify the sequencing technology used for sim_high. "calculated usin fastANI" →"calculated using fastANI".

    1. We evaluate 452 tasks from the public APEX-Agents dataset spanning investment banking, management consulting, and corporate law

      452 个任务跨越投资银行、管理咨询、公司法三个领域——这三个领域是全球「知识密集型工作」的代表,也是最难被 AI 替代的白领职业。APEX-Agents 选择这三个领域作为 benchmark,本身就是一个宣言:AI 已经准备好挑战那些曾经被认为「最安全」的专业工作。而最高分只有 33.3% 这个事实同样是一个宣言:这个挑战才刚刚开始。

    2. Cost (USD) to run the evaluation: GPT-5.4 (xhigh): $1,110, Claude Opus 4.6 (max): $1,055

      运行一次 452 个任务的评测,GPT-5.4 花费 1110 美元,Claude Opus 4.6 花费 1055 美元——每个任务平均约 2.3 美元。而 Gemini 3 Flash 只需要 596 美元,实现了 27.7% 的成绩(vs 顶级模型的 33.3%)。这个性价比数据对 AI 选型决策极为关键:如果业务场景可以接受 27% 而非 33% 的成功率,Gemini 3 Flash 能节省近一半成本。在金融服务的大规模部署中,这个差异将被放大数千倍。

    3. Gemini 3 Flash achieves the highest score of 24.0%

      在原始论文中,Gemini 3 Flash 以 24.0% 的成绩位列第一——而 Artificial Analysis 的独立复测中,它的成绩是 27.7%,被 GPT-5.4 和 Claude Opus 超越。两个不同时间、不同方法论的测试得出了不同的排名。这揭示了 AI Agent 评测的根本脆弱性:同一个 benchmark,不同实施者得出不同结论。「谁第一」在 AI 评测中是一个随时间和方法论变化的流动答案。

    4. GPT-5.4 (xhigh) scores the highest on APEX-Agents-AA Pass@1 with a score of 33.3%, followed by Claude Opus 4.6 (Adaptive Reasoning, Max Effort) with a score of 33.0%, and Gemini 3.1 Pro Preview with a score of 32.0%

      令人震惊的数字:即便是全球最强的 AI Agent,在投行/咨询/律所的专业任务上也只有三分之一的成功率。更惊讶的是前三名几乎并列——GPT-5.4 的 33.3%、Claude Opus 4.6 的 33.0%、Gemini 3.1 Pro 的 32.0%——三家顶级实验室在专业服务 Agent 评测上的差距已缩小到统计噪声级别。「谁的 AI 更强」的问题,在这个维度上已经没有明确答案。

    Annotators

    1. an agent does not care about the structure, unless you specifically ask it to. But even in this case you have to review the changes.

      【启发】「AI 天然不在意结构,除非你明确要求」——这个发现定义了人类工程师在 AI 时代最不可替代的职责:做代码结构的「守门人」。这与 Every 文章里「每个人都是管理者」的洞见形成呼应:人类的工作从「执行代码」转变为「审查代码质量并为 AI 设定标准」。对工程团队文化的启发:代码 Review 的重要性不是在下降,而是在上升——因为现在需要 Review 的代码量是以前的 10 倍。

    2. LLMs are pretty good at picking up the style in your repo. So keeping it clean and organized already helps.

      【启发】「整洁的代码库会教会 AI 模仿它的风格」——这是一个良性循环的起点。好代码 → AI 学习好风格 → AI 生成更好的代码 → 代码库更整洁。反之亦然:烂代码 → AI 学习烂风格 → 越来越多的烂代码。这意味着代码库的初始质量会被 AI 放大——好的变得更好,烂的变得更烂。技术债的「利息」在 AI 时代将以更高的复利增长。

    3. When you give a task to your agent, make sure you also explain how the code should be organized. Not only value, but also structure.

      【启发】这条实操建议揭示了一个普遍被忽视的 Prompt 盲区:大多数人给 AI 下达编程任务时,只描述「做什么」,从不描述「怎么组织」。这相当于只告诉一个新员工「实现这个功能」,却从不告诉他「我们的代码规范是什么」。对所有使用 Vibe Coding 的人来说,这条建议应该成为标准操作流程的一部分——在每次任务 Prompt 中,主动加入结构约束。

    4. Robert Martin in Clean Architecture talks about code as having two properties: value (it works, it's fast, etc.) and structure (how code is organised).

      【启发】把 Robert Martin 的「价值 vs 结构」二元框架带入 AI Agent 时代,是一个极聪明的理论嫁接。AI 天然只关心「价值」(能跑通、能完成任务),却倾向于忽略「结构」(代码是否整洁、是否可维护)。这意味着在 AI 驱动的开发工作流中,「守护结构」必须成为人类工程师的核心职责——这是 AI 无法自发完成的工作,也因此成了人类不可替代的价值所在。

    5. poorly organized code means agents need to read, "understand", and make changes to more files than necessary - polluting their context and costing you tokens.

      【启发】技术债从「慢慢损害可维护性」变成了「立刻损害你的账单」。这是一个全新的技术债量化维度——不再只能用「未来的工时」来衡量,而可以用「每次 AI 调用的 token 超支」来实时计算。这为「说服管理层重视代码质量」提供了一个全新的、财务可量化的论据:烂代码不只是技术问题,它在每次 AI 执行任务时都在直接产生额外费用。

    6. Context is basically how many things a machine can keep in its operational memory - it's not so different from the very human cognitive load.

      【启发】「上下文窗口 = 认知负荷」——这个类比是整篇文章最有洞察力的一句话。它把一个技术概念(context window)与一个人类体验(认知疲劳)无缝连接。启发在于:所有帮助人类减少认知负荷的代码实践——模块化、清晰命名、单一职责——现在也在帮助 AI 减少 token 消耗。「对人友好的代码 = 对 AI 友好的代码」,这个等式比我们想象的成立得更彻底。

    7. their productivity is affected by the state of the codebase.

      【启发】这句话的深远意义在于:它把 AI Coding Agent 与人类开发者置于同一评价维度。这不是「AI 是否能替代人」的问题,而是「AI 受代码质量影响的方式是否与人类相同」。答案是肯定的——这意味着几十年来软件工程师积累的代码质量实践,不是因为 AI 的到来而失效,而恰恰因为 AI 的到来而变得更加重要。技术债从「慢慢影响人」变成了「立刻影响 AI 的 token 消耗」。