10,000 Matching Annotations
  1. Dec 2023
    1. Author Response

      The following is the authors’ response to the original reviews.

      We thank the reviewers and editors for their constructive comments on the manuscript. We have extensively revised the manuscript based on these concerns and comments. The followings are the specific answers.

      Public Reviews:

      Reviewer #1 (Public Review):

      In the manuscript "Long‐read single‐cell sequencing reveals expressions of hypermutation clusters of isoforms in human liver cancer cells", S. Liu et al present a protocol combining 10x Genomics single-cell assay with Element LoopSeq synthetic long-read sequencing to study single nucleotide variants (SNVs) and gene fusions in Hepatocellular carcinoma (HCC) at single‐cell level. The authors were the first to combine LoopSeq synthetic long‐read sequencing technology and 10x Genomics barcoding for single cell sequencing. For each cell and each somatic mutation, they obtain fractions of mutated transcripts per gene and per each transcript isoform. The manuscript states that these values (as well as gene fusion information) provide better features for tumor-normal classification than gene expression levels. The authors identified many SNVs in genes of the human major histocompatibility complex (HLA) with up to 25 SNVs in the same molecule of HLA‐DQB1 transcript. The analysis shows that most mutations occur in HLA genes and suggests evolution pathways that led to these hypermutation clusters. Yet, very little is said about novel isoforms and alternative splicing in HCC cells, differences in isoform ratio between cells carrying different mutations, or diversity of alternative isoforms across cells. While the manuscript by Liu et al. presents a promising combination of technologies, it lacks significant insights, a comprehensive introduction, and has significant problems with data description and presentation.

      Answer: Thanks for the precious suggestion. Our long-read single-cell sequencing has discovered an average of 442 novel isoform transcripts per benign liver cell and 450 novel isoform transcripts per HCC cell per SCANTI v1.2 analysis. These are stated in the revised manuscript. The alternative splicing was detected by differential isoform expression as demonstrated in supplemental figures 6 and 7 and supplemental tables 8-11. The examples of differences in isoform ratio between cells carrying different mutations are now shown by DOCK8 and STEAP4 (figure 5 in the revised manuscript). A new section was added in the results to discuss the mutation expression of these two genes. The diversity of isoforms of the selected genes is shown in Supplemental Figure 10.

      This study showed how mutations in the same allele evolved in liver cancer. In particular, HLA hypermutations were found to develop from some specific sites of the molecules into large clusters of mutations in the same molecules. A new paragraph of introduction was added about the role of mutations in human cancer development. We also revised the figures to present the information better. All the HLA genes expressed only one known isoform, as shown in Figure 4 and Supplemental Figure 3, regardless of mutations.

      Major comments:

      1. The introduction section is scarce. It lacks description of important previous works focused on clustered mutations in cancers (for example, PMID35140399), on deriving the process of cancer development through somatic evolution (PMID32025013, from single cell data PMID32807900). Moreover, some key concepts e.g. mutational gene expression and mutational isoform expression are not defined. The introduction and the abstract contain slang expressions e.g. "protein mutation', a combination of terms I teach my students not to use.

      Answer: We appreciate the reviewer for the idea of more solid background introduction and term definition. We added a new paragraph in the introduction section to introduce the role of mutations and hypermutations in human cancers. Some important work has been cited. We added a new section in the "Methods" to define "mutation gene expression share" and "mutation isoform expression share". "Protein mutation" has been replaced by "genetic mutation".

      1. In the results section, to select the mutations of interest, the authors apply UMAP dimensionality reduction to the mutation isoforms expression and cluster samples in UMAP space, then select the mutations that are present only in one cluster, then apply UMAP to the selected mutations only and cluster the samples again. The motivation for such a procedure seems unclear, could it be replaced with a more straightforward feature selection?

      Answer: Thanks for raising up this important question. The goal of the analysis is an unbiased classification of the cell populations in the samples. We found that by removal of mutated isoform expressions that were at similar levels of all cells, the UMAP clustering generated clear segregation of three population cells. When the unique mutated isoform expressions from each group were applied, it generated highly distinct 8 groups of cells, with each group having a distinct mutation isoform expression pattern. If we force known knowledge into the mix of the analysis, it may generate unwanted bias. Specifically, the first UMAP was performed in an unbiased way to cluster cells, while the second step is a supervised approach by selecting the unique mutations in each cluster to identify the classifiers. The second UMAP matches the Benign/HCC labeling well.

      1. As I understand, the first "mutated isoform"-based UMAP clustering was built from expression levels of 205 "mutational isoforms". What was the purpose and outcome of the second "mutated isoform"based UMAP clustering (Figure 2E)? In the manuscript the authors just describe the clusters and do not draw any conclusions or use the results of the clustering anywhere further.

      Answer: Thanks for pointing this out. Figure 2E was generated from unique mutation isoform expressions in groups A, B, and C from Figure 2D. The purpose of Figure 2E is to investigate whether these unique mutation isoforms can further classify the cell populations free of prior biological knowledge. We added a sentence in the revision to clarify the purpose of the clustering. The conclusion from this analysis, including Figure 2F and Figure 3 (which is an extension of Figure 2E), is that HLA mutation isoform expressions dominated the classifications of cell populations.

      1. The authors just cluster the data three times based on expression levels of different sets of "mutational isoforms" and describe the clusters. What do we need to gather from these clustering attempts besides the set of 113 mutations used for further analysis? What was the point of the reclusterings? Did the authors observe improvement of the classification at each step?

      Answer: Thanks for asking this important question. The improvement of re-clustering to classify cell populations is the obvious segregation of 8 different groups of cells without any manual classification through prior knowledge. The distances among groups were far apart in comparison to the first clustering (figure 2B). Detailed subclassifications were achieved on cell populations that otherwise could not be segregated based on the first clustering.

      1. The alignment of short reads generated from hypermutated transcriptomes is non-trivial. The proposed approach could address the issue without the need for whole genome sequencing and offer insights about the cancer development through somatic evolution. Why didn't the authors use modern phylogenetic approaches in the "Evolution of mutations in HLA molecules" section or at least utilize the already performed clustering to infer cell lineages?

      Answer: We appreciate for the great question. For a single molecule mutation evolution, single gene clustering may not produce a desirable and robust effect. A simple evolution snowball chart in Figure 4B may be easier to be understood.

      1. I am not sure I understood the definition of "mutated gene expression levels" and "mutated isoform expression levels" in the "Mutational gene expression and fusion transcript enhanced transcriptome clustering of benign hepatocytes and HCC" section. The authors mention that gene lists included all the isoforms within the same range of standard deviation. If I understand it correctly, they are equal if there is only one expressed transcript isoform. In that case, this overlap is not surprising at all.

      Answer: We thank the reviewer for the great question. The definition of mutation gene expression level, mutation isoform expression level, and fusion gene expression level are now defined in the "Methods" section. In all HLA mutation transcripts, there were multiple transcripts with or without mutations for a single dominant isoform.

      1. "To investigate the roles of gene expression alterations that were not accompanied with isoform expression changes, UMAP analyses were performed based on the non‐overlapped genes." Venn diagrams (Sup Figure 8) show that there are much less "non-overlapped genes" than "genes that showed both gene and isoform level changes" for each SD threshold (for example, for SD>=0.8 59 vs 275). Could that be the reason why clustering based on the former group is worse i.e the cancer and normal cells are separated less clearly?

      Answer: The number of (attributes) genes could be a contributing factor in the segregation of cell populations. However, the number of attributes is not the underlying reason for worse performance for gene only classifier because much smaller isoforms/genes (22) overlap in SD>=1 outperformed a large number of genes (59) with SD>=0.8. It suggested that 59 gene expression classifier is less efficient in segregating the cell populations. To address this concern, we took SD>=0.8 as an example for demonstration if we subsampled the 275 overlapped genes/isoforms to 59 (equal to 59 non-overlapped genes in terms of number), we can still get better separation than the 59 DEG only. We repeated this subsampling process for three times. Similar results were found. The new data were inserted into supplemental Figure 8

      Reviewer #2 (Public Review):

      In the present study, Liu et al present an analysis of benign and HCC liver samples which were subjected to a new technology (LOOP-Seq) and paired WES. By integrating these data, the authors find isoforms, fusions and mutations which uniquely cluster within HCC samples, such as in the HLA locus, which serve as candidate leads for further investigation. The main appeal of the study is in the potential of LOOPSeq as a method to present isoform-resolved data without actually performing long-read sequencing. While this presents an exciting new method, the current study lacks systematic comparisons with other technologies/data to test the robustness, reproducibility and utility of LOOPSeq. Further, this study could be further improved by giving more physiologic context and examples from the analyses, thus providing a new resource to the HCC community. A few suggestions based on these are below:

      Answer: We appreciate the reviewer to raise up all the important questions and the great suggestions. The LOOPseq technology was compared with Oxford nanopore and PacBio long-read sequencing in our previous study. We have cited analysis in the introduction section of the paper. HLA mutation clusters in the single molecules are our finding with major physiological significance since these mutations may help liver cancer cells evade immune surveillance. We have extensively discussed the potential impact of these mutations on cancer development in the discussion. In addition, we added a new section of DOCK8 and STEAP4 mutation expressions in the results (page 11, new Figure 5) that are highly relevant to the pathogenesis of HCC.

      1. A primary consideration is that this seems to be the first implementation of LOOP-Seq, where the technology, while intriguing, has not been evaluated systematically. It seems like a standard 10x workflow is performed, where exons are selectively pulled down and amplified. Subsequent ultra-deep sequencing is assumed to give isoform-resolution of the sc-seq data. To demonstrate the utility of the approach it would benefit the study to compare the isoform-resolved results with studies where long-read sequencing was actually performed (ex: https://journals.lww.com/hep/Fulltext/2019/09000/Long_Read_RNA_Sequencing_Identifies_Alternativ e.19.aspx, https://www.jhep-reports.eu/article/S2589-5559(22)00021-0/fulltext, https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1010342). Presumably, a fair amount of overlap should occur to justify the usage.

      Answer: We have discussed the utility of the methodology in comparison with the previous studies by these three groups in the revision (results, page 12).

      1. Related to this point, the sc-seq cell types and benign vs HCC genes should be compared with the wealth of data available for HCC sc-seq (https://www.nature.com/articles/s41467-022-322833, https://www.nature.com/articles/s41598-021-84693-w). These seem to be important to benchmark the technology in order to demonstrate that the probe-based selection and subsequent amplification does not bias cell type definition and clustering. In particular, https://www.nature.com/articles/s41586021-03974-6 seems quite relevant to compare mutational landscapes from the data.

      Answer: This is a great point. The consistency probe-based analysis was demonstrated in our previous analyses and the analyses mentioned in the comments. We further discussed it in the results section of the paper (page 12).

      1. From the initial UMAP clustering, it will be important to know what the identities are of the cells themselves. Presumably, there is quite a bit of immune cells and hepatocytes, but without giving identities, downstream mechanistic interpretation is difficult.

      Answer: When mutation analyses were combined with cell marker analysis, i.e., immune marker positive but negative in HLA mutation, we found only one bona fide immune cell in the HCC sample. Thus, immune cells may not be significant in the current analysis.

      1. In general, there are a fair amount of broad analyses, such as comparisons of hierarchical clustering of cell types, but very little physiologic interpretations of what these results mean. For example, among the cell clusters from Fig 6, knowing the pathways and cell annotations would help to contextualize these results. Without more biologically-meaningful aspects to highlight, most of the current appeal for the manuscript is dependent on the robustness of LOOP-seq and its implementation.

      Answer: To address this comment, a new pathway analysis was performed on the cluster results of Figure 6. A new supplemental table was generated. The results are now discussed on page 13.

      1. Many of the specific analyses are difficult and the methods are brief. Especially given that this technology is new and the dataset potentially useful, I would strongly recommend the authors set up a git repository, galaxy notebook or similar to maximize utility and reproducibility

      Answer: The script file has been uploaded to GIT to facilitate the reproducibility of the analysis. We also added a new pipeline description script in the methods (pages 19-20).

      1. The authors claim that clustering between benign and HCC samples was improved by including isoform & gene (Suppl fig 8). This seems like an important conclusion if true, especially to justify the use of longread implementation. Given that the combination of isoform + gene presents ~double the number of variables on which to cluster, it would be important to show that the improved separation on UMAP distance is actually due to the isoforms themselves and not just sampling more variables from either gene or isoform

      Answer: The number of (attributes) genes could be a contributing factor in the segregation of cell populations. However, the number of attributes is not the underlying reason for worse performance for gene only classifier because much smaller isoforms/genes (22) overlap in SD>=1 outperformed a large number of genes (58) with SD>=0.8. It suggested that 58 gene expression classifier is less efficient in segregating the cell populations. To address this comment, we performed random subsampling to reduce the isoform/gene overlap iterates, similar results were obtained. A new supplemental figure was generated to reflect the new analyses.

      1. SQANTI implementation to identify fusions relevant for the HCC/benign comparison. How do the fusions compare with those already identified for HCC? These analyses can be quite messy when performed on WES alone so it seems that having such deep RNA-seq would improve the capacity to see which fused genes are strongly expressed/suppressed. This doesn't seem as evident from current analysis. There are quite a bit of WES datasets which could be compared: https://www.nature.com/articles/ng.3252, https://www.nature.com/articles/s41467-01803276-y

      Answer: Exome sequencing is not an ideal tool to identify fusion genes. Very few fusion genes have been discovered based on RNA sequencing so far. The fusion genes discovered in the study appeared mostly novel. No exome sequencing was involved in the identification of fusion genes.

      1. Figure 4 is fairly unclear. The matrix graphs showing gene position mutations are tough to interpret and make out. Usually, gene track views with bars or lollipop graphs can make these results more readily interpretable. Also, how Figure 4 B infers causal directions from mutations is unclear.

      Answer: We appreciate the reviewer for pointing this out. We have revised the diagram in Figure 4A to reflect the proper distance between the mutations in HLA-DQB1 NM_002123. Since these are the positions in the same alleles (protein), the gene track view or lollipop graph may not show that properly. The mutation clusters started from an isolated mutation, and mutation did not revert to wild type sequence after occurring. Based on these two principles, we showed several mutation accumulation pathways leading to hypermutation clusters.

      Reviewer #3 (Public Review):

      The Liu, et al. manuscript focuses on the interesting topic of evaluating in an almost genome-wide-scale, the number of transcriptional isoforms and fusion gene are present in single cells across the annotated protein coding genome. They also seek to determine the occurrences of single nucleotide variations/mutations (SNV) in the same isoform molecule emanating from the same gene expressed in normal and normal and hepatocellular carcinoma (HCC) cells. This study has been accomplished using modified LoopSeq long‐read technology (developed by several of the authors) and single cell isolation (10X) technologies. While this effort addresses a timely and important biological question, the reader encounters several issues in their report that are problematic.:

      1. Much of the analysis of the evolution of mutations results and the biological effects of the fusion genes is conjecture and is not supported by empirical data. While their conclusions leave the reader with a sense that the results obtained from the LoopSeq has substantive biological implications. However, they are extended interpretations of the data. For example: The fusion protein likely functions as a decoy interference protein that negatively impacts the microtubule organization activity of EML4.(pg 9)... and other statements presented in a similar fashion.

      Answer: We thank the reviewer for the helpful comment. The mutation results were experimentally validated by exome sequencing on the same samples. Furthermore, these mutations were filtered by requiring their presence in three different transcriptomes. The biological significance of these mutations is probably the subject of investigation in the next phase. Since a large number of HLA mutations did not occur overnight, the analysis of the accumulation pathways for these mutations was warranted, given the extensive evidence of such a process. The impact of mutations on HLA molecules appeared obvious and should be discussed. For ACTR2-EML4 fusion, we revised it as "The loss of microtubule binding domain may negatively impact the microtubule organization activity of EML4 domain of the fusion protein." We only discussed the obvious impact due to the loss of a large protein domain.

      2, LoopSeq has the advantage of using short read sequencing analyses to characterize the exome capture results and thus benefits from low error rate compared to standard long-read sequencing techniques. However, there is no evidence obtained from standard long read sequencing that the isoforms observed with LoopSeq are obtained with parallel technologies such as long read technologies. It is not made clear how much discordance there is in comparing the LoopSeq results are with either PacBio or ONT long read technologies.

      Answer: The comparative analyses among LOOPSeq, Oxford nanopore, and PacBio sequencing were performed in our previous study. We have cited the study in our introduction.

      1. There is no proteome evidence (empirically derived or present in proteome databases) from the HCC and normal samples that confirms the presence or importance of the identified novel isoforms, nor is there support that indicate that changes in levels HLA genes translate to effects observed at the protein level. Since the stability and transport differences of isoforms from the same gene are often regulated at the post-transcriptional level, the biological importance of the isoform variations is unclear.

      Answer: Given the transcriptome sequencing data, we can only focus on the isoform variation analysis but not directly link to the protein level variation because of the post-transcriptional level regulation. We discussed this in the revised manuscript (page 14).

      4 It is unclear why certain thresholds were chosen for standard deviation (SD) <0.4 (page 5), SD >1.0 (pg 11).

      Answer: The threshold is flexible and arbitrary. We showed different thresholds, and the same conclusion holds. We just choose the thresholds with better separation and a reasonable number of genes/isoforms for the downstream analysis. (Supplemental Figure 6-7 with different thresholds and supplemental tables 4-12).

      1. HLA is known to accumulate considerable somatic variation. Of the many non-immunological genes determined to have multiple isoforms what are the isoform specific mutation rates in the same isoform molecule? Are the HLA genes unique in the number of mutations occurring in the same isoform?

      Answer: We thank the reviewer for this important suggestion. We now show mutation expression patterns in isoforms of DOCK8 and STEAP4 in Figure 5. A new section is added to discuss the mutation expression of these two genes. As shown in supplemental figure 10, HLA-DQB1, HLA-DRB1, HLA-B, and HLA-C, have only one known isoform detected,

      Editorial comments:

      The present study pairs single-cell seq with LoopSeq synthetic long-read sequencing on samples of HCC and benign liver to identify mutations and fusion transcripts specific to cancer cells. The authors present a potentially important resource; however the overall support remains incomplete.

      While the approach of evaluating isoform-specific changes at the cellular level to cancer seeks to address a timely and important topic, there is currently incomplete evidence in support of the major claims in the manuscript. In particular, major recommendations to provide stronger support for the combination of technologies and interpretation regarding cancer-associated genomic changes include: 1) systematic evaluation of UMAP-based clustering methods, to what subsets of data they are applied and subsequent interpretations, 2) direct comparisons of results with additional methods to quantify long-read sequencing data and those evaluating mutational consequences of HCC progression and 3) detailed expansion of the description of methods and rationale for selecting specific parameters and cell types for further analyses. Including these changes would significantly strengthen the support for utility of combining 10x single-cell with Loop-seq and provide compelling evidence for usage of this resource in dissecting HCC-associated molecular changes.

      Answer: We appreciate the frank and constructive comments. The goal of UMAP is to obtain biological knowledge through unbiased data selection. Systematically, we select classifiers without any prior knowledge (blind to the samples). In our case, classifiers with high standard deviation across all the cells were chosen. We stressed this in the result section. The comparison among LOOPSeq, PacBio, and Oxford nanopore was made in our previous study. We cited that analysis in this paper. Analysis detail and pipelines were added in the revised manuscript to improve the reproducibility. The mutation expression analysis was quite clear-cut. The clustering classified the HCC and benign liver cells by itself and identified a few cancer cells in the benign liver sample. All these were accomplished without applying any knowledge.

      Reviewer #1 (Recommendations For The Authors):

      Overall, there are numerous problems with data presentation and insufficient description, which authors could fix.

      1. Figure 4. A. It would be more clear if the figure showed the distribution of mutations in the molecule. Otherwise, it's hard to see if we see clusters of mutations or just 25 mutations spread uniformly across the transcript. B. It's unclear what the reader needs to take away from these columns of numbers.

      Answer: The mutation positions are now presented as proportion to the location in a molecule. Column B is the distribution of mutation molecules from left panel in each cluster of cells (from Figure 3A) and their sample origin (HCC or benign liver). We clarify it a little more in the legend of Figure 4A.

      1. As a reader, I did not understand how "mutated gene expression levels" and "mutated isoform expression levels" were calculated in terms of sequenced long reads

      Answer: We defined the term and calculations in the methods section of the revised manuscript.

      1. Page 6 "genes involving antigen presentation"

      Answer: The full sentence of the subtitle is" Mutations of genes involving antigen presentation dominated the mutation expression landscape."

      1. Page 6 "These unique mutational isoforms" - how are these isoforms unique?

      Answer: We take away most of the "unique" adjectives to describe the non-redundant mutations.

      1. Page 6. Unclear "All but one clusters contained cells co‐migrated with cells of their sources."

      "Among 113 mutation isoforms, the major histocompatibility complex (HLA) was the most prominent with 68 iterations (60.2%) (Supplemental Table 3, Figure 3B)" There is nothing about HLA in Figure 3B.

      Answer: We revised the sentence as "Cells in all but one clusters co-migrated with cells of their sources". The mutation isoform expressions were listed in supplemental Table 3. They are too small and become unreadable when put in the figure.

      1. Page 10 "genes or isoforms that across all samples had with expression standard deviations less than" - probably "with" should not be there.

      Answer: We correct the error and thank the reviewer for the comment.

      1. Page 11 "UMAP analysis was performed using genes with standard deviations {greater than or equal to} 1.0 (182 wild‐type genes) and standard deviations >0.4 (282 mutated genes)". What do "wild-type" and "mutated" mean here?

      Answer: We edited as "UMAP analysis was performed using gene expressions with standard deviations ≥ 1.0 (182 non-mutated genes) and gene mutation expression with standard deviations 0.4 (282 mutated genes)."

      1. I could not find the description of Supplementary Tables.

      Answer: The supplemental table legends are added in the revised manuscript.

      1. In the Discussion section, the authors mention that mutations were mainly expressed in a specific isoform of a gene for a given cell. I suggest to emphasize this point in the Results section and illustrate it with a comparison of abundance of mutated and non-mutated isoforms

      Answer: For HLA molecules, their expression appeared to be restricted to one known isoform, regardless of mutation status. This sentence is removed in the revision. A new section of DOCK8 and STEAP4 mutation expression is added to the result.

      1. It is also mentioned that mutations may have an impact on the RNA splicing process. The authors should compare the observed isoform ratio to a prediction of the effect of variants on splicing by SpliceAI or similar tools

      Answer: This sentence was removed from the discussion.

      1. Figure 3c: triangles corresponding to HLA-positive cells are hard to distinguish

      Answer: We provide a larger representation of the triangle and circle in figure 3c in the revision.

      Reviewer #2 (Recommendations For The Authors):

      Many of my comments could be addressed by spending time to provide the code/data and a walkthrough of analyses so that other users would be able to answer these questions on their own.

      Answer: We have included a script section in the revision to ensure the reproducibility of the analysis. The raw data had been uploaded to GEO (see Methods).

    1. Running the code in a subprocess is much slower than running a thread, not because the computation is slower, but because of the overhead of copying and (de)serializing the data. So how do you avoid this overhead?

      Reducing the performance hit of copying data between processes:

      Option #1: Just use threads

      Processes have overhead, threads do not. And while it’s true that generic Python code won’t parallelize well when using multiple threads, that’s not necessarily true for your Python code. For example, NumPy releases the GIL for many of its operations, which means you can use multiple CPU cores even with threads.

      ``` # numpy_gil.py import numpy as np from time import time from multiprocessing.pool import ThreadPool

      arr = np.ones((1024, 1024, 1024))

      start = time() for i in range(10): arr.sum() print("Sequential:", time() - start)

      expected = arr.sum()

      start = time() with ThreadPool(4) as pool: result = pool.map(np.sum, [arr] * 10) assert result == [expected] * 10 print("4 threads:", time() - start) ```

      When run, we see that NumPy uses multiple cores just fine when using threads, at least for this operation:

      $ python numpy_gil.py Sequential: 4.253053188323975 4 threads: 1.3854241371154785

      Pandas is built on NumPy, so many numeric operations will likely release the GIL as well. However, anything involving strings, or Python objects in general, will not. So another approach is to use a library like Polars which is designed from the ground-up for parallelism, to the point where you don’t have to think about it at all, it has an internal thread pool.

      Option #2: Live with it

      If you’re stuck with using processes, you might just decide to live with the overhead of pickling. In particular, if you minimize how much data gets passed and forth between processes, and the computation in each process is significant enough, the cost of copying and serializing data might not significantly impact your program’s runtime. Spending a few seconds on pickling doesn’t really matter if your subsequent computation takes 10 minutes.

      Option #3: Write the data to disk

      Instead of passing data directly, you can write the data to disk, and then pass the path to this file: * to the subprocess (as an argument) * to parent process (as the return value of the function running in the worker process).

      The recipient process can then parse the file.

      ``` import pandas as pd import multiprocessing as mp from pathlib import Path from tempfile import mkdtemp from time import time

      def noop(df: pd.DataFrame): # real code would process the dataframe here pass

      def noop_from_path(path: Path): df = pd.read_parquet(path, engine="fastparquet") # real code would process the dataframe here pass

      def main(): df = pd.DataFrame({"column": list(range(10_000_000))})

      with mp.get_context("spawn").Pool(1) as pool:
          # Pass the DataFrame to the worker process
          # directly, via pickling:
          start = time()
          pool.apply(noop, (df,))
          print("Pickling-based:", time() - start)
      
          # Write the DataFrame to a file, pass the path to
          # the file to the worker process:
          start = time()
          path = Path(mkdtemp()) / "temp.parquet"
          df.to_parquet(
              path,
              engine="fastparquet",
              # Run faster by skipping compression:
              compression="uncompressed",
          )
          pool.apply(noop_from_path, (path,))
          print("Parquet-based:", time() - start)
      

      if name == "main": main() `` **Option #4:multiprocessing.shared_memory`**

      Because processes sometimes do want to share memory, operating systems typically provide facilities for explicitly creating shared memory between processes. Python wraps this facilities in the multiprocessing.shared_memory module.

      However, unlike threads, where the same memory address space allows trivially sharing Python objects, in this case you’re mostly limited to sharing arrays. And as we’ve seen, NumPy releases the GIL for expensive operations, which means you can just use threads, which is much simpler. Still, in case you ever need it, it’s worth knowing this module exists.

      Note: The module also includes ShareableList, which is a bit like a Python list but limited to int, float, bool, small str and bytes, and None. But this doesn’t help you cheaply share an arbitrary Python object.

      A bad option for Linux: the "fork" context

      You may have noticed we did multiprocessing.get_context("spawn").Pool() to create a process pool. This is because Python has multiple implementations of multiprocessing on some OSes. "spawn" is the only option on Windows, the only non-broken option on macOS, and available on Linux. When using "spawn", a completely new process is created, so you always have to copy data across.

      On Linux, the default is "fork": the new child process has a complete copy of the memory of the parent process at the time of the child process’ creation. This means any objects in the parent (arrays, giant dicts, whatever) that were created before the child process was created, and were stored somewhere helpful like a module, are accessible to the child. Which means you don’t need to pickle/unpickle to access them.

      Sounds useful, right? There’s only one problem: the "fork" context is super-broken, which is why it will stop being the default in Python 3.14.

      Consider the following program:

      ``` import threading import sys from multiprocessing import Process

      def thread1(): for i in range(1000): print("hello", file=sys.stderr)

      threading.Thread(target=thread1).start()

      def foo(): pass

      Process(target=foo).start() ```

      On my computer, this program consistently deadlocks: it freezes and never exits. Any time you have threads in the parent process, the "fork" context can cause in potential deadlocks, or even corrupted memory, in the child process.

      You might think that you’re fine because you don’t start any threads. But many Python libraries start a thread pool on import, for example NumPy. If you’re using NumPy, Pandas, or any other library that depends on NumPy, you are running a threaded program, and therefore at risk of deadlocks, segfaults, or data corruption when using the "fork" multiprocessing context. For more details see this article on why multiprocessing’s default is broken on Linux.

      You’re just shooting yourself in the foot if you take this approach.

    2. Threads vs. processes

      Multiple threads let you run code in parallel, potentially on multiple CPUs. On Python, however, the global interpreter lock makes this parallelism harder to achieve.

      Multiple processes also let you run code in parallel—so what’s the difference between threads and processes?

      All the threads inside a single process share the same memory address space. If thread 1 in a process stores some memory at address 0x7f0cd1a88810, thread 2 can access the same memory at the same address. That means passing objects between threads is cheap: you just need to get the pointer to the memory address from one thread to the other. A memory address is 8 bytes: this is not a lot of data to move around.

      In contrast, processes do not share the same memory space. There are some shared memory facilities provided by the operating system, typically, and we’ll get to that later. But by default, no memory is shared. That means you can’t just share the address of your data across processes: you have to copy the data.

    1. Successful perceivers, according to O’Regan and Noë, will have mastery, or tacit knowledge, of the sensorimotor contingencies particular to their own sensory apparatuses. But it is not just motion of the eye that reveals sensorimotor contingencies. Often these contingencies show themselves only when the entire organism is in motion, moving around objects, picking them up, seeing them against a background of other objects, etc. The actions an organism takes in its efforts to perceive display a kind of skill—a skill that in turn reflects familiarity with the sensorimotor contingencies that impose order on patterns of changing stimulation that would otherwise appear as an uncrackable cipher—a code waiting to be broken.

      they are defining a normative aspect of perception, which need not be intrinsic to perception itself: they are defining what it takes to perceive well and accurately, the accuracy conditions of perception, which is this tacit knowledge of the sensorimotor contingencies. but this knowledge is a skill because it requires continued and varied perceptual experiences. So - in order to perceive well - one first needs to perceive, and so knowing the conditions of perceiving well will not necessarily impart knowledge of perception as such. 'familiarity' and knowledge of 'contingencies', an imposition of order on 'what would otherwise be' 'patterns of changing stimulation'.

      I.E., the skill invokes precisely the kind of cognitive processing they are seeking to explain? Are they describing an aspect of cognition by describing sensory motions that involve cognitive processing? Is this circular?

    1. Reviewer #1 (Public Review):

      Summary:

      This study aims to provide imaging methods for users of the field of human layer-fMRI. This is an emerging field with 240 papers published so far. Different than implied in the manuscript, 3T is well represented among those papers. E.g. see the papers below that are not cited in the manuscript. Thus, the claim on the impact of developing 3T methodology for wider dissemination is not justified. Specifically, because some of the previous papers perform whole brain layer-fMRI (also at 3T) in more efficient, and more established procedures.

      The authors implemented a sequence with lots of nice features. Including their own SMS EPI, diffusion bipolar pulses, eye-saturation bands, and they built their own reconstruction around it. This is not trivial. Only a few labs around the world have this level of engineering expertise. I applaud this technical achievement. However, I doubt that any of this is the right tool for layer-fMRI, nor does it represent an advancement for the field. In the thermal noise dominated regime of sub-millimeter fMRI (especially at 3T), it is established to use 3D readouts over 2D (SMS) readouts. While it is not trivial to implement SMS, the vendor implementations (as well as the CMRR and MGH implementations) are most widely applied across the majority of current fMRI studies already. The author's work on this does not serve any previous shortcomings in the field.

      The mechanism to use bi-polar gradients to increase the localization specificity is doubtful to me. In my understanding, killing the intra-vascular BOLD should make it less specific. Also, the empirical data do not suggest a higher localization specificity to me.

      Embedding this work in the literature of previous methods is incomplete. Recent trends of vessel signal manipulation with ABC or VAPER are not mentioned. Comparisons with VASO are outdated and incorrect.

      The reproducibility of the methods and the result is doubtful (see below).

      I don't think that this manuscript is in the top 50% of the 240 layer-fmri papers out there.

      3T layer-fMRI papers that are not cited:<br /> Taso, M., Munsch, F., Zhao, L., Alsop, D.C., 2021. Regional and depth-dependence of cortical blood-flow assessed with high-resolution Arterial Spin Labeling (ASL). Journal of Cerebral Blood Flow and Metabolism. https://doi.org/10.1177/0271678X20982382

      Wu, P.Y., Chu, Y.H., Lin, J.F.L., Kuo, W.J., Lin, F.H., 2018. Feature-dependent intrinsic functional connectivity across cortical depths in the human auditory cortex. Scientific Reports 8, 1-14. https://doi.org/10.1038/s41598-018-31292-x

      Lifshits, S., Tomer, O., Shamir, I., Barazany, D., Tsarfaty, G., Rosset, S., Assaf, Y., 2018. Resolution considerations in imaging of the cortical layers. NeuroImage 164, 112-120. https://doi.org/10.1016/j.neuroimage.2017.02.086

      Puckett, A.M., Aquino, K.M., Robinson, P.A., Breakspear, M., Schira, M.M., 2016. The spatiotemporal hemodynamic response function for depth-dependent functional imaging of human cortex. NeuroImage 139, 240-248. https://doi.org/10.1016/j.neuroimage.2016.06.019

      Olman, C.A., Inati, S., Heeger, D.J., 2007. The effect of large veins on spatial localization with GE BOLD at 3 T: Displacement, not blurring. NeuroImage 34, 1126-1135. https://doi.org/10.1016/j.neuroimage.2006.08.045

      Ress, D., Glover, G.H., Liu, J., Wandell, B., 2007. Laminar profiles of functional activity in the human brain. NeuroImage 34, 74-84. https://doi.org/10.1016/j.neuroimage.2006.08.020

      Huber, L., Kronbichler, L., Stirnberg, R., Ehses, P., Stocker, T., Fernández-Cabello, S., Poser, B.A., Kronbichler, M., 2023. Evaluating the capabilities and challenges of layer-fMRI VASO at 3T. Aperture Neuro 3. https://doi.org/10.52294/001c.85117

      Scheeringa, R., Bonnefond, M., van Mourik, T., Jensen, O., Norris, D.G., Koopmans, P.J., 2022. Relating neural oscillations to laminar fMRI connectivity in visual cortex. Cerebral Cortex. https://doi.org/10.1093/cercor/bhac154

      Strengths:

      See above. The authors developed their own SMS sequence with many features. This is important to the field. And does not leave sequence development work to view isolated monopoly labs. This work democratises SMS.<br /> The questions addressed here are of high relevance to the field: getting tools with good sensitivity, user-friendly applicability, and locally specific brain activity mapping is an important topic in the field of layer-fMRI.

      Weaknesses:

      1. I feel the authors need to justify why flow-crushing helps localization specificity. There is an entire family of recent papers that aim to achieve higher localization specificity by doing the exact opposite. Namely, MT or ABC fRMRI aims to increase the localization specificity by highlighting the intravascular BOLD by means of suppressing non-flowing tissue. To name a few:

      Priovoulos, N., de Oliveira, I.A.F., Poser, B.A., Norris, D.G., van der Zwaag, W., 2023. Combining arterial blood contrast with BOLD increases fMRI intracortical contrast. Human Brain Mapping hbm.26227. https://doi.org/10.1002/hbm.26227.

      Pfaffenrot, V., Koopmans, P.J., 2022. Magnetization Transfer weighted laminar fMRI with multi-echo FLASH. NeuroImage 119725. https://doi.org/10.1016/j.neuroimage.2022.119725

      Schulz, J., Fazal, Z., Metere, R., Marques, J.P., Norris, D.G., 2020. Arterial blood contrast ( ABC ) enabled by magnetization transfer ( MT ): a novel MRI technique for enhancing the measurement of brain activation changes. bioRxiv. https://doi.org/10.1101/2020.05.20.106666

      Based on this literature, it seems that the proposed method will make the vein problem worse, not better. The authors could make it clearer how they reason that making GE-BOLD signals more extra-vascular weighted should help to reduce large vein effects.

      The empirical evidence for the claim that flow crushing helps with the localization specificity should be made clearer. The response magnitude with and without flow crushing looks pretty much identical to me (see Fig, 6d).<br /> It's unclear to me what to look for in Fig. 5. I cannot discern any layer patterns in these maps. It's too noisy. The two maps of TE=43ms look like identical copies from each other. Maybe an editorial error?

      The authors discuss bipolar crushing with respect to SE-BOLD where it has been previously applied. For SE-BOLD at UHF, a substantial portion of the vein signal comes from the intravascular compartment. So I agree that for SE-BOLD, it makes sense to crush the intravascular signal. For GE-BOLD however, this reasoning does not hold. For GE-BOLD (even at 3T), most of the vein signal comes from extravascular dephasing around large unspecific veins, and the bipolar crushing is not expected to help with this.

      2. The bipolar crushing is limited to one single direction of flow. This introduces a lot of artificial variance across the cortical folding pattern. This is not mentioned in the manuscript. There is an entire family of papers that perform layer-fmri with black-blood imaging that solves this with a 3D contrast preparation (VAPER) that is applied across a longer time period, thus killing the blood signal while it flows across all directions of the vascular tree. Here, the signal cruising is happening with a 2D readout as a "snap-shot" crushing. This does not allow the blood to flow in multiple directions.<br /> VAPER also accounts for BOLD contaminations of larger draining veins by means of a tag-control sampling. The proposed approach here does not account for this contamination.

      Chai, Y., Li, L., Huber, L., Poser, B.A., Bandettini, P.A., 2020. Integrated VASO and perfusion contrast: A new tool for laminar functional MRI. NeuroImage 207, 116358. https://doi.org/10.1016/j.neuroimage.2019.116358

      Chai, Y., Liu, T.T., Marrett, S., Li, L., Khojandi, A., Handwerker, D.A., Alink, A., Muckli, L., Bandettini, P.A., 2021. Topographical and laminar distribution of audiovisual processing within human planum temporale. Progress in Neurobiology 102121. https://doi.org/10.1016/j.pneurobio.2021.102121

      If I would recommend anyone to perform layer-fMRI with blood crushing, it seems that VAPER is the superior approach. The authors could make it clearer why users might want to use the unidirectional crushing instead.

      3. The comparison with VASO is misleading.<br /> The authors claim that previous VASO approaches were limited by TRs of 8.2s. The authors might be advised to check the latest literature of the last years.<br /> Koiso et al. performed whole brain layer-fMRI VASO at 0.8mm at 3.9 seconds (with reliable activation), 2.7 seconds (with unconvincing activation pattern, though), and 2.3 (without activation).<br /> Also, whole brain layer-fMRI BOLD at 0.5mm and 0.7mm has been previously performed by the Juelich group at TRs of 3.5s (their TR definition is 'fishy' though).

      Koiso, K., Müller, A.K., Akamatsu, K., Dresbach, S., Gulban, O.F., Goebel, R., Miyawaki, Y., Poser, B.A., Huber, L., 2023. Acquisition and processing methods of whole-brain layer-fMRI VASO and BOLD: The Kenshu dataset. Aperture Neuro 34. https://doi.org/10.1101/2022.08.19.504502

      Yun, S.D., Pais‐Roldán, P., Palomero‐Gallagher, N., Shah, N.J., 2022. Mapping of whole‐cerebrum resting‐state networks using ultra‐high resolution acquisition protocols. Human Brain Mapping. https://doi.org/10.1002/hbm.25855

      Pais-Roldan, P., Yun, S.D., Palomero-Gallagher, N., Shah, N.J., 2023. Cortical depth-dependent human fMRI of resting-state networks using EPIK. Front. Neurosci. 17, 1151544. https://doi.org/10.3389/fnins.2023.1151544

      The authors are correct that VASO is not advised as a turn-key method for lower brain areas, incl. Hippocampus and subcortex. However, the authors use this word of caution that is intended for inexperienced "users" as a statement that this cannot be performed. This statement is taken out of context. This statement is not from the academic literature. It's advice for the 40+ user base that wants to perform layer-fMRI as a plug-and-play routine tool in neuroscience usage. In fact, sub-millimeter VASO is routinely being performed by MRI-physicists across all brain areas (including deep brain structures, hippocampus etc). E.g. see Koiso et al. and an overview lecture from a layer-fMRI workshop that I had recently attended: https://youtu.be/kzh-nWXd54s?si=hoIJjLLIxFUJ4g20&t=2401

      Thus, the authors could embed this phrasing into the context of their own method that they are proposing in the manuscript. E.g. the authors could state whether they think that their sequence has the potential to be disseminated across sites, considering that it requires slow offline reconstruction in Matlab?<br /> Do the authors think that the results shown in Fig. 6c are suggesting turn-key acquisition of a routine mapping tool? In my humble opinion, it looks like random noise, with most of the activation outside the ROI (in white matter).

      4. The repeatability of the results is questionable.<br /> The authors perform experiments about the robustness of the method (line 620). The corresponding results are not suggesting any robustness to me. In fact, the layer profiles in Fig. 4c vs. Fig 4d are completely opposite. The location of peaks turns into locations of dips and vice versa.<br /> The methods are not described in enough detail to reproduce these results.<br /> The authors mention that their image reconstruction is done "using in-house MATLAB code" (line 634). They do not post a link to github, nor do they say if they share this code.

      It is not trivial to get good phase data for fMRI. The authors do not mention how they perform the respective coil-combination.<br /> No data are shared for reproduction of the analysis.

      5. The application of NODRIC is not validated.<br /> Previous applications of NORDIC at 3T layer-fMRI have resulted in mixed success. When not adjusted for the right SNR regime it can result in artifactual reductions of beta scores, depending on the SNR across layers. The authors could validate their application of NORDIC and confirm that the average layer-profiles are unaffected by the application of NORDIC. Also, the NORDIC version should be explicitly mentioned in the manuscript.

      Akbari, A., Gati, J.S., Zeman, P., Liem, B., Menon, R.S., 2023. Layer Dependence of Monocular and Binocular Responses in Human Ocular Dominance Columns at 7T using VASO and BOLD (preprint). Neuroscience. https://doi.org/10.1101/2023.04.06.535924

      Knudsen, L., Guo, F., Huang, J., Blicher, J.U., Lund, T.E., Zhou, Y., Zhang, P., Yang, Y., 2023. The laminar pattern of proprioceptive activation in human primary motor cortex. bioRxiv. https://doi.org/10.1101/2023.10.29.564658

    1. Reviewer #2 (Public Review):

      Summary<br /> Song et al investigate the role of the frontal eye field (FEF) and the intraparietal sulcus (IPS) in mediating the shift in ocular dominance (OD) observed after a period of dichoptic stimulation during which attention is selectively directed to one eye. This manipulation has been previously found to transiently shift OD in favor of the unattended eye, similar to the effect of short-term monocular deprivation. To this aim, the authors combine psychophysics, fMRI, and transcranial magnetic stimulation (TMS). In the first experiment, the authors determine the regions of interest (ROIs) based on the responses recorded by fMRI during either dichoptic or binocular stimulation, showing selective recruitment of the right FEF and IPS during the dichoptic condition, in line with the involvement of eye-based attention. In a second experiment, the authors investigate the causal role of these two ROIs in mediating the OD shift observed after a period of dichoptic stimulation by selectively inhibiting with TMS (using continuous theta burst stimulation, cTBS), before the adaptation period (50 min exposure to dichoptic stimulation). They show that, when cTBS is delivered on the FEF, but not the IPS or the vertex, the shift in OD induced by dichoptic stimulation is reduced, indicating a causal involvement of the FEF in mediating this form of short-term plasticity. A third control experiment rules out the possibility that TMS interferes with the OD task (binocular rivalry), rather than with the plasticity mechanisms. From this evidence, the authors conclude that the FEF is one of the areas mediating the OD shift induced by eye-selective attention.

      Strengths<br /> 1. The experimental paradigm is sound and the authors have thoroughly investigated the neural correlates of an interesting form of short-term visual plasticity combining different techniques in an intelligent way.

      2. The results are solid and the appropriate controls have been performed to exclude potential confounds.

      3. The results are very interesting, providing new evidence both about the neural correlates of eye-based attention and the involvement of extra-striate areas in mediating short-term OD plasticity in humans, with potential relevance for clinical applications (especially in the field of amblyopia).

      Weaknesses<br /> 1. Ethics: more details about the ethics need to be included in the manuscript. It is only mentioned for experiment 1 that participants "provided informed consent in accordance with the Declaration of Helsinki. This study was approved by the Institutional Review Board of the Institute of Psychology, Chinese Academy of Sciences". (Which version of the Declaration of Helsinki? The latest version requires the pre-registration of the study. The code of the approved protocol together with the code and date of the approval should be provided.) There is no mention of informed consent procedures or ethics approval for the TMS experiments. This is a huge concern, especially for brain stimulation experiments!

      2. Statistics: the methods section should include a sub-section describing in detail all the statistical analyses performed for the study. Moreover, in the results section, statistical details should be added to support the fMRI results. In the current version of the manuscript, the claims are not supported by statistical evidence.

      3. Interpretation of the results: the TMS results are very interesting and convincing regarding the involvement of the FEF in the build-up of the OD shift induced by dichoptic stimulation, however, I am not sure that the authors can claim that this effect is related to eye-based attention, as cTBS has no effect on the blob detection task during dichoptic stimulation. If the FEF were causally involved in eye-based attention, one would expect a change in performance in this task during dichoptic stimulation, perhaps a similar performance for the unattended and attended eye. The authors speculate that the sound could have an additional role in driving eye-based attention, which might explain the lack of effect for the blob discrimination task, however, this hypothesis has not been tested.

      4. Writing: in general, the manuscript is well written, but clarity should be improved in certain sections.

      a. fMRI results: the first sentence is difficult to understand at first read, but it is crucial to understand the results, please reformulate and clarify.

      b. Experiment 3: the rationale for experiment one should be straightforward, without a long premise explaining why it would not be necessary.

      c. Discussion: the language is a bit familiar here and there, a more straightforward style should be preferred (one example: p.19 second paragraph).

      5. Minor: the authors might consider using the term "participant" or "observer" instead of "subject" when referring to the volunteers who participated in the study.

    1. here exist explicit cut-and-dried algorithms for calculating the Hodge dual of B, especially if B is known in terms of components in some basis. See the discussion in reference 13, or see the actual code in reference 14.

      it's literally just iB, is it not? Why do I need a reference for that, and why isn't it just directly stated?

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Haj Abdullah Alieh at al., describe re-analysis of an existing short read RNA-Seq dataset consisting of 3 replicates of 3 FAC sorted cell populations of the E14.5 Btg2::RFP/Tubb3::GFP mouse cortex: neural stem cells (NSC; RFP-/GFP-), neural precursors (NP; RFP+/GFP-) and neurons (N; GFP+), for the purpose of investigating alternative splicing isoform switching during neuronal cell-type specification. They generate a one replicate PacBio dataset of these same sorted cells, with the aim of identifying full-length transcript isoforms, which are difficult to discern with short-read data alone. The key conclusions are the discovery of ~50,000 novel transcript isoforms containing ~2,500 novel splice junctions; the discovery of isoform switches between NSC -> neuron that contain a high proportion of microexon inclusion events and the finding that many of these switches are predicted by Alphafold2 to have a structural impact.

      The data is interesting and the bioinformatics approach of investigating potential impacts of splice variants on protein structure using Alphafold2 is also interesting, however at present the paper would be better presented as a resource, unless effort is undertaken to experimentally validate some potential biological findings. However, for the paper to be useful as a resource, links to newly generated data and analysis code need to be provided. The capacity for exploration of these newly identified splice isoforms, or further analysis using the new GTF, could then be one of the attractions of this work.

      Major comments:

      • Are the key conclusions convincing?
      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?
      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. Figure 1 The discovery of ~50,000 novel transcript isoforms containing ~2,500 novel splice junctions As far as I can see the description of novelty is based on them being not present in either Ensembl (GRCm38.p6), NCBI_RefSeq, or Gencode (vM10) - note here the numbers are genome assembly versions and do not refer to the GTF annotation versions compared against - these should be provided as they are frequently updated. The claim is that they are not present in these references because the unique cell samples have not been analysed before. For transcript isoforms to be included in these references they must have a good level of support. I have a couple of concerns about the support for these isoforms: The numbers in figure 1A do not add up. For long read sequencing two pipelines are used resulting in 76,077 and 80,782 isoforms - in the venn diagram 1A the overlapping circle has two numbers of isoforms in it: 70,658 and 71,760 so it is unclear, are 70,658 isoforms found by both pipelines or 71,760? Then we are told the union of these transcripts is taken forward to the next venn diagram. However this diagram is labelled with 82,046 transcript isoforms. Pipeline 1 has labelled 5419 unique isoforms, pipeline 2 has 9,022 unique isoforms so 5419 + 9022 + 70658(71760) = 85,099(86201) not 82,046 - perhaps some extra filtering has occurred that should be labelled/described? Again the final number of transcripts at the end of everything is off - if the 82,046 transcripts from long read are combined with the 16,070 unique to the short read this equals 98,116, not 97,240. The authors decide to use long read sequencing to assemble the isoforms as short-read sequencing is unreliable for assembling full length isoforms - however for their final list they merge isoforms assembled by StringTie from short read data with the isoforms assembled from the PacBio long read data, it seems likely that the isoforms detected only by short-read Stringtie assembly would be unreliable and shouldn't be included in the final total. The authors perform only one biological replicate of PacBio long read sequencing of three different samples, so it is not possible to easily determine the reproducibility of the findings. I appreciate PacBio is expensive, the authors could consider other ways to evaluate the reproducibility - perhaps by looking at the detection of transcripts expected to be uniformly expressed between the different conditions? The authors provide no quality information for their PacBio sequencing run - eg. length distribution of reads, how many reads are left after quality filtering, quality across the length of reads, ie. I do not know if most isoforms reported are supported by 5 full length isoform reads, or if it is rare in the dataset to get full length isoform reads .etc is the quality comparable across the three PacBio samples? How many of the novel isoforms are supported by both short read and long read data? How many of the novel isoforms are supported only by short reads? How many isoforms are found in all three PacBio samples? Does gene expression measured with the PacBio data match the previous results of measuring gene expression in the short read data? Adding these kinds of analyses would give more confidence in the results. This section of methods is confusing, I don't really understand what has been done or what part of the manuscript this refers to: "​​Events were assigned to an inclusion isoform if their coordinates overlapped, at least partially, with an exon or to an exclusion isoform if they were located within an intron. AS events without a corresponding inclusion or exclusion isoform were assigned to an Ensembl or NCBI_RefSeq isoform using the criteria above. Only AS events assigned to at least one inclusion and one exclusion isoform were considered for further analysis." VastDB is a splicing database created by Manuel Irimia/Ben Blencowe containing a lot of neural samples across development - how many of the 'novel' splice sites are present in VastDB? Similarly, how many of the 'novel' splice isoforms were previously detected by Zhang et al., 2016, Cell.

      Figure 2: over neuronal maturation the major splicing change is for cassette exons to become more included, 50% of those measured being microexons Overall this section is strongest, the conclusions are well supported. Figure 2D - there are no genome coordinates given to allow the reader to check the highlighted events out for themselves. Figure 2F is very confusing, consider an alternative way to present this. Figure 2G, the premise of this analysis is interesting! But confused on the numbers - in 2F its shown that 226 exons become more included between both NSC->NP->N, so why are 441 exons plotted in 2G? Whilst I appreciate genes must be expressed in both NSCs and neurons to be able to calculate differential splicing, one thing not addressed is whether expression of a lot of these genes also goes up in neurons, i.e. could it be that when these genes are lowly expressed in NSCs their splicing is not particularly well regulated but it doesn't really matter because they are not really required in NSCs? This becomes relevant later where you start to address the functionality of isoform switches - if the gene is expressed to the same degree in NSC vs. N this would suggest that both isoforms are functional, if a gene is very lowly expressed in NSC but highly expressed in N, then maybe only the N isoform needs to be functional. Gene ontology methodology is not described in the methods. What were the spliced genes compared against? Given these are neural samples, lots of expressed genes will have neural functions, so is this really informing us about the alternatively spliced genes? The manuscript would benefit by integration of its data with other published datasets - especially with the microexons - how do these behave in other datasets of neuronal maturation (such as those from vastdb or zhang 2016)? The authors could consider looking at motifs around regulated microexons to try and establish if any specific RBPs might be involved in this regulation, although this would benefit from follow up experiments.

      Figure 3: exon inclusion in neuronal specific transcripts confers different structures to translated proteins, suggesting these events are important functionally Here, Alphafold2 is used to predict the structures of switching isoforms, whilst an interesting approach to inform further experiments, presented alone, it remains hypothetical. Hook2 is highlighted as one example, where inclusion of a microexon introducing two amino acids to the translated protein is predicted to cause a structural change that will impact its binding to microtubules. It's hard to determine if this really will have a functional impact without doing experiments in the lab. For this manuscript to serve as a research (rather than resource) article, it would benefit from an example experiment expressing neuronal vs. NSC Hook2 isoform in a cell line and measuring co-localisation with microtubules via IF microscopy, or something similar to address the proposed function. In the second half of this figure, more subtle local structural changes are investigated and the example of an alpha-helix to beta-strand switch predicted in Kctd13 is presented. The figure would benefit from showing the splicing change at the RNA level and relating that to the change seen at the protein sequence level as it is a bit confusing - the region of deletion is labelled as 'AS REGION' however, two amino acids preceding this box are different between the two isoforms (KVEF vs. KVRG) - so presumably the splicing change starts earlier than denoted? In the discussion the authors state: "While these regions are long known to exist, their structural switch was assumed to be dependent on substantial changes in their structural and sequence contexts (Gendoo and Harrison, 2011; W. Li et al., 2015) as opposed to, as observed in our study, being triggered by small perturbations within nearly identical sequence contexts." It's not clear whether these small local predictions are accurate and would require some additional structural data to validate. - Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. Suggestions of additional computational analysis are very realistic and shouldn't take longer than a month or two. The addition of experimental data to support Figure 3 would take considerable time and resources, potentially collaboration with other labs. Perhaps focusing on making this dataset an accessible resource would be a better route to publication. - Are the data and the methods presented in such a way that they can be reproduced? No, no source code, software versions or supplementary data/materials is provided. - Are the experiments adequately replicated and statistical analysis adequate? Having one replicate of the PacBio experiment is a bit concerning, but I am aware that it is expensive. Given they have three samples of different conditions with PacBio data perhaps showing the quality control of the libraries, reproducibility of transcripts that don't change in the three conditions, etc. would give more confidence in the data.

      Minor comments:

      • Specific experimental issues that are easily addressable. Made above.
      • Are prior studies referenced appropriately? Yes. Except for this section of introduction: "While great effort is being made to overcome these limitations, capturing cell type-specific AS dynamics that is both quantitative and comprehensive of full-length transcript information currently requires combination of both SRS and LRS performed in parallel on the same cell pool. This was seldom attempted (Gupta et al., 2018; Joglekar et al., 2021) and, to the best of our knowledge, never for specific cell types of the developing mammalian brain. Even more limiting, systematic assessment of the consequences of AS on protein structure and putative function in cell fate commitment is entirely lacking. "

      LRS has allowed for whole transcriptome determination and quantification in a number of cases, especially in non-model organisms, below I mention some examples from human and mouse: Nanopore use in GTEX + short reads: Glinos et al., 2022 Nature https://www.nature.com/articles/s41586-022-05035-y PacBio SMRT-Seq + short reads human and mouse cortex: Leung et al. Cell Reports 2021 https://www.cell.com/cell-reports/pdf/S2211-1247(21)01504-7.pdf PacBio IsoSeq + short reads in human and mouse sperm: Sun et al., 2021 Nature Communications https://www.nature.com/articles/s41467-021-21524-6 Single cell long read RNA-Seq has also been described in several scenarios and is worth referencing in the introduction: Samples from various mouse and human sources: Tian et al., 2021 Genome Biology https://link.springer.com/article/10.1186/s13059-021-02525-6 differential isoform usage in myeloma cell lines: Phillpott et al., 2021, Nature Biotech https://www.nature.com/articles/s41587-021-00965-w Single cell long read isoform analysis in human immune cells: Volden and Vollmers, 2022, Genome Biology https://genomebiology.biomedcentral.com/articles/10.1186/s13059-022-02615-z - Are the text and figures clear and accurate? Mostly, I've highlighted where numbers in figures don't make sense to me. Generally the text could use some going over and tightening up (eg. sentence on page 12 needs revising for clarity and typo "The fact that within this helical packing resides the protein domain essential for Hook2 function to bind microtubules, implies that such a negligible AS switch by two ammino acids may result in a completely altered function. ") - Do you have suggestions that would help the authors improve the presentation of their data and conclusions? I have made suggestions above about figures that are unclear to me.

      Referees cross-commenting

      After reading the reviews of other reviewers, it seems we are much in agreement over the main concerns relating to this manuscript. Namely: concerns over the PacBio being single replicate, concerns over indiscriminately merging PacBio and SRS transcripts, concerns about lack of validation of the structural changes predicted by AlphaFold2. On the question of novelty and significance we also seem to be aligned.

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      The main general findings of the work have been described elsewhere: that microexon inclusion increases in many transcripts during neuronal cell fate commitment has previously been described, the suggestions of important isoform structural changes in Hook2 and Kctd13 are not backed up by any experimental data and so are not reliable. The description of a huge number of novel isoforms is not particularly useful because it's not clear if these have been found by other similar studies, because the data is not compared, furthermore we have no information about these isoforms to be able to pursue further research about them. The main output of the work would be the data and transcript annotations for other people to follow up on, but this is not provided in any accessible way. The paper might be better reframed as a resource, if it is not possible to follow up on the biological conclusions. - Place the work in the context of the existing literature (provide references, where appropriate).

      Previously, alternative splicing has been studied in purified cell types of the developing mouse cortex using short read sequencing eg. in Zhang 2016, Cell. In this previous study, VZ NPCs (EGFP−) and non-VZ cells (EGFP+) were isolated from E14.5 Tbr2-EGFP mouse cerebral cortex. The double reporter mouse model used in the present study allows for better cell sorting into NSC, NPC and neurons, and the long read sequencing allows for whole transcript identification, however the present study has made no effort to compare the data, so it's not clear how much new biology this leads to. In Zhang 2016, the authors also predict disruption to protein domains caused by AS, but go further to perform experiments to validate the impact of some of these predictions. - State what audience might be interested in and influenced by the reported findings.

      Researchers of this cell fate transition might want to look at their favourite genes to see if there are novel isoforms reported (however this is currently not possible because this information is not provided). Researchers of Hook2 or Kctd13 may want to further explore the described predicted structural changes. Researchers generally studying alternative splicing may want to include the novel isoforms in their analyses (again currently not possible because they are not provided). Generally this paper would probably be best seen as a resource. - Define your field of expertise with a few keywords to help the authors contextualize your point of view.

      Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. Bioinformatics, Splicing, RBP biology

    1. When to use Python Async

      Async only makes sense if you're doing IO.

      There's ZERO benefit in using async to stuff like this that is CPU-bound:

      ``` import asyncio

      async def sum_two_numbers_async(n1: int, n2: int) -> int: return n1 + n2

      async def main(): await sum_two_numbers_async(2, 2) await sum_two_numbers_async(4, 4)

      asyncio.run(main()) ```

      Your code might even get slower by doing that due to the Event Loop.

      That's because Python async only optimizes IDLE time!

    1. Reviewer #1 (Public Review):

      Summary:<br /> Huang and Luo investigated whether regularities between stimulus features can be exploited to facilitate the encoding of each set of stimuli in visual working memory, improving performance. They recorded both behavioural and neural (EEG) data from human participants during a sequential delayed response task involving three items with two properties: location and colour. In the key condition ('aligned trajectory'), the distance between locations of successively presented stimuli was identical to their 'distance' in colour space, permitting a compression strategy of encoding only the location and colour of the first stimulus and the relative distance of the second and third stimulus (as opposed to remembering 3 locations and 3 colours, this would only require remembering 1 location, 1 colour, and 2 distances). Participants recalled the location and colour of each item after a delay.

      Consistent with the compression account, participants' location and colour recall errors were correlated and were overall lower compared to a non-compressible condition ('misaligned trajectory'). Multivariate analysis of the neural data permitted decoding of the locations and colours during encoding. Crucially, the relative distance could also be decoded - a necessary ingredient for the compression strategy.

      Strengths:<br /> The main strength of this study is a novel experimental design that elegantly demonstrates how we exploit stimulus structure to overcome working memory capacity limits. The behavioural results are robust and support the main hypothesis of compressed encoding across a number of analyses. The simple and well-controlled design is suited to neuroimaging studies and paves the way for investigating the neural basis of how environmental structure is detected and represented in memory. Prior studies on this topic have primarily studied behaviour only (e.g., Brady & Tenenbaum, 2013).

      Weaknesses:<br /> The main weakness of the study is that the EEG results do not make a clear case for compression or demonstrate its neural basis. If the main aim of this strategy is to improve memory maintenance, it seems that it should be employed during the encoding phase. From then on, the neural representation in memory should be in the compressed format. The only positive evidence for this occurs in the late encoding phase (the re-activation of decoding of the distance between items 1 and 2, Fig. 5A), but the link to behaviour seems fairly weak (p=0.068). Stronger evidence would be showing decoding of the compressed code during memory maintenance or recall, but this is not presented. On the contrary, during location recall (after the majority of memory maintenance is already over), colour decoding re-emerges, but in the un-compressed item-by-item code (Fig. 4B). The authors suggest that compression is consolidated at this point, but its utility at this late stage is not obvious.

      Impact:<br /> This important study elegantly demonstrates that the use of shared structure can improve capacity-limited visual working memory. The paradigm and approach explicitly link this field to recent findings on the role of replay in structure learning and will therefore be of interest to neuroscientists studying both topics.

    1. I disagree. What is expressed is an attempt to solve X by making something that should maybe be agnostic of time asynchronous. The problem is related to design: time taints code. You have a choice: either you make the surface area of async code grow and grow or you treat it as impure code and you lift pure synchronous logic in an async context. Without more information on the surrounding algorithm, we don't know if the design decision to make SymbolTable async was the best decision and we can't propose an alternative. This question was handled superficially and carelessly by the community.

      superficially and carelessly?

    2. because the value isn't there yet. A promise is just a marker that it will be available at some point in the future. You cannot convert asynchronous code to synchronous, though. If you order a pizza, you get a receipt that tells you that you will have a pizza at some point in the future. You cannot treat that receipt as the pizza itself, though. When you get your number called you can "resolve" that receipt to a pizza. But what you're describing is trying to eat the receipt.
    1. Performance

      "Core performance" would probably describe this better, since this talks about the performance in the core competence dimension of a software developer. A generic name like "Performance" can be misleading since performance of a software engineer is much more than the quality of code.

    2. Myth: Productivity is all about developer activity

      In this case, the activity could be an input metric, like the # of hours a developer is working, or an output metric, like the # of lines of code, or the # of PRs and so on, both of which are tricky to be used as sole indicators of productivity.

      Adding to this, developer activity is often seen through a narrow lens of code. While that's true for junior engineers, software engineering is much more than writing code, especially in senior positions, and involves planning, documenting, communicating/collaborating, execution, monitoring and maintenance.

    1. for security, app access token should never be hard-coded into client-side code, doing so would give everyone who loaded your webpage or decompiled your app full access to your app secret, and therefore the ability to modify your app. This implies that most of the time, you will be using app access tokens only in server to server calls.
    1. small sample bias (see chapter 4: Small Samples).

      OK, in that case I would wait until next chapter to introduce the R code. There isn't much value in showing someone how to get an answer in R that differs from your hand worked example

    1. Markdown is an excellent language to use for doc files stored next to code

      There are markdown documentation frameworks, but I still want to try sphinx.

    1. Leon Huang  · nsoSrdopteftuam473i91ctittu09521cgu781mlfhct5i61g2c61l37699t  · Shared with Public最近衝著Banjamin van Rooji,去買了《行為失控》這本書;剛剛讀了第一章,我發現翻譯才是真的失控。低劣而錯誤百出。我講的並不是單純翻譯風格疑似中國化的問題,而是原文所使用的字彙,語意,句法,乃至寫作巧思,在翻譯版當中錯漏誤譯,巧思盡失。我舉幾個例子吧。1. 第一章的標題,"A Tale of Two Codes" 顯然原作者在致敬Dickens的名著《雙城記》A Tale of Two Cities. 這不用花太多腦力也沒什麼高深的文學素養。了解這個基本不過的文學典故,翻譯成「雙『碼』記」,不失原意之外,同時也呼應本書同時指涉legal code,也就是「法典」,正是一種「法律編碼」,以及behavioral code--行為準則,也是一種「行為編碼」。正是雙『碼』記。為什麼能翻成「兩個密碼的故事」?然後,全文的"legal code"硬要翻成法律密碼,"behavioral code"硬要翻成行為密碼,到底是有多少秘密?編碼、符碼、組碼...各式各樣的詞彙組合,都有可能呈現作者原意。到底跟密碼(cryptography)有什麼一定要扯上的關係?2. 「在這一切之中,律師扮演著重要的角色。律師身為立法者...」(第22頁))這種荒腔走板的翻譯,顯然是以為"lawyer"一詞就是律師,而不知道lawyer一詞根本上很常泛指「法律人」。這一整段的譯文內容,不要說句意通順了,連中文的意思都令人難以理解。3. 「我們的法律傾向由公共意見形成的政治過程。」(第23頁)誰能告訴我這句中文在講什麼?我看不懂。這才第一章。被我標示為錯誤或中文字句但無法以中文理解的字詞,已經多不勝數。我不理解為什麼一本橫跨法律與行為科學專業,在美國頗受好評的著作,這麼大一家出版社的中文版可以把它搞砸成這樣。譯者跟責任編輯不覺得要對讀者負責嗎?編輯自己不懂的,不用找專業審訂嗎?還是覺得法律相關的書反正大家都看不懂,無所謂?

      A Tale of Two Codes我會翻《雙典記》 code 法典 encode 編成法典

      姑且不談code該怎麼翻,如原po律師指出,作者明顯泛稱的laywer一詞,譯者顯然帶著一種死腦筋,硬相信自己幾十年的淺度學習記憶,lawyer一定就是「律師」不能有他義,於是翻出令人好笑的意思。例如,犯了原文 lawyers ACT AS judges...翻成「律師的舉止有如法官」,連act as意思是「擔任、充當」如此清晰,都會變成「舉止有如」,這句意思是「法律人擔任法官(時)...」。

      「最終形成公共意見」,「形成」應作「形塑」(影響、左右),原文是shape,不是form、make up。

      「法律傾向...政治過程」那個病句真的令人納悶,編輯根本睡著了,原文有一個字漏翻。

    1. The code isn't that different from your typical asyncio script:

      ``` import re import time

      import httpx import trio

      urls = [ "https://www.bitecode.dev/p/relieving-your-python-packaging-pain", "https://www.bitecode.dev/p/hype-cycles", "https://www.bitecode.dev/p/why-not-tell-people-to-simply-use", "https://www.bitecode.dev/p/nobody-ever-paid-me-for-code", "https://www.bitecode.dev/p/python-cocktail-mix-a-context-manager", "https://www.bitecode.dev/p/the-costly-mistake-so-many-makes", "https://www.bitecode.dev/p/the-weirdest-python-keyword", ]

      title_pattern = re.compile(r"<title[^>]>(.?)</title>", re.IGNORECASE)

      user_agent = ( "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/116.0" )

      async def fetch_url(url): start_time = time.time()

      async with httpx.AsyncClient() as client:
          headers = {"User-Agent": user_agent}
          response = await client.get(url, headers=headers)
          match = title_pattern.search(response.text)
          title = match.group(1) if match else "Unknown"
          print(f"URL: {url}\nTitle: {title}")
      
      end_time = time.time()
      elapsed_time = end_time - start_time
      print(f"Time taken for {url}: {elapsed_time:.4f} seconds\n")
      

      async def main(): global_start_time = time.time()

      # That's the biggest API difference
      async with trio.open_nursery() as nursery:
          for url in urls:
              nursery.start_soon(fetch_url, url)
      
      global_end_time = time.time()
      global_elapsed_time = global_end_time - global_start_time
      print(f"Total time taken for all URLs: {global_elapsed_time:.4f} seconds")
      

      if name == "main": trio.run(main) ```

      Because it doesn't create nor schedule coroutines immediately (notice the nursery.start_soon(fetch_url, url) is not nursery.start_soon(fetch_url(url))), it will also consume less memory. But the most important part is the nursery:

      # That's the biggest API difference async with trio.open_nursery() as nursery: for url in urls: nursery.start_soon(fetch_url, url)

      The with block scopes all the tasks, meaning everything that is started inside that context manager is guaranteed to be finished (or terminated) when it exits. First, the API is better than expecting the user to wait manually like with asyncio.gather: you cannot start concurrent coroutines without a clear scope in trio, it doesn't rely on the coder's discipline. But under the hood, the design is also different. The whole bunch of coroutines you group and start can be canceled easily, because trio always knows where things begin and end.

      As soon as things get complicated, code with curio-like design become radically simpler than ones with asyncio-like design.

    2. Because of the way gevent works, you can take a blocking script, and with very few modifications, make it async. Let's take the original stdlib one, and convert it to gevent:

      ``` import re import time

      import gevent from gevent import monkey

      monkey.patch_all() # THIS MUST BE DONE BEFORE IMPORTING URLLIB

      from urllib.request import Request, urlopen

      urls = [ "https://www.bitecode.dev/p/relieving-your-python-packaging-pain", "https://www.bitecode.dev/p/hype-cycles", "https://www.bitecode.dev/p/why-not-tell-people-to-simply-use", "https://www.bitecode.dev/p/nobody-ever-paid-me-for-code", "https://www.bitecode.dev/p/python-cocktail-mix-a-context-manager", "https://www.bitecode.dev/p/the-costly-mistake-so-many-makes", "https://www.bitecode.dev/p/the-weirdest-python-keyword", ]

      title_pattern = re.compile(r"<title[^>]>(.?)</title>", re.IGNORECASE)

      user_agent = ( "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/116.0" )

      We move the fetching into a function so we can isolate it into a green thread

      def fetch_url(url): start_time = time.time()

      headers = {"User-Agent": user_agent}
      
      with urlopen(Request(url, headers=headers)) as response:
          html_content = response.read().decode("utf-8")
          match = title_pattern.search(html_content)
          title = match.group(1) if match else "Unknown"
      
          print(f"URL: {url}\nTitle: {title}")
      
      end_time = time.time()
      elapsed_time = end_time - start_time
      
      print(f"Time taken: {elapsed_time:.4f} seconds\n")
      

      def main(): global_start_time = time.time()

      # Here is where we convert synchronous calls into async ones
      greenlets = [gevent.spawn(fetch_url, url) for url in urls]
      gevent.joinall(greenlets)
      
      global_end_time = time.time()
      global_elapsed_time = global_end_time - global_start_time
      
      print(f"Total time taken: {global_elapsed_time:.4f} seconds")
      

      main() ```

      No async, no await. No special lib except for gevent. In fact it would work with the requests lib just as well. Very few modifications are needed, for a net perf gain.

      The only danger is if you call gevent.monkey.patch_all() too late. You get a cryptic error that crashes your program.

    3. So what's the deal with asyncio, twisted, gevent, trio and all that stuff?

      asyncio

      asyncio is the modern module for asynchronous network programming provided with the python stdlib since 3.4. In other words, it's the default stuff at your disposal if you want to code something without waiting on the network.

      asyncio replaces the old deprecated asyncore module. It is quite low level, so while you can manually code most network-related things with it, you are still at the level of TCP or UDP. If you want higher-level protocols, like FTP, HTTP or SSH, you have to either code it yourself, or install a third party library or module.

      Because asyncio is the default solution, it has a the biggest ecosystem of 3rd party libs, and pretty much everything async strives to be compatible with it directly, or through compatibility layers like anyio.

      Twisted

      20 years ago, there was no asyncio, there was no async/await, nodejs didn't exist and Python 3 was half a decade away. Yet, it was the .com bubble, everything needed to be connected now. And so was born twisted, the grandfather of all the asynchronous frameworks we have today. Twisted ecosystem grew to include everything, from mail to ssh.

      To this day, twisted is still a robust and versatile tool. But you do pay the price of its age. It doesn't follow PEP8 very well, and the design lean on the heavy size.

      Tornado

      Tornado was developed after Twisted, by FriendFeed, at this weird 2005-2015 web dev period where everything needed to be social web scale. It was like Twisted, but tooted to be faster, and was higher level. Out of the box, the HTTP story is way nicer.

      Today, you are unlikely to use Tornado unless you work at Facebook or contribute to jupyter. After all, if you want to make async web things, the default tool is FastAPI in 2023.

      gevent

      Gevent came about in 2009, the same year as Tornado, but with a fundamentally different design. Instead of attempting to provide an asychronous API, it decided to do black magic. When you use gevent, you call from gevent import monkey; monkey.patch_all() and it changes the underlying mechanism of Python networking, making everything non-blocking.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      This is an interesting, timely and informative article. The authors used publicly available data (made available by a funding agency) to examine some of the academic characteristics of the individuals recipients of the National Institutes of Health (NIH) k99/R00 award program during the entire history of this funding mechanism (17 years, total ~ 4 billion US dollars (annual investment of ~230 million USD)). The analysis focuses on the pedigree and the NIH funding portfolio of the institutions hosting the k99 awardees as postdoctoral researchers and the institutions hiring these individuals. The authors also analyze the data by gender, by whether the R00 portion of the awards eventually gets activated and based on whether the awardees stayed/were hired as faculty at their k99 (postdoctoral) host institution or moved elsewhere. The authors further sought to examine the rates of funding for those in systematically marginalized groups by analyzing the patterns of receiving k99 awards and hiring k99 awardees at historically black colleges and universities.

      The goals and analysis are reasonable and the limitations of the data are described adequately. It is worth noting that some of the observed funding and hiring traits are in line with the Matthew effect in science (https://www.science.org/doi/10.1126/science.159.3810.56) and in science funding (https://www.pnas.org/doi/10.1073/pnas.1719557115). Overall, the article is a valuable addition to the research culture literature examining the academic funding and hiring traits in the United States. The findings can provide further insights for the leadership at funding and hiring institutions and science policy makers for individual and large-scale improvements that can benefit the scientific community.

      Thank you for these comments. We have incorporated the articles referenced on the Matthew effect into the first paragraph of the Discussion our revised preprint.

      Reviewer #2 (Public Review):

      Early career funding success has an immense impact on later funding success and faculty persistence, as evidenced by well-documented "rich-get-richer" or "Matthew effect" phenomena in science (e.g., Bol et al. 2018, PNAS). Woitowich et al. examined publicly available data on the distribution of the National Institutes of Health's K99/R00 awards - an early career postdoc-to-faculty transition funding mechanism - and showed that although 85% of K99 awardees successfully transitioned into faculty, disparities in subsequent R01 grant obtainment emerged along three characteristics: researcher mobility, gender, and institution. Men who moved to a top-25 NIH funded institution in their postdoc-to-faculty transition experienced the shortest median time to receiving a R01 award, 4.6 years, in contrast to the median 7.4 years for women working at less well-funded schools who remained at their postdoc institutions. This result is consistent with prior evidence of funding disparities by gender and institution type. The finding that researcher mobility has the largest effect on subsequent funding success is key and novel, and enhances previous work showing the relationship between mobility and ones' access to resources, collaborators, or research objects (e.g., Sugimoto and Larivière, 2023, Equity for Women in Science (Harvard University Press)).

      These results empirically demonstrate that even after receiving a prestigious early career grant, researchers with less mobility belonging to disadvantaged groups at less-resourced institutions continue to experience barriers that delay them from receiving their next major grant. This result has important policy implications aimed at reducing funding disparities - mainly that interventions that focus solely on early career or early stage investigator funding alone will not achieve the desired outcome of improving faculty diversity.

      The authors also highlight two incredible facts: No postdoc at a historically Black college or university (HBCU) has been awarded a K99 since the program's launch. And out of all 2,847 R00 awards given thus far, only two have been made to faculty at HBCUs. Given the track record of HBCUs for improving diversity in STEM contexts, this distribution of awards is a massive oversight that demands attention.

      At no fault of the authors, the analysis is limited to only examining K99 awardees and not those who applied but did not receive the award. This limitation is solely due to the lack of data made publicly available by the NIH. If this data were available, this study would have been able to compare the trajectory of winners versus losers and therefore could potentially quantify the impact of the award itself on later funding success, much like the landmark Bol et al. (2018) paper that followed the careers of winners of an early career grant scheme in the Netherlands. Such an analysis would also provide new insights that would inform policy.

      Although data on applications versus awards for the K99/R00 mechanism are limited, there exists data for applicant race and ethnicity for the 2007-2017 period, which were made available by a Freedom of Information Act request through the now defunct Rescuing Biomedical Research Initiative: https://web.archive.org/web/20180723171128/http://rescuingbiomedicalresearch.org/blog/examining-distribution-k99r00-awards-race/. These results are not presently discussed in the paper, but are highly relevant given the discussion of K99 award impacts on the sociodemographic composition of U.S. biomedical faculty. From 2007 to 2017, the K99 award rate for white applicants was 31.0% compared to 26.7% for Asian applicants and 16.2% for Black applicants. In terms of award totals, these funding rates amount to 1,384 awards to white applicants, 610 to Asian applicants, and 25 to Black applicants for the entire 2007-2017 period. And in terms of R00 awards, or successful faculty transitions: whereas 77.0% of white K99 awardees received an R00 award, the conversion rate for Asian and Black K99 awardees was lower, at 76.1% and 60.0%, respectively. Regarding this K99-to-R00 transition rate, Woitowich et al. found no difference by gender (Table 2). These results are consistent with a growing body of literature that shows that while there have been improvements to equity in funding outcomes by gender, similar improvements for achieving racial equity are lagging.

      The conclusions are well-supported by the data, and limitations of the data and the name-gender matching algorithm are described satisfactorily.

      One aspect that the authors should expand or comment on is the change in the rate of K99 to R00 conversions. Since 2016, while the absolute number of K99 and R00 awards has been increasing, the percentage of R00 conversions appears to be decreasing, especially in 2020 and 2021. This observation is not clearly stated or shown in Figure 1 but is an important point - if the effectiveness of the K99/R00 mechanism for postdoc-to-faculty transitions has been decreasing lately, then something is undermining the purpose of this mechanism. This result bears emphasis and potentially discussion for possible reasons for why this is happening.

      Thank you for these insightful comments. We now calculate a rolling conversion rate for K99 to R00 awards which shows there is not as much of a decline in conversion from K99 to R00 (Fig 1B). We still see a slight decline in 2021 and 2022. 468 K99 awards are from 2020 or later so they may still convert to the R00 phase. Thus it is difficult to draw conclusions about 2021/2022 yet. As more time passes, we may better be able to determine whether or not significant alteration from normal occurred in these years, presumably due to pressures from the Covid-19 pandemic. We also thank you for providing the details of the FOIA request. We have included a discussion of these data in the discussion.

      Reviewer #3 (Public Review):

      The researchers aim add to the literature on faculty career pathways with particular attention to how gender disparities persist in the career and funding opportunities of researchers. The researchers also examine aspects of institutional prestige that can further amplify funding and career disparities. While some factors about individuals' pathways to faculty lines are known, including the prospects of certain K award recipients, the current study provides the only known examination of the K99/R00 awardees and their pathways.

      Strengths:

      The authors establish a clear overview of the institutional locations of K99 and R00 awardees and the pathways for K99-to-R00 researchers and the gendered and institutional patterns of such pathways. For example, there's a clear institutional hierarchy of hiring for K99/R00 researchers that echo previous research on the rigid faculty hiring networks across fields, and a pivotal difference in the time between awards that can impact faculty careers. Moreover, there's regional clusters of hiring in certain parts of the US where multiple research universities are located. Moreover, documenting the pathways of HBCU faculty is an important extension of the Wapman et al. study (among others from that research group), and provides a more nuanced look at the pathways of faculty beyond the oft-discussed high status institutions. (However, there is a need for more refinement in this segment of the analyses as discussed further below.). Also, the authors provide important caveats throughout the manuscript about the study's findings that show careful attention to the complexity of these patterns and attempting to limit misinterpretations of readers.

      Weaknesses:

      The authors reference institutional prestige in relation to some of the findings, but there's no specific measure of institutional prestige included in the analyses. If being identified as a top 25 NIH-funded institution is the proximate measure for prestige in the study, then more justification of how that relates to previous studies' measures of institutional prestige and status are needed to further clarify the interpretations offered in the manuscript.

      The identification of institutional funding disparities impacting HBCUs is an important finding and highlights another aspect of how faculty at these institutions are under resourced and arguably undervalued in their research contributions. However, a lingering question exists: why compare HBCUs with Harvard? What are the theoretical and/or methodological justifications for such comparisons? This comparison lends itself to reifying the status hierarchy of institutions that perpetuate funding and career inequalities at the heart of the current manuscript. If aggregating all HBCU faculty together, then a comparable grouping for comparison is needed, not just one institution. Perhaps looking at the top 25 NIH funded institutions could be one way of providing a clearer comparison. Related to this point is the confusing inclusion of Gallaudet in Figure 6 as it is not an officially identified HBCU. Was this institution also included in the HBCU-related calculations?

      Thank you for this comment. We agree this comparison perpetuates the perception of the prestige hierarchy and is problematic. We now compare all institutions in the top 25 NIH funding category to all HBCUs. Thank you also for identifying our error in mis-coding Gallaudet as an HBCU. We have corrected this in the current version.

      There is a clear connection that is missed in the current iteration of the manuscript derived from the work of Robert Merton and others about cumulative advantages in science and the "Matthew effect." While aspects of this connection are noted in the manuscript such as well-resourced institutions (those with the most NIH funding in this circumstance) hire each others' K99/R00 awardees, elaborating on these connections are important for readers to understand the central processes of how a rigid hierarchy of funding and career opportunities exist around these pathways. The work the authors build on from Daniel Larremore, Aaron Clauset, and their colleagues have also incorporated these important theoretical connections from the sociology of knowledge and science, and it would provide a more interdisciplinary lens and further depth to understanding the faculty career inequalities documented in the current study.

      Reviewer #1 (Recommendations For The Authors):

      Comments to authors:

      1. For the benefit of general reader, it would be informative to mention the amount of annual NIH investment in the k99 funding mechanism in the text (230 awards representing a ~ 230 million US dollars investment).

      Thank you for this suggestion. We have added that this is ~$25 million investment annually.

      1. It is worth noting that some of the observed funding and hiring traits resemble the Matthew effect, discussed in: The Matthew effect in science: https://www.science.org/doi/10.1126/science.159.3810.56

      The Matthew effect in science funding: https://www.pnas.org/doi/10.1073/pnas.1719557115

      It would be of value to cite these for further context for the readers.

      Thank you for this suggestion. We have included these references and briefly discussed the Matthew effect in the first paragraph of the Discussion.

      1. Figs 3, 6 and Fig S1 are hard to read without zooming in due to their format and don't work great within a letter size page but can work if they are also linked to a zoomable web version. It would make sense to have an online navigable/searchable/selectable version. But when the reader zooms out, there are patterns that reflect what points the authors are making (though those could be illustrated differently). These figures are really made for online webapp visualization (such as Shiny in R).

      We agree with this comment and have used the “googleVis()” package in R to put together interactive Sankey diagrams. These can be found at: https://dantyrr.github.io/K99-R00-analysis/ and they are referenced in the manuscript.

      1. The abstract states 85% of awardees get R00 awards. That appears to come from 198/234 (page 6) though it's not explicitly stated, and other ratios give different answers (e.g., 1-304/3475 = 91%) but the 85% seems to be the right one. That first paragraph of the results could be clearer. Also, in the middle of page three the number given is 90% so something is inconsistent. For Figure 1A, given the methodology it should be possible to calculate a rolling conversion rate as "R00(t) / K99(t-1)" (and a similarly-calculated cumulative rate).

      Thank you for catching these errors. These were introduced because there are R00 awardees that did not have extramural K99 awards. These are intramural NIH K99 awardees but there is no public data on these awardees. The correct number is 78% of K99 awardees that transitioned to the R00 phase. We have also calculated the rolling conversion rate which is 89% if you exclude the first 2 years of the program (when the first awardees were within the 2-yr K99 period) and final 2 years (when most recent K99 awardees were still within their first 2 years of the K99 period).

      1. Assuming that 85% is the correct number, is there any information/insight into why ~1/6 of awardees do not continue to R00, which seems high given that only two years passes - that's a lot of awardees not getting R00 positions.

      We are unsure of why these don’t convert. In the revised version of the manuscript, we speculate on this in the 4th paragraph of the discussion:

      The factors that prevented the other 302 K99 awardees from 2019 and earlier unable to convert their K99-R00 grants is cause for concern within our greater academic community. Possible explanations include leaving the biomedical workforce, accepting tenure-track positions or other positions abroad, or by simply not successfully securing a tenable tenure-track offer.

      1. It looks like perhaps a non-zero number of K99s are just one year and not two (e.g., see 2006 in Fig 1A, which should not appear if all 2006 awards were 2 years). What is the typical percentage of K99s not activated for a second year, and is this a sizable % of the 15% not converting to R00?

      This is an interesting question. We didn’t originally look into this and the dataset that we originally downloaded from NIH reporter included a significant number of duplicates for the grants because year 1 of the K99 was listed on its own line and year 2 was listed on a different line. The first step in curating the data was to delete the duplicate values so we only had one entry per person. Unfortunately based on sorting of the data tables, sometimes the year 1 appeared above year 2 and at other times year 2 appeared before year 1. Because none of the data we were interested in are benchmarked to K99 start date, we removed the duplicate values non-specifically. With the dataset we currently have, we would not be able to tell which individuals dropped out (didn’t convert to R00) during the first or second year of the K99. In order to do this we would have to download the raw data from NIH reporter again and curate it again. We may do this in the future but for the purpose of publishing the current manuscript we prefer to focus our efforts on other aspects of the revision.

      1. Further down page 3, the authors state that "men typically experience 2-3% greater funding success rates" is ambiguous, as rates are themselves a percentage. So, is it 2-3% greater as in 23% vs 20%, or is it 2-3% greater as in 20.6% vs 20%? Please clarify the language.

      Thank you for asking for this clarification. We have updated the text here to reflect that we mean “23% vs 20%”.

      1. Metrics such as time to first R01 are compared internally within the study set, which yields interesting insights, but more could be done to benchmark these metrics to non-K99 scientists.

      We agree with the reviewer that this would be ideal; however, we feel that it is out of the scope of this manuscript. We may examine this in the future.

      1. In the text, several times percentages are being referred to when the figures cited do not show percentages. For example (page 6) 'proportion of awardees that stayed at the same institution declined to about 20% where it has remained consistent (Fig 1B)' - Figure 1B does not show percentages, instead the reader would need to work out from the raw numbers what the pattern of percentages might look like. It's fine (great even) to provide the raw numbers, but would be great to show the percentages as well. This happened for multiple graphs.

      Thank you for this comment. We agree that showing the percentage would be beneficial so we have included the percentages in Figure 1 for the conversion rate. We also added a standalone figure panel for the rolling conversion rate for Figure 1. For Figure 4, we have also included a right Y-axis to better indicate the % women.

      1. Figure 4 - putting the %women on a 0-250 scale makes it difficult to see the changes in that curve. Please replot it as a separate graph with an appropriate scale (30-50%? 30-70%?)

      Thank you for this comment. We have made this edit.

      1. Figure 5 - The table appears inconsistent - the Moved/Stayed HR is 1.411 suggesting that moving is better for reducing time to R01, but then Woman/Man is 1.208, so one of these pairs needs to be written in the opposite order to have the table make sense (intended to be listed as 'better/worse'?)

      Thank you for noticing this. In the revised manuscript we have re-run the cox proportional hazard model using the R package “survival” and the function “coxph()”. There were minor differences in the hazard ratios using this package instead of Graphpad prism; however, the R package is much more widely used compared to prism for these types of analysis. We present the new data in the table in Figure 5B in the revised manuscript. We now present the “detrimental” cox hazard value for each variable (i.e. 0.7095 for the mobility [moved/stayed]). We also underlined the variable which was detrimental to receiving an R01 award earlier.

      1. Figure 5's graph appears strange. All the lines have an appearance of stochasticity but are actually multiples of each other, rising exactly in sync. Are these actually modeled lines? If so, why not instead actually draw the lines based on the real data from the real groups depicted, and give the n for each group?

      Thank you for picking this up. The software we originally used to plot the graphs did plot modeled lines instead of the actual data. We have re-run the cox proportional hazard model using the R “survival” package v3.5-5 and the coxph() and survfit() functions. The updated data are in Figure 5 of the revised manuscript.

      1. Table 1 should note that each column sums to 100%.

      This is a good suggestion. In the revised manuscript, we have added a row to the table to indicate the column total N and %.

      1. The authors discuss how k99/R00 grant reviewing process may have to change but the k99 awards also impact the faculty hiring ecosystem as well. There are faculty hiring job ads explicitly requesting or indicating preference towards k99 holders and the results described in this article show that k99 awarding is biased towards particular demographics at select wealthy institutions. Of course, collective/central action is almost always more effective/impactful (especially in shorter time line) than individual elective action. In other words, NIH changing granting patterns would likely work better than encouraging faculty searches to change the weight they give to K99s, because there are many searches and just one NIH. But these are not mutually exclusive and individual action can still help when central action isn't done (if the NIH does not change the k99/R00 grant review process for more inclusive funding and does not increase the number of annual k99 awards hence the annual budget for this award mechanism) and it would be good to have this discussed in the manuscript.

      Thank you for this comment and thoughtful insights. We have included additional discussion on this in the final paragraph of the discussion.

      Reviewer #2 (Recommendations For The Authors):

      Thank you for conducting this important work. On top of some thoughts I have described in the public review (in particular, Chris Pickett's FOIA data on K99/R00 outcomes by applicant race and ethnicity), I only have a few comments for potential improvements to this paper:

      1. The comparison of K99-R00 transition rates by gender was interesting. However, I missed the analysis on the K99-R00 transition rates by institution (by type or by top-25 NIH funded institution versus not). I think this analysis may be buried somewhere in the more nuanced descriptions about faculty flows from one institution type to another, but I was not able to locate it. I wonder if the authors could consider dedicating a subsection to specifically describing the transition rate by institution type, creating a table equivalent to Table 2. This section would probably fit best somewhere before the authors dive into the nuances of self-hires and faculty flows.

      Said another way: As I was reading, I felt I was missing an answer to a simple question - are there differences in conversion rates by institution type (however you define institution type, as an MSI or non MSI, or top-25 NIH funded versus not)?

      Thank you for this suggestion. We have created the table (Table 3 and Table 4) in the revised manuscript. We also made a new figure (now figure 5 in the revised manuscript). This was an interesting way to look at the data and it is very clear that the number of K99 and R00 awards is heavily concentrated within the institutions that have the highest NIH funding. We have added a paragraph in the results in a new section entitled “K99 and R00 awards are concentrated within the highest funded institutions”.

      1. Regarding the comparison of HBCUs and Harvard: this analysis was elucidating, but I am not sure if the framing of this analysis as pertaining to "systematically marginalized groups" - see second sentence in the section, "Faculty doctorates differ between Harvard and HBCUs" is appropriate. While it is true that proportionally more faculty at HBCUs are from marginalized groups, there are also many faculty at HBCUs who are from privileged or advantaged backgrounds (e.g., white, men, educated at elite institutions). It would be more accurate to rephrase the second sentence to say something along the lines of, "We sought to examine the rates of funding for those at historically under-funded institutions." I recommend that the authors comb the paper for any other potential places in the text that conflate systemic marginalization with institution type, and rephrase as needed for accuracy.

      Thank you for pointing this out. This is an extremely important point and we have removed any instances we could find where we conflate systemically marginalized groups with institution type.

      1. I strongly recommend Sugimoto and Larivière (2023)'s new book, Equity for Women in Science, which has an entire section dedicated to previous work investigating how researcher mobility impacts access to resources, collaborations, et cetera (Chapter 5 on Mobility; other chapters on Funding are also relevant but I hone in on Mobility since this is such a key result of this work). I think this chapter would provide significant food-for-thought and background that could strengthen the Discussion section of the paper.

      Thank you for this suggestion. We have added some discussion of mobility in the first paragraph of the Discussion.

      1. I appreciated the subsection headings that described key results (e.g., "Institutions with the most NIH funding tend to hire K99/R00 awardees from other institutions with the most funding"; "K99/R00 awardee self-hires are more common at institutions with the top NIH funding.") This paper structure made it easier for me to ensure that I was getting the intended takeaway from a figure or section. But partway through the paper, the subheadings changed to being less declarative and therefore less informative (e.g., "Gender of K99/R00 awardees"; "Factors influencing K99/R00 awardee future funding success"). It would be great to rephrase these boilerplate subsection headers to be more declarative, like earlier subsection headings. For example, maybe say "Men receive the majority of K99 awards" or "No gender difference in the rate of conversion from K99 to R00" or something to that effect, depending on what result the authors wish to emphasize.

      Thank you for this comment. This is a very good point. We have re-worded the more generic headings in the revised version.

      1. Lastly, I would like to share a question that came to my mind that involves an additional analysis, but is work that is (probably) out-of-the-scope of this paper, but could instead be a separate paper or product. Circling back to Chris Pickett's FOIA-ed data on K99/R00 funding outcomes by applicant race and ethnicity (https://web.archive.org/web/20180723171128/http://rescuingbiomedicalresearch.org/blog/examining-distribution-k99r00-awards-race/): Given that Pickett's numbers provide incontrovertible information on the number of awards to various racial and ethnic groups, I wonder if it is possible to use this information as an "answer key" to (1) check the accuracy of an algorithm that assigns race based on name for applications in your analysis but for 2007-2017 period, and, (2) if the results are reasonable, then examine the dataset with race and ethnicity information. Some recent papers performing large-scale bibliometric analyses have applied such algorithms (e.g., see Kozlowski et al. 2022 PNAS Intersectional inequalities in science) and I wonder if they could be useful, or at least tested, here. Again, Pickett's data would serve as the benchmark to see if the algorithm produces numbers that are consistent with the actual funding outcomes; if they're not wildly off, or perhaps accurate for some groups but not others, there might be something here.

      This is a really insightful comment. We have discussed whether we could assign ethnicity based on an algorithm and check based on Chris Pickett’s data. We agree that it is beyond the scope of this article, but has potential for future research.

      Reviewer #3 (Recommendations For The Authors):

      -In the methods section, it would be helpful to provide an overview of the number of universities, departments, and faculty represented in the data analyzed in the study.

      Thank you for this comment. We agree with the reviewer. We have added a section to the results discussing the distribution of different types of institutions. We also added Table 3 and Table 4 and a new Figure 5 describing these. Regarding the faculty, we have discussed the demographics of the K99 and R00 awardees as best as we could. We do not have data on which faculty laboratories the K99 awardees were in when they received their awards. This information is not available through NIH reporter.

      -I would consider incorporating, or at least citing, Jeff Lockhart and colleagues' recent paper Nature Human Behavior article "Name-based demographic inference and the unequal distribution of misrecognition" about to provide readers with an additional resource and more information about the likelihood of misattribution and general cautionary notes about using gender and race/ethnicity ascription/imputation approaches and tools for research.

      Thank you for bringing this reference to our attention. We have incorporated this into the methods section describing our name-based gender determination.

      -In the next to last sentence under the final paragraph of the methods section, there looks to be a typo as it should read "K99 or R00," not "K00" as currently written.

      Thank you for catching this. We have now corrected it.

      -Clarifying some of the data and measures used are necessary to limit confusion and misinterpretations of the study's findings.

      Thank you. We have significantly updated the revised manuscript and hope that it is more clear.

      -Elaborating more on the gender inequality notable in the Cox proportional hazard model would strengthen the authors' point about persistent gender inequalities within the K99/R00 funding mechanism and pathways. In its current iteration, the findings are somewhat buried by the discussion of institutional differences, but when we look at the findings and the plot associated with the model, we notice that men have more advantages than women in funding and institutional location.

      Thank you for highlighting this. This is true and we have elaborated on the gender inequality in the revised version of the manuscript.

      -Also for the Cox proportional hazard model, I would consider exploring the inclusion of data that can further clarify the biomedical research infrastructure of institutions. For example, in the conversation about the differences between Princeton and other universities including other Ivies, it's important to note that Princeton does not have a medical school. Moreover, other institutions do not operate or are affiliated with a hospital. Adding more data to the model that can better contextualize the research infrastructure around researchers with NIH awards beyond the size of the NIH portfolio can shed light on possibly other important institutional differences that undergird these inequalities.

      Thank you for this comment. We have added additional details about the institutional type; however, to examine whether institutions are attached to a hospital (or are themselves as hospital like MGH etc.) or whether institutions include a medical school may be difficult. We would have to manually code these and then determine whether or not the award recipient was affiliated with a department within that entity or not. We believe that this is a fascinating question but that it is out of the scope of the present manuscript. This is something that we will look into for potential future publications.

      -Throughout the manuscript there's usage of "elite" and "prestigious" that are somewhat ambiguous regarding what exactly they are referring to about institutional characteristics. This is a common issue in the literature, but trying to clarify what these terms specifically mean for the current study and checking for consistent usage with limited interchangeability that can add confusion for readers about what is being referred to would give added strength to the conversation provided by the authors.

      Thank you for this suggestion. Based on these comments and those by the other reviewers, in the revised version of the manuscript, we have limited the use of “elite” and “prestigious” to describe institutions in order not to perpetuate biases toward certain institutions.

      -In relation to the discussion at the end of the manuscript of the longer time to award noted for researchers who stay at the same institutions, another possibility for the disparity could be their reliance for service work (e.g., hiring committees, departmental committees, supporting graduate students through mentoring and/or dissertation committee work, etc.) in their institutions given their knowledge of and experience within it.

      Thank you for this suggestion. We have added 2 sentences to the discussion reflecting this possibility.

      -Engaging with how STEM professional cultures can perpetuate these funding disparities and related hiring and career outcomes could enhance the contributions of the study. In relation to STEM professional cultures, engaging with the work of Mary Blair-Loy and Erin Cech in their recent book, Misconceiving Merit, could help provide additional insights for readers.

      Thank you for these comments. We have incorporated edits to the revised manuscript reflecting the work of Erin Cech and Mary Blair-Loy.

    1. Let us see an example, HTML CSS <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <link rel="stylesheet" href="style.css" /> <title>CSS text-decoration</title> </head> <body> <h1>Shorthand Text Decoration</h1> <p> We can <span>decorate</span> our text using CSS text-decoration property. </p> </body> </html> /* shorthand property to set text-decoration*/ p span { text-decoration: underline wavy red 2px; } /* This above code is equivalent to p span { text-decoration-line: underline; text-decoration-style: wavy; text-decoration-color: red; text-decoration-thickness: 2px; } */ Browser Output

      Different section

    1. For instance, a binary search tree might have the invariant that for every node, the key of the node's left child is less than the node's own key. A correctly written insertion function for this tree will maintain that invariant. As you can tell, that's not the sort of thing you can store in a variable: it's more a statement about the program. By figuring out what sort of invariants your program should maintain, then reviewing your code to make sure that it actually maintains those invariants, you can avoid logical errors in your code.
    1. The true distinction: static vs. dynamic

      The true distinction that we should be teaching students is the difference between properties of languages that can be determined statically—that is, by just staring at the code without running it—and properties that can only be known dynamically, during runtime.

      Notice that I said “properties” and not “languages”. Every programming language chooses its own set of properties that can be determined either statically or dynamically, and taken together, this makes a language more “dynamic” or more “static”. Static versus dynamic is a spectrum, and yes, Python falls on the more dynamic end of the spectrum. A language like Java has far more static features than Python, but even Java includes things like reflection, which is inarguably a dynamic feature.

    2. “Compiled vs. Interpreted” limits what we think is possible with programming languages

      For instance, JavaScript is commonly lumped into the “interpreted language” category. But for a while, JavaScript running in Google Chrome would never be interpreted—instead, JavaScript was compiled directly to machine code! As a result, JavaScript can keep pace with C++.

    3. When you run your Python program using [CPython], the code is parsed and converted to an internal bytecode format, which is then executed inside the VM. From the user’s perspective, this is clearly an interpreter—they run their program from source. But if you look under CPython’s scaly skin, you’ll see that there is definitely some compiling going on. The answer is that it is both. CPython is an interpreter, and it has a compiler.
    4. You can actually compile all of your Python code beforehand using the compileall module on the command line:

      $ python3 -m compileall .

      This will place the compiled bytecode of all Python files in the current directory in pycache/ and show you any compiler errors.

    5. Python is both a compiled and interpreted language

      The CPython interpreter really is an interpreter. But it also is a compiler. Python must go through a few stages before ever running the first line of code:

      1. scanning
      2. parsing

      Older versions of Python added an additional stage:

      1. scanning
      2. parsing
      3. checking for valid assignment targets

      Let’s compare this to the stages of compiling a C program:

      1. ~~preprocessing~~
      2. lexical analysis (another term for “scanning”)
      3. syntactic analysis (another term for “parsing”)
      4. ~~semantic analysis~~
      5. ~~linking~~
    6. next stage is parsing (also known as syntactic analysis) and the parser reports the first error in the source code. Parsing the whole file happens before running the first line of code which means that Python does not even see the error on line 1 and reports the syntax error on line 2.
    7. I haven’t done a deep dive into the source code of the CPython interpreter to verify this, but I think the reason that this is the first error detected is because one of the first steps that Python 3.12 does is scanning (also known as lexical analysis). The scanner converts the ENTIRE file into a series of tokens before continuing to the next stage. A missing quotation mark at the end of a string literal is an error that is detected by the scanner—the scanner wants to turn the ENTIRE string into one big token, but it can’t do that until it finds the closing quotation mark. The scanner runs first, before anything else in Python 3.12, hence why this is the first error message.
    8. Python reports only one error message at a time—so the game is which error message will be reported first?

      Here is the buggy program:

      python 1 / 0 print() = None if False ñ = "hello

      Each line of code generates a different error message:

      • 1 / 0 will generate ZeroDivisionError: division by zero.
      • print() = None will generate SyntaxError: cannot assign to function call.
      • if False will generate SyntaxError: expected ':'.
      • ñ = "hello will generate SyntaxError: EOL while scanning string literal.

      The question is… which will be reported first?

      Spoilers: the specific version of Python matters (more than I thought it would) so keep that in mind if you see different results.

      The first error message detected is on the last line of source code. What this tells us is that Python must read the entire source code file before running the first line of code. If you have a definition in your head of an “interpreted language” that includes “interpreted languages run the code one line at a time”, then I want you to cross that out!

    1. When the designer on the team, who also writes CSS, went to go make changes, it was a lot harder for them to implement them. They had to figure out which file to look in, open up command line, run a build step, check that it worked as expected, and then deploy the code.
    1. The tokenizer takes your source code and chunks it into “tokens”. Tokens are just small pieces of source code that you can identify in isolation. As examples, there will be tokens for numbers, mathematical operators, variable names, and keywords (like if or for). The parser will take that linear sequence of tokens and essentially reshape them into a tree structure (that's what the T in AST stands for: Tree). This tree is what gives meaning to your tokens, providing a nice structure that is easier to reason about and work on. As soon as we have that tree structure, our compiler can go over the tree and figure out what bytecode instructions represent the code in the tree. For example, if part of the tree represents a function, we may need a bytecode for the return statement of that function. Finally, the interpreter takes those bytecode instructions and executes them, producing the results of our original program.
    2. Recap

      In this article you started implementing your own version of Python. To do so, you needed to create four main components:

      A tokenizer: * accepts strings as input (supposedly, source code); * chunks the input into atomic pieces called tokens; * produces tokens regardless of their sequence making sense or not.

      A parser: * accepts tokens as input; * consumes the tokens one at a time, while making sense they come in an order that makes sense; * produces a tree that represents the syntax of the original code.

      A compiler: * accepts a tree as input; * traverses the tree to produce bytecode operations.

      An interpreter: * accepts bytecode as input; * traverses the bytecode and performs the operation that each one represents; * uses a stack to help with the computations.

    1. the senior defense official said. Enter your email to subscribe to the CNN Five Things Newsletter.close dialogYou give us five minutes, we’ll give you five things you must know for the day.Please enter aboveSign me upBy subscribing you agree to ourprivacy policy.Success! Thanks for Subscribing Get a daily roundup of the top stories you may have missed, unique to your interests. Create your free CNN account to sign-upGet my recommended storiesclose dialog/* effects for .bx-campaign-2104712 *//* custom css .bx-campaign-2104712 *//* custom css from creative 53617 *//*Custom code for animation and text display*/.bx-custom.bx-campaign-2104712 .bx-row-input.bx-row-validation .bx-vtext { font-size: 11px; color: #ee2924;}@keyframes bx-anim-2104712-spin { 100% { transform: rotate(360deg); }}/* rendered styles .bx-campaign-2104712 */.bxc.bx-campaign-2104712.bx-active-step-1 .bx-creative:before {min-height: 185px;}.bxc.bx-campaign-2104712.bx-active-step-1 .bx-creative {border-color: #c1c1c1;border-style: solid;background-size: contain;background-color: white;border-width: 1px 0;border-radius: 0;}.bxc.bx-campaign-2104712.bx-active-step-1 .bx-creative> *:first-child {width: 780px;padding: 10px;vertical-align: middle;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712.bx-active-step-1 .bx-creative> *:first-child {width: 340px;padding: 15px;}}.bxc.bx-campaign-2104712.bx-active-step-1 .bx-close {stroke: rgb(193, 193, 193);stroke-width: 2px;width: 24px;height: 24px;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712.bx-active-step-1 .bx-close {width: 30px;height: 30px;padding: 0 0 10px 10px;}}.bxc.bx-campaign-2104712 .bx-group-2104712-PGhroUO {width: 135px;text-align: left;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-group-2104712-PGhroUO {text-align: center;width: 100%;}}.bxc.bx-campaign-2104712 .bx-element-2104712-LHjAYNs {width: 100%;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-LHjAYNs {width: 100px;}}.bxc.bx-campaign-2104712 .bx-group-2104712-eWRqNk4 {width: 505px;padding: 2px 0 0 25px;text-align: left;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-group-2104712-eWRqNk4 {width: 100%;padding: 15px 0 0;text-align: center;}}.bxc.bx-campaign-2104712 .bx-element-2104712-Sk3p2Hs {width: 100%;}.bxc.bx-campaign-2104712 .bx-element-2104712-Sk3p2Hs> *:first-child {font-size: 21px;color: #282828;line-height: 1em;letter-spacing: -.015em;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-Sk3p2Hs> *:first-child {font-size: 18px;min-width: auto;padding: 0;}}.bxc.bx-campaign-2104712 .bx-element-2104712-Gs3ScAY {width: 100%;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-Gs3ScAY {width: 100%;}}@media all and (min-width: 1025px) {.bxc.bx-campaign-2104712 .bx-element-2104712-Gs3ScAY {width: 500px;}}.bxc.bx-campaign-2104712 .bx-element-2104712-Gs3ScAY> *:first-child {font-family: CNN,Helvetica Neue,Helvetica,Arial,Utkal,sans-serif;font-weight: 400;font-size: 21px;padding: 6px 0 0;color: #ee2924;line-height: 1.1em;letter-spacing: -.015em;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-Gs3ScAY> *:first-child {font-size: 18px;padding: 8px 0 0;}}.bxc.bx-campaign-2104712 .bx-group-2104712-kw4VbV5 {width: 640px;padding: 10px 0 0;min-width: 550px;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-group-2104712-kw4VbV5 {min-width: auto;width: 100%;padding: 5px 0 0;}}@media all and (min-width: 737px) and (max-width: 1024px) {.bxc.bx-campaign-2104712 .bx-group-2104712-kw4VbV5 {min-width: 520px;}}.bxc.bx-campaign-2104712 .bx-element-2104712-5c1hF3z .bx-el {padding: 9px 10px 8px 10px;border-style: solid;border-color: #282828;border-radius: 0;font-weight: 400;font-size: 16px;color: #b8b8b8;border-width: 1px 0 1px 1px;background-color: white;line-height: normal;}.bxc.bx-campaign-2104712 .bx-has-text.bx-element-2104712-5c1hF3z .bx-el {padding: 9px 10px 8px 10px;}.bxc.bx-campaign-2104712 .bx-element-2104712-5c1hF3z .bx-el::-webkit-input-placeholder {color: rgb(149, 149, 149);}.bxc.bx-campaign-2104712 .bx-element-2104712-5c1hF3z .bx-el:-moz-placeholder {color: rgb(149, 149, 149);}.bxc.bx-campaign-2104712 .bx-element-2104712-5c1hF3z .bx-el::-moz-placeholder {color: rgb(149, 149, 149);}.bxc.bx-campaign-2104712 .bx-element-2104712-5c1hF3z .bx-el:-ms-input-placeholder {color: rgb(149, 149, 149);}.bxc.bx-campaign-2104712 .bx-element-2104712-5c1hF3z .bx-el:focus {border-style: solid;border-color: rgb(40, 40, 40);border-width: 1px 0 1px 1px;border-radius: 0;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-5c1hF3z .bx-el {font-size: 16px;border-width: 1px;}.bxc.bx-campaign-2104712 .bx-element-2104712-5c1hF3z .bx-el:focus {border-width: 1px;}}.bxc.bx-campaign-2104712 .bx-element-2104712-5c1hF3z {width: 65%;padding: 5px 0;vertical-align: top;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-5c1hF3z {width: 58%;padding: 5px 0;}}.bxc.bx-campaign-2104712 .bx-element-2104712-0pAUXdP {width: 35%;padding: 5px 0;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-0pAUXdP {width: 42%;}}.bxc.bx-campaign-2104712 .bx-element-2104712-0pAUXdP> *:first-child {padding: 8px 9px;font-weight: 400;font-size: 17px;line-height: normal;text-transform: capitalize;border-color: #ee2924;border-radius: 0;background-color: #ee2924;letter-spacing: .02em;}.bxc.bx-campaign-2104712 .bx-element-2104712-0pAUXdP> *:first-child:hover {color: white;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-0pAUXdP> *:first-child {font-size: 16px;padding: 9px 10px 8px 10px;}}.bxc.bx-campaign-2104712 .bx-group-2104712-X5R4SMD {padding: 5px 0 0;width: 640px;text-align: left;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-group-2104712-X5R4SMD {width: 100%;}}.bxc.bx-campaign-2104712 .bx-element-2104712-85lf2AI {width: auto;}.bxc.bx-campaign-2104712 .bx-element-2104712-85lf2AI> *:first-child {padding: 0 4px 0 0;font-size: 12px;font-weight: 400;color: #282828;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-85lf2AI> *:first-child {font-size: 11px;padding: 5px 4px 0 0;}}.bxc.bx-campaign-2104712 .bx-element-2104712-CA16L9s {width: auto;}.bxc.bx-campaign-2104712 .bx-element-2104712-CA16L9s> *:first-child {padding: 0;font-size: 12px;color: #282828;text-decoration: underline;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-CA16L9s> *:first-child {font-size: 11px;padding: 5px 0 0;}}.bxc.bx-campaign-2104712.bx-active-step-2 .bx-creative:before {min-height: 185px;}.bxc.bx-campaign-2104712.bx-active-step-2 .bx-creative {border-color: #c1c1c1;border-style: solid;background-size: contain;background-color: white;border-width: 1px 0;border-radius: 0;}.bxc.bx-campaign-2104712.bx-active-step-2 .bx-creative> *:first-child {width: 780px;padding: 10px;vertical-align: middle;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712.bx-active-step-2 .bx-creative> *:first-child {width: 340px;padding: 12px;}}.bxc.bx-campaign-2104712.bx-active-step-2 .bx-close {stroke: rgb(193, 193, 193);stroke-width: 2px;width: 24px;height: 24px;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712.bx-active-step-2 .bx-close {width: 30px;height: 30px;padding: 0 0 10px 10px;}}.bxc.bx-campaign-2104712 .bx-group-2104712-EKyEXuC {width: 15%;text-align: center;padding: 0 0 55px;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-group-2104712-EKyEXuC {text-align: center;width: 300px;padding: 0 0 15px;}}@media all and (min-width: 737px) and (max-width: 1024px) {.bxc.bx-campaign-2104712 .bx-group-2104712-EKyEXuC {width: 15%;}}.bxc.bx-campaign-2104712 .bx-element-2104712-vEgvz5X {padding: 0;width: auto;}.bxc.bx-campaign-2104712 .bx-element-2104712-vEgvz5X> *:first-child {background-color: transparent;background-size: contain;padding: 0;height: 50px !important;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-vEgvz5X> *:first-child {height: 40px !important;}}@media all and (min-width: 737px) and (max-width: 1024px) {.bxc.bx-campaign-2104712 .bx-element-2104712-vEgvz5X> *:first-child {height: 45px !important;}}.bxc.bx-campaign-2104712 .bx-group-2104712-YvE6gVc {width: 85%;text-align: left;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-group-2104712-YvE6gVc {padding: 0;text-align: center;width: 100%;}}@media all and (min-width: 737px) and (max-width: 1024px) {.bxc.bx-campaign-2104712 .bx-group-2104712-YvE6gVc {width: 85%;}}.bxc.bx-campaign-2104712 .bx-element-2104712-HGzTp5j> *:first-child {font-weight: 700;font-size: 14px;letter-spacing: .05em;text-transform: uppercase;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-HGzTp5j> *:first-child {font-size: 12px;}}.bxc.bx-campaign-2104712 .bx-element-2104712-HGzTp5j {padding: 0 5px 10px;}.bxc.bx-campaign-2104712 .bx-element-2104712-DDTX1XS {width: 400px;}.bxc.bx-campaign-2104712 .bx-element-2104712-m8SGMkm {width: 100%;padding: 5px 0px 8px 5px;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-m8SGMkm {padding: 5px 0 8px;}}.bxc.bx-campaign-2104712 .bx-element-2104712-m8SGMkm> *:first-child {font-weight: 500;font-size: 14px;color: #282828;line-height: 1em;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-m8SGMkm> *:first-child {font-size: 13px;min-width: auto;padding: 0;line-height: 1.2;}}@media all and (min-width: 737px) and (max-width: 1024px) {.bxc.bx-campaign-2104712 .bx-element-2104712-m8SGMkm> *:first-child {font-size: 15px;}}.bxc.bx-campaign-2104712 .bx-element-2104712-Q0mFgKV {width: 100%;padding: 0 5px;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-Q0mFgKV {width: 100%;}}.bxc.bx-campaign-2104712 .bx-element-2104712-Q0mFgKV> *:first-child {font-weight: 400;font-size: 14px;color: #282828;line-height: 1em;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-Q0mFgKV> *:first-child {font-size: 13px;}}@media all and (min-width: 737px) and (max-width: 1024px) {.bxc.bx-campaign-2104712 .bx-element-2104712-Q0mFgKV> *:first-child {font-size: 15px;}}.bxc.bx-campaign-2104712 .bx-element-2104712-z4k8SKM {width: 300px;padding: 20px 0 0 0;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-z4k8SKM {padding: 20px 0 5px;}}.bxc.bx-campaign-2104712 .bx-element-2104712-z4k8SKM> *:first-child {padding: 11px;font-size: 16px;}@media all and (max-width: 736px) {.bxc.bx-campaign-2104712 .bx-element-2104712-z4k8SKM> *:first-child {font-size: 14px;}}#bx-campaign-2104712 .bx-slab { margin-bottom: 16px; }

      A lot of various sources being used since they really do not know much that is going on.

    1. Reviewer #1 (Public Review):

      The brain's code is not static. Neuronal activity patterns change as a result of learning, aging, and disease. Reliable tracking of activity from individual neurons across long time periods would enable detailed studies of these important dynamics. For this reason, the authors' efforts to track electrophysiological activity across days without relying on matching neural receptive fields (which can change due to learning, aging, and disease) are very important.

      By utilizing the tightly-spaced electrodes on Neuropixels probes, they are able to measure the physical distance and the waveform shape 'distance' between sorted units recorded on different days. To tune the matching algorithm and validate the results, they used the visual receptive fields of neurons in the mouse visual cortex (which tend to change little over time) as ground truth. Their approach performs quite well, with a high proportion of neurons accurately matched across multiple weeks. This suggests that the method may be useable in other cases where the receptive fields can't be used as ground truth to validate the tracking. This potential extendibility to tougher applications is where this approach holds the most promise.

      The main caveat (and disappointment) is that this paper does not address generalizability to other experimental conditions. Because it only looks at one brain area (visual cortex), in one species (mouse), using one type of spike sorter (Kilosort), and one type of behavioral prep (head-fixed), it is not clear if this approach is overfit to those conditions or if it will perform equally well in other conditions. Most importantly, in brain areas where neuronal receptive fields are more dynamic and can't be used as a ground truth diagnostic, it isn't clear how to apply the technique outlined in this study, since many of the parameters are tuned to a very specific set of conditions using visual receptive fields as ground truth.

    1. Rebuild your images regularly

      If you want both the benefits of caching, and to get security updates within a reasonable amount of time, you will need two build processes:

      1. The normal image build process that happens whenever you release new code.
      2. Once a week, or every night, rebuild your Docker image from scratch using docker build --pull --no-cache to ensure you have security updates.
    1. Préconisation-clé 6 : Ajouter le cousin dans la définition des viols et agressions sexuellesqualifiés d’incestueuxLe droit semble se méfier du mot inceste qu’il tient à distance du code pénal. Lui est privilégié une terminologie quimasque en grande partie le crime généalogique qu’est l’inceste. Peu à peu le mot obtient droit de cité dans la loipénale mais encore insuffisamment.C’est par le statut familial de l’agresseur vis-à-vis de l’enfant victime que l’inceste est défini. En effet, les viols et lesagressions sexuelles sont qualifiés d’incestueux lorsqu’ils sont commis par 1° Un ascendant ; 2° Un frère, une sœur,un oncle, une tante, un grand-oncle, une grand-tante, un neveu ou une nièce ; 3° Le conjoint, le concubin d’une despersonnes mentionnées aux 1° et 2° ou le partenaire lié par un pacte civil de solidarité avec l’une des personnesmentionnées aux mêmes 1° et 2°, s’il a sur la victime une autorité de droit ou de fait (art. 222-22-3 CP).Dans de nombreux témoignages, les victimes ont confié que leur agresseur était leur cousin(ou leur cousine). Hors de toute considération relative à l’interdit civil à mariage, la CIIVISEpréconise que soit reconnu le caractère incestueux des violences sexuelles lorsqu’ellessont commises par le cousin ou la cousine de la victime.
    1. You'll find that each platform offers a unique blend of features, capabilities, limitations, and use cases.

      This might be a good place to put two notes on how the way we encounter these systems is changin One trend is toward integrating AI into existing software like Google Doc or Word or the Office Suite. Another trend is toward multimodal, more agentlike systems that "decide" to do multiple kinds of things, so not a discrete text generator app but a system that will run code, search the web, generate and analyze images as well as generating text. ChatGPT Premium is already like this.

    1. Inter-Worker communication

      Whether using sub interpreters or multiprocessing you cannot simply send existing Python objects to worker processes.

      Multiprocessing uses pickle by default. When you start a process or use a process pool, you can use pipes, queues and shared memory as mechanisms to sending data to/from the workers and the main process. These mechanisms revolve around pickling. Pickling is the builtin serialization library for Python that can convert most Python objects into a byte string and back into a Python object.

      Pickle is very flexible. You can serialize a lot of different types of Python objects (but not all) and Python objects can even define a method for how they can be serialized. It also handles nested objects and properties. However, with that flexibility comes a performance hit. Pickle is slow. So if you have a worker model that relies upon continuous inter-worker communication of complex pickled data you’ll likely see a bottleneck.

      Sub interpreters can accept pickled data. They also have a second mechanism called shared data. Shared data is a high-speed shared memory space that interpreters can write to and share data with other interpreters. It supports only immutable types, those are:

      • Strings
      • Byte Strings
      • Integers and Floats
      • Boolean and None
      • Tuples (and tuples of tuples)

      To share data with an interpreter, you can either set it as initialization data or you can send it through a channel.

    2. Python’s system architecture is roughly made up of three parts:
      • A Python process, which contains one or more interpreters
      • An interpreter, which contains a lock (the GIL) and one or more Python threads
      • A thread, which contains information about the currently executing code.
    1. The final project in my JOUR 352 class was to turn an existing long story posted on a news website and reimagine the layout with HTML, CSS, and Bootstrap. HTML is text code. Each HTML file represents one page. CSS is code that connects with HTML documents. CSS changes how that text is presented with font, font size, color, etc. across all individual web pages on that site. Bootstrap is comprised CSS and JavaScript code for creating easy to use grid systems and responsive design. Website developers can take existing Bootstrap code and build off it. Responsive design involves making changes to make the website more functional on smaller screens like your smart phone or when shrinking your browser tab on computer.

      This was a great way of introducing an outside audience to the technical terms. You made them seem more down to earth and familiar to the unfamiliar reader.

    1. if we want to allow code outside the class to change the value of an instance variable

      Getters and setters are typically named using the following conventions: * Getters start with the word "get" followed by the name of the variable. * Setters start with the word "set" followed by the name of the variable. For example, if you have a private variable named "name", you would create a getter method named "getName()" and a setter method named "setName()".

    1. Strategies for Addressing Freedom and Autonomy * Ensure humans are in control of decisions * Allow clinicians control of the tech * Universal code for patient-clinician relationship * Make AI Info comprehensible to patients

    Annotators

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We express our gratitude to the reviewers for their time and insightful comments, which have significantly contributed to the enhancement of our manuscript. We believe that the thoughtful critiques and suggestions have substantially improved the overall quality of our work. The changes made in the revised manuscript were highlighted in red. Below, we provide a point-by-point response to each comment, addressing the concerns raised by the reviewers.

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      *Summary: *

      *In the current study, Li et al investigated how TGF-beta signaling is controlled by protein abundances. Computational modeling and experiments indicated that the abundance of TGFBR1 and TGFBR2 affects the signaling, and those with lower abundance affect the signaling more, resembling Liebig's law of the minimum. Specifically, they showed that by using multiple cell lines with a different abundance of receptors, modulation of expression of the less abundant receptor impacts the signaling, which is measured by SMAD2 nuclear-to-cytosol ratio and/or relative phospho-SMAD2 level. Also, by using a light-induced interaction system, they showed that the signaling is dependent on the concentration of receptor complex when both receptors are expressed at similar amounts. *

      *Major comments: *

      *Computational predictions support the authors' idea. The computation and the experiments are well-documented. And it would gain substantially if the authors fill the gap between the predictions and the experiments as follows. *

      *In Figure 4, the authors showed that perturbation on receptors with lower expression levels in each cell line changes the phospho-SMAD2 level. Although the data looks consistent with their claim, the result is only qualitative. The authors established a computational model in the former sections, thus it would be of great interest to assess if the experimental results quantitatively match the computational prediction. *

      Response: The reviewer suggests that our work could benefit from a quantitative comparison between computational predictions and experimental data shown in Figure 4. We appreciate this suggestion. Given the challenges in obtaining precise quantification of TGFBR1 protein due to antibody issues (see the response to comment #2 from reviewer 2), a direct quantitative comparison between model predictions and experimental results is difficult. Our model predictions about the control principle with Liebig's law of the minimum should be interpreted qualitatively, rather than a strict quantitative law. We have explicitly indicated in the revised manuscript that our siRNA knockdown experiments are to qualitatively test our model predictions.

      *In Figure 5, the authors computationally predicted that the expression level of receptors is correlated with SMAD2 N2C levels 1 hour after stimulation, and the strength of negative feedback with SMAD2 N2C levels 8 hours after stimulation. Because the authors employed iRFP-SMAD2 system, the prediction could be verified experimentally, at least the prediction on SMAD2 N2C 1 hour after stimulation could be checked. (In a sense, this is partially verified by the data in Figure 7, where both receptors are expressed at similar levels). It would gain substantially if the authors could verify the computational prediction in Figure 6. Since the authors stated in the introduction that "The same TGF-beta ligand can initiate different signaling responses depending on the cellular context, but the underlying control principle remains unclear...Together, these results revealed an effect of the minimum control in the TGF-beta pathway, which may be an important principle of control in signaling pathways with context-dependent outputs.", experimental verification of the prediction done in Figures 4-6 will be very important. Or the authors should stress that these points are only predicted by computational models. *

      __Response: __The reviewer recommends verifying the model predictions in Figure 6 experimentally, particularly regarding SMAD2 N2C levels 1 hour after stimulation. We appreciate this valuable suggestion, which was also raised by reviewer 2. In response, we conducted experiments as recommended by reviewer #2, in which imbalanced expression of TGFBR1 and TGFBR2 was achieved by transfecting optoTGFBR1 or optoTGFBR2 plasmids into optoTGFBRs-HeLa cells, which initially expressed similar levels of both receptors. Western blot analysis confirmed the desired imbalance (Figure S13).

      Consistent with the model predictions (Figure 6), the strong correlation between SMAD2 N2C fold change response at 1h and optoTGFBR2-tdTomato expression levels persisted in single cells when optoTGFBR1 was overexpressed (Figure 8A). Conversely, the high correlation between nuclear SMAD2 signaling and optoTGFBR2-tdTomato expression levels vanished at single cell level when optoTGFBR2 was overexpressed (Figure 8B). These experimental results validate our model predictions, confirming that the SMAD2 signaling is determined by the low abundance TGF-beta receptor in single cells. Incorporating these experimental validations enhances the quantitative support for our model predictions and clarifies the relationship between TGF-beta receptor abundance and signaling outcomes in single cells.

      *As written in the below "Significance" section, the result is, in a sense, obvious. It should be stated that because the study utilized a slightly high concentration of TGF-beta in the experiments, it might be natural that the low-abundance receptor becomes a bottleneck of the signaling. It would gain to assess how receptor abundance affects signaling with the stimulation of lower concentrations of TGF-beta, or to examine the computational model if the low abundance of a receptor becomes a bottleneck of signaling because of saturation. Also, it is highly recommended to discuss the physiological implication of the current study, taking into account the experimental conditions used. *

      Response: We appreciate the reviewer's insightful comments regarding the concentration of TGF-beta used in our experiments and the potential influence on the model predictions. In our experiments and model simulations, we utilized 100 pM TGF-beta, equivalent to 2.5 ng/mL (not 4.4 ng/mL as calculated by the reviewer). This concentration is a widely used dose in TGF-beta signaling studies. The reviewer's suggestion to explore how varying TGF-beta concentrations might influence the minimum control concept prompted us to extend our computational simulations. We used the extended model to perform simulations with lower TGF-beta concentrations (25 pM, equivalent to 0.625 ng/mL, and 10 pM, equivalent to 0.25 ng/mL). The results, depicted in Figure S7 of the revised manuscript, reaffirm that even at lower TGF-beta stimulations, a low abundance of a TGF-beta receptor acts as a bottleneck for SMAD2 signaling.

      Following the reviewer’s suggestion, we have incorporated additional paragraphs to discuss the physiological implications and potential limitations of our study (Page 16-17 in the Main text).

      It is pertinent to note that while the concept of TGF-beta signaling response being dictated by the minimum abundance of TGF-beta receptors may seem intuitive or even obvious, theoretical and experimental validations are crucial. As demonstrated in Figure S1B, our new simulation results from the minimal model illustrate similar response profiles when a high binding affinity (K1) is set for ligand-receptor interactions (Figure S1A). However, with a small binding affinity (K1), the minimal model indicates that TGF-beta signal response remains proportional to the product of TGFBR1 and TGFBR2 abundance and can be sensitive to the change of high abundance receptor in some region (Figure S1B). This highlights that the observed response patterns aligning with Liebig's law of the minimum depend on the binding affinity of ligand-receptor interactions in our minimal model. Consequently, the intuitive idea about Liebig's law of the minimum is not necessarily true theoretically. Moreover, given the non-linearity of the TGF-beta network, this complexity introduces an additional layer of uncertainty regarding the applicability of the minimum control principle to TGF-beta responses. This uncertainty led us to develop an extended model, with parameter values either experimentally measured or estimated from time course experimental data. The extended model predicted a similar minimum control principle at the TGF-beta receptor level, inspiring us to validate this prediction through diverse experiments. While we acknowledge the intuitive nature of our findings, we believe it is important for the field to prove this expectation, as emphasized by reviewer 4.

      Reviewer #1 (Significance (Required)):

      *TGF-beta signaling is one of the most rigorously studied pathways both computationally and experimentally. As written in the introduction of the manuscript, it is still unknown how the variability of responses arises not only between cell types but also differences among cells of single cell type. Studies showed that protein abundance accounts at least partly for a source of cell variability in TGF-beta signaling. While former studies examined the variability in SMAD protein abundance, the uniqueness of this study is that it focused on the abundance of TGF-beta receptors. *

      *Given that both TGFBR1 and TGFBR2 are involved in the signaling, however, it's not difficult to imagine that a less abundant receptor affects the signaling more than the other, and serves as a bottleneck for the signaling. Specifically, because a slightly high concentration (100pM = 4.4 ng/mL of TGF-beta; other studies used much lower conc., e.g. 0, 0.03, 0.04, 0.07, and 2.4 ng/mL in Frick et al, PNAS, 2017, and 0, 1, 2.5, 5, 25, and 100 pM in Strasen et al, Mol Syst Biol, 2017) is used throughout the experiments to check cell-cell variability and the effect of receptor abundance in the current study, the formation of the receptor-ligand complex may be quite fast and be saturated at the level where the receptor with lower abundance is exhausted. In the reviewer's humble opinion, the authors' statement that this is Liebig's law of the minimum sounds a bit exaggerated. *

      Nevertheless, the study is of some value because it utilized both computational and experimental analysis to show it is indeed the case. Of note, the current study showed that the variability in the different proteins leads to the variability in different time points, namely, the variability in the receptor abundance leads to the variability 1 hour after stimulation, while that in negative feedback strength leads to the variability 8 hours after stimulation. If the authors fill a small gap between their computational analysis and experimental verification, the study will be of interest to the specialist in the field.

      __Response: __We are grateful for the valuable feedback provided by the reviewer. The concerns related to the TGF-beta dose have been thoroughly addressed in our responses to previous comments. Regarding the observation that the term "Liebig's law of the minimum" may sound a bit exaggerated, we acknowledge this consideration. We have refined the title to "Liebig’s Law of the Minimum in the TGF-β/SMAD Pathway," specifying its relevance to SMAD signaling exclusively, as non-SMAD signaling was not within the scope of this study. We appreciate the reviewer's constructive feedback and hope these adjustments enhance the specificity and accuracy of our manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Li et al. present an interesting and intuitive concept for the sensitivity and heterogeneity of biological networks: When two or more proteins form a functional complex, it is the limiting component with the lowest concentration that is most sensitive to perturbations and whose fluctuations dictate cell-to-cell variability of complex function. The authors apply this concept to the TGFb pathway and discuss sensitivity of SMAD signaling towards TGFb receptor I and II fluctuations. The paper is clearly written and convincing, but some improvements in the experimental validation would be beneficial as detailed further below.

      1) The authors claim that the ratio of TGFb receptor I and II is very different across cell lines (Fig. 1) and use this observation for the validation of their model in Fig. 4. However, the relative expression TGFb receptor levels are purely based on RNAseq data which does not necessarily imply similar behavior at the protein level, especially on the cell surface. To address this issue, the authors should ideally provide absolute Western blot measurements of TGFbRI at the protein level to complement their absolute quantification of TGFbRII (Fig. S2). At the very least they should show that the observed relative expression levels of TGFbRI and II at the protein level (Figure S7) are correlated to differences in RNA levels (Fig. 1) using protein quantification. They should also confirm that similar receptor ratios for these receptors at the RNA level are observed in other published RNAseq datasets of the same cell lines (e.g., ENCODE for HepG2 and published RNAseq studies in HaCaT). Furthermore, they might take into account published mass spec datasets for quantifications of TGFbR protein levels.

      Response: We appreciate the reviewer's thorough evaluation and constructive suggestions.

      (A) Absolute quantification of TGFBR1: We acknowledge the importance of obtaining absolute quantification of TGFBR1 protein similar as what we have done for TGFBR2 protein (Figure S2). Despite significant efforts, our attempts to achieve this were hindered by challenges with available TGFBR1 antibodies and recombinant TGFBR1 proteins. Many commercial antibodies failed negative controls with TGFBR1 knockdown samples, while others validated TGFBR1 antibodies could not recognize the available recombinant TGFBR1 protein standards.

      Although many mass spectrometry proteomics data available for different cell lines, it is difficult to convert these MS quantitative values to absolute protein abundance as mentioned in a recent publication (Nusinow et al.,bioRxiv 2020.02.03.932384): “Importantly, these values are all relative values to the other values for that same protein and not absolute values. This means that comparing the levels of different proteins to each other without using something like a correlation to standardize values won’t produce meaningful results.

      We share the reviewer's concern and fully agree that obtaining this absolute quantification is crucial. However, at the present stage, technical limitations prevent us from providing this information for TGFBR1. We commit to pursuing this aspect when feasible in the future.

      (B) Validation of relative TGF-beta receptor expression ratios: Following the reviewer's suggestion, we conducted additional analyses to validate the relative expression ratios of TGFBR1 and TGFBR2 using different RNA-Seq databases. The results, presented in Table S1, demonstrate consistent imbalances in TGFBR1-to-TGFBR2 ratios across HepG2 and RH30 cell lines from various data sources, reinforcing the reliability of our observations.

      (C) Correlation between RNA and protein expression: We appreciate the reviewer highlighting the challenges associated with correlating RNA and protein expression. Indeed, the correlations between RNA and protein levels vary widely, and direct comparisons can be challenging. To address this, we referenced a recent study (Nusinow et al., Cell 2020, 180:387), which reported that the protein data of TGFBR1 and TGFBR2 were highly correlated with the corresponding RNA data from the same cell line (Spearman’s correlation: 0.672 for TGFBR1, 0.771 for TGFBR2) based on quantitative proteomics and RNA expression data from 375 cancer cell lines.

      2) Figure 4: To better judge the reproducibility of the knockdown titration, it would be good to show the different siRNA concentrations as a color code- Alternatively, TGFBR expression could be plotted as a function of the siRNA concentration in a Supplemental Figure, showing the effects of individual replicates.

      Response: We thank the reviewer for the suggestion to enhance the clarity of the knockdown titration data. In response, we have now presented the quantified experimental data from three replicates with different colors in Figure 4. Additionally, we have created Figure S9 that plots the expression levels of relative TGFBR1 and TGFBR2 as a function of siRNA concentration, providing a more detailed view of the effects across individual replicates.

      3) The simulations in Figs. 5 and 6 show that SMAD signaling fluctuations are mainly determined by cell-to-cell variability of receptor levels when using the SMAD nucleocytoplasmic ratio as a readout, and this is especially true for early time points. For downstream cellular responses, the absolute concentration of phosphorylated SMAD (complexes) in the nucleus is likely more relevant. Based on the authors work and evidence from the literature, I expect that this quantity will likely be heavily be influenced by receptor levels as well, but fluctuations in SMAD expression will play an important role as well. The authors should discuss this issue, and clarify that normalized quantities like SMAD N2C and pSMAD/SMAD mostly characterize receptor-level fluctuations while filtering SMAD fluctuations.

      __Response: __We acknowledge the importance of discussing the relevance of different readouts in our study. In the revised manuscript, we have incorporated a discussion addressing this issue. Specifically, we highlight that while the SMAD nucleocytoplasmic ratio is sensitive to cell-to-cell variability in low abundance receptor levels, the absolute concentration of phosphorylated SMAD in the nucleus may be more relevant for downstream cellular responses (e.g.: gene expression). We have cited the work by Lucarelli et al, which demonstrated that variations in SMAD abundance could modulate the balance of different SMAD complexes, thereby regulating heterogeneous gene expression in diverse cell types (Lucarelli et al., Cell Systems 2018).

      4) The single-cell measurements in Fig. 7 are interesting, but can only partially be seen as a direct validation of the model predictions, as it seems expected that varying the total input by introducing co-fluctuations in both receptors heavily influence the SMAD level. Wouldn't it be possible to design more specific validation experiments, in which the receptor co-expression construct (Fig. 7C) is used for baseline optoTGFBR expression and combined with an individual expression construct for one of the opto-receptors? This way, the authors could establish different regimes, in which one of the two receptors becomes dominant, and the impact fluctuations could be analyzed in a larger receptor expression space. Of course, a full validation of all possible scenarios is not necessary, but it would, for instance, be valuable to see whether the strong dependency of SMAD signaling of TGFBR2 levels vanishes when TGFBR2 is expressed at a higher level than TGFBR1.

      Response: We appreciate the insightful comments and suggestions provided by the reviewer. Based on these recommendations, we have conducted additional experiments to further validate our model predictions. Reviewer 1 also raised this point, we quote our aforementioned response here: “consistent with the model predictions (Figure 6), the strong correlation between SMAD2 N2C fold change response at 1h and optoTGFBR2-tdTomato expression levels persisted in single cells when optoTGFBR1 was overexpressed (Figure 8A). Conversely, the high correlation between nuclear SMAD2 signaling and optoTGFBR2 expression levels vanished at single cell level when optoTGFBR2 was overexpressed (Figure 8B). These experimental results validate our model predictions, confirming that the SMAD2 signaling is determined by the low abundance TGF-beta receptor in single cells. Incorporating these experimental validations enhances the quantitative support for our model predictions and clarifies the relationship between TGF-beta receptor abundance and signaling outcomes in single cells.”

      **Referees cross-commenting**

      Comments from R2: I agree with most comments of the other reviewers, and highlight the most important overlaps with my comments below.

      I agree with R1 that the model validation in Fig. 7 is incomplete and think that this will be a key point to improve the quality of the manuscript (see also my reviewer comment 4)

      In line with R3 and R4, I think that the SMAD N/C simulations do not necessarily imply effects on TGFb target gene expression, cell fate decisions or human pathologies. The significance of the results for cellular behavior should be discussed (see also my comment 3)

      __Response: __We are grateful for the reviewer's thoughtful comments. These comments have been now addressed (see our responses to the corresponding comments).

      Reviewer #2 (Significance (Required)):

      The manuscript presents an interesting and intuitive concept for the sensitivity and heterogeneity of biological networks. The authors apply this concept to the TGFb pathway and discuss sensitivity of SMAD signaling towards TGFb receptor I and II fluctuations.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      *Summary: *

      *This is an interesting study that examines the output of the TGF-Beta pathway and how abundance/dosage can determine the signaling response in single cells across multiple cell types. The study is primarily mathematical. The focus is on the Type 1 and 2 TGF-Beta receptors driving nuclear SMAD2 expression. The authors observe that SMAD2 phosphorylation is sensitive to variations in the lower levels of either receptor but robust at variations of high abundance of the receptor reflected through SiRNA experiments shown in Figure 4. Their conclusion is that the feature is consistent with Liebig's law of the minimum- where in this case- a low abundance of the receptor serves as the rate-limiting step in signaling for this pathway. *

      *Major comments: *

      *- While the data as presented are interesting, it is unclear as to whether the abundance regulates biological function. SMAD2 phosphorylation is shown with some nuclear translocation. However, TGF-Beta target gene activation is not shown, and this needs to be completed. *

      Response: We appreciate the reviewer's constructive comment. We have conducted new experiments and included quantitative real-time PCR data in the revised manuscript to evaluate the impact of TGFBR1 and TGFBR2 knockdown on the expression of TGF-beta target genes, such as SMAD7, PAI1, and JUNB. The results, presented in Figure S11, demonstrate differential sensitivity of these genes to the downregulation of TGFBR1 and TGFBR2 in various cell lines (HaCaT, HepG2, and RH30). Specifically, the expression of SMAD7, PAI1, and JUNB is sensitive to TGFBR2 knockdown in RH30 cells, while it is sensitive to TGFBR1 knockdown in HepG2 cells. HaCaT cells, expressing similar levels of both receptors, show comparable sensitivities to reductions in both TGFBR1 and TGFBR2. These findings provide additional insights into the regulatory role of TGF-beta receptor abundance on downstream target gene activation, complementing our study's focus on SMAD2 phosphorylation and nuclear translocation.

      *- In addition, it is unclear as to what happens to SMAD3 and SMAD4 which are expressed endogenously in this setting. How are these other TGF-Beta signaling molecules addressed by these observations? *

      __Response: __Thank you for bringing up this important point. In our study, the expression levels of endogenous SMAD2 and SMAD4 were found to be similar across HaCaT, RH30, and HepG2 cells. However, SMAD3 expression was notably lower in RH30 and HepG2 compared to HaCaT cells. The central conclusion of our study is based on the observed common control principle, which hinges on the relative expression levels of TGFBR1 and TGFBR2. Consequently, the applicability of this principle is more pertinent when comparing signal responses within the same cell type.

      We acknowledge the relevance of endogenous SMAD proteins, and in the revised manuscript, we have expanded our discussion on how differences in SMAD protein expression levels and potential mutations (page 16 in main text), as observed in certain cancers, could influence the formation of homo- and hetero-oligomeric SMAD complexes. These considerations contribute to a more comprehensive understanding of downstream gene expression responses, as discussed in the work of Lucarelli et al. (Cell Systems 2018).

      *-Specific biological readouts- cell differentiation etc. are not examined and would need to be provided and discussed. Therefore, the claims put forward while interesting require additional experiments examining SMAD2 target gene activation and biological readouts. *

      __Response: __We appreciate this valuable suggestion. While we acknowledge the importance of exploring long-term biological responses, including cell differentiation, it is crucial to note that specific biological readouts are not solely dependent on SMAD signaling; they also involve other non-SMAD signaling pathways. Additionally, these responses are highly cell type-specific. Undertaking extensive investigations into these responses would extend beyond the current scope of our work. Nevertheless, we have discussed this topic in the revised manuscript (page 16 in main text).

      Following the reviewers’ suggestion on examining TGF-beta target genes, we have performed experiments examining the expression of SMAD7, PAI1, and JUNB with respect to the changes of TGFBR1 and TGFBR2, respectively (see our response to the first major comment of this reviewer).

      *- Lastly, statistical analyses are not provided and would need to be provided. For instance, in Figure 4, how many experiments were replicated and statistical analysis performed for this Figure? *

      __Response: __In addressing this concern, we conducted three siRNA knockdown titration experiments for each cell line, as detailed in the figure legend. Due to batch effects, different percentages of TGF-beta receptors were knocked down in different experiments using the same concentration of siRNA. To transparently present the data, we utilized a scatter plot. Following the suggestion from reviewer 2, we have further enhanced the clarity of our data presentation by labeling the results of different experiments with a color code. In addition, we have performed statistical analysis of TGF-β receptor fold-change effects leading to a 50% reduction in the P-Smad2 response compared to that in the non-targeting siRNA control group (EC50) during siRNA knockdown experiments (Figure S10). The results of this analysis unveil significant differences in the sensitivities of pSMAD2 responses to variations in TGFBR1 and TGFBR2 within RH30 and HepG2 cells.

      Reviewer #3 (Significance (Required)):

      *- Conceptually this is an important study because dosage is a prominent issue in TGF-Beta signaling. For instance, in my field of expertise- mouse models of TGF-beta signaling e.g. SMAD2 knockouts- the cancer phenotypes are evident in haploid animals. Yet how and why dosage plays such a large role in tumorigenesis remains unclear. *

      __Response: __We sincerely appreciate your recognition of the conceptual importance of our study in addressing the dosage-related complexities of TGF-beta signaling. Your insights into dosage effects in mouse models, particularly in haploid animals, highlight the relevance of our work underlying tumorigenesis. We have incorporated relevant citations and expanded our discussion in the revised manuscript, providing additional context to the importance of dosage in tumorigenesis (page 18 in main text).

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      Summary: In this study, Li and co-workers combined computational modeling and experimental analysis to study the dependence of the output of the TGF-beta pathway on the abundance of signaling molecules in the pathway, mainly the most upstream regulators of SMAD2, TGFbeta type I and type II receptors. They showed by a combination of biochemical studies (mainly pSmad2 WB and type I/II receptor expression profiling) in HaCaT and HeLa cells as well as stable optogenetical receptor variants expressed by those cell lines, that TGF-beta receptor abundance influences signaling outputs using the concept of Liebigs law of the minimum, meaning that the output-modifying factor is the signaling protein that is most limited, to determine signaling responses across cell types and in single cells.

      *Major comments: *

      The study is very interesting, the combination of biochemistry and computational modeling to better understand the compexity of the TGFbeta pathway is very much required in the field and should stimulate others to further expand this approach.

      __Response: __Thank you for the positive evaluation of this work.

      *However, the authors must further explain that the model depicted here to explain pathway kinetics and dynamics lacks multiple crossroads and feedbacks and is until now oversimplified in the manuscript. They have mentioned receptor internalization and recycling, nuclear import and export of SMAD protein, and the feedback regulations e.g. by SMADs regulating receptor expression. Beyond, there is non- SMAD signaling (Derynck et al.; SMAD Linker regulation, deRobertis et al.), different receptor oligomerization modes (Ehrlich/Henis et al.) and heteromeric receptor complexes of TGFbeta receptors known (Hill et al.), that further diversify beyond these mentioned mechanisms. It is understandable that the mathematical model cannot include those considerations to date, however, they must be further explained and commented on to allow that this model can be expanded in the future. *

      Response: We acknowledge that there are multiple crossroads and feedbacks that exist in the TGF-beta signaling pathway that have not been explicitly incorporated into our model. We appreciate the reviewer's understanding that current model cannot include these considerations and his/her suggestions for potential future extensions. In the revised manuscript, we have mentioned one of the limitations of our model: non-Smad signaling and crosstalk with other signaling pathways were not considered for simplicity. We have also discussed how to expand this model by including these regulations when more quantitative data are available in the future (page 16-17 in main text).

      *A myriad of research labs focus on these intricate fine tuning ot the TGFbeta pathway by those mechanisms which makes the difference between "good" TGFbeta signaling and "bad" TGFbeta signaling in different context and this complexity must be acknowledged by more introduction and discussion. *

      Response: In the revised manuscript, we have added an introduction and discussion about the dual role of TGF-beta signaling (page 4 and page 18 in main text).

      *The model here will be important to explain *

      *A: the mode of heterooligomeric TGFbeta/BMP receptor assemblies as e.g. found in pathological conditions and *

      B: Can maybe explain the formation of mixed SMAD complexes as activated by lateral signaling comprising TGFbeta *and BMP receptors once one receptor is of lower abundance to form a high affinity complex. *

      *It is therefore required to comment on these aspects at multiple points in the manuscript. *

      *It is very important that the visual model used in this manuscript depicts on the possibility, that a TGFbeta type I receptor can team up with e.g. another TGFbeta type I receptor together with two TGFbeta type II receptors but also with an activin type II receptor or that a BMP type I receptor (e.g. ALK1) can form heterooligomeric complexes with ALK5 (TGFbeta type I). *

      __Response: __Thank you for this comment. We cited the relevant work (Ramachandran et al, eLife 2018; Szilagyi et al, BMC Biology 2022) and added a discussion about the complexity of the mode of heterooligomeric TGFbeta/BMP receptor assemblies and its effect on the induction of mixed SMAD complexes (page 17 in the main text).

      *While the use of optogenetical TGFbeta receptor biosensors is highly interesting, their mode of oligomerization is not yet fully described. It is not known if those biosensors behave like wt receptors in terms of oligomerization and ligand binding. This should be mentioned somewehere. For this reason, the authors should also consider to draw the TGFbeta receptor complex in the cartoons with more detail towards the heterooligomeric assembly that is standard to the field. *

      __Response: __The reviewer is correct that the optogenetic TGF-beta receptors might behave differently from the natural TGF-beta receptor system in terms of ligand binding. We have added this point in the Discussion part to highlight the potential difference between the optogenetic TGF-beta systems and the wild-type system (page 16 in the main text).

      *While the general finding is not surprising (manipulating the receptor with the lowest abundancy has the biggest impact on signaling output) the methods and models used here are very important to the field to proof that this expectation is actually true and can be experimentally addressed by a combination of bioinformatics and biochemistry. The model developed will be valuable to expand to much more complex and interesting questions in TGFbeta signaling and possibly also BMP signaling e.g. in pathological context (see below). *

      *Minor comments: *

      *The authors should discuss their findings in the context of: *

      • non-Smad signaling outputs (similar or different to the observations on pSMAD2)*
      • What do these findings mean for e.g. human pathologies, where type I or type II receptor expression is altered? *
      • Can those findings integrate into the "switch" in TGFbeta signaling? *
      • How do these findings translate towards BMP SMAD 1/5/9 signaling? * Response: First, we sincerely appreciate the reviewer’s recognition that our work is very important to the field in proving that manipulating the receptor with the lowest abundance has the biggest impact on signaling output. The reviewer’s suggestions about discussing our work in the context of non-Smad signaling, BMP SMAD1/5/9 branch, and the relevance to the dual role of TGF-beta signaling are all constructive. We have incorporated these suggestions and discussed them in the revised manuscript (page 17 in the main text).

      Reviewer #4 (Significance (Required)):

      *The manuscript is novel and interesting, partiular the combination of bioinformatical and biochemical approaches. The use of optogenetics is state-of-art while some more care should be given to interpretation of results with optogenetical TGfbeta receptor biosensors, is is not known if they really behave similar in terms of receptor oligomerization and signaling. Also it is not shown how their interactome in terms of effector proteins looks like that can potentially influence SMAD signaling output (e.g. Phosphathases to SMADs known to interact with wt receptors). *

      *The models drawn need to depict more accurately on the nature of type I and type II receptor complexes (heterotetrameric) and high affinity towards the ligand. The current versions are too oversimplified at this stage. The pathway crosstalks and feedbacks need to be more visible, in order for non experts to not draw too simple conclusions from the visual representations presented in this MS. Particularly the work by Hill and co-workers on receptor oligomerization and SMAD shuttling and feedback need to be included. *

      Overall, the manuscript is very significant to the field.

      __Response: __We would like to thank the reviewer again for his/her positive evaluation of the novelty and significance of our work. We have taken the reviewer's comments into consideration and made revisions to the manuscript. We now provide more information on the limitations of our current model and the optogenetic TGF-beta receptor biosensors in the Discussion section. We have also included more details about the receptor complex nature and the high affinity towards the ligand. The ligand receptor complex in the model is now drawn as heterotetrametric complex (1 ligand dimer with two TGFBR1s and two TGFBR2s). Additionally, we have incorporated information about pathway crosstalks and feedbacks, giving a more comprehensive view for non-experts. The work by Hill and co-workers on receptor oligomerization, SMAD shuttling, and feedback has been included in the revised manuscript to provide a more complete and accurate representation of the current knowledge in the field.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Li et al. present an interesting and intuitive concept for the sensitivity and heterogeneity of biological networks: When two or more proteins form a functional complex, it is the limiting component with the lowest concentration that is most sensitive to perturbations and whose fluctuations dictate cell-to-cell variability of complex function. The authors apply this concept to the TGFb pathway and discuss sensitivity of SMAD signaling towards TGFb receptor I and II fluctuations. The paper is clearly written and convincing, but some improvements in the experimental validation would be beneficial as detailed further below.

      1. The authors claim that the ratio of TGFb receptor I and II is very different across cell lines (Fig. 1) and use this observation for the validation of their model in Fig. 4. However, the relative expression TGFb receptor levels are purely based on RNAseq data which does not necessarily imply similar behavior at the protein level, especially on the cell surface. To address this issue, the authors should ideally provide absolute Western blot measurements of TGFbRI at the protein level to complement their absolute quantification of TGFbRII (Fig. S2). At the very least they should show that the observed relative expression levels of TGFbRI and II at the protein level (Figure S7) are correlated to differences in RNA levels (Fig. 1) using protein quantification. They should also confirm that similar receptor ratios for these receptors at the RNA level are observed in other published RNAseq datasets of the same cell lines(e.g., ENCODE for HepG2 and published RNAseq studies in HaCaT). Furthermore, they might take into account published mass spec datasets for quantifications of TGFbR protein levels.
      2. Figure 4: To better judge the reproducibility of the knockdown titration, it would be good to show the different siRNA concentrations as a color code- Alternatively, TGFBR expression could be plotted as a function of the siRNA concentration in a Supplemental Figure, showing the effects of individual replicates.
      3. The simulations in Figs. 5 and 6 show that SMAD signaling fluctuations are mainly determined by cell-to-cell variability of receptor levels when using the SMAD nucleocytoplasmic ratio as a readout, and this is especially true for early time points. For downstream cellular responses, the absolute concentration of phosphorylated SMAD (complexes) in the nucleus is likely more relevant. Based on the authors work and evidence from the literature, I expect that this quantity will likely be heavily be influenced by receptor levels as well, but fluctuations in SMAD expression will play an important role as well. The authors should discuss this issue, and clarify that normalized quantitites like SMAD N2C and pSMAD/SMAD mostly characterize receptor-level fluctuations while filtering SMAD fluctuations.
      4. The single-cell measurements in Fig. 7 are interesting, but can only partially be seen as a direct validation of the model predictions, as it seems expected that varying the total input by introducing co-fluctuations in both receptors heavily influence the SMAD level. Wouldn't it be possible to design more specific validation experiments, in which the receptor co-expression construct (Fig. 7C) is used for baseline optoTGFBR expression and combined with an individual expression construct for one of the opto-receptors? This way, the authors could establish different regimes, in which one of the two receptors becomes dominant, and the impact fluctuations could be analyzed in a larger receptor expression space. Of course, a full validation of all possible scenarios is not necessary, but it would, for instance, be valuable to see whether the strong dependency of SMAD signaling of TGFBR2 levels vanishes when TGFBR2 is expressed at a higher level than TGFBR1.

      Referees cross-commenting

      Comments from R2: I agree with most comments of the other reviewers, and highlight the most important overlaps with my comments below.

      I agree with R1 that the model validation in Fig. 7 is incomplete and think that this will be a key point to improve the quality of the manuscript (see also my reviewer comment 4)

      In line with R3 and R4, I think that the SMAD N/C simulations do not necessarily imply effects on TGFb target gene expression, cell fate decisions or human pathologies. The significance of the results for cellular behavior should be discussed (see also my comment 3)

      Significance

      The manuscript presents an interesting and intuitive concept for the sensitivity and heterogeneity of biological networks. The authors apply this concept to the TGFb pathway and discuss sensitivity of SMAD signaling towards TGFb receptor I and II fluctuations.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We are grateful to the reviewers for their remarks, which significantly improved the paper. We repeated the biochemical assay concerning SIRT6 activity on H3-K27Ac and quantified the results as requested. Please find our detailed answers bellow each recommendation of the reviewers.

      Major recommendations:

      1. Grammatical errors are still common; the authors may need to consider an external editing service if they intend to fix the problems as they indicate that they believe the errors have been removed. The Results section is relatively clean, but parts of the Abstract, Introduction, and Discussion are more difficult to understand, and errors are especially common in the Methods section and those parts of the manuscript that are new in this revision.

      We corrected the grammatical errors.

      1. The introduction doesn't mention the other structures published; this is considered to be a serious deficiency as it prevents the reader from understanding the context for the contributions described here. Withholding the comparison with (or mention of) the previously published work to the last sentence of the Discussion seems misleading and does not give the reader adequate ability to judge the novelty of the results presented in this manuscript.

      A paragraph comparing our paper to the other structures published appear at the end of the discussion. We feel this is still the right place for such a paragraph.

      1. The addition of the assay for deacetylation is a significant improvement over the initial submission. This is important both for validating the importance of the acidic patch contacts and for helping to resolve the conflicting reports regarding activity on H3-K27Ac. Given the importance of this assay for the impact of the manuscript, it is not clear why the authors chose to 1) put the data in the supplement instead of in the main manuscript, and 2) provide only single samples without quantitation. These both seem to be significant limitations.

      We repeated the experiment and provided quantification of the results. We placed the figure in the main manuscript.

      1. The authors should add text or a table to the Methods section explaining which maps were used for each figure. By our count, there are 8 maps and 5 models (plus MD models) based on two datasets, but the relationships among them are not clearly stated, and the names of the maps (such as "Zn-finger focused" and "Rossman-Fold-Focused") might be changed to be more helpful to the reader (for example, the latter includes more than the Rossman fold and might be renamed "Sirt6-focused"). The authors should also explain how the maps were validated, which data were deposited in public repositories, and why some data were not deposited. For example, no statistics or methods regarding how particles were separated into integrated vs. non-integrated motion are provided for the CryoDRGN models. Further, the "two principle movements" described are depicted in 4 maps from two CryoDRGN runs using two separate sets of particles, but the relationships among them are not defined clearly. Finally, the connectivity of densities in Fig 8 are not obvious in the submitted maps. Until these points are addressed, the work is considered incomplete.

      AND

      1. The PDB model provided for review and submitted to the PDB database shows loosely bound DNA at the nucleosomal entry/exit points near the binding site of SIRT6, but the maps provided for review and submitted to the EMDB show stronger density for the canonical location of the DNA expected at these sites. The CryoDRGN maps support a more extended conformation, but these maps were not deposited or provided for review so their validity cannot be assessed.

      We added a section to the methods listing the different maps used for the figures. We deposited the map we used to trance the H2A N-terminal tail (EMD-18497). Unfortunately, we couldn’t deposit the cryoDRGN maps as the deposition system either accepts composite maps, where the consensus should be deposited too or experimental maps, where the deposition of half maps are mandatory. Nevertheless, the cryoDRGN maps are available upon request. We also added a supplementary figure (Supplementary Fig 6) to show how the cryoDRGN analyses were performed.

      1. The orientation, angle and threshold used in Fig 1 make it difficult to see the multiple DNA orientations that are visible in the deposited consensus map. Examination of the map suggests that the DNA model submitted to PDB corresponds to a weaker DNA conformation than is present in the map where both DNA conformations are visible. The authors should consider modeling both conformations in their deposited model to provide a more complete, accurate representation of the data. It is concerning that a key conclusion of the manuscript is that the DNA conformation changes upon SIRT6 binding, but density for the canonical position is observable in Fig 8a.

      Figure 1 is showing the overall representation of the SIRT6 bound nucleosome structure. We show the DNA linker orientations in the subsequent figure. Figure 8 (now Figure 9) shows the rearrangement of the SIRT6 Rossmann fold domain not the DNA linker.

      1. Figure 4 needs a more complete legend, indicating that it is a hybrid of the consensus structure (one color) and the MD simulations (another color). In general, the colors used in the figure should be changed to make the main points more accessible.

      As there is a color code for the histones, changing colors might be confusing. The figure legend mentions that panels c, d and e are from MD simulations.

      Minor recommendations:

      1. Figures 2c, e, and f are not referenced in the text.

      We now referenced all figure panels in the text.

      1. Consider moving Supp. 5C to Fig. 2 as the models in that figure come from the CryoDRGN maps and not the consensus map.

      Supplemental Figure 5c show the DNA linker deviation upon SIRT6 binding from another angle. We prefer to keep it there.

      1.) Supp Fig 3 is labeled "ZnF-nucleosome" refinement, but this appears to come from Data Set #2 processing. The map might be labeled ZnF-nucleosome but then a mask should be shown that excludes the Rossman Fold. It is not clear if this is a focused refinement or just a 2.9 A map that was merged with the "Rossman-fold" map.

      We changed both supplemental figures accordingly.

      1. The orientation of Fig 2 b and e do not show the differences in these models as well as panels c and f. Panels b and e could be replaced with the 4 CryoDRGN maps.

      The models reflect the cryoDRGN maps and panels c and f were added to clarify the movement.

      1. The MD description should emphasize that the H3 tails are moving with respect to the active site, as it currently suggests the active site is moving.

      In the results and in the discussion section we mention that we observe new conformations of the H3 tail, not of the active site.

      1. The authors refer to the "flexibility of the Rossmann fold domain," but the Rossman Fold domain isn't flexible, the linkage to the ZnF is flexible. Perhaps "observed conformational space" or "dynamic Rossman-fold domain position" are meant.

      The text was changed accordingly.

      1. The H2A C-terminal tail present in Fig 1 (bottom right) and Figure 3e is not present in the model in Fig 4a,b.

      The H2A tails conformation was not resolved in the cryoDRGN maps so we didn’t model it.

      1. The crosslinking agent used is not specified.

      The crosslinking agent used is specified more clearly in the methods.

      1. Supp Table 1 and EM methods do not agree on the magnification for Dataset #1. Verify nominal versus binned magnification and reported pixel size.<br /> The magnification in the methods was changed.

      2. Fig 3F showing the difference between affinity for H2A and H2A.Z-containing nucleosomes would be more convincing with a titration rather than the current comparison of a single concentration.

      We agree with this remark however, we find single concentration comparison is convincing enough for the purposes of this paper as it is not a central finding.

      1. Fig S1 legend; both the Zn-finger and helix bundle are stated to be shown in green.

      Figure S1 legend was changed.

    1. So what can we do about it? Well, we can start thinking about how we create more inclusive code and employ inclusive coding practices. It really starts with people.

      100% true the real impact that can be made is on a people scale, with people putting in better policy's and such this CAN be prevented.

    2. I asked the developers what was going on, and it turned out we had used the same generic facial recognition software.

      I wonder what this code is and how easily it could be fixed to detect everyone's face. Also, what a coincidence.

    3. Hello, I'm Joy, a poet of code, on a mission to stop an unseen force that's rising

      Big metaphor for the implications of AI bias which is what this video is on, interesting for her very first sentence to be a metaphor like this and I wonder if there is a good reason for it like it being a hook or something. Additionally, I will say that it is kind of funny that her name is "Joy" and throughout the whole video she is happy even though she is talking about a serious issue.

    1. The underrepresentation of women and people of color in technology, and the under-sampling of these groups in the data that shapes AI, has led to the creation of technology that is optimized for a small portion of the world.

      This is important as it is one of the main reasons for the poor, biased, decisions of the ai code. It is also instutuional oppression as it is the comapny deciding who to test for their ai data, even if it is an accident. i think with a bigger diversiety of testers, ai will be mroe ethical and omptimized for everyone to use. i feel hopeful

    1. Sometimes you fuck up and lose your only copy of a GitHub secret that you can't replace easily, such as a Cachix signing key. However you lucked out and that key is actually saved in GitHub Actions secrets...which won't let you read the contents of that secret for understandable security reasons. Here's how you work around that.

      I remembered that happen to me, but it involve some python code and Telegram Bot API worked for me.

    1. Author Response

      The authors wish to thank the Reviewers for valuable and constructive comments that will help up improve the paper’s quality.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript builds upon the authors' previous work on the cross-talk between transcription initiation and post-transcriptional events in yeast gene expression. These prior studies identified an mRNA 'imprinting' phenomenon linked to genes activated by the Rap1 transcription factor (TF), a surprising role for the Sfp1 TF in promoting RNA polymerase II (RNAPII) backtracking, and a role for the non-essential RNAPII subunits Rpb4/7 in the regulation of mRNA decay and translation. Here the authors aimed to extend these observations to provide a more coherent picture of the role of Sfp1 in transcription initiation and subsequent steps in gene expression. They provide evidence for (1) a physical interaction between Sfp1 and Rpb4, (2) Sfp1 binding and stabilization of mRNAs derived from genes whose promoters are bound by both Rap1 and Sfp1 and (3) an effect of Sfp1 on Rpb4 binding or conformation during transcription elongation.

      Strengths:

      This study provides evidence that a TF (yeast Sfp1), in addition to stimulating transcription initiation, can at some target genes interact with their mRNA transcripts and promote their stability. Sfp1 thus has a positive effect on two distinct regulatory steps. Furthermore, evidence is presented indicating that strong Sfp1 mRNA association requires both Rap1 and Sfp1 promoter binding and is increased at a sequence motif near the polyA track of many target mRNAs. Finally, they provide compelling evidence that Sfp1-bound mRNAs have higher levels of RNAPII backtracking and altered Rpb4 association or conformation compared to those not bound by Sfp1.

      Weaknesses:

      The Sfp1-Rpb4 association is supported only by a two-hybrid assay that is poorly described and lacks an important control. Furthermore, there is no evidence that this interaction is direct, nor are the interaction domains on either protein identified (or mutated to address function).

      Indeed, our two hybrid, immunoprecipitation and imaging results do not allow us to conclusively discern whether the interaction between Rpb4 and Sfp1 is direct or indirect. While the interaction holds significance, we consider the direct versus indirect distinction to be of secondary importance in the context of this paper. We intend to give more attention to this matter in our revised paper. In addition, we will make an effort to investigate an in vitro interaction between Sfp1 and Rpb4 by employing purified Sfp1 and Rpb4 proteins.

      The contention that Sfp1 nuclear export to the cytoplasm is transcription-dependent is not well supported by the experiments shown, which are not properly described in the text and are not accompanied by any primary data.

      We note that this assay has been developed and published in prior research by Lee, M. S., M. Henry, and P. A. Silver. (G&D, 1996) and was reported in a number of subsequent papers. Reassuringly, our conclusion is supported by the observation that Sfp1 binds to Pol II transcripts co-transcriptionally suggesting that Sfp1 is exported in the context of the mRNA.

      The presence of Sfp1 in P-bodies is of unclear relevance and the authors do not ask whether Sfp1-bound mRNAs are also present in these condensates.

      In the revised paper, we will indicate that we do not know whether RP mRNAs are present in the actual foci shown in Fig. 1B.

      Further analysis of Sfp1-bound mRNAs would be of interest, particularly to address the question of whether those from ribosomal protein genes and other growth-related genes that are known to display Sfp1 binding in their promoters are regulated (either stabilized or destabilized) by Sfp1.

      Fig. 4A, C and D show that RP mRNAs become destabilized in sfp1Δ cells.

      The authors need to discuss, and ideally address, the apparent paradox that their previous findings showed that Rap1 acts to destabilize its downstream transcripts, i.e. that it has the opposite effect of Sfp1 shown here.

      We would like to thank Reviewer 1 for this valuable comment. In the revised paper, we will delve into our hypothesis suggesting that Rap1 is likely responsible for regulating the imprinting of other proteins, that, in turn, lead to the destabilization of mRNAs, such as Rpb4.

      Finally, recent studies indicate that the drugs used here to measure mRNA stability induce a strong stress response accompanied by rapid and complex effects on transcription. Their relevance to mRNA stability in unstressed cells is questionable.

      Half-lives were determined mainly by the GRO analysis of optimally proliferating cells. This method does not requires any drug or stressful treatment. The results obtained by this method were consistent with the those obtained after thiolutin addition. Nevertheless, in our revised manuscript, we plan to supplement the half-life data with results obtained by subjecting cells to a temperature shift to 42°C, a natural method to block transcription in wild-type (WT) cells. This approach to determine half-lives has been previously reported in our publications, such as Lotan et al. (2005, 2007) and Goler Baron et al. (2008). This may rule out effects of the drug on halfe-life.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Kelbert et al. presents results on the involvement of the yeast transcription factor Sfp1 in the stabilisation of transcripts whose synthesis it stimulates. Sfp1 is known to affect the synthesis of a number of important cellular transcripts, such as many of those that code for ribosomal proteins. The hypothesis that a transcription factor can remain bound to the nascent transcript and affect its cytoplasmic half-life is attractive, but the methods used to demonstrate the half-life effects and the association of Sfp1 with cytoplasmic transcripts remain to be fully validated, as explained in my comments on the results below:

      Comments on methodology and results:

      1. A two-hybrid-based assay for protein-protein interactions identified Sfp1, a transcription factor known for its effects on ribosomal protein gene expression, as interacting with Rpb4, a subunit of RNA polymerase II. Classical two-hybrid experiments depend on the presence of the tested proteins in the nucleus of yeast cells, suggesting that the observed interaction occurs in the nucleus. Unfortunately, the two-hybrid method cannot determine whether the interaction is direct or mediated by nucleic acids.

      Please see our response to comment 1 of Reviewer 1.

      1. Inactivation of nup49, a component of the nuclear pore complex, resulted in the redistribution of GFP-Sfp1 into the cytoplasm at the temperature non-permissive for the nup49-313 strain, suggesting that GFP-Sfp1 is a nucleo-cytoplasmic shuttling protein. This observation confirmed the dynamic nature of the nucleo-cytoplasmic distribution of Sfp1. For example, a similar redistribution to the cytoplasm was previously reported following rapamycin treatment and under starvation (Marion et al., PNAS 2004). In conjunction with the observation of an interaction with Rpb4, the authors observed slower nuclear import kinetics for GFP-Sfp1 in the absence of Rpb4 when cells were transferred to a glucose-containing medium after a period of starvation. Since the redistribution of GFP-Sfp1 was abolished in an rpb1-1/nup49-313 double mutant, the authors concluded that Sfp1 localisation to the cytoplasm depends on transcription. The double mutant yeast cells may show a variety of non-specific effects at the restrictive temperature, and whether transcription is required for Sfp1 cytoplasmic localisation remains incompletely demonstrated.

      We concur with Reviewer 2 that any heat inactivation of a temperature-sensitive (ts) protein can result in non-specific effects. In the instance of rpb1-1, these non-specific effects are anticipated because of the transcriptional arrest, which can eventually lead to a reduction in protein content. However, it is worth noting that this process takes some time, whereas the impact on export is more rapid. We note that that this assay has been developed and published in prior research by Pam Silver (op. cit.) and was reported in a number of subsequent papers. Reassuringly, our conclusion is supported by the observation that Sfp1 binds to Pol II transcripts co-transcriptionally.

      1. Under starvation conditions, which led to the presence of Sfp1 in the cytoplasm and have previously been correlated with a decrease in the transcription of Sfp1 target genes, the authors observed that a plasmid-based expressed GFP-Sfp1 accumulated in cytoplasmic foci. These foci were also labelled by P-body markers such as Dcp2 and Lsm1. The quality of the microscopic images provided does not allow to determine whether Rpb4-RFP colocalises with GFP-Sfp1.

      The submitted PDF figure is of low quality. We believe that high quality figure will be convincing.

      1. To understand to which RNA Sfp1 might bind, the authors used an N-terminally tagged fusion protein in a cross-linking and purification experiment. This method identified 264 transcripts for which the CRAC signal was considered positive and which mostly correspond to abundant mRNAs, including 74 ribosomal protein mRNAs or metabolic enzyme-abundant mRNAs such as PGK1. The authors did not provide evidence for the specificity of the observed CRAC signal, in particular, what would be the background of a similar experiment performed without UV cross-linking. In a validation experiment, the presence of several mRNAs in a purified SFP1 fraction was measured at levels that reflect the relative levels of RNA in a total RNA extract. Negative controls showing that abundant mRNAs not found in the CRAC experiment were clearly depleted from the purified fraction with Sfp1 would be crucial to assessing the specificity of the observed protein-RNA interactions. The CRAC-selected mRNAs were enriched for genes whose expression was previously shown to be upregulated upon Sfp1 overexpression (Albert et al., 2019). The presence of unspliced RPL30 pre-mRNA in the Sfp1 purification was interpreted as a sign of co-transcriptional assembly of Sfp1 into mRNA, but in the absence of valid negative controls, this hypothesis would require further experimental validation.

      We argue that the 264 CRAC+ genes represent a distinct group with many unique features. Moreover, many CRAC+ genes do not fall into the category of highly transcribed genes.

      The biological significance of the 264 CRAC+ mRNAs was demonstrated by various experiments; all are inconsistent with technical flaws. Some examples are:

      1. Fig. 2a and B show that most reads of CRAC+ mRNA were mapped to specific location – close the pA sites.
      2. Fig. 2C shows that most reads of CRAC+ mRNA were mapped to specific RNA motif.

      3. Most RiBi CRAC+ promoter contain Rap1 binding sites (p= 1.9x10-22), whereas the vast majority of RiBi CRAC- promoters do not contain Rap1 binding site. (Fig. 3C).

      4. Fig. 4A shows that RiBi CRAC+ mRNAs become destabilized due to Sfp1 deletion, whereas RiBi CRAC- mRNAs do not. Fig. 4B shows similar results due to

      5. Fig. 6B shows that the impact of Sfp1 on backtracking is substantially higher for CRAC+ than for CRAC- genes. This is most clearly visible in RiBi genes.

      6. Fig. 7A shows that the Sfp1-dependent changes along the transcription units is substantially more rigorous for CRAC+ than for CRAC-.

      7. Fig. S4B Shows that chromatin binding profile of Sfp1 is different for CRAC+ and CRAC- genes

      Moreover, only a portion of the RiBi mRNAs binds Sfp1, despite similar expression of all RiBi.

      Most importantly, these genes do not all fall into the category of highly transcribed genes. On the contrary, as depicted in Figure 6A (green dots), it is evident that CRAC+ genes exhibit a diverse range of Rpb3 ChIP and GRO signals. Furthermore, as illustrated in Figure 7A, when comparing CRAC+ to Q1 (the most highly transcribed genes), it becomes evident that the Rpb4/Rpb3 profile of CRAC+ genes is not a result of high transcription levels. In our revised paper, we will give increased attention to this matter in the Discussion section.

      1. To address the important question of whether co-transcriptional assembly of Spf1 with transcripts could alter their stability, the authors first used a reporter system in which the RPL30 transcription unit is transferred to vectors under different transcriptional contexts, as previously described by the Choder laboratory (Bregman et al. 2011). While RPL30 expressed under an ACT1 promoter was barely detectable, the highest levels of RNA were observed in the context of the native upstream RPL30 sequence when Rap1 binding sites were also present. Sfp1 showed better association with reporter mRNAs containing Rap1 binding sites in the promoter region. However, removal of the Rap1 binding sites from the reporter vector also led to a drastic decrease in reporter mRNA levels. Whether the fraction of co-purified RNA is nuclear and co-transcriptional or not cannot be inferred from these results.

      The proposed co-transcriptional binding of Sfp1 is based on the findings presented in Figure 5C and Figure S2D, as well as the observed binding of Sfp1 to transcripts containing introns, as shown in Figures 2D and 3B. Our conclusion, which we still uphold, was drawn from the results presented in Figure 3. These results led us to the assertion that the "RNA-binding capacity of Sfp1 is regulated by Rap1-binding sites located at the promoter." We maintain our stance on this conclusion. Indeed, the Rap1 binding site does impact mRNA levels, as highlighted by Reviewer 2. However, "construct E," which possesses a promoter with a Rap1 binding site, exhibits lower transcript levels compared to "construct F," which lacks such a binding site in its promoter. Despite this difference in transcript levels, Sfp1 was able to pull down the former transcript but not the latter, even though expression of the former gene is relatively low. Thus, the results appear to be more reliant on the specific capacity of Sfp1 to interact with the transcript rather than on the transcript's expression level.

      1. To complement the biochemical data presented in the first part of the manuscript, the authors turned to the deletion or rapid depletion of SFP1 and used labelling experiments to assess changes in the rate of synthesis, abundance, and decay of mRNAs under these conditions. An important observation was that in the absence of Sfp1, mRNAs encoding ribosomal protein genes not only had a reduced synthesis rate but also an increased degradation rate. This important observation needs careful validation, as genomic run-on experiments were used to measure half-lives, and this particular method was found to give results that correlated poorly with other measures of half-life in yeast (e.g. Chappelboim et al., 2022 for a comparison). Similarly, the use of thiolutin to block transcription as a method of assessing mRNA half-life has been reported to be problematic, as thiolutin can specifically inhibit the degradation of ribosomal protein mRNA (Pelechano & Perez-Ortin, 2008). Specific repressible reporters, such as those used by Baudrimont et al. (2017), would need to be tested to validate the effect of Sfp1 on the half-life of specific mRNAs. Also, it would be very difficult to infer from the images presented whether the rate of deadenylation is altered by Sfp1.

      Various methods exist for assessing mRNA half-lives (HLs), and each of them carries its own set of challenges and biases. Consequently, it becomes problematic to directly compare HL values of a specific mRNA when different methods are employed. The superiority of one particular method over others remains unclear. However, they all exhibit a high degree of reliability when it comes to comparing different strains under the identical conditions using a single method.

      Estimating half-lives through the GRO approach is a non-invasive method, applied on optimally proliferating cells, which has been employed in numerous publications. While no method is without its limitations, we consider this approach to be among the most dependable. Our HL determination using thiolutin to block transcription provided results that were consistent with the values obtained by the GRO approach.

      Nevertheless, in our revised manuscript, we plan to supplement the HL data, obtain by thiolutin, with results obtained by subjecting cells to a temperature shift to 42°C, a natural method to block transcription in wild-type (WT) cells. This approach to determine HLs has been previously reported in our publications, such as Lotan et al. (2005, 2007) and Goler Baron et al. (2008).

      1. The effects of SFP1 on transcription were investigated by chromatin purification with Rpb3, a subunit of RNA polymerase, and the results were compared with synthesis rates determined by genomic run-on experiments. The decrease in polII presence on transcripts in the absence of SFP1 was not accompanied by a marked decrease in transcript output, suggesting an effect of Sfp1 in ensuring robust transcription and avoiding RNA polymerase backtracking. To further investigate the phenotypes associated with the depletion or absence of Sfp1, the authors examined the presence of Rpb4 along transcription units compared to Rpb3. One effect of spf1 deficiency was that this ratio, which decreased from the start of transcription towards the end of transcripts, increased slightly. The results presented are largely correlative and could arise from the focus on very specific types of mRNAs, such as those of ribosomal protein genes, which are sensitive to stress and are targeted by very active RNA degradation mechanisms activated, for example, under heat stress (Bresson et al., 2020).

      Figure 7A illustrates a significant reduction in Rpb4/Rpb3 ratios along the transcription unit in WT cells. This reduction is notably more pronounced in CRAC+ genes compared to the highly transcribed quartile (Q1), which includes all ribosomal protein (RP) genes, and it is completely absent in sfp1∆ cells. Furthermore, it's important to highlight that the CRAC+ gene group displays a wide range of transcription rates, as measured by either Rpb3 ChIP or GRO (Figure 6A). Given these observations, it is challenging to reconcile how the heightened sensitivity of RP mRNA degradation in response to stress could account for the more pronounced differences in the configuration of the Pol II elongation complex that are detected in CRAC+ genes under standard culture conditions in wt cells.

      Correlative studies are particularly informative when a gene mutation eliminates a correlation, and this is precisely the type of study depicted in Figure 7B-C. The configuration of elongating Pol II (as reflected by Rpb4/Rpb3 ratios) and the backtracking index are both transcriptional outputs. It is difficult to envision how stress-induced destabilization of RP mRNAs could explain the twofold higher correlation between these two parameters observed in CRAC+ genes under non-stressful conditions in WT cells (Figure 7B).

      Furthermore, it's worth noting that in WT cells, CRAC+ genes did not display any apparent unusual destabilization, but rather exhibited higher (not lower) mRNA stability compared to CRAC- genes (Figure 7C).

      Strengths: - Diversity of experimental approaches used - Validation of large-scale results with appropriate reporters

      Weaknesses: - Choice of evaluation method to test mRNA half-life - Lack of controls for the CRAC results

    1. Reviewer #1 (Public Review):

      The manuscript investigates the binding of PHD-BD, a tandem of reader domains in the C-terminus of BPTF, to modified histone tail peptides and nucleosomes. It focuses on the differences in binding affinity between peptide and nucleosome substrates for BPTF PHD-BD. Using the dCypher approach, they find that multi-modified peptide substrates (both acetylation and methylation) do not increase PHD-BD binding affinity. They argue that histone peptide substrates do not support the histone code model, which champions that multivalent engagement by PHD-BD with a multi-modified substrate would lead to stronger binding when compared to the engagement of each domain alone. In contrast, when using nucleosome substrates, even though the overall affinity is reduced, the affinity for H3K4me3triac (double modification) is tighter than either modification on its own. This is consistent with the histone code model.

      A strength of the manuscript is that it further delineates the contribution of each domain by again using dCypher to compare peptide and nucleosome binding of the PHD and BD domains alone, as well as tandem domain constructs where each domain has been inactivated by a point mutation (W2891A for the PHD and N3007A for the BD). PHD alone had a lower affinity for nucleosomes than peptides overall. With peptide substrates, PHD had the highest affinity for H3K4me3 and reduced affinity for H3K4me3triac; while with nucleosomes this trend was reversed. BD alone showed an affinity for acetylated H3 and H4 peptides but surprisingly was unable to bind nucleosomes. PHD requires the combination of H3K4 methylation and H3 tail acetylation for binding, and when partnered with BD, which is not able to bind nucleosomes alone, interestingly confers specificity for K14ac and K18ac. The in vivo relevance is argued using CUT&RUN analysis.

      NMR spectroscopy is further used to show that PHD-BD binds acetylated H3 in a multivalent manner while forming a unique complex with H3K4me3triac. Deleting the N-terminal A1 region of H3 abolishes the binding of PHD-BD, implying its importance for recognition. The authors also discuss a "fuzzy complex" that forms between H3 and DNA, as well as H4 and DNA, which explains the occlusion of histone tail accessibility in the nucleosome. By changing the sidechain charge, such as with PTMs, this interaction can be weakened and allow PHD in this case to bind to the modified H3 tail. Comparisons between spectra of the H4 tail, H4 tail with DNA, and the H4 tail in the nucleosome are made and used to argue for H4-DNA interactions in the nucleosome.

      The conclusions of the manuscript are very well-supported by the data and reveal a lot of insight into how the two reader domains of BPTF interact with modified nucleosomes. In many places, however, the manuscript is written more generally as if the conclusions apply in all cases (e.g. the title, abstract, and introduction) and this remains to be determined. It is also overstated that there is a belief that peptides perfectly recapitulate nucleosomes. It should also be pointed out that the nucleosomes are multi-valent and the data cannot discriminate binding of a single PHD-BD to single or multiple tails, and that the work is limited as it is using a construct of BPTF and in fact, there is at least one other reader domain involved.

    2. Reviewer #2 (Public Review):

      This manuscript by Musselman and coworkers uses a commercial library of modified histone peptides and mononucleosomes to probe the substrate specificity of the PHD-bromodomain combination of the BPTF protein. They arrive at the conclusion that BPTF preferably binds H3K4me3 and H3K18ac in the H3 tail. By using NMR with lableled H4 protein in nucleosomes they show that the H4 tail interacts with DNA, which may limit its ability to interact with BPTF. Finally, experiments in cells demonstrate that BPTF, H3K4me3, and H3K18ac occupy overlapping regions of chromatin. The authors suggest that recruitment of BPTF to specific regions of chromatin is driven by the co-binding of H3K4me3 and H3K18ac by BPTF. This study is of interest to readers interested in understanding the functions of the BPTF protein in cells.

      In this reviewer's opinion, the manuscript needs some revision and the inclusion of some missing information.

      1) The authors seem to have overlooked the fact that mononucleosome substrates have been in use for determining the substrate specificity and mechanisms of quite a few enzymes that simply do not act on peptide substrates. For example, Dot1L doesn't do anything with peptides nor does COMPASS/Set1, both of which require intact nucleosomal substrates to measure their activity in response to ubiquitylated H2B. Thus, the authors' refinement of the "histone code hypothesis" is unnecessary and overdone. I would suggest that they instead cite examples where nucleosome substrates have provided answers that cannot be obtained from peptide substrates alone. For example, extensive work from the Muir and Allis labs.

      2) Ruthenburg and Allis in Cell 2011 conducted similar experimentation and concluded that H3K4me3-H4K16ac is a modification state bound by BPTF in cells. They also showed co-localization in ChIP-seq experiments and demonstrated preferential pulldowns with BPTF and semisynthetic methylated and acetylated nucleosomes. The authors have entirely ignored these previous results in their own discussions. Readers would benefit from a side-by-side comparison of the two acetylation states to get a sense of which is a stronger interaction and why both seemingly correlate in CUTnRUN or ChIP-seq.

      3) The idea that electrostatics may modulate tail accessibility was reported by Musselman and coworkers for the H3 tail in eLife 2018. Yet the PHD domain of BPTF clearly binds H3K4me3 in nucleosomes. In light of this prior observation, the NMR experiments now with H4 tail seem repetitive and not informative regarding BPTF's bromodomain binding. Also, missing is the effect of H4K16acetylation on H4 tail dynamics, which would be pertinent to addressing the hypothesis regarding the BPTF bromodomain binding H4K16ac

      4) The NMR experiments are all undertaken with 150mM KCl with no NaCl present. While NMR experimental constraints are understandable, the authors should avoid sweeping statements from NMR experiments regarding the dynamism of histone tails in chromatin, unless specific experiments are cited/conducted to demonstrate the same in cells. Many factors may contribute to the exclusion of BPTF from modified histone tails in cells, including the binding of other reader proteins, and the precise genomic localization of these modifications vis-a-vis BPTF. The important role of anchoring proteins must also be taken into account when considering binding/non-binding of substrates by CAPs. Thus, the NMR experiments presented in the manuscript do not report on whether BPTF binds H4K16ac in cells or indeed in vitro. If the PHD domain is capable of ultimately binding the H3 tail despite the tail's fuzzy interaction with DNA, the question remains as to why the bromodomain may not do so for acetylated H4 tails?

      This manuscript reports several interesting elements regarding BPTF regulation, but as presented it is missing some key comparisons with prior information that makes it hard for readers to assess the relevance of the results presented.

    1. we are certainly special I mean 00:02:57 no other animal rich the moon or know how to build atom bombs so we are definitely quite different from chimpanzees and elephants and and all the rest of the animals but we are still 00:03:09 animals you know many of our most basic emotions much of our society is still run on Stone Age code
      • for: stone age code, similar to - Ronald Wright - computer metaphor, evolutionary psychology - examples, evolutionary paradox of modernity, evolution - last mile link, major evolutionary transition - full spectrum in modern humans, example - MET - full spectrum embedded in modern humans

      • comment

      • insights

        • evolutionary paradox of modernity
          • modern humans , like all the living species we share the world with, are the last mile link of the evolution of life we've made it to the present, so all species of the present are, in an evolutionary sense, winners of their respective evolutionary game
          • this means that all our present behaviors contain the full spectrum of the evolutionary history of 4 billion years of life
          • the modern human embodies all major evolutionary transitions of the past
          • so our behavior, at all levels of our being is a complex and heterogenous mixture of evolutionary adaptations from different time periods of the 4 billion years that life has taken to evolve.
          • Some behaviors may have originated billions of years ago, and others hundred thousand years ago.
      • Examples: humans embody full spectrum of METs in our evolutionary past

        • fight and flight response
          • early hominids on African Savannah hundreds of thousands to millions of years ago when hominids were predated upon by wild predators
        • cancer
          • normative intercell communication breaks down and reverts to individual cell behavior from billions of years ago
            • see Michael Levin's research on how to make metastatic cancer cells return to normative collective, cooperative behavior
        • children afraid to sleep in the dark
          • evolutionary adaptation against dangerous animals that might have hid in the dark - dangerous insiects, snakes, etc, which in the past may have resulted in human fatalities
        • obesity
          • hunter gatherer hominid attraction to rich sources of fruit. Eating as much of it as we can and maybe harvesting as much as we can and carrying that with us.
            • like squirrels storing away for the winter.
    1. the declaration itself is Smalltalk code, indeed the message #subclass:instanceVariableNames:classVariableNames:... was sent to Magnitude to create this class.

      Gilad Bracha's criticism:

      Semi-tangent: source code seems the most obvious thing in the world; but traditional Smalltalk’s have no real syntax above the method level! Classes are defined via the evaluation of reflective expressions, which rely on the reflective API. This is very problematic: the API often varies from one implementation to another. By the way, this is one of the ways Newspeak differs from almost every Smalltalk (the late, great Resilient being the only exception I can recall). Newspeak has a true syntax. Furthermore, because Newspeak module declarations are fully parametric in all their external dependencies, they can be compiled at any time in any order - unlike code in most languages (say Java packages) where there are numerous constraints on compilation order (e.g., imports must be defined).

      See the full blog post An Image Problem.

    1. on intervention me faisait penser aux travaux des des e notel calves sur qui décrit en fait le fait d'un manque du code de la laïcité qui 01:09:29 aussi nous parfois nous empêche de de nous retrouver dans la définition de ce terme et je demandais si du coup je trouais intéressant les les du coup les les recueils que tu avais fait sur les les notamment les les textes je 01:09:42 demandais dans les arrêts du Conseil d'État notamment par rapport à des affaires dans l'ordre scolaire il y avait ce synthagme de valeur de de la République qui était mentionné même si j'ai bien compris qu'il était pas été 01:09:53 finini mais est-ce qu'il peut est-ce qu'il est mentionnéors je vais parler sur le contrôle de Mathieu cloué qui vaêtre une réponse plus claire que moi dans les arrêts du Conseil d'État j'ai repéré au fait des définitions plus 01:10:05 précises lorsqu'il s'agissait d'expulser des étrangers qui attentaient au Val la publique donc là pour le coup c'est pas le des affaires en matière d'éducation mais plutôt des affaires en matière de droit d'asile ou en matière de droit des étrangers ou parfois des étrangers 01:10:18 étaient explusés parce qu'en fait ils étaient vu comme étant problématiqu vis-à-vis desurs la publique et donc le Conseil d'État il précisait au fait pourquoi il acceptait ou il rejetait la 01:10:29 décision d'qtf ou la décision de non attribution d'une demande d'asile mais en matière d'éducation je pense qu' il y a eu des études ou des avis du Conseil d'État mais jamais de décisions sur les valeurs 01:10:42 publiqu en matière scolaire C donc en matière de droits étrangers ça c'est sûr là je laisse Matthieu Clouet peut-être préciser je sais pas si je sais non à ma connaissance non il n'y a 01:10:54 pas de définition dans dans les arrêts du du Conseil d'État ce qu'on peut y trouver parfois ce sont des précisions sur la façon dont ça se décline notamment dans l'institution scolaire le lien avec l'assiduité 01:11:07 scolaire par exemple ça ça peut ça on peut le trouver
  2. Nov 2023

    Annotators

    1. 20.3.1. Programming in English# Most programming languages are based in English, and there are very few non-English programming languages, and those that exist are rarely used. The reason few non-English programming languages exist is due to the network effect, which we mentioned last chapter. Once English became the standard language for programming, people who learn programming learn English (or enough to program with it). Attempts to create a non-English programming language face an uphill battle, since even those that know that language would still have to re-learn all their programming terms in the non-English language. Now, since many people do speak other languages, you can often find comments, variable names, and even sometimes coding libraries which use non-English languages, but the core coding terms (e.g., for, if, etc.), are still almost always in English. See also this academic paper: Non-Native English Speakers Learning Computer Programming: Barriers, Desires, and Design Opportunities

      That resonates with me deeply. While English is not my native language, I've come to appreciate the wisdom behind programming being conducted exclusively in English. It establishes a universal standard, allowing coders worldwide to share a common language. This not only promotes convenience but ensures that even if someone's English proficiency is not perfect, their code remains understandable to English speakers.

    2. Now, since many people do speak other languages, you can often find comments, variable names, and even sometimes coding libraries which use non-English languages, but the core coding terms (e.g., for, if, etc.), are still almost always in English.

      I think that even with people who speak in other languages, the "for" and "if" coding terms are just symbols to them that they interpret in their brains as something necessary to code. I think the universal code makes it easier to handle.

    3. Most programming languages are based in English, and there are very few non-English programming languages, and those that exist are rarely used. The reason few non-English programming languages exist is due to the network effect, which we mentioned last chapter. Once English became the standard language for programming, people who learn programming learn English (or enough to program with it). Attempts to create a non-English programming language face an uphill battle, since even those that know that language would still have to re-learn all their programming terms in the non-English language. Now, since many people do speak other languages, you can often find comments, variable names, and even sometimes coding libraries which use non-English languages, but the core coding terms (e.g., for, if, etc.), are still almost always in English.

      As a child, I initially thought programming languages were translations of other languages, but I soon discovered that this wasn't the case when I began learning how to code. Given that English was the language of choice in the early development of programming languages, it makes sense that many programs were designed using the same foundation. However, I believe that to promote global learning and ensure efficient access to this skill, we should explore ways to make coding more accessible in languages other than English.

    1. le code de l’éducation le droit des élèves à suivre une scolarité sans harcèlement en se montrant particulièrement inquiet à l’égard du cyberharcèlement qui progresse du fait de l’utilisation des réseaux sociaux par les jeunes.

      N

    1. IPFS Connect Istanbul 2023: IPVM - Bringing Wasm-Based Edge Compute to IPFS

      [docdrop](IPFS Connect Istanbul 2023: IPVM - Bringing Wasm-Based Edge Compute to IPFS

      15 views 22 Nov 2023 The advent of TCP/IP and the web produced an explosion of innovation by radically lowering the barrier to entry to connect over the network. Thanks to new technical and social innovations, we now have the building blocks for the next generation of open services: location-free verifiable data and computation.

      Verifiable computation opens the door to content-addressed function invocations, results, and workflows. This radically lowers the complexity of historical architectures (e.g. the LAMP stack), networked, and distributed systems. Not only is this easier to reason about, it also (paradoxically) enables superlinear scaling: the more it gets used, the more efficient it becomes!

      This talk presents UCAN Invocations and the Interplanetary VM (IPVM). Code in this model can run anywhere (even offline), respects data privacy, and services interoperate seamlessly out of the box without pre-negotiation. Since computation doesn't happen in a vacuum, we will also describe how the workflow planner interacts with existing services and how to lift them into this seamless paradigm.) 15 views 22 Nov 2023 The advent of TCP/IP and the web produced an explosion of innovation by radically lowering the barrier to entry to connect over the network. Thanks to new technical and social innovations, we now have the building blocks for the next generation of open services: location-free verifiable data and computation.

      Verifiable computation opens the door to content-addressed function invocations, results, and workflows. This radically lowers the complexity of historical architectures (e.g. the LAMP stack), networked, and distributed systems. Not only is this easier to reason about, it also (paradoxically) enables superlinear scaling: the more it gets used, the more efficient it becomes!

      This talk presents UCAN Invocations and the Interplanetary VM (IPVM). Code in this model can run anywhere (even offline), respects data privacy, and services interoperate seamlessly out of the box without pre-negotiation. Since computation doesn't happen in a vacuum, we will also describe how the workflow planner interacts with existing services and how to lift them into this seamless paradigm.

    2. IPFS Connect Istanbul 2023: IPVM - Bringing Wasm-Based Edge Compute to IPFS

      15 views 22 Nov 2023 The advent of TCP/IP and the web produced an explosion of innovation by radically lowering the barrier to entry to connect over the network. Thanks to new technical and social innovations, we now have the building blocks for the next generation of open services: location-free verifiable data and computation.

      Verifiable computation opens the door to content-addressed function invocations, results, and workflows. This radically lowers the complexity of historical architectures (e.g. the LAMP stack), networked, and distributed systems. Not only is this easier to reason about, it also (paradoxically) enables superlinear scaling: the more it gets used, the more efficient it becomes!

      This talk presents UCAN Invocations and the Interplanetary VM (IPVM). Code in this model can run anywhere (even offline), respects data privacy, and services interoperate seamlessly out of the box without pre-negotiation. Since computation doesn't happen in a vacuum, we will also describe how the workflow planner interacts with existing services and how to lift them into this seamless paradigm.

    3. 15 views 22 Nov 2023 The advent of TCP/IP and the web produced an explosion of innovation by radically lowering the barrier to entry to connect over the network. Thanks to new technical and social innovations, we now have the building blocks for the next generation of open services: location-free verifiable data and computation.

      Verifiable computation opens the door to content-addressed function invocations, results, and workflows. This radically lowers the complexity of historical architectures (e.g. the LAMP stack), networked, and distributed systems. Not only is this easier to reason about, it also (paradoxically) enables superlinear scaling: the more it gets used, the more efficient it becomes!

      This talk presents UCAN Invocations and the Interplanetary VM (IPVM). Code in this model can run anywhere (even offline), respects data privacy, and services interoperate seamlessly out of the box without pre-negotiation. Since computation doesn't happen in a vacuum, we will also describe how the workflow planner interacts with existing services and how to lift them into this seamless paradigm.

    4. This talk presents UCAN Invocations and the Interplanetary VM (IPVM).

      This talk presents UCAN Invocations and the Interplanetary VM (IPVM). Code in this model can run anywhere (even offline), respects data privacy, and services interoperate seamlessly out of the box without pre-negotiation. Since computation doesn't happen in a vacuum, we will also describe how the workflow planner interacts with existing services and how to lift them into this seamless paradigm.

    1. Change: Complete re-write of code handling keyboard frames and avoidance. This addresses a few nagging issues with improper display around the on-screen keyboard on iPad, in particular. Fix: Keyboard layout issues in share extension on iPad.

      I do believe this might be the culprit for all the problems I've had with suddenly/inconsistently unresponsive keyboard shortcuts recently...

    1. ​We must start with end-programmer programming

      ​We must start with end-programmer programming - to achieve end-user programming,

      - where users can
      
      • spin up personalized apps
        • without knowing how to code.

      With end-programmer programming, software engineers can build - folk applications, integrations, and mini-apps - to customize - their experience interacting with third-party software.

    2. ​We must start with end-programmer programming to achieve end-user programming, where users can spin up personalized apps without knowing how to code. With end-programmer programming, software engineers can build folk applications, integrations, and mini-apps to customize their experience interacting with third-party software.

      Description

      local ipfs

    1. Anybody that writes code for some purpose (whether as a researcher, a software engineer, or in any other profession) will get to the point where others are relying on their code.

      代码不是写给自己一个人的

    1. For organizations where passwordless authentication is not yet possible, the next best option is to use adaptive multi-factor authentication (Adaptive MFA) as a security measure. This approach monitors the user’s login behavior on the basis of location, device, network, and more to determine which authentication methods to use. If the risk factor is high, then the user would be asked to submit an additional identifying factor such as an TOTP code or a one-time password.

      adaptive multi-factor authentication

    1. m. Daalderop etal. (1988a) predicted a large Kerr rotation of 5j for UNiSn, but did, unfortunately, notpublish a test of the computational method on simpler systems like the elemental 3i metals.

      Did they make their code available for others to do the test?

    Annotators

    1. Never worry about stack traces again: The IDE should just get it, and auto-fix the code for you.

      Suggesting changes (diff) instead of "auto-changing" would be more in the comfort zone for most, even in 2024 (??)

    2. Pseudo-code mode: Edit an “outline” representation of your code and have the changes automatically applied at the source level.

      This a whole other project :) With a baggage of structure editing, edit-AST->code generation

    3. Reader mode: Make code understanding effortless with docs at any level of specificity and a bot that guides you through the relevant code paths, explaining as-needed.

      Generating diagrams would be useful (e.g. mermaid), a la chatGPT plugins

      Overlay UI * for missing docstrings, top level docstring for package/module/namespace * line level explainers / exploratory comments

      General overview w/ jump to symbol links

      Quote documentation from relavant packages w/ links

    4. Time warp: Predict and display the cross-file code changes you’ll make in the next 15 minutes. One key command to accept all insertions/deletions.

      Needs: * accept all * accept granular changes * ability to ask for variants & iterate

      Interesting, real-time-ish as you type HUD

      (for that Also high-level summary of the changes)

    1. Reviewer #1 (Public Review):

      This work seeks to understand how behaviour-related information is represented in the neural activity of the primate motor cortex. To this end, a statistical model of neural activity is presented that enables a non-linear separation of behaviour-related from unrelated activity. As a generative model, it enables the separate analysis of these two activity modes, here primarily done by assessing the decoding performance of hand movements the monkeys perform in the experiments. Several lines of analysis are presented to show that while the neurons with significant tuning to movements strongly contribute to the behaviourally-relevant activity subspace, less or un-tuned neurons also carry decodable information. It is further shown that the discovered subspaces enable linear decoding, leading the authors to conclude that motor cortex read-out can be linear.

      Strengths:

      In my opinion, using an expressive generative model to analyse neural state spaces is an interesting approach to understanding neural population coding. While potentially sacrificing interpretability, this approach allows capturing both redundancies and synergies in the code as done in this paper. The model presented here is a natural non-linear extension of a previous linear model (PSID) and

      Weaknesses:

      First, the model in the paper is almost identical to an existing VAE model (TNDM) that makes use of weak supervision with behaviour in the same way [1]. This paper should at least be referenced. If the authors wish they could compare their model to TNDM, which combines a state space model with smoothing similar to LFADS. Given that TNDM achieves very good behaviour reconstructions, it may be on par with this model without the need for a Kalman filter (and hence may achieve better separation of behaviour-related and unrelated dynamics).

      Second, in my opinion, the claims regarding identifiability are overstated - this matters as the results depend on this to some extent. Recent work shows that VAEs generally suffer from identifiability problems due to the Gaussian latent space [2]. This paper also hints that weak supervision may help to resolve such issues, so this model as well as TNDM and CEBRA may indeed benefit from this. In addition however, it appears that the relative weight of the KL Divergence in the VAE objective is chosen very small compared to the likelihood (0.1%), so the influence of the prior is weak and the model may essentially learn the average neural trajectories while underestimating the noise in the latent variables. This, in turn, could mean that the model will not autoencode neural activity as well as it should, note that an average R2 in this case will still be high (I could not see how this is actually computed). At the same time, the behaviour R2 will be large simply because the different movement trajectories are very distinct. Since the paper makes claims about the roles of different neurons, it would be important to understand how well their single trial activities are reconstructed, which can perhaps best be investigated by comparing the Poisson likelihood (LFADS is a good baseline model). Taken together, while it certainly makes sense that well-tuned neurons contribute more to behaviour decoding, I worry that the very interesting claim that neurons with weak tuning contain behavioural signals is not well supported.

      Third, and relating to this issue, I could not entirely follow the reasoning in the section arguing that behavioural information can be inferred from neurons with weak selectivity, but that it is not linearly decodable. It is right to test if weak supervision signals bleed into the irrelevant subspace, but I could not follow the explanations. Why, for instance, is the ANN decoder on raw data (I assume this is a decoder trained fully supervised) not equal in performance to the revenant distilled signals? Should a well-trained non-linear decoder not simply yield a performance ceiling? Next, if I understand correctly, distilled signals were obtained from the full model. How does a model perform trained only on the weakly tuned neurons? Is it possible that the subspaces obtained with the model are just not optimally aligned for decoding? This could be a result of limited identifiability or model specifics that bias reconstruction to averages (a well-known problem of VAEs). I, therefore, think this analysis should be complemented with tests that do not depend on the model.

      Finally, a more technical issue to note is related to the choice to learn a non-parametric prior instead of using a conventional Gaussian prior. How is this implemented? Is just a single sample taken during a forward pass? I worry this may be insufficient as this would not sample the prior well, and some other strategy such as importance sampling may be required (unless the prior is not relevant as it weakly contributed to the ELBO, in which case this choice seems not very relevant). Generally, it would be useful to see visualisations of the latent variables to see how information about behaviour is represented by the model.

      Summary:

      This paper presents a very interesting analysis, but I have several concerns as to well the analysis supports the main conclusions. I think the work could benefit from an additional complementary analysis that seeks to confirm with another method if weakly tuned neurons indeed show an encoding that differs qualitatively from the strongly tuned ones.

      [1] Hurwitz, Cole, et al. "Targeted neural dynamical modeling." Advances in Neural Information Processing Systems 34 (2021): 29379-29392.<br /> [2] Hyvarinen, Aapo, Ilyes Khemakhem, and Hiroshi Morioka. "Nonlinear Independent Component Analysis for Principled Disentanglement in Unsupervised Deep Learning." arXiv preprint arXiv:2303.16535 (2023).

    1. Observe the text here is formatted to ease code understanding. It is possible to write the cascade of messages in one line, but it reduces the readability of the code:

      Use a Cascade to send several messages to the same receiver. Separate the messages with a semicolon. Put each message on its own line and indent one tab. Only use Cascades for messages with zero or one argument.

      —Kent Beck, Smalltalk Best Practice Patterns

    1. Lyle Bickley explains the PDP-1 (and we play the original Spacewar! at Curious Mark YouTube channel

      From the description:

      Lyle Bickley, of the PDP-1 restoration team, gives us a tour of this amazing, early scientific interactive computer at the Computer History Museum. The first machine built by DEC in 1959, it features a superb graphics screen. DEC gave one to MIT, and some very bright students went wild. Gems such as Spacewar!, Snowflake, 4-voice music programs were all developed by moonlighting MIT students, unencumbered by its measly 12kW memory and pokey 100,000 instructions per second. Along with much more serious debugging and programming languages of course. You can come and see the real machine for yourself at the Computer History Museum in Mountain View, California: http://www.computerhistory.org/

      Also, Norbert Landsteiner made this incredible simulation of the PDP-1 that can run the original Spacewar! and Minkytron code in your browser: https://www.masswerk.at/spacewar/ https://www.masswerk.at/minskytron/ He also made a gate exact replica with Verilog code on github: https://www.youtube.com/watch?v=iymD9eysqXo

    1. This is where I showcase the various images I've created over the years pursuing my hobby of using maths to create patterns. Wherever possible I've tried to provide explanations of how the patterns are generated and in some cases I've provided source code that you can download and adapt.

      A website is both a hobby and a life's work

    1. Scarlet Harris (profile at University of Cambridge)

      Contact Information: sh2232@cam.ac.uk

      Dr Scarlet Harris is a Teaching Associate in the department of Sociology.

      She received her BA in Sociology from the University of Edinburgh and her MRes and PhD in Sociology from the University of Glasgow. Dr Harris has held various research posts at the University of Manchester, including with the Centre on the Dynamics of Ethnicity (CoDE).

      She is currently writing a book based on her doctoral research, entitled 'Islamophobia, anti-racism and the British left', which will be published by Manchester University Press.

      https://research.sociology.cam.ac.uk/profile/dr-scarlet-harris

      accessed:: 2023-11-25 17:10

    1. <?php $fullname = 'Mathieu Nebra'; echo 'Bonjour ' . $fullname . ' et bienvenue sur le site !'; // OK ?>

      Les points avant et après la variable sont-ils utiles ou es-ce dans un souci de lisibilité du code ?

    1. I genuinely can’t understand how anybody could look at the mess that’s Rust’s async and think that it was a good design for a language that already had the reputation of being very complicated to write.I tried to get it, I really did, but my god what a massive mess that is. And it contaminates everything it touches, too. I really love Rust and I do most of my coding in it these days, but every time I encounter async-heavy Rust code my jaw clenches and my vision blurs.

      我无法理解为什么会有人觉得:对于一个已经以复杂著称的Rust来说,这种 Aysnc 实现还能称得上是一个好设计。 我曾尝试去理解它,但还是无法接受,它会污染所有跟它有关的代码。我很喜欢 Rust,同时它也是我最常用的编程语言,但每次我在编写异步密集的代码时,我牙关就会开始打颤、视线也变得模糊。

    1. Reviewer #1 (Public Review):

      Summary:

      The goal of Pawel et al. is to provide a more rigorous and quantitative approach for judging whether or not an initial null finding (conventionally with p >= 0.05) has been replicated by a second similarly null finding. They discuss important objections to relying on the qualitative significant/non-significant dichotomy to make this judgement. They present two complementary methods (one frequentist and the other Bayesian) which provide a superior quantitative framework for assessing the replicability of null findings.

      Strengths:

      Clear presentation; illuminating examples drawn from the well-known Reproducibility Project: Cancer Biology data set; R-code that implements suggested analyses. Using both methods as suggested provides a superior procedure for judging the replicability of null findings.

      Weaknesses:

      The proposed frequentist and the Bayesian methods both rely on binary assessments of an original finding and its replication. I'm not sure if this is a weakness or is inherent to making binary decisions based on continuous data.

      For the frequentist method, a null finding is considered replicated if the original and replication 90% confidence intervals for the effects both fall within the equivalence range. According to this approach, a null finding would be considered replicated if p-values of both equivalences tests (original and replication) were, say, 0.049, whereas would not be considered replicated if, for example, the equivalence test of the original study had a p-value of 0.051 and the replication had a p-value of 0.001. Intuitively, the evidence for replication would seem to be stronger in the second instance. The recommended Bayesian approach similarly relies on a dichotomy (e.g. Bayes factor > 1).

    2. Reviewer #2 (Public Review):

      Summary:

      The study demonstrates how inconclusive replications of studies initially with p > 0.05 can be and employs equivalence tests and Bayesian factor approaches to illustrate this concept. Interestingly, the study reveals that achieving a success rate of 11 out of 15, or 73%, as was accomplished with the non-significance criterion from the RPCB (Reproducibility Project: Cancer Biology), requires unrealistic margins of Δ > 2 for equivalence testing.

      Strengths:

      The study uses reliable and sharable/open data to demonstrate its findings, sharing as well the code for statistical analysis. The study provides sensitivity analysis for different scenarios of equivalence margin and alfa level, as well as for different scenarios of standard deviations for the prior of Bayes factors and different thresholds to consider. All analysis and code of the work is open and can be replicated. As well, the study demonstrates on a case-by-case basis how the different criteria can diverge, regarding one sample of a field of science: preclinical cancer biology. It also explains clearly what Bayes factors and equivalence tests are.

      Weaknesses:

      It would be interesting to investigate whether using Bayes factors and equivalence tests in addition to p-values results in a clearer scenario when applied to replication data from other fields. As mentioned by the authors, the Reproducibility Project: Experimental Philosophy (RPEP) and the Reproducibility Project: Psychology (RPP) have data attempting to replicate some original studies with null results. While the RPCB analysis yielded a similar picture when using both criteria, it is worth exploring whether this holds true for RPP and RPEP. Considerations for further research in this direction are suggested. Even if the original null results were excluded in the calculation of an overall replicability rate based on significance, sensitivity analyses considering them could have been conducted. The present authors can demonstrate replication success using the significance criteria in these two projects with initially p < 0.05 studies, both positive and non-positive.

      Other comments:

      - Introduction: The study demonstrates how inconclusive replications of studies initially with p > 0.05 can be and employs equivalence tests and Bayesian factor approaches to illustrate this concept. Interestingly, the study reveals that achieving a success rate of 11 out of 15, or 73%, as was accomplished with the non-significance criterion from the RPCB (Reproducibility Project: Cancer Biology), requires unrealistic margins of Δ > 2 for equivalence testing.

      - Overall picture vs. case-by-case scenario: An interesting finding is that the authors observe that in most cases, there is no substantial evidence for either the absence or the presence of an effect, as evidenced by the equivalence tests. Thus, using both suggested criteria results in a picture similar to the one initially raised by the paper itself. The work done by the authors highlights additional criteria that can be used to further analyze replication success on a case-by-case basis, and I believe that this is where the paper's main contributions lie. Despite not changing the overall picture much, I agree that the p-value criterion by itself does not distinguish between (1) a situation where the original study had low statistical power, resulting in a highly inconclusive non-significant result that does not provide evidence for the absence of an effect and (2) a scenario where the original study was adequately powered, and a non-significant result may indeed provide some evidence for the absence of an effect when analyzed with appropriate methods. Equivalence testing and Bayesian factor approaches are valuable tools in both cases.

      Regarding the 0.05 threshold, the choice of the prior distribution for the SMD under the alternative 𝐻1 is debatable, and this also applies to the equivalence margin. Sensitivity analyses, as highlighted by the authors, are helpful in these scenarios.

    1. The response contains just a status code of 201 Created, which indicates that Salesforce successfully received the job data.

      What happens if I hit the endpoint multiple times?

    1. these commonly used artifacts are small code snippets that are entirely functional in nature and, therefore, when used in isolation, don't enjoy copyright protection at all.

      They are not "used" in isolation though, as it is the context of these code snippets gleaned from tons of code that makes it valuable to the ML scene in comparison to static autocompletion of the prior generation

    1. My assumption was and is that early voting is not absentee/mail-in.

      This assumption is flat-out incorrect. Virginia statute explicitly refers to "early voting" as "[a]bsentee voting in person." Va. Code Ann. § 24.2-701.1. If Mr. Dreyer had been familiar with the terminology used by election officials, the answer Mr. Dreyer should have provided the officer of election when asked whether he had voted absentee would have been "yes." That would have avoided all of the ensuing confusion. With this context, an "unsettling Election Day story," becomes nothing more than another example of the system working.

    1. One of the ways that, that chat G BT is very powerful is that uh if you're sufficiently educated about computers and you want to make a computer program and you can instruct uh chat G BT in what you want with enough specificity, it can write the code for you. It doesn't mean that every coder is going to be replaced by Chad GP T, but it means that a competent coder uh with an imagination can accomplish a lot more than she used to be able to, uh maybe she could do the work of five coders. Um So there's a dynamic where people who can master the technology can get a lot more done.

      ChatGPT augments, not replaces

      You have to know what you want to do before you can provide the prompt for the code generation.

    1. This is, quite obviously, a phenomenal outcome for Microsoft. The company already has a perpetual license to all OpenAI IP (short of artificial general intelligence), including source code and model weights; the question was whether it would have the talent to exploit that IP if OpenAI suffered the sort of talent drain that was threatened upon Altman and Brockman’s removal. Indeed they will, as a good portion of that talent seems likely to flow to Microsoft; you can make the case that Microsoft just acquired OpenAI for $0 and zero risk of an antitrust lawsuit.

      I was just telling someone that OpenAI must decide who they wants to be. Do they want to be AWS/Azure or do they want to '"freely collaborate" with other institutions and researchers by making its patents and research open to the public'?

      Well it looks like the choice was made for them. They just changed the name from "OpenAI" to... "Microsoft":

    1. RECOMMANDATION 9Assurer aux enfants des familles hébergées par le Samu Social le même accès aux activités péri et extrascolaires qu’à tous les enfants résidant sur le territoire de la commune, en limitant notamment les justificatifs nécessaires à leur inscription à ceux prévus aux articles L. 131-6 et D. 131-3-1 du code de l’éducation pour la scolarisation des enfants.Destinataires : Maires.
    1. Two-factor authentication, or two-step authentication, is a login process where the user is asked to provide two authentication points, such as a password and a code shared through a text message. Two-factor authentication enhances login security.

      Two-factor authentication is important to teach students. The process of login in the website as well as the passwords or codes that can be sent through a text.

    2. Two-factor authentication, or two-step authentication, is a login process where the user is asked to provide two authentication points, such as a password and a code shared through a text message. Two-factor authentication enhances login security.

      Working in IT, I can not tell you the sheer number of people who would do anything to avoid having to take the extra steps for security. Even when we explain to them the exact issues that come up, it seems there is a misunderstanding often about how dire a hacked account could become. This is part of why it is so necessary to make these security measures normalized as early as possible with students and technology!

    1. eLife assessment

      This work is a valuable presentation of sharp-wave-ripple reactivation of hippocampal neural ensemble activity recorded as animals explored two different environments. It attempts to use the fact that the ensemble code remaps between the two mazes to identify the best replay-detection procedures for analyzing this type of data. The reviewers found the evidence for a prescriptive conclusion inadequate, while still appreciating the concept of comparing maze-identity discrimination with replay.

    2. Reviewer #1 (Public Review):

      This work introduces a novel framework for evaluating the performance of statistical methods that identify replay events. This is challenging because hippocampal replay is a latent cognitive process, where the ground truth is inaccessible, so methods cannot be evaluated against a known answer. The framework consists of two elements:<br /> 1. A replay sequence p-value, evaluated against shuffled permutations of the data, such as radon line fitting, rank-order correlation, or weighted correlation. This element determines how trajectory-like the spiking representation is. The p-value threshold for all accepted replay events is adjusted based on an empirical shuffled distribution to control for the false discovery rate.<br /> 2. A trajectory discriminability score, also evaluated against shuffled permutations of the data. In this case, there are two different possible spatial environments that can be replayed, so the method compares the log odds of track 1 vs. track 2.

      The authors then use this framework (accepted number of replay events and trajectory discriminability) to study the performance of replay identification methods. They conclude that sharp wave ripple power is not a necessary criterion for identifying replay event candidates during awake run behavior if you have high multiunit activity, a higher number of permutations is better for identifying replay events, linear Bayesian decoding methods outperform rank-order correlation, and there is no evidence for pre-play.

      The authors tackle a difficult and important problem for those studying hippocampal replay (and indeed all latent cognitive processes in the brain) with spiking data: how do we understand how well our methods are doing when the ground truth is inaccessible? Additionally, systematically studying how the variety of methods for identifying replay perform, is important for understanding the sometimes contradictory conclusions from replay papers. It helps consolidate the field around particular methods, leading to better reproducibility in the future. The authors' framework is also simple to implement and understand and the code has been provided, making it accessible to other neuroscientists. Testing for track discriminability, as well as the sequentiality of the replay event, is a sensible additional data point to eliminate "spurious" replay events.

      However, there are some concerns with the framework as well. The novelty of the framework is questionable as it consists of a log odds measure previously used in two prior papers (Carey et al. 2019 and the authors' own Tirole & Huelin Gorriz, et al., 2022) and a multiple comparisons correction, albeit a unique empirical multiple comparisons correction based on shuffled data.

      With respect to the log odds measure itself, as presented, it is reliant on having only two options to test between, limiting its general applicability. Even in the data used for the paper, there are sometimes three tracks, which could influence the conclusions of the paper about the validity of replay methods. This also highlights a weakness of the method in that it assumes that the true model (spatial track environment) is present in the set of options being tested. Furthermore, the log odds measure itself is sensitive to the defined ripple or multiunit start and end times, because it marginalizes over both position and time, so any inclusion of place cells that fire for the animal's stationary position could influence the discriminability of the track. Multiple track representations during a candidate replay event would also limit track discriminability. Finally, the authors call this measure "trajectory discriminability", which seems a misnomer as the time and position information are integrated out, so there is no notion of trajectory.

      The authors also fail to make the connection with the control of the false discovery rate via false positives on empirical shuffles with existing multiple comparison corrections that control for false discovery rates (such as the Benjamini and Hochberg procedure or Storey's q-value). Additionally, the particular type of shuffle used will influence the empirically determined p-value, making the procedure dependent on the defined null distribution. Shuffling the data is also considerably more computationally intensive than the existing multiple comparison corrections.

      Overall, the authors make interesting conclusions with respect to hippocampal replay methods, but the utility of the method is limited in scope because of its reliance on having exactly two comparisons and having to specify the null distribution to control for the false discovery rate. This work will be of interest to electrophysiologists studying hippocampal replay in spiking data.

    1. eLife assessment

      This work is a valuable presentation of sharp-wave-ripple reactivation of hippocampal neural ensemble activity recorded as animals explored two different environments. It attempts to use the fact that the ensemble code remaps between the two mazes to identify the best replay-detection procedures for analyzing this type of data. The reviewers found the evidence for a prescriptive conclusion inadequate, while still appreciating the concept of comparing maze-identity discrimination with replay.

    2. Reviewer #1 (Public Review):

      This work introduces a novel framework for evaluating the performance of statistical methods that identify replay events. This is challenging because hippocampal replay is a latent cognitive process, where the ground truth is inaccessible, so methods cannot be evaluated against a known answer. The framework consists of two elements:<br /> 1. A replay sequence p-value, evaluated against shuffled permutations of the data, such as radon line fitting, rank-order correlation, or weighted correlation. This element determines how trajectory-like the spiking representation is. The p-value threshold for all accepted replay events is adjusted based on an empirical shuffled distribution to control for the false discovery rate.<br /> 2. A trajectory discriminability score, also evaluated against shuffled permutations of the data. In this case, there are two different possible spatial environments that can be replayed, so the method compares the log odds of track 1 vs. track 2.

      The authors then use this framework (accepted number of replay events and trajectory discriminability) to study the performance of replay identification methods. They conclude that sharp wave ripple power is not a necessary criterion for identifying replay event candidates during awake run behavior if you have high multiunit activity, a higher number of permutations is better for identifying replay events, linear Bayesian decoding methods outperform rank-order correlation, and there is no evidence for pre-play.

      The authors tackle a difficult and important problem for those studying hippocampal replay (and indeed all latent cognitive processes in the brain) with spiking data: how do we understand how well our methods are doing when the ground truth is inaccessible? Additionally, systematically studying how the variety of methods for identifying replay perform, is important for understanding the sometimes contradictory conclusions from replay papers. It helps consolidate the field around particular methods, leading to better reproducibility in the future. The authors' framework is also simple to implement and understand and the code has been provided, making it accessible to other neuroscientists. Testing for track discriminability, as well as the sequentiality of the replay event, is a sensible additional data point to eliminate "spurious" replay events.

      However, there are some concerns with the framework as well. The novelty of the framework is questionable as it consists of a log odds measure previously used in two prior papers (Carey et al. 2019 and the authors' own Tirole & Huelin Gorriz, et al., 2022) and a multiple comparisons correction, albeit a unique empirical multiple comparisons correction based on shuffled data.

      With respect to the log odds measure itself, as presented, it is reliant on having only two options to test between, limiting its general applicability. Even in the data used for the paper, there are sometimes three tracks, which could influence the conclusions of the paper about the validity of replay methods. This also highlights a weakness of the method in that it assumes that the true model (spatial track environment) is present in the set of options being tested. Furthermore, the log odds measure itself is sensitive to the defined ripple or multiunit start and end times, because it marginalizes over both position and time, so any inclusion of place cells that fire for the animal's stationary position could influence the discriminability of the track. Multiple track representations during a candidate replay event would also limit track discriminability. Finally, the authors call this measure "trajectory discriminability", which seems a misnomer as the time and position information are integrated out, so there is no notion of trajectory.

      The authors also fail to make the connection with the control of the false discovery rate via false positives on empirical shuffles with existing multiple comparison corrections that control for false discovery rates (such as the Benjamini and Hochberg procedure or Storey's q-value). Additionally, the particular type of shuffle used will influence the empirically determined p-value, making the procedure dependent on the defined null distribution. Shuffling the data is also considerably more computationally intensive than the existing multiple comparison corrections.

      Overall, the authors make interesting conclusions with respect to hippocampal replay methods, but the utility of the method is limited in scope because of its reliance on having exactly two comparisons and having to specify the null distribution to control for the false discovery rate. This work will be of interest to electrophysiologists studying hippocampal replay in spiking data.

    1. Now, the prover commits to the restof the columns (c and d), which may depend on the random values.

      I guess this explains what the interaction trace is about, and why there are interaction elements coming from verifier in the code base.

    1. if you're going to write a 00:35:14 plugin for an ide prepare for your hello world to be days of learning and pages of code just to do the hello world

      If you're going to write a plugin for an IDE, prepare for your hello-world to be days of learning and pages of code just to do the hello-world.

    2. now i would love someday to do a plug-in for intellij that understands all of the 00:33:01 custom stuff for my game code right you know i would love to but you know that's that's a project

      Now, I would love someday to do a plug-in for IntelliJ that understands all of the custom stuff for my game code. Right? You know, I would love to, but you know that's that's a project.

    1. Rule 38 - Right to Trial by Jury(a) Exercise of Right. Upon the filing of a demand and the simultaneous payment of the requisite jury fee by any party in actions wherein a trial by jury is provided by constitution or by statute, including actions for the recovery of specific real or personal property, with or without damages, or for money claimed as due on contract, or as damages for breach of contract, or for injuries to person or property, all issues of fact shall be tried by a jury. The jury fee is not refundable; however, a demanding party may waive that party's demand for trial by jury pursuant to section (e) of this rule.

      Fuck you haley...gonna have your fucking license

      "Upon filing of a demand by any party wherein a trial by jury is provided by statute, all issues of fact shall be tried by a jury"

      AND THIS STATEMENT FROM THE COLORADO JUDICIAL BRANCH (And mag. McLean heard from my mouth I wanted a trial): What is the court process in dependency and neglect cases? A dependency and neglect case begins with the filing of a petition by the county attorney or, in Denver, the city attorney. Parents who are listed in the D&N petition are referred to as “respondents.” You are required to appear in court and at that time, you may deny the allegations against you and demand that the case then be heard at trial by a jury of six people, by a judge, or by a juvenile magistrate. https://www.courts.state.co.us/userfiles/File/Media/Brochures/d&nweb.pdf

      See: https://hyp.is/go?url=https%3A%2F%2Fcasetext.com%2Fstatute%2Fcolorado-revised-statutes%2Ftitle-19-childrens-code%2Farticle-3-dependency-and-neglect%2Fpart-2-general-provisions%2Fsection-19-3-202-effective-until-112024-right-to-counsel-and-jury-trial&group=world

    1. The author states that they believe EventStores are more valuable because they maintain context whereas datomic's records changes without context. Datomic arguably leaves other things out... it doesn't, for example, truly provide an entity kind or a firm consistent schema. It's up to application code to define more formal relationships of data kinds. Arguable you could also say it's up to the application to capture context changes. I do wonder if a hybrid model between these two would result in an overall better solution.

    1. (d) If it appears from the evidence that the child may have a mental health disorder

      Section 27-65-102 - Definitions (22) "Mental health disorder" includes one or more substantial disorders of the cognitive, volitional, or emotional processes that grossly impairs judgment or capacity to recognize reality or to control behavior. An intellectual or developmental disability is insufficient to either justify or exclude a finding of a mental health disorder pursuant to the provisions of this article 65.

      Section 19-3-506 - [Effective Until 7/1/2024] Child with a mental health disorder or an intellectual and developmental disability - procedure√

      (b) If it appears from the evidence presented at an adjudicatory hearing or otherwise that a child may have a mental health disorder, as defined in section 27-65-102,

      the court shall order a prescreening to determine whether the child requires further evaluation. The prescreening must be conducted as expeditiously as possible, and a prescreening report must be provided to the court within twenty-four hours

      c) If the mental health professional finds, based upon a prescreening done pursuant to this section or section 19-3-403 (4), that the child may have a mental health disorder, as defined in section 27-65-102, the court shall review the prescreening report within twenty-four hours, excluding Saturdays, Sundays, and legal holidays, and order the child placed for an evaluation at a facility designated by the commissioner of the behavioral health administration

      (d) An evaluation conducted pursuant to this subsection (1) must be completed within seventy-two hours, excluding Saturdays, Sundays, and legal holidays. A county jail or a detention facility, as described in article 2.5 of this title 19, is not considered a suitable facility for evaluation,

    1. Reviewer #2 (Public Review):

      Summary:

      The manuscript by Wankowicz et al. describes updates to qFit, an algorithm for the characterization of conformational heterogeneity of protein molecules based on X-ray diffraction of Cryo-EM data. The work provides a clear description of the algorithm used by qFit. The authors then proceed to validate the performance of qFit by comparing it to deposited X-ray entries in the PDB in the 1.2-1.5 Å resolution range as quantified by Rfree, Rwork-Rfree, detailed examination of the conformations introduced by qFit, and performance on stereochemical measures (MolProbity scores). To examine the effect of experimental resolution of X-ray diffraction data, they start from an ultra high-resolution structure (SARS-CoV2 Nsp3 macrodomain) to determine how the loss of resolution (introduced artificially) degrades the ability of qFit to correctly infer the nature and presence of alternate conformations. The authors observe a gradual loss of ability to correctly infer alternate conformations as resolution degrades past 2 Å. The authors repeat this analysis for a larger set of entries in a more automated fashion and again observe that qFit works well for structures with resolutions better than 2 Å, with a rapid loss of accuracy at lower resolution. Finally, the authors examine the performance of qFit on cryo-EM data. Despite a few prominent examples, the authors find only a handful (8) of datasets for which they can confirm a resolution better than 2.0 Å. The performance of qFit on these maps is encouraging and will be of much interest because cryo-EM maps will, presumably, continue to improve and because of the rapid increase in the availability of such data for many supramolecular biological assemblies. As the authors note, practices in cryo-EM analysis are far from uniform, hampering the development and assessment of tools like qFit.

      Strengths:

      qFit improves the quality of refined structures at resolutions better than 2.0 A, in terms of reflecting true conformational heterogeneity and geometry. The algorithm is well designed and does not introduce spurious or unnecessary conformational heterogeneity. I was able to install and run the program without a problem within a computing cluster environment. The paper is well written and the validation thorough.<br /> I found the section on cryo-EM particularly enlightening, both because it demonstrates the potential for discovery of conformational heterogeneity from such data by qFit, and because it clearly explains the hurdles towards this becoming common practice, including lack of uniformity in reporting resolution, and differences in map and solvent treatment.

      Weaknesses:

      The authors begin the results section by claiming that they made "substantial improvement" relative to the previous iteration of qFit, "both algorithmically (e.g., scoring is improved by BIC, sampling of B factors is now included) and computationally (improving the efficiency and reliability of the code)" (bottom of page 3). However, the paper does not provide a comparison to previous iterations of the software or quantitation of the effects of these specific improvements, such as whether scoring is improved by the BIC, how the application of BIC has changed since the previous paper, whether sampling of B factors helps, and whether the code faster. It would help the reader to understand what, if any, the significance of each of these improvements was.

      The exclusion of structures containing ligands and multichain protein models in the validation of qFit was puzzling since both are very common in the PDB. This may convey the impression that qFit cannot handle such use cases. (Although it seems that qFit has an algorithm dedicated to modeling ligand heterogeneity and seems to be able to handle multiple chains). The paper would be more effective if it explained how a user of the software would handle scenarios with ligands and multiple chains, and why these would be excluded from analysis here.

      It would be helpful to add some guidance on how/whether qFit models can be further refined afterwards in Coot, Phenix, ..., or whether these models are strictly intended as the terminal step in refinement.

      Appraisal & Discussion:

      Overall, the authors convincingly demonstrate that qFit provides a reliable means to detect and model conformational heterogeneity within high-resolution X-ray diffraction datasets and (based on a smaller sample) in cryo-EM density maps. This represents the state of the art in the field and will be of interest to any structural biologist or biochemist seeking to attain an understanding of the structural basis of the function of their system of interest, including potential allosteric mechanisms-an area where there are still few good solutions. That is, I expect qFit to find widespread use.

    1. This Ethics Code is intended to provide specific standards to cover most situations encountered by psychologists.

      The ethics code is a blueprint to navigate different situations psychologist would encounter but the phrase "cover most" leave room for a possibility that these professionals would be face with situations not addressed by the ethics code.

    1. Reviewer #2 (Public Review):

      Summary:

      This manuscript describes P. falciparum population structure in Zanzibar and mainland Tanzania. 282 samples were typed using molecular inversion probes. The manuscript is overall well-written and shows a clear population structure. It follows a similar manuscript published earlier this year, which typed a similar number of samples collected mostly in the same sites around the same time. The current manuscript extends this work by including a large number of samples from coastal Tanzania, and by including clinical samples, allowing for a comparison with asymptomatic samples.

      The two studies made overall very similar findings, including strong small-scale population structure, related infections on Zanzibar and the mainland, near-clonal expansion on Pemba, and frequency of markers of drug resistance. Despite these similarities, the previous study is mentioned a single time in the discussion (in contrast, the previous research from the authors of the current study is more thoroughly discussed). The authors missed an opportunity here to highlight the similar findings of the two studies.

      Strengths:

      The overall results show a clear pattern of population structure. The finding of highly related infections detected in close proximity shows local transmission and can possibly be leveraged for targeted control.

      Weaknesses:

      A number of points need clarification:

      It is overall quite challenging to keep track of the number of samples analyzed. I believe the number of samples used to study population structure was 282 (line 141), thus this number should be included in the abstract rather than 391. It is unclear where the number 232 on line 205 comes from, I failed to deduct this number from supplementary table 1.

      Also, Table 1 and Supplementary Table 1 should be swapped. It is more important for the reader to know the number of samples included in the analysis (as given in Supplementary Table 1) than the number collected. Possibly, the two tables could be combined in a clever way.

      Methods<br /> The authors took the somewhat unusual decision to apply K-means clustering to GPS coordinates to determine how to combine their data into a cluster. There is an obvious cluster on Pemba islands and three clusters on Unguja. Based on the map, I assume that one of these three clusters is mostly urban, while the other two are more rural. It would be helpful to have a bit more information about that in the methods. See also comments on maps in Figures 1 and 2 below.

      Following this point, in Supplemental Figure 5 I fail to see an inflection point at K=4. If there is one, it will be so weak that it is hardly informative. I think selecting 4 clusters in Zanzibar is fine, but the justification based on this figure is unclear.

      For the drug resistance loci, it is stated that "we further removed SNPs with less than 0.005 population frequency." Was the denominator for this analysis the entire population, or were Zanzibar and mainland samples assessed separately? If the latter, as for all markers <200 samples were typed per site, there could not be a meaningful way of applying this threshold. Given data were available for 200-300 samples for each marker, does this simply mean that each SNP needed to be present twice?

      Discussion:<br /> I was a bit surprised to read the following statement, given Zanzibar is one of the few places that has an effective reactive case detection program in place: "Thus, directly targeting local malaria transmission, including the asymptomatic reservoir which contributes to sustained transmission (Barry et al., 2021; Sumner et al., 2021), may be an important focus for ultimately achieving malaria control in the archipelago (Björkman & Morris, 2020)." I think the current RACD program should be mentioned and referenced. A number of studies have investigated this program.

      The discussion states that "In Zanzibar, we see this both within and between shehias, suggesting that parasite gene flow occurs over both short and long distances." I think the term 'long distances' should be better defined. Figure 4 shows that highly related infections rarely span beyond 20-30 km. In many epidemiological studies, this would still be considered short distances.

      Lines 330-331: "Polymorphisms associated with artemisinin resistance did not appear in this population." Do you refer to background mutations here? Otherwise, the sentence seems to repeat lines 324. Please clarify.

      Line 344: The opinion paper by Bousema et al. in 2012 was followed by a field trial in Kenya (Bousema et al, 2016) that found that targeting hotspots did NOT have an impact beyond the actual hotspot. This (and other) more recent finding needs to be considered when arguing for hotspot-targeted interventions in Zanzibar.

      Figures and Tables:<br /> Table 2: Why not enter '0' if a mutation was not detected? 'ND' is somewhat confusing, as the prevalence is indeed 0%.

      Figure 1: Panel A is very hard to read. I don't think there is a meaningful way to display a 3D-panel in 2D. Two panels showing PC1 vs. PC2 and PC1 vs. PC3 would be better. I also believe the legend 'PC2' is placed in the wrong position (along the Y-axis of panel 2).

      Supplementary Figure 2B suffers from the same issue.

      The maps for Figures 1 and 2 don't correspond. Assuming Kati represents cluster 4 in Figure 2, the name is put in the wrong position. If the grouping of shehias is different between the Figures, please add an explanation of why this is.

      Figure 2: In the main panel, please clarify what the lines indicate (median and quartiles?). It is very difficult to see anything except the outliers. I wonder whether another way of displaying these data would be clearer. Maybe a table with medians and confidence intervals would be better (or that data could be added to the plots). The current plots might be misleading as they are dominated by outliers.

      In the insert, the cluster number should not only be given as a color code but also added to the map. The current version will be impossible to read for people with color vision impairment, and it is confusing for any reader as the numbers don't appear to follow any logic (e.g. north to south).

      The legend for Figure 3 is difficult to follow. I do not understand what the difference in binning was in panels A and B compared to C.

      Font sizes for panel C differ, and it is not aligned with the other panels.

      Why is Kusini included in Supplemental Figure 4, but not in Figure 1?

      Supplemental Figures 6 and 7: What does the width of the line indicate?

      What was the motivation not to put these lines on the map, as in Figure 4A? This might make it easier to interpret the data.

    1. Reviewer #2 (Public Review):

      Summary:<br /> In this manuscript, Vöröslakos and colleagues describe a new behavioural testing apparatus called ThermoMaze, which should facilitate controlling when a mouse is exploring the environment vs. remaining immobile. The floor of the apparatus is tiled with 25 plates, which can be individually heated, whereas the rest of the environment is cooled. The mouse avoids cooled areas and stays immobile on a heated tile. The authors systematically changed the location of the heated tile to trigger the mouse's exploratory behaviours. The authors showed that if the same plate stays heated longer, the mouse falls into an NREM sleep state. The authors conclude their apparatus allows easy control of triggering behaviours such as running/exploration, immobility and NREM sleep. The authors also carried out single-unit recordings of CA1 hippocampal cells using various silicone probes. They show that the location of a mouse can be decoded with above-chance accuracy from cell activity during sharp wave ripples, which tend to occur when the mouse is immobile or asleep. The authors suggest that consistent with some previous results, SPW-Rs encode the mouse's current location and any other information they may encode (such as past and future locations, usually associated with them).

      Strengths:<br /> Overall, the apparatus may open fruitful avenues for future research to uncover the physiology of transitions from different behavioural states such as locomotion, immobility, and sleep. The setup is compatible with neural recordings. No training is required.

      Weaknesses:<br /> I have a few concerns related to the authors' methodology and some limitations of the apparatus's current form. Although the authors suggest that switching between the plates forces animal behaviour into an exploratory mode, leading to a better sampling of the enclosure, their example position heat maps and trajectories suggest that the behaviour is still very stereotypical, restricted mostly to the trajectories along the walls or the diagonal ones (between two opposite corners). This may not be ideal for studying spatial responses known to be affected by the stereotypicity of the animal's trajectories. Moreover, given such stereotypicity of the trajectories mice take before and after reaching a specific plate, it may be that the stable activity of SWR-P ripples used for decoding different quadrants may be representing future and/or past trajectories rather than the current locations suggested by the authors. If this is the case, it may be confusing/misleading to call such activity ' place-selective firing', since they don't necessarily encode a given place per se (line 281).

      Another main study limitation is the reported instability of the location cells in the Thermomaze. This may be related to the heating procedure, differences in stereotypical sampling of the enclosure, or the enclosure size (too small to properly reveal the place code). It would be helpful if the authors separate pyramidal cells into place and non-place cells to better understand how stable place cell activity is. This information may also help to disambiguate the SPW-R-related limitations outlined above and may help to solve the poor decoding problem reported by the authors (lines 218-221).

    1. Reviewer #1 (Public Review):

      Summary: The study introduces and validates the Cyclic Homogeneous Oscillation (CHO) detection method to precisely determine the duration, location, and fundamental frequency of non-sinusoidal neural oscillations. Traditional spectral analysis methods face challenges in distinguishing the fundamental frequency of non-sinusoidal oscillations from their harmonics, leading to potential inaccuracies. The authors implement an underexplored approach, using the auto-correlation structure to identify the characteristic frequency of an oscillation. By combining this strategy with existing time-frequency tools to identify when oscillations occur, the authors strive to solve outstanding challenges involving spurious harmonic peaks detected in time-frequency representations. Empirical tests using electrocorticographic (ECoG) and electroencephalographic (EEG) signals further support the efficacy of CHO in detecting neural oscillations.

      Strengths:

      1. The paper puts an important emphasis on the 'identity' question of oscillatory identification. The field primarily identifies oscillations through frequency, space (brain region), and time (length, and relative to task or rest). However, more tools that claim to further characterize oscillations by their defining/identifying traits are needed, in addition to data-driven studies about what the identifiable traits of neural oscillations are beyond frequency, location, and time. Such tools are useful for potentially distinguishing between circuit mechanistic generators underlying signals that may not otherwise be distinguished. This paper states this problem well and puts forth a new type of objective for neural signal processing methods.

      2. The paper uses synthetic data and multimodal recordings at multiple scales to validate the tool, suggesting CHO's robustness and applicability in various real-data scenarios. The figures illustratively demonstrate how CHO works on such synthetic and real examples, depicting in both time and frequency domains. The synthetic data are well-designed, and capable of producing transient oscillatory bursts with non-sinusoidal characteristics within 1/f noise. Using both non-invasive and invasive signals exposes CHO to conditions which may differ in extent and quality of the harmonic signal structure. An interesting followup question is whether the utility demonstrated here holds for MEG signals, as well as source-reconstructed signals from non-invasive recordings.

      3. This study is accompanied by open-source code and data for use by the community.

      Weaknesses:

      1. Due to the proliferation of neural signal processing techniques that have been designed to tackle issues such as harmonic activity, transient and event-like oscillations, and non-sinusoidal waveforms, it is naturally difficult for every introduction of a new tool to include exhaustive comparisons of all others. Here, some additional comparisons may be considered for the sake of context, a selection of which follows, biased by the previous exposure of this reviewer. One emerging approach that may be considered is known as state-space models with oscillatory and autoregressive components (Matsuda 2017, Beck 2022). State-space models such as autoregressive models have long been used to estimate the auto-correlation structure of a signal. State-space oscillators have recently been applied to transient oscillations such as sleep spindles (He 2023). Therefore, state-space oscillators extended with auto-regressive components may be able to perform the functions of the present tool through different means by circumventing the need to identify them in time-frequency. Another tool that should be mentioned is called PAPTO (Brady 2022). Although PAPTO does not address harmonics, it detects oscillatory events in the presence of 1/f background activity. Lastly, empirical mode decomposition (EMD) approaches have been studied in the context of neural harmonics and non-sinusoidal activity (Quinn 2021, Fabus 2022). EMD has an intrinsic relationship with extrema finding, in contrast with the present technique. In summary, the existence of methods such as PAPTO shows that researchers are converging on similar approaches to tackle similar problems. The existence of time-domain approaches such as state-space oscillators and EMD indicates that the field of time-series analysis may yield even more approaches that are conceptually distinct and may theoretically circumvent the methodology of this tool.

      2. The criteria that the authors use for neural oscillations embody some operating assumptions underlying their characteristics, perhaps informed by immediate use cases intended by the authors (e.g., hippocampal bursts). The extent to which these assumptions hold in all circumstances should be investigated. For instance, the notion of consistent auto-correlation breaks down in scenarios where instantaneous frequency fluctuates significantly at the scale of a few cycles. Imagine an alpha-beta complex without harmonics (Jones 2009). If oscillations change phase position within a timeframe of a few cycles, it would be difficult for a single peak in the auto-correlation structure to elucidate the complex time-varying peak frequency in a dynamic fashion. Likewise, it is unclear whether bounding boxes with a pre-specified overlap can capture complexes that maneuver across peak frequencies.

      3. Related to the last item, this method appears to lack implementation of statistical inferential techniques for estimating and interpreting auto-correlation and spectral structure. In standard practice, auto-correlation functions and spectral measures can be subjected to statistical inference to establish confidence intervals, often helping to determine the significance of the estimates. Doing so would be useful for expressing the likelihood that an oscillation and its harmonic has the same auto-correlation structure and fundamental frequency, or more robustly identifying harmonic peaks in the presence of spectral noise. Here, the authors appear to use auto-correlation and time-frequency decomposition more as a deterministic tool rather than an inferential one. Overall, an inferential approach would help differentiate between true effects and those that might spuriously occur due to the nature of the data. Ultimately, a more statistically principled approach might estimate harmonic structure in the presence of noise in a unified manner transmitted throughout the methodological steps.

      4. As with any signal processing method, hyperparameters and their ability to be tuned by the user need to be clearly acknowledged, as they impact the robustness and reproducibility of the method. Here, some of the hyperparameters appear to be: a) number of cycles around which to construct bounding boxes and b) overlap percentage of bounding boxes for grouping. Any others should be highlighted by the authors and clearly explained during the course of tool dissemination to the community, ideally in tutorial format through the Github repository.

      5. Most of the validation demonstrations in this paper depict the detection capabilities of CHO. For example, the authors demonstrate how to use this tool to reduce false detection of oscillations made up of harmonic activity and show in simulated examples how CHO performs compared to other methods in detection specificity, sensitivity, and accuracy. However, the detection problem is not the same as the 'identity' problem that the paper originally introduced CHO to solve. That is, detecting a non-sinusoidal oscillation well does not help define or characterize its non-sinusoidal 'fingerprint'. An example problem to set up this question is: if there are multiple oscillations at the same base frequency in a dataset, how can their differing harmonic structure be used to distinguish them from each other? To address this at a minimum, Figure 4 (or a followup to it) should simulate signals at similar levels of detectability with different 'identities' (i.e. different levels and/or manifestations of harmonic structure), and evaluate CHO's potential ability to distinguish or cluster them from each other. Then, does a real-world dataset or neuroscientific problem exist in which a similar sort of exercise can be conducted and validated in some way? If the "what" question is to be sufficiently addressed by this tool, then this type of task should be within the scope of its capabilities, and validation within this scenario should be demonstrated in the paper. This is the most fundamental limitation at the paper's current state.

      References:

      Beck AM, He M, Gutierrez R, Purdon PL. An iterative search algorithm to identify oscillatory dynamics in neurophysiological time series. bioRxiv. 2022. p. 2022.10.30.514422. doi:10.1101/2022.10.30.514422

      Brady B, Bardouille T. Periodic/Aperiodic parameterization of transient oscillations (PAPTO)-Implications for healthy ageing. Neuroimage. 2022;251: 118974.

      Fabus MS, Woolrich MW, Warnaby CW, Quinn AJ. Understanding Harmonic Structures Through Instantaneous Frequency. IEEE Open J Signal Process. 2022;3: 320-334.

      Jones SR, Pritchett DL, Sikora MA, Stufflebeam SM, Hämäläinen M, Moore CI. Quantitative analysis and biophysically realistic neural modeling of the MEG mu rhythm: rhythmogenesis and modulation of sensory-evoked responses. J Neurophysiol. 2009;102: 3554-3572.

      He M, Das P, Hotan G, Purdon PL. Switching state-space modeling of neural signal dynamics. PLoS Comput Biol. 2023;19: e1011395.

      Matsuda T, Komaki F. Time Series Decomposition into Oscillation Components and Phase Estimation. Neural Comput. 2017;29: 332-367.

      Quinn AJ, Lopes-Dos-Santos V, Huang N, Liang W-K, Juan C-H, Yeh J-R, et al. Within-cycle instantaneous frequency profiles report oscillatory waveform dynamics. J Neurophysiol. 2021;126: 1190-1208.

    1. The police, then, is subjectneither to the Law as a literal code, nor to the precedents set by its own acts, butrather remains that “zone of indistinction” between potentiality and actuality,

      selective enforcement

    2. Government-generatedblacklists, administrative injunctions, interdictions in the legal code and thoseresulting from juridical verdicts weave themselves into the lives of Freddi andhis friends through various forms of mediation.
    3. With this, the code precludes the possibility that minorchanges to a prohibited sign would place it outside of the law’s reach. But theprinciple of similarity also introduces a critical dimension of ambiguity intothe process of adjudication

      more ambiguity

    4. Because of the constitutional clause againstcensorship, in themselves such signs cannot be placed under legal prohibition.Instead, their outlawing appears in the criminal code almost as a derivativefunction— as if it were a mere by-product— of a different set of legal restric-tions whose concern is rather with placing limits on the constitutional freedomof association, and that regulates the criminalization of counterconstitutionalorganizations. What comes under their jurisdiction, then, are not particularsymbols but rather more generally the dissemination or use of any sign associ-ated with a banned organization in a way that could promote the organizationor its goals

      signs themselves are not illegal but the use of them is associated with the organization and thus promotes its goals and therefore is illegal. Seems to be loopholes in most of these

    Annotators

    1. But I do question why lib and not something in app is the common suggestion for classes/modules who do not fall into the default set of folders (models, controllers, jobs, etc). Is it just because it's what we've been doing for so long? To me feels like we're trying to shoehorn the lib folder into further being a kitchen sink (now holding rake tasks and miscellaneous classes), rather than just saying "your Ruby classes/modules go somewhere in app because they're application code".
    2. Everything has a place so do better and find it. There is a certain belief that everything within app should be organized into functionally-named directories and any files placed in app/lib actually belongs in app/services or app/interactors or app/models or someplace if the developers just tried harder. The implication is that developers are bad developers if they don’t yet know what kind of constant they have and where its forever home should be. I reject this. Over the lifespan of an application, there will be constants that have not yet found their functional kin, if those kin ever come to exist at all; sometimes you simply need some code and a place to put it. app/lib can be the convention for where those constants can live temporarily or as long as necessary. Autoloading is really nice, let’s treat them to it.
    1. Author Response

      The following is the authors’ response to the original reviews.

      Thank you for the e-mail of 27th September that includes the eLife assessment and reviewers comments on manuscript eLife-RP-RA-2023-91861. We have considered these, added additional data and made various changes to the text as detailed below. We now submit a modified version that we would be happy to view as the ‘Version of Record’.

      We are very pleased to note the highly positive reports from the reviewers. The major change we have made is to alter the Introduction to include further consideration of the development of the ‘bar-code’ hypothesis. As highlighted by reviewer 2 the Lefkowitz/Duke University Group have been major proponents of this concept. However, as with many topics their views did not emerge in isolation. Indeed we (specifically Tobin) were developing similar ideas in the same period (see Tobin et al., (2008) Trends Pharmacol Sci 29, 413-420). Moreover, other groups, particularly that of Clark and collaborators at University of Texas, were developing similar ideas using the beta2-adrenoceptor as a model at least as early as this (e.g. Tran et al., (2004) Mol Pharmacol 65, 196-206). As such we have re-written parts of the Introduction to reflect these early studies whilst retaining information on more recent studies that have greatly expanded such early work. This has resulted in the addition of extra references and re-numbering of the Reference section. We have also provided statistical analysis of agonist-induced arrestin interactions with the receptor as requested by a reviewer and performed additional studies to assess the effect of the GRK2/3 inhibitor in agonist-regulation of phosphorylation of the hFFA2-DREADD receptor. This has led to an additional author (Aisha M. Abdelmalik) being added to the paper.

      To address first the ‘public reviews’

      Reviewer 1

      1. We agree that we do not at this point explore the implications of the tissue specific barcoding we observe and report. However, as noted by the reviewer these will be studies for the future.

      2. The question of why these are only 2 widely expressed arrestins and very many GPCRs is not one we attempt to address here and groups using various arrestin ‘conformation’ sensors are probably much better placed to do so than we are.

      Reviewer 2

      1. It is difficult to address the potential low level of ‘background’ staining in some of the immunocytochemical images versus the ‘cleaner’ background in some of the immunoblotting images. The methods and techniques used are very distinct. However, it should be apparent that the immunoblotting studies are performed (both using cell lines and tissues) post-immunoprecipitation and this is likely to reduce such background to a minimum. This is obviously not the case in the immunocytochemical studies. It is also likely, even though the antisera are immune-selected against the peptide target, there may be some level of immune-recognition this is not limited to the phosphorylated residues.

      2. Whilst this reviewer has commented in detail in the ‘recommendations’ section on the use of English, the other reviewers have not, and we do not find the manuscript challenging to follow or read.

      Reviewer 3

      1. We agree that the mass-spectrometry presented is not quantitative. The intention was for the mass spec to be a guide for the development of the antisera used in the study. We have re-written the initial part of the Results section (page 7) to state that phosphorylation of Ser297 was evident in the basal and agonist-stimulated receptor whilst phosphorylation of Ser296 was only evident following agonist addition.

      2. Immunoblotting is intrinsically variable as parameters of antiserum titre in re-used samples is not assessed and although we are aware that FFA2 displays a degree of constitutive activity (see for example Hudson et al., (2012) J Biol Chem. 287(49):41195-209) we did not make any specific effort to supress this by, for example, including an inverse agonist ligand. Agonist-regulation of phosphorylation of the receptor, as detected in cell lines by the anti- pThr306/pThr310antiserum, is exceptionally clear cut in all the images displayed, and as we note for the pSer296/pSer297 antiserum this was always, in part, agonist-independent.

      The point about compound 101 not being tested directly in the immunoblotting studies performed on the cell line-expressed receptor is a good one. We have now performed such studies which are shown as Figure 2E. These illustrate that the GRK2/3 inhibitor compound 101 does not reduce substantially agonist-induced phosphorylation of the receptor at least as detected by the pThr306/pThr310antiserum or by the pSer296/pSer297 antiserum. Equally this compound had little effect on recognition of the receptor. As the PD2 mutations which correspond to the targets for the pThr306/pThr310antiserum have no significant effect on recruitment of arrestin 3 in response to MOMBA (please see additional statistical analysis in modified Figure 2C) this is perhaps not surprising. Moreover, the PD1 mutations that correspond to the pSer296/pSer297antiserum also, in isolation, only have a partial effect of MOMBA-induced interactions with arrestin 3.

      1. The use of phosphatase inhibitors is an integral part of these studies. As noted in Materials we used PhosSTOP (Roche, 4906837001). However, we failed to make it sufficiently clear that this reagent was present throughput sample preparation for both cell lines and tissue studies. This had been specified previously by two of us (SS, FN, see Fritzwanker S, Nagel F, Kliewer A, Stammer V, Schulz S. In situ visualization of opioid and cannabinoid drug effects using phosphosite-specific GPCR antibodies. Commun Biol. 6, 419 (2023)) but we agree this was insufficient and we now correct this oversight by making this explicit in Results.

      Recommendations

      Reviewer 1

      Competing interest: We apologise for this typographic error. It is now corrected.

      Figures: We have upgraded the figure images to 300dpi and this markedly improves readability

      Reviewer 2

      Revisiting writing: We thank the reviewer for their assessment of the text. However, we do not feel that ‘every sentence in the entire manuscript could be clarified’ is a reasonable statement. Neither of the other reviewers commented on this. Each of the authors read and approved the manuscript.

      Figures: see response to Reviewer 1. We have greatly enhanced image quality at this part of the process.

      Statistics on Figure 2: We apologise for this oversight. Although there were no significant differences in potency for MOMBA to promote interactions with arrestin-3 to each of the PD mutants versus wild type receptor, there were in terms of maximal effect. Statistical analysis was performed via one-way ANOVA followed by Dunnett’s multiple comparisons test. This is now detailed directly in Figure 2C and its associated legend. As noted by the reviewer there was indeed a highly significant effect of the GRK2/3 inhibitor compound 101 and this is now also noted in Figure 2D and its associated legend.

      Units on page 9: pEC50 is considered as Molar by default but we have now specified this. PD1-4: It would be cumbersome to write out (and to read) 8 mutations that make up PD1-4 and hence we think this is specified appropriately in the Figure.

      Reviewer 3

      1. Mass spec: Please see comment point 1 to reviewer 3.

      2. Immunoblotting and compound 101: We have done so.

      3. Phosphatase inhibition: see public comments, reviewer 3.

    2. Reviewer #3 (Public Review):

      Summary:

      The authors generate and characterize two phosphospecific antisera for FFA2 receptor and claim a "bar code" difference between white fat and Peyers patches.

      Strengths:

      The question is interesting and the antibody characterization is convincing.

      Weaknesses:

      The mass spectrometry analysis is not convincing because the method is not quantitative (no SILAC, TMT, internal standards etc). Figure 1 shows single tryptic peptides with one and two phosphorylation fragmentations as claimed, but there is no data testing the abundance of these so the differences claimed between cell treatment conditions are not established.

      The blot analysis cannot distinguish 296/7 but it does convincingly show an agonist increase. Can the authors clarify why the amount of constitutive phosphorylation is much higher in the example blot in Figure 2 than in Figure 3? It would be helpful to quantify this across more than one example, like in Figures 4 and 5 for tissue.

      Compound 101 is shown in Figure 2 to block barrestin recruitment. I agree this suggests phosphorylation mediated by GRK2/3 but this is not tested. The new antibodies should be good for this so I don't understand why the indirect approach.

      The conditions used to inhibit dephosphorylation are not specified, the method only says "phosphatase inhibitors". How do the authors know that low P at 306/7 in white fat is not a result of dephosphorylation during sample preparation? If these sites are GRK2/3 dependent (see above) then does adipose tissue lack this GRK?

    1. you

      A cheat code for academic writing: instead of writing you one can replace it with one or someone depending on context. It sounds more like instructions and thus more academic. "Imagine a teacher sitting someone down for a test but the entire test is in French. One might be able to pick out words that sound similar, but the sentence structure wouldn't make sense right?" Although this strategy is way less personable so I only use it on my most official papers lmao.

    Annotators

    1. Reviewer #1 (Public Review):

      This work provides a new dataset of 71,688 images of different ape species across a variety of environmental and behavioral conditions, along with pose annotations per image. The authors demonstrate the value of their dataset by training pose estimation networks (HRNet-W48) on both their own dataset and other primate datasets (OpenMonkeyPose for monkeys, COCO for humans), ultimately showing that the model trained on their dataset had the best performance (performance measured by PCK and AUC). In addition to their ablation studies where they train pose estimation models with either specific species removed or a certain percentage of the images removed, they provide solid evidence that their large, specialized dataset is uniquely positioned to aid in the task of pose estimation for ape species.

      The diversity and size of the dataset make it particularly useful, as it covers a wide range of ape species and poses, making it particularly suitable for training off the shelf pose estimation networks or for contributing to the training of a large foundational pose estimation model. In conjunction with new tools focused on extracting behavioral dynamics from pose, this dataset can be especially useful in understanding the basis of ape behaviors using pose.

      Overall this work is a terrific contribution to the field, and is likely to have a significant impact on both computer vision and animal behavior.

      Strengths:<br /> - Open source dataset with excellent annotations on the format, as well as example code provided for working with it<br /> - Properties of the dataset are mostly well described<br /> - Comparison to pose estimation models trained on humans vs monkeys, finding that models trained on human data generalized better to apes than the ones trained on monkeys, in accordance with phylogenetic similarity. This provides evidence for an important consideration in the field: how well can we expect pose estimation models to generalize to new species when using data from closely or distantly related ones.<br /> - Sample efficiency experiments reflect an important property of pose estimation systems, which indicates how much data would be necessary to generate similar datasets in other species, as well as how much data may be required for fine tuning these types of models (also characterized via ablation experiments where some species are left out)<br /> - The sample efficiency experiments also reveal important insights about scaling properties of different model architectures, finding that HRNet saturates in performance improvements as a function of dataset size sooner than other architectures like CPMs (even though HRNets still perform better overall).

    1. Joint Public Review:

      Murphy, Fancy and Skene performed a reanalysis of snRNA-seq data from Alzheimer Disease (AD) patients and healthy controls published previously by Mathys et al. (2019), arriving at the conclusion that many of the transcriptional differences described in the original publication were false positives. This was achieved by revising the strategy for both quality control and differential expression analysis. With this re-analysis, the authors aim to raise awareness of the impact of data analysis choices for scRNA-seq data and to caution focus on putatively wrongly identified genes in the AD research community. The revised manuscript has been improved by separating QC and DE analysis, which makes interpretation of both steps more straightforward.

      STRENGTHS:

      The authors demonstrate that the choice of data analysis strategy can have a vast impact on the results of a study, which in itself may not be obvious to many researchers.

      The authors apply a pseudobulk-based differential expression analysis strategy (essentially, adding up counts from all cells per individual and comparing those counts with standard RNA-seq differential expression tests), which is (a) in line with latest community recommendations, (b) different from the "default options" in most popular scRNA-seq analysis suites, and (c) explains the vastly different number of DEGs identified by the authors and the original publication. The recommendation of this approach together with a detailed assessment of the DEGs found by both methodologies could potentially be a useful finding for the research community. Unfortunately, it is currently not sufficiently substantiated.

      All code and data used in this study are publicly available to the readers.

      WEAKNESSES:

      The authors interpret the fact that they found fewer DEGs with their method than the original paper as a good thing by making the assumption that all genes that were not found were false positives. However, they do not prove this, and it is likely that at least some genes were not found due to a lack of statistical power and not because they were actually "incorrect". The original paper also had performed independent validations of some genes that were not found here. I had raised this weakness in my first review, but it was not explicitly addressed and still pertains to the revised manuscript. The authors have added an analysis that shows that "pseudoreplication" is prone to false positive (FP) discoveries for high cell numbers (Fig. 1f), but this does not prove that all of Mathys' DEGs were wrong.

      I am concerned that almost all DEGs found by the authors are in the rare cell types, foremost the rare microglia (see Fig. 1e). Indeed, there is a weak negative correlation between cell counts and numbers of DEGs (Fig. 1e), if the correlation analysis is to be believed (see next point). It is unclear to me how many cells the pseudo-bulk counts were based on for these cell types, but it seems that (a) there were few and (b) there were quite few reads per cells. If both are the case, the pseudobulk counts for these cell populations might be rather noisy and the DEG results liable to outliers with extreme fold changes. Supp. Fig. 3b now shows three examples of DEGs, of which one (EGR1) looks like the DE call is indeed largely driven by four outliers, while Supp. Fig 3a shows at least one gene (BEX1) that could be FP of the pseudobulk approach due to insufficient statistical power. The authors go on to cite two papers (one is their own, published in a journal with suspected lack of appropriate quality assurance measures https://predatoryreports.org/the-predatory-journals-1), to support that the finding of DEGs in microglia "makes more sense" (l. 127). In summary, neither the presented examples nor the supporting literature are convincing. Lastly, the authors even show themselves that their approach is liable to FPs if applied with very low cell numbers in the range of those for microglia and OPCs (Fig. 1g).

      The correlation analysis between cell counts and number of DEGs found is weak. In all three cases (Fig. 1c, d, e) the correlation is largely driven by a single outlier data point.

      The authors claim they improved the quality control of the dataset but offer no objective metric to assess this putative improvement. The authors' QC procedure removes some 20k cells that had not been filtered out by Mathys' et al. As the authors state themselves, this difference is mostly due to the removal of cells with a high mitochondrial read content. Murphy et al use a fixed threshold for the mitochondrial percentage of reads, while the original paper had removed cell clusters with an "abnormally high" mitochondrial read fraction. That also seems reasonable, given that some cells might have a higher mitochondrial read content for reasons other than being "low quality". Simply stating that Mathys' approach was ineffective at removing cells with high mitochondrial read content is a self-fulfilling prophecy given the difference in approach, and itself not proof that the original QC procedure was inferior.

      Batch correction: "Dataset integration has become a common step in single-cell RNA-Seq protocols and is recommended to remove confounding sources of variation" (l. 38). While it is true that many authors now choose to perform an integration step as part of their analysis workflow, this is by no means uncontroversial as there is a risk of "over-integration" and loss of true biological differences. I had raised this point previously, but the authors chose not to address it (quoted text and line numbers updated). Given that there is controversy in the literature and "community opinion" on the topic of data integration, this is another example of the authors claiming superiority in analysis without showing proof.

      Due to a lack of comparison with other methods and due to the fact that the author's methodology was only applied to a single dataset, the paper presents merely a case study, which could be useful but falls short of providing a general recommendation for a best practice workflow.

      APPRAISAL:

      The manuscript could help to increase awareness of data analysis choices in the community, but only if the superiority of the methodology was clearly demonstrated. However, the authors only show that there are differences but have no convincing (orthogonal) evidence that their methodology was indeed better. This applies to both QC and DE analysis.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      _We have underlined the important points in the reviewer's comments. All responses have been read and authorized by all authors of this manuscript. Authors would like to thank the reviewers and the editor for their valuable time. We believe that the comments and suggestions from both reviewers will significantly improve SMorph and the manuscript. _

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      First of all, I want to apologize the authors and editor for my delay. Secondly, for clarity, I want to disclose that I am the author of the Fiji's 'Sholl Analysis' plugin, that the authors cite extensively (Ferreira et al, Nat Methods, 2014).

      In this study, Sethi et al introduce a software tool - SMorph - for bulk morphometric analysis of neurons and glia (astrocytes and microglia), based on the Sholl technique. The authors compare it to the state-of-the-art in a series of validation experiments (stab wound injury), to conclude that it is 1000 times faster that existing tools. Empowered by the tool, the authors show that chronic administration of a tricyclic antidepressant (DMI) leads to structural changes of astrocytes in the mouse hippocampus. The paper is well written, the description of the tool is clear, and the authors make all of the source code available, as well as most of the imagery analyzed in the manuscript. The latter on its own, makes me really appreciative of the authors work.

      We thank reviewer #1 for their careful reading of the manuscript and their comments.

      **Major comments:**

      A major strength of SMorph is that it leverages the Python ecosystem, which allow the authors take advantage of powerful python packages such as sklearn, without the need for external packages or tools. However, I have strong criticisms for the claims that are made in terms of speed and broad-applicability of the software, including PCA.

      Speed:

      The 1000x speed gains, assumes - for the most part -- that the processing in Fiji cannot be automated. This is false. I read the source code of SMorph, and with exception of the PCA analysis, all aspects of SMorph can be automated in Fiji, using any of Fiji's scripting languages to make direct calls to the Fiji and Sholl Analysis plugin APIs (See https://javadoc.scijava.org/) . Now, perhaps the authors do not have experience with ImageJ scripting, or perhaps we Fiji developers failed to provide clear tutorials and examples on how to do so. Or perhaps, there is something inherently cumbersome with Fiji scripting that makes this hard (e.g., there is a current limitation with the ImageJ2 version of 'Sholl Analysis' that does not make it macro recordable). It such limitations do exist, it is perfectly fine to mention them, but do contact us at https://forum.image.sc, if something is unclear. We do strive to make our work as re-usable as possible. Unfortunately our own research does not always allow us the time required to do so. Case in point, our scripting examples (e.g., https://github.com/tferr/ASA/blob/master/scripting-examples/3D_Analysis_ImageStack.py; https://github.com/tferr/ASA/blob/master/scripting-examples/3D_Analysis_ImageStack.py) are not well advertised. That being said, I am still surprised that in their side-by-side comparisons the authors were not able to automate more the processing steps (e.g., the ImageJ1 version of 'Sholl Analysis' remains fully functional and is macro recordable). If I misunderstood what was done, please provide the ImageJ macros you used. Also, I wanted to mention that i) semi-manual tracing with Simple Neurite Tracer (now "SNT"), can also be scripted (see https://doi.org/10.1101/2020.07.13.179325); and that ii) Fiji commands and plugins can also be called in native python using pyimagej (https://pypi.org/project/pyimagej/), see e.g., https://github.com/morphonets/SNT/tree/master/notebooks#snt-notebooks). Arguably, the fact that SMorph handles blob detection and skeletonization-based metrics directly is more advantageous from a user point of view. In Fiji, blob detection, skeletonization and Strahler analysis (https://imagej.net/Strahler_Analysis) of the skeleton are handled by different plugins. However, those are also fully scriptable, and interoperate well. The point that topographic skeletonization in Fiji can originate loops is valid, however the authors should know that such cycles can be detected and pruned programmatically using e.g., pixel intensities (see https://imagej.net/AnalyzeSkeleton.html#Loop_detection_and_pruning and the original publication (https://pubmed.ncbi.nlm.nih.gov/20232465/)

      We completely agree with the reviewer’s assertion that most parts of the functionality of SMorph can be automated within imageJ as well, and in such comparison, the speed gains with SMorph will not be >1000X.

      However, automating the analysis in imageJ is beyond the scope of the present manuscript. In fact, imageJ analysis comparison was not a part of our original manuscript at all. Upon presubmission inquiry to one of the affiliate journals of Review Commons, we were specifically asked to include a side-by-side comparison with “already available” methods. So, we decided to use ImageJ as it is, and automation, if any, was limited to simple macros to run a series of commands sequentially on batches of images. Although it is true that this analysis could be done much more efficiently with additional scripting, it would not have met the definition of “already available” tools. The imageJ analysis was performed in a way an average biologist with no programming experience would perform it, since that group will find SMorph most useful. In no way do we intend to imply that imageJ analysis can’t be made more efficient and automated. Perhaps it was not clear from the way the text was framed in the initial version of the manuscript. We will add additional text to make this point clearer.

      On a side-note, in response to reviewer #2’s comments, we will perform the speed comparison on a per-image basis, so the speed gain (1080X) may change a little in the new comparison.

      Broad applicability:

      In our work, we made a significant effort to ensure that automated Sholl could be performed on any cell type: e.g., By supporting 2D and 3D images, by allowing repeated measures at each sampled distance, and by improving curve fitting. For linear profiles, we implemented the ability to perform polynomial fits of arbitrary degree, and implemented heuristics for 'best degree' determination. For normalized profiles, we implemented several normalizers, and alternatives for determining regression coefficients. We did not tackle segmentation of images directly (we did provide some accompanying scripts to aid users, see e.g. https://imagej.net/BAR) because in our case that is handled directly by ImageJ and Fiji's large collection of plugins. However, in SMorph, several of these parameters are hard-wired in the code. They may be suitable to the analyzed images, but they can be hardly generalized to other datasets. In detail: In terms of segmentation, SMorph is restricted to 2D images, scales data to a fixed 98 percentile, and uses a fixed auto-threshold method (Otsu). These settings are tethered to the authors imagery. They will give ill results for someone else using a different imaging setup, or staining method. In terms of curve fitting, the polynomial regression seems to be fixed at a 3rd order polynomial, which will not be suitable to different cell types (not even to all cells of 'radial morphology').

      We have indeed hard-coded the parameters that the reviewer mentions, and we agree that we can perhaps give all options to the end-users to choose from. The decision was made to hard-code the parameters so that SMorph becomes very easy and minimalistic to use for the end-users. But the reviewer is right to point out that this may compromise the broad applicability and accuracy. We will update the code in the revised version of the manuscript to give the users control over choosing these parameters.

      PCA:

      The idea of making PCA analysis of Sholl-based morphometry accessible to a broader user base has merit and is welcomed. However, it has to be done carefully in a self-critic manner as opposed to a black-box solution. E.g., in the text it is mentioned that 2 principal components are used, in the tutorial notebook, 3. Why not provide intuitive scree plots that empower users with the ability to criticize choice? Also, it would be useful for users to understand which metrics correlate with each other, and their variable weights.

      Reviewer #1’s suggestions would indeed make the PCA analysis more useful to the users. In the revised version of the code, we will provide additional data/plots to the user for making an informed choice of the significant principal components e.g. the elbow method, Ogive or Pareto plots, variable weights of different features in the principal components and correlation/covariance matrices.

      When we showcased the utility of PCA to distinguish closely related morphology groups (as in Type-1 and Type-2 PV neurons), we had been unable to base the distinction on individual metrics, at least not in a robust manner (see Fig. S4 in Ferreira et al, 2014). A minor conundrum of the paper, is that it does not directly highlight the advantages of "analyzes in a multidimensional space". The differences between groups in the stab wound and DMI assays are such, that PCA is hardly needed: I.e., the differences depicted Fig2F,G are already significant, and already convey changes in "size and branch complexity" (as per PC1). The same argument applies to Fig. 5. The paper would profit from having this discussed.

      PCA data indeed is not required to make any of the inferences we make in the paper and is superfluous. However, as mentioned in the discussion section of this manuscript, the low-dimensional PCA data can be used in future for other applications, e.g to cluster the astrocytes into morphometrically-defined subpopulations. SMorph can be further developed to perform real-time classification of these cells into morphometric clusters, which will allow the researchers to investigate clusters-specific gene expression, electrophysiology etc. Preliminary results from our lab do suggest that such clusters are differentially altered by stress and antidepressant treatments. However, these results are preliminary and are a part of a long-term future study. The data is really premature to publish at this stage, since it will require a lot of experimentation to show that these astrocyte subpopulations are indeed physiologically and functionally different. Nevertheless, we think that the utility of SMorph for such analyses may help others to come up with additional innovative ways to use the PCA data. Hence, we do believe that the community will benefit from the current release of SMorph having PCA. PCA data was shown in the figures just to demonstrate the functionality of SMorph. We will add additional text to make these points clearer.

      Other:

      - All metrics and parameters should be expressed in physical units (e.g.," radii increasing by 3 pixels", axes in Figure 2, 3, 5, S2) so that readers can directly interpret them.

      In the revised manuscript, we will convert all units into actual physical distances.

      - The paper would profit from the insights provided by Bird & Cuntz (https://pubmed.ncbi.nlm.nih.gov/31167149/)

      We thank the reviewer for suggesting this paper. We will include this in the discussion of the manuscript.

      **Minor comments:**

      - Usage of RGB images (8-bit per channel) seems hardly justifiable. Aren't you loosing dynamic range of GFAP signal?

      We agree that we could have captured the images at a higher dynamic range. However, for the changes we observe between treatment groups using GFAP immunoreactivity signal as presented in the manuscript, we do not see an advantage of using higher dynamic range. However, as the reviewer rightly pointed out, under certain conditions, imaging using a higher dynamic range may help and hence, we will include this recommendation in the materials and methods section.

      - Please explain how MaxAbsScaler "prevents sub-optimal results"

      Since morphometric features extracted from cell images either have different units or are scalar, we had to perform normalization before PCA. We will add further explanation in the methods section of the manuscript.

      - The fact that automated batch processing can stall on a single bad 'contrast ratio' image seems rather cumbersome to deal with

      This problem has been resolved in the current version of SMorph, which will be uploaded with the revised version of the manuscript.

      - Please add a license to https://github.com/parulsethi/SMorph/. Without it, other projects may shy away from using SMorph

      We will add a ____GPLv3 license

      - "mounted on stereotax" should be "mounted on a stereotaxis device"?

      We will make this change

      - Ensure Schoenen is capitalized

      We will make this change

      Reviewer #1 (Significance (Required)):

      I find the Desipramine results interesting. However, given the existing claims that DMI can modulate LTP, I regret that the authors did not look at structural modifications in hippocampal neurons (e.g., by performing the experiments in Thy1-M-eGFP animals). I understand, that doing so at this point would be a large undertaking.

      Another manuscript from our lab__1, as well as work from other labs have shown that stress causes significant degenerative changes in hippocampal astrocytes__2,3__. In the light of these observations, we do believe that our observation of chronic antidepressant treatment inducing structural plasticity in astrocytes is significant. Structural alterations in neurons after DMI treatment are of interest. But in our experience, we have not seen gross morphological (dendritic arborization) changes in hippocampal neurons as a result of antidepressant drug treatments. Such changes are restricted to spine morphology and axonal varicosities, which is beyond the capabilities of SMorph. __

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This paper addresses the challenge of automatic Sholl analysis of large dataset of multiple cell types such as neurons, astrocytes and microglia. The developed approach should improve the speed of morphology analysis compared to the state of the art without compromising on the accuracy. The authors present an interesting application of their tool to the morphological analysis of astrocytes following chronic antidepressant treatment. The paper is well written, and the tool presented could be beneficial for different applications and context. However, some major aspects should be addressed by the author concerning the description of the algorithms used and the quantification of the results.

      We thank reviewer #2 for their careful reading of the paper and their comments.

      **Major comments/Questions:**

      1. In the Results and/or Methods sections, the author should better describe how their approach is different from state-of-the-art approaches in terms of algorithms used and how these difference impacts on the speed and accuracy of the analysis.

      We will add these descriptions in the methods section in response to this comment as well as some comments from reviewer #1.

      Imaging was performed on a Zeiss LSM 880 airyscan confocal microscope. Is this method robust to other types of imaging techniques, other microscopes, variable levels of signal-to-noise? This should be tested and quantified.

      We will demonstrate the results obtained from images taken using different microscopes and imaging techniques, and quantify the outcome.

      Manual cropping of the cells with ImageJ was used. However, in the methods section, the authors mention that other machine learning tools could be used for this task. Why were these tools not implemented in this paper in order to propose a fully automated analysis approach in combination with SMorph?

      We have tried both the machine learning tools cited in this paper (one for DAB images and other for confocal images). However, in our experience, we do not get robust performance from these tools with our datasets, and these tools will perhaps need more optimization for broad applicability. We are developing an auto-cropping tool in-house, but that is beyond the scope of the current study. Another point is that these tools are tailor-made for astrocytes, and their integration into SMorph will restrict its applicability to just one cell type.

      In the methods section you state that cropped cells need to have a good contrast ratio for automated batch processing. Could you define what a good contrast ratio is and characterize the performance of your approach for different contrast ratio?

      In the revised manuscript, we will compare the images taken from multiple microscopes and quantify the outcome. We will change the text accordingly. As such, the comment on rejected cells referred to really poor quality images. In the revised manuscript, we will make specific recommendations on imaging parameters so that this should not be an issue at all.

      It is mentioned that the analysis routine can be interupted by a cell with lower contrast ratio. This is a major drawback of the approach (but I think that it could be easily improved), as such interruptions may not be= practicable for many applications that need to rely on automated processing.

      We have already rectified this problem and the updated version of SMorph will be uploaded with the revised manuscript.

      Also, you should precise how the contrast ratio should be enhanced without modifying raw data in order to be processed with your approach. You suggest removing cells with lower contrast ratio from the analysis, but can this impact on the findings especially if some treatments impact on the detected fluorescence signal? Can you propose ways to improve the robustness of your approach to variable signal ratios?

      It is indeed possible that removing cells from analysis, may in certain cases, affect the results. To rectify this, we are testing the method on images obtained from different microscopes and under different imaging conditions. From these analyses, we will deduce minimum recommendations for imaging conditions so that images don’t have to be edited/altogether removed from analysis for the software to work. In the materials and methods section, we will add these recommendations to the users on the optimal range of imaging parameters. This way, rejection/modification of images should not be an issue.

      In the Results section, you describe the time necessary to perform different analysis. However, giving a total time in hours is not very informative as this will likely vary a lot depending on the size of the dataset, complexity of the images, etc. You should compare the average time per image for both methods and types of analysis.

      We compared the total time required for the entire dataset, since SMorph is meant for batch-processing all the images at once. However, we can change the comparisons to time taken per image. We can divide the total time taken by SMorph by the number of images analysed. However, in our opinion, the time taken to initiate SMorph will make these comparisons inaccurate.

      You state that for the number of branch point, the lower value of the measured slope when comparing SMorph and ImageJ was related to a constant overestimation of this parameter with ImageJ. How was this quantified? I think you should stress out more the comparison of both approaches with the manually annotated dataset.

      In the revised version of this manuscript, we will include some examples of skeletonized images that overestimate the number of forks. We have observed this to be a recurring problem with the skeletonization tools we have tried in imageJ. This can be rectified in imageJ itself as pointed out by reviewer #1. However, that’s beyond the scope of the present study and will not fit the definition of comparison with “already available” methods.

      How can you explain the differences in the 2D-projected Area, total skeleton length and convex hull between SMorph and ImageJ, which all show a slope around 0.83? Can you quantify the performance of both methods by comparing them with your manually annotated dataset?

      In the revised version, we will include the correlation data between completely manual and SMorph comparisons. We will discuss these comparisons further in the manuscript and make specific conclusions about the accuracy.

      In the introduction and discussion, you mention that you present a method that works on neurons, astrocytes and microglia. However, I don't see in the paper the comparison between the accuracy for all these cell types as you seem to have analyzed only the morphology of astrocytes.

      In the revised manuscript, we will include the Sholl analysis comparison (imageJ vs SMorph) from images of neurons and microglia.

      You mention that your method is quite sensitive to variation in contrast ratio. You should quantify the contrast ratio throughout the experiments and ensure that this is not biasing the SMorph analysis for some of the treatments.

      We thank both reviewers for highlighting this issue in the initial version of SMorph. As mentioned in our response to point #6, we will perform additional analyses to make specific recommendations to the end users regarding imaging parameters so that SMorph can work on images as they are. As such, our comments on contrast ratio applied only to very poor quality images. If images are acquired conforming to the imaging parameters we will recommend in the revised manuscript, images can be analysed without any issues.

      **Minor Points :**

      1. Precise the exact inclusion and exclusion criteria for Soma detection and rephrase: "The high-intensity blobs were detected as a position of soma..." & "Boundary blobs coming from adjacent cells...".

      We will add a complete explanation of blob detection and the exclusion criterion in the methods section.

      Throughout the text, make sure to always refer to an analysis time per image or per cell and not only include absolute duration values without reference to the task at hand (e.g. in the discussion : SMorph took 40 second to complete the analysis... please state to which analysis you are exactly referring to and if applicable if it varies from cell to cell).

      We will change all comparisons to time taken per cell. Text will be added to mention which datasets were used when any claims of speed are made.

      When you state in the discussion that "Although some methods do allow Sholl analysis without manual neurite tracing, they still work on one cell at a time", please precise if the only aspect that is missing from this type of analysis is batch processing (looping through the data) or if there is a major obstacle to automate this technique. This is important a SMorph does proceed with the analysis one cell at a time but can work in a loop/batch.

      We will elaborate further on our assertion regarding the challenges of using imageJ plugins for sholl analysis in large batches of cells.

      Reviewer #2 (Significance (Required)):

      This tool could very useful to researchers in the field of cellular neuroscience working with high-throughput analysis of microscopy data. The authors show some interesting improvements over existing methods. An improved quantitative characterization of the robustness of their approach would be of great importance to ensure the significance of this tool to a large community of researchers using different types of microscopes or studying different cell types.

      My expertise is in the field of optical microscopy and high-throughput (automated) image analysis for neuroscience. My expertise to evaluate the biological findings in this study is very limited.

      We thank reviewer #2 for their careful reading of the manuscript and their insightful comments. Growing evidence (clinical and preclinical) shows a significant reduction in astrocyte density in key limbic brain regions as a result of depression. We believe that the structural plasticity induced by chronic antidepressant treatment, as demonstrated in this manuscript, is an interesting novel plasticity mechanism that can negate deleterious effects of stress on astrocytes.

      The improvements suggested by both reviewers will help us to greatly improve SMorph in the revised version of this manuscript.

      References:

      1. Virmani, G., D’almeida, P., Nandi, A. & Marathe, S. Subfield-specific Effects of Chronic Mild Unpredictable Stress on Hippocampal Astrocytes. doi:10.1101/2020.02.07.938472.
      2. Czéh, B., Simon, M., Schmelting, B., Hiemke, C. & Fuchs, E. Astroglial plasticity in the hippocampus is affected by chronic psychosocial stress and concomitant fluoxetine treatment. Neuropsychopharmacology 31, 1616–1626 (2006).
      3. Musholt, K. et al. Neonatal separation stress reduces glial fibrillary acidic protein- and S100beta-immunoreactive astrocytes in the rat medial precentral cortex. Dev. Neurobiol. 69, 203–211 (2009).
    1. Assignment¶A statement that assigns a value to a variable. Concatenate¶To join two operands end to end. Comment¶Information in a program that is meant for other programmers (or anyone reading the source code) and has no effect on the execution of the program. Evaluate¶To simplify an expression by performing the operations in order to yield a single value. Expression¶A combination of variables, operators, and values that represents a single result value. Floating Point¶A type that represents numbers with fractional parts. Integer¶A type that represents whole numbers. Keyword¶A reserved word that is used by the compiler to parse a program; you cannot use keywords like if, def, and while as variable names. Mnemonic¶A memory aid. We often give variables mnemonic names to help us remember what is stored in the variable. Modulus Operator¶An operator, denoted with a percent sign (%), that works on integers and yields the remainder when one number is divided by another. Operand¶One of the values on which an operator operates. Operator¶A special symbol that represents a simple computation like addition, multiplication, or string concatenation. Rules of Precedence¶The set of rules governing the order in which expressions involving multiple operators and operands are evaluated. Statement¶A section of code that represents a command or action. So far, the statements we have seen are assignments and print expression statement. String¶A type that represents sequences of characters. Type¶A category of values. The types we have seen so far are integers (type int), floating-point numbers (type float), and strings (type str). Value¶One of the basic units of data, like a number or string, that a program manipulates. Variable¶A name that refers to a value.

      Assignment A statement that assigns a value to a variable.

      Concatenate To join two operands end to end.

      Comment Information in a program that is meant for other programmers (or anyone reading the source code) and has no effect on the execution of the program.

      Evaluate To simplify an expression by performing the operations in order to yield a single value.

      Expression A combination of variables, operators, and values that represents a single result value.

      Floating Point A type that represents numbers with fractional parts.

      Integer A type that represents whole numbers.

      Keyword A reserved word that is used by the compiler to parse a program; you cannot use keywords like if, def, and while as variable names.

      Mnemonic A memory aid. We often give variables mnemonic names to help us remember what is stored in the variable.

      Modulus Operator An operator, denoted with a percent sign (%), that works on integers and yields the remainder when one number is divided by another.

      Operand One of the values on which an operator operates.

      Operator A special symbol that represents a simple computation like addition, multiplication, or string concatenation.

      Rules of Precedence The set of rules governing the order in which expressions involving multiple operators and operands are evaluated.

      Statement A section of code that represents a command or action. So far, the statements we have seen are assignments and print expression statement.

      String A type that represents sequences of characters.

      Type A category of values. The types we have seen so far are integers (type int), floating-point numbers (type float), and strings (type str).

      Value One of the basic units of data, like a number or string, that a program manipulates.

      Variable A name that refers to a value.

    1. Background In recent years, three-dimensional (3D) spheroid models have become increasingly popular in scientific research as they provide a more physiologically relevant microenvironment that mimics in vivo conditions. The use of 3D spheroid assays has proven to be advantageous as it offers a better understanding of the cellular behavior, drug efficacy, and toxicity as compared to traditional two-dimensional cell culture methods. However, the use of 3D spheroid assays is impeded by the absence of automated and user-friendly tools for spheroid image analysis, which adversely affects the reproducibility and throughput of these assays.Results To address these issues, we have developed a fully automated, web-based tool called SpheroScan, which uses the deep learning framework called Mask Regions with Convolutional Neural Networks (R-CNN) for image detection and segmentation. To develop a deep learning model that could be applied to spheroid images from a range of experimental conditions, we trained the model using spheroid images captured using IncuCyte Live-Cell Analysis System and a conventional microscope. Performance evaluation of the trained model using validation and test datasets shows promising results.Conclusion SpheroScan allows for easy analysis of large numbers of images and provides interactive visualization features for a more in-depth understanding of the data. Our tool represents a significant advancement in the analysis of spheroid images and will facilitate the widespread adoption of 3D spheroid models in scientific research. The source code and a detailed tutorial for SpheroScan are available at https://github.com/FunctionalUrology/SpheroScan.

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giad082 ), which carries out open, named peer-review. This review is published under a CC-BY 4.0 license:

      **Reviewer Name: Francesco Pampaloni **

      This study represents a significant contribution to the field of screening and analysis of threedimensional cell cultures. The demand for reliable and user-friendly image processing tools to extract quantitative data from a large number of spheroids or other types of three-dimensional tissue models is substantial. The authors of this manuscript have developed a tool that aims to address this need by providing a straightforward method to extract the projected area and intensity of individual cellular spheroids imaged with bright-field microscopy. The tool is compatible with "Incucyte" microscopes or any other automated microscope capable of imaging multiple specimens, typically found in high-density multiwell plates.An admirable aspect of this work is the authors' decision to make all the code and pipeline openly available on Github. This openness allows other scientists to test and validate the code, promoting transparency and collaboration in the scientific community. However, several improvements should be made to the manuscript prior to publication.One important aspect that the authors should address in the manuscript is the suitability, rationale, and extent of using a neural network-based segmentation approach for the specific analysis described in the manuscript—segmentation of single bright-field images of spheroids.

      While neural networks are anticipated to play an increasingly important role in microscopy data segmentation in the coming years, they are not a universal solution. Although there may be segmentation tasks that are challenging to accomplish with traditional approaches, where neural networks can be highly effective, other segmentation tasks can be successfully performed using conventional strategies. For example, in our research group, we were able to reliably segment densely populated bright-field images containing numerous organoids in a single field of view using a pipeline based on the ImageJ plugin MorphoLibJ (see references: https://doi.org/10.1093/bioinformatics/btw413 and https://doi.org/10.1186/s12915-021-00958-w). Therefore, it would be informative and valuable for readers if the authors compared the results obtained from the neural network with those achieved by employing simple thresholding techniques (such as Otsu or Watershed) on the same dataset, as demonstrated in a similar study (reference: https://doi.org/10.1038/s41598-021-94217-1, Figure 5).

      Furthermore, to address the limitations of the model, the authors should provide specific examples (preferably in the supplementary material due to space constraints) of incorrect segmentations or artifacts that arise from applying the neural network to the data. For instance, it would be beneficial to explore scenarios where spheroids are surrounded by cellular debris or when multiple spheroids are present in the field of view. These real-life situations are common and it is important to provide insights into potential challenges that may arise when the images of the spheroids are not pristine.

    2. Background In recent years, three-dimensional (3D) spheroid models have become increasingly popular in scientific research as they provide a more physiologically relevant microenvironment that mimics in vivo conditions. The use of 3D spheroid assays has proven to be advantageous as it offers a better understanding of the cellular behavior, drug efficacy, and toxicity as compared to traditional two-dimensional cell culture methods. However, the use of 3D spheroid assays is impeded by the absence of automated and user-friendly tools for spheroid image analysis, which adversely affects the reproducibility and throughput of these assays.Results To address these issues, we have developed a fully automated, web-based tool called SpheroScan, which uses the deep learning framework called Mask Regions with Convolutional Neural Networks (R-CNN) for image detection and segmentation. To develop a deep learning model that could be applied to spheroid images from a range of experimental conditions, we trained the model using spheroid images captured using IncuCyte Live-Cell Analysis System and a conventional microscope. Performance evaluation of the trained model using validation and test datasets shows promising results.Conclusion SpheroScan allows for easy analysis of large numbers of images and provides interactive visualization features for a more in-depth understanding of the data. Our tool represents a significant advancement in the analysis of spheroid images and will facilitate the widespread adoption of 3D spheroid models in scientific research. The source code and a detailed tutorial for SpheroScan are available at https://github.com/FunctionalUrology/SpheroScan

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giad082 ), which carries out open, named peer-review. This review is published under a CC-BY 4.0 license:

      **Reviewer name: Kevin Tröndle **

      The authors present a "Technical Note" about an open-source web tool called SpheroScan. As input users could upload (large batches of) spheroid images (brightfield, 2D). The tool delivers two outputs: (1) Prediction Module: creates a file with area and intensity of detected spheroids (CSV), (2) Visualization Module: plots of the corresponding parameters (PNG). Performance was tested on 480 Incucyte images and 423 microscope images with 336 (70 %) and 265 for training, 144 (30 %) and 117 for validation, and 50 images for testing, respectively. The framework is based on Mask R-CNN and Detectron2 library. The performance was tested in the range of 0.5 to 0.95 against manual annotation (VGG Annotator). As evaluation measure they used Intersection over union (IoU), determining the overlap between the predicted and ground truth regions and calculates values of Average Precision (AP) for masking: 0.937 and 0.972 (Test), 0.927 and 0.97 (Validation) as well as AP for bounding box: 0.899 and 0.977 (test) 0.89 and 0.944 (Validation). They show a linear runtime, proofed with different sized datasets (1 s / image) for masking on a 16 core CPU, 64 GB RAM machine. The tool is available on GitHub and claimed to be available as a web tool on spheroscan.onrender.com.General evaluation:The concept of the tool serves some important needs of 3D cell culture-based assays: automated, standardized, high-throughput image analysis. As such, it represents value added for the research field.

      However, it remains open how high the impact, the reproducibility, and the chances of potential application by other researchers will be. This is due to some significant limitations in accessibility (i.e. non-permanent or non-functional web tool), as well as the (potential) restriction of input data (i.e. brightfield only, not validated with external data) and the limited options for analysis of the metadata (i.e. area and intensity only). The greatest value stems from the possibility to access a web interface, which is easy to use and will ideally be equipped with additional functionalities in the future.

      Comment 1 (minor):The presented tool uses the Mask R-CNN deep-learning model in their image processing pipeline. Several tools, which perform image segmentation, are based on this or other models are well-established and already implemented in several commercial imaging devices and allow for segmentation of cell containing image areas, e.g. to determine confluency or cell migration in "wound healing assays", mainly optimized for 2D cultures, but also applicable for 2D images of 3D spheroids. The concept of automated image segmentation is thus not novel and only meets the journal's input criterion as "update or adaptation of existing" tools.The state-of-the-art and preliminary work are not sufficiently referenced. Several similar and alternative (open-source) tools are existent and should be mentioned in the manuscript, e.g. (Lacalle et al., 2021; Piccinini et al., 2023; Trossbach et al., 2023), to give only a few examples.

      Comment 2 (major):The authors claim to present an user-friendly open-source web tool. The python project is available on Github, and on a demo-server (https://spheroscan.onrender.com/) where the web interface can be accessed. Unfortunately the mentioned web tool is not functional, i.e. it is stated on the website: "This is a demonstration server and the prediction module is not available for use. To utilize the prediction functionality, please run SpheroScan on your local machine.".This is significantly limiting the applicability of the presented tool to users who are able to execute python code on their local hardware. Therefore, the demo server should either present a functional user interface (recommended), or the statement should be removed from the manuscript, which would limit the impact of the submission significantly

      .Comment 3 (major):The presented algorithm was trained exclusively on internal data of brightfield images from "Incucyte and microscope platforms". Furthermore, two distinct models were generated, working with either Incucyte or microscope images.It remains unclear how the algorithm will perform on external data of prospective users. Given the fact that two distinct models had to be trained for different image sources (i.e. from two different platforms) indicates a limited robustness of the models in this regard. This is clearly a general problem of image processing algorithms, but one that will stand in the way of applicability by external users with certainly other imaging techniques. Since the web tool interface is not functional at this point, the authors will also not be able to evaluate or improve on this after publication. At least one performance test with external data, obtained from an ideally blinded user should be performed, to further elaborate on this.

      Comment 4 (major):Many assays nowadays use fluorescent labels, for example to calculate cell ratios within 3D arrangements, e.g. for cell viability or the expression of certain proteins. The authors do not state if the algorithm (or future iterations thereof) is or will be able to process multi-channel microscope images of spheroids.This is a significant limitation of the presented work and should at least be mentioned in the corresponding section, respectively. Furthermore, a proof-of-concept test run with fluorescent images could be performed to test the algorithm performance and derive potentially necessary adaptations in future versions.

      Comment 5 (minor):The output of the tool is a list of detected spheroids with corresponding area (2D) and bright field average intensity within the area.The usability of these two parameters is limited to specific assays, such as the mentioned use case to investigate collagen gel contraction assays. Several other parameters of interest could easily be derived from the metadata, such as roundness, volume estimation (assuming a spheroid shape), or even cell count estimation. This should again be mentioned in the "limitations and considerations" section.

    1. Background Genotyping-by-Sequencing (GBS) provides affordable methods for genotyping hundreds of individuals using millions of markers. However, this challenges bioinformatic procedures that must overcome possible artifacts such as the bias generated by PCR duplicates and sequencing errors. Genotyping errors lead to data that deviate from what is expected from regular meiosis. This, in turn, leads to difficulties in grouping and ordering markers resulting in inflated and incorrect linkage maps. Therefore, genotyping errors can be easily detected by linkage map quality evaluations.Results We developed and used the Reads2Map workflow to build linkage maps with simulated and empirical GBS data of diploid outcrossing populations. The workflows run GATK, Stacks, TASSEL, and Freebayes for SNP calling and updog, polyRAD, and SuperMASSA for genotype calling, and OneMap and GUSMap to build linkage maps. Using simulated data, we observed which genotype call software fails in identifying common errors in GBS sequencing data and proposed specific filters to better handle them. We tested whether it is possible to overcome errors in a linkage map using genotype probabilities from each software or global error rates to estimate genetic distances with an updated version of OneMap. We also evaluated the impact of segregation distortion, contaminant samples, and haplotype-based multiallelic markers in the final linkage maps. Through our evaluations, we observed that some of the approaches produce different results depending on the dataset (dataset-dependent) and others produce consistent advantageous results among them (dataset-independent).Conclusions We set as default in the Reads2Map workflows the approaches that showed to be dataset-independent for GBS datasets according to our results. This reduces the number required of tests to identify optimal pipelines and parameters for other empirical datasets. Using Reads2Map, users can select the pipeline and parameters that best fit their data context. The Reads2MapApp shiny app provides a graphical representation of the results to facilitate their interpretation.

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giad092), which carries out open, named peer-review. This review is published under a CC-BY 4.0 license:

      **Reviewer Name: Zhenbin Hu **

      In this MS, the authors tried to develop a framework for using GBS data for downstream analysis and reduce the impact of sequence errors caused by GBS. However, sequence error is an issue not specific to GBS, it is also for whole genome sequences. Actually, I think the major issue for GBS is the missing data. However, in this MS, the authors did not test the impact of missing data on downstream analysis.The authors also mentioned that sequencing error may cause distortion segregation in linkage map construction, however, distortion segregation in linkage map construction can also happen for correct genotyping data. The distortion segregation can be caused by individual selection during the construction of the population. So I don't think it is correct to use distortion segregation to correct sequence errors.The authors need to clear the major question of this MS, in the abstract, the authors highlight the sequence errors, while in the introduction, the authors highlight the package for linkage map construction (the last paragraph). Actually, from the MS, authors were assembling a framework for genotyping-by-sequencing data.Two major reduced-represented sequencing approaches, GBS and RADseq, have specific tools for genotype calling, such as Tassel and Stack. However, the authors used the GATK and Freebayes pipeline for variant calling, authors need to present the reason they were not using TASSEL and Stack.In the genotyping-by-sequencing data, individuals were barcoded and mixed during sequencing, what package/code was used to split the individuals (demultiplex) from the fastq for GATK and Freebayes pipeline?The maximum missing data was allowed at 25% for markers data, how about for the individual missing rate?On page 6, the authors mentioned 'seuqnece size of 350', what that means?

    1. This snippet removes some of the empty a elements to make the headings anchors instead:

      javascript ([ ...document.querySelectorAll("a[name] +h1, a[name] +h2, a[name] +h3, a[name] +h4, h1 +a[name], h2 +a[name], h3 +a[name], h4 +a[name]") ]).map((x) => { if (x instanceof HTMLHeadingElement) { var link = x.previousElementSibling; var heading = x; } else { var link = x; var heading = x.previousElementSibling; } link.parentElement.removeChild(link); heading.setAttribute("id", link.name); })

    2. The HTML encoding of this document contains several errors, some of which substantially affect the way it's read. This fixes one of those problems in Appendix II:

      javascript ([ ...document.querySelectorAll("op") ]).reverse().forEach((op) => { let f = document.createDocumentFragment(); f.append(document.createTextNode("<OP>"), ...op.childNodes); op.parentElement.replaceChild(f, op); })

      The problem show be apparent on what is, at the time of this writing, line 4437:

      html <code>IF ?w THEN ?x<OP>?y ELSE ?z<OP>?y</code>

      (The angle brackets around the occurrences of "OP" should be encoded as HTML entities. Because they aren't they end up getting parsed as HTML op elements (which isn't a thing) and screwing up the document tree.)

    1. I keep repeating this in the hopes that it sticks, because too much OO code is written like Java, and too many programmers believe that OO is defined by Java.

      This reads like a total non-sequitur at this point in the post.

    1. Three distinct rRNA sequences exist in the operon, encoding for the two subunits of the ribosome; the 5S and 23S sequences code for the large subunit, while the 16S sequence encodes for the small subunit.

      rRNA operon

    1. Shockingly, you get immediate access to all of it. Access to their gigantic monorepo🔗 with billions of lines of code covering almost all their products. Live statuses of their globe-covering data centers. Strategy documents spanning two decades of history. And direct access to legends.

      Biggest reason to work in big tech ---- to learn.

    1. Author Response

      Reviewer #1

      While the article clearly outlines the strengths of the chosen approach, it lacks an equally clear exposition of its limitations and a more thorough comparison to established approaches. Two examples of limitations that should be stated more clearly, in my opinion: models need to be small enough to fit on a single machine (in contrast to e.g. NEURON and NEST which support distributed computation via MPI), and only single-compartment models are supported; both limitations are mentioned in passing in the discussion, but would merit a more upfront mention.

      We agree that our paper could be improved by more clearly stating the limitations of our approach and comparing it to established approaches. We have revised the paper and added two new subsections in the Discussion section to address these specific concerns:

      1. The Limitations subsection (L448 - L491) acknowledges restrictions of BrainPy paradigm which uses a Python-based object-oriented programming. It highlights three main categories of limitations: (a) approach limitations, (b) functionality limitations, (c) parallelization limitations. These limitations highlight areas where BrainPy may require further development to improve its functionality, performance, and compatibility with different modeling approaches.

      2. The Future Work subsection (L493 - L526) outlines development enhancements we aim to pursue in the near term. It emphasizes the need for further development in order to meet the demands of the field. Three key areas requiring attention are highlighted: (a) multi-compartment neuron models, (b) ultra-large-scale brain simulations, (c) bridging with acceleration computing systems.

      In addition to these changes, we have also made a number of other minor changes to the paper to improve its clarity and readability.

      The study does not verify the accuracy of the presented framework. While its basic approach (time-step-based simulation, standard numerical integration algorithms) is sufficiently similar to other software to not expect major discrepancies, an explicit comparison would remove any doubt. Quantitative measures of accuracies are particularly important in the context of benchmarks (see below), since simulations can be made arbitrarily fast by sacrificing performance.

      We agree that an explicit comparison would help alleviate any doubts and provide a more comprehensive understanding of our framework's accuracy. We have revised our manuscript to include a dedicated section, particularly Appendix 11. In this section, we verified that all simulators generated consistent average firing rates for the given benchmark network models (figure 1 and figure 2 in Appendix 11). These verifications were performed under different network sizes (ranging from 4e^3 to 4e^5) and different computing platforms (CPU, GPU and TPU). We also qualitatively compared the overall network activity patterns produced by each simulator to ensure they exhibited the same dynamics (figure 3 and figure 4 in Appendix 11). While exact spike-to-spike reproducibility was not guaranteed between different simulator implementations, we confirmed that our simulator produced activity consistent with the reference simulators for both firing rates and network-level dynamics. Additionally, BrainPy did not sacrifice simulation accuracy for speed performance. Despite using single precision floating point, BrainPy was able to produce consistent firing rates and raster diagrams across all simulations (see figure 3 and figure 4 in Appendix 11).

      We hope these revisions can ensure that our manuscript provides a clear and robust validation of the accuracy of our simulator.

      Benchmarking against other software is obviously important, but also full of potential pitfalls. The current article does not state clearly whether the results are strictly comparable. In particular: are the benchmarks on the different simulators calculating results to the same accuracy (use of single or double precision, same integration algorithm, etc.)? Does each simulator use the fastest possible execution mode (e.g. number of threads/processes for NEST, C++ standalone mode in Brian2, etc.)? What is exactly measured (compilation time, network generation time, simulation execution time, ...) - these components will scale differently with network size and simulation duration, so summing them up makes the results difficult to interpret. Details are also missing for the comparison between the XLA operator customization in C++ vs. Python: was the C++ variant written by the authors or by someone else? Does the NUMBA→XLA mechanism also support GPUs/TPUs? This comparison also seems to be missing from the GitHub repository provided for reproducing the paper results.

      We have carefully considered these comments and addressed each of these concerns regarding the benchmarks and examples presented in our paper.

      1. Lack of Details in Examples: In the revised version of the paper, we provide additional information and any other pertinent details to enhance the clarity and replicability of our results. Particularly, in Appendix 9, we provide the mathematical description, the number of neurons, the connection density, and delay times used in our multi-scale spiking network; in Appendix 10, we provide the detail description of reservoir models, evaluation metrics, training algorithms, and their implementations in BrainPy; in Appendix 11, we elaborate the hardware and software specifications and experimental details for benchmark comparisons.

      2. Inadequate Description of Benchmarking Procedures: In the revised paper, particularly, in L328-L329 of the main text at section of "Efficient performance of BrainPy" and in Appendix 11, we elaborate on the integration methods, simulation time steps, and floating-point precision used in our experiments. We also ensure that these parameters are clearly stated and identical across all simulators involved in the benchmarking process, see "Accuracy evaluations" in Appendix 11 (L1550 - L1580).

      3. Clarification on Measured Time: In the revised paper, we state that all simulations only measured the model execution time, excluding model construction time, synapse creation time, and compilation time, see "Performance measurements" in Appendix 11 (L1539 - L1548).

      4. Consideration of Acceleration Modes: In the revised version, we provide the simulation speed of other brain simulators on different acceleration modes, see Figure 8. For instance, we utilize the fastest possible option --- the C++ standalone mode in Brian2 --- for speed evaluations. Furthermore, we have requested the developers of the comparison simulators for optimizing the benchmark models, ensuring a fair and accurate comparison.

      5. Scaling Networks to Maintain Activity: In the revised manuscript, we adopt the suggestion of Reviewer #3 and apply the appropriate scaling techniques to maintain consistent network activity throughout our experiments. These details can be found in Appendix 11 (also see Appendix 11—figure 1 and Appendix 11—figure 2).

      Regarding the comparison between XLA operator customization in C++ and Python, we utilized our self-implemented C++ version, which is accessible in the Appendix 8 Listing 2. Presently, the NUMBA→XLA mechanism does not support GPUs/TPUs; however, we are working on expanding this capability to other platforms. We have made this clarification in the revised manuscript as well (see L1278 - L1285).

      While the authors convincingly argue for the merits of their Python-based/object-oriented approach, in my opinion, they do not fully acknowledge the advantages of domain-specific languages (NMODL, NestML, equation syntax of ANNarchy and Brian2, ...). In particular, such languages aim at a strong decoupling of the mathematical model description from its implementation and other parts of the model. In contrast, models described with BrainPy's approach often need to refer to such details, e.g. be aware of differences between dense and sparse connectivity schemes, online, or batch mode, etc. It might also be worth mentioning descriptive approaches to synaptic connectivity as supported by other simulators (connection syntax in Brian2, Connection Set Algebra for NEST).

      We have made revisions to better acknowledge the merits of DSLs while providing a more comprehensive comparison. These revisions are incorporated in Discussion (L452 - L466) and Appendix 1 (L778 - L788).

      Reviewer #2

      While the results presented are impressive, publishing further details of the benchmarks in an appendix would be helpful for evaluating the claims and the overall conclusion would be more convincing if the performance benefits were demonstrated on a wider selection of test cases. Unsatisfyingly, the authors gave up on making a direct comparison to Brian running on GPUs with GeNN which would have been a fairer comparison than CPU-based simulations. The code for the chosen benchmarks is also likely to be highly optimised by the authors for running on BrainPy but less so for the other platforms - a fairer test would be to invite the authors of the other simulators to optimise the same models and re-evaluate the benchmarks.

      We have carefully considered these comments and addressed each of these concerns regarding the benchmarks and examples presented in our paper.

      1. Lack of Details in Examples: In the revised version of the paper, we provide additional information and any other pertinent details to enhance the clarity and replicability of our results. Particularly, in Appendix 9, we provide the mathematical description, the number of neurons, the connection density, and delay times used in our multi-scale spiking network; in Appendix 10, we provide the detail description of reservoir models, evaluation metrics, training algorithms, and their implementations in BrainPy; in Appendix 11, we elaborate the hardware and software specifications and experimental details for benchmark comparisons.

      2. Inadequate Description of Benchmarking Procedures: In the revised paper, particularly, in L328-L329 of the main text at section of "Efficient performance of BrainPy" and in Appendix 11, we elaborate on the integration methods, simulation time steps, and floating-point precision used in our experiments. We also ensure that these parameters are clearly stated and identical across all simulators involved in the benchmarking process, see "Accuracy evaluations" in Appendix 11 (L1550 - L1580).

      3. Clarification on Measured Time: In the revised paper, we state that all simulations only measured the model execution time, excluding model construction time, synapse creation time, and compilation time, see "Performance measurements" in Appendix 11 (L1539 - L1548).

      4. Consideration of Acceleration Modes: In the revised version, we provide the simulation speed of other brain simulators on different acceleration modes, see Figure 8. For instance, we utilize the fastest possible option --- the C++ standalone mode in Brian2 --- for speed evaluations. Furthermore, we have requested the developers of the comparison simulators for optimizing the benchmark models, ensuring a fair and accurate comparison.

      5. Scaling Networks to Maintain Activity: In the revised manuscript, we adopt the suggestion of Reviewer #3 and apply the appropriate scaling techniques to maintain consistent network activity throughout our experiments. These details can be found in Appendix 11 (also see Appendix 11—figure 1 and Appendix 11—figure 2).

      Regarding the wider selection of test cases, we understand the importance of demonstrating the performance benefits on a broader range of scenarios. Particularly, we have designed two kinds of benchmark models:

      • Sparse connection models. This category models include COBA-LIF network and COBA-HH network. The former is a standard E/I balanced network for comparing simualtion speed of a brain simulator, while the latter uses the complex computational expensive HH neuron model as the elements. Both models can be effectively to demonstrate the capability of a brain simulator for the sparse and event-driven computation.

      • Dense connection models. The local circuits of a cortical column are usually connected densely (Science 366, 1093). Particularly, we use the decision making network proposed by (Wang, 2002) for evaluations.

      In the revised version, we include extensive experiments on these three test cases under different kinds of computing platforms (including CPU, GPU, and TPU) to strengthen the overall conclusion and provide a more comprehensive evaluation of our approach.

      Regarding the comparison to Brian running on GPUs with GeNN, we apologize for not including that in our initial submission. We have conducted the necessary experiments on all three benchmark models we have used in our evaluations and include these results in the revised version of the paper (see Figure 8). This addition will enhance the credibility of our findings and allow for a more meaningful comparison between different simulation platforms. Furthermore, we have also reached out to the authors of other simulators and invite them to optimize the same models used in our benchmarks. We believe this collaborative approach will ensure a more equitable evaluation of the simulators and provide a more robust and convincing analysis of our work.

      Furthermore, the manuscript reads like an advertisement for the platform with very little discussion of its limitations, weaknesses, or directions for further improvement. A more frank and balanced perspective would strengthen the manuscript and give the reader greater confidence in the platform.

      We agree that our paper could be improved by more clearly stating the limitations of our approach and comparing it to established approaches. We have revised the paper and added two new subsections in the Discussion section to address these specific concerns:

      1. The Limitations subsection (L448 - L491) acknowledges restrictions of BrainPy paradigm which uses a Python-based object-oriented programming. It highlights three main categories of limitations: (a) approach limitations, (b) functionality limitations, (c) parallelization limitations. These limitations highlight areas where BrainPy may require further development to improve its functionality, performance, and compatibility with different modeling approaches.

      2. The Future Work subsection (L493 - L526) outlines development enhancements we aim to pursue in the near term. It emphasizes the need for further development in order to meet the demands of the field. Three key areas requiring attention are highlighted: (a) multi-compartment neuron models, (b) ultra-large-scale brain simulations, (c) bridging with acceleration computing systems. In addition to these changes, we have also made a number of other minor changes to the paper to improve its clarity and readability.

      Since simulators wax and wane in popularity, it would be reassuring to see a roadmap for development with a proposed release cadence and a sustainable governance policy for the project. This would serve to both clearly indicate the areas of active development where community contributions would be most valuable and also to reassure potential users that the project is unlikely to be abandoned in the near future, ensuring that their time investment in learning to use the framework will not be wasted.

      We appreciate the reviewer raising the point for demonstrating the project's sustainability. In response to this feedback, we have made the following efforts.

      Firstly, we add and maintain a "Development roadmap" section in the BrainPy GitHub homepage (https://github.com/brainpy/BrainPy). This will enable the community to have a clear understanding of the project's direction and the areas of active development. Additionally, the "Future work" section in our revised paper has also outlined a comprehensive roadmap for next stages of the BrainPy development.

      Secondly, to address the concern about the sustainability of our project and the potential risk of abandonment, we have incorporated a ACKNOWLEDGMENTS.md file in the GitHub (https://github.com/brainpy/BrainPy/blob/master/ACKNOWLEDGMENTS.md) to outline our sustainable funding support. These supports demonstrates our commitment to the long-term maintenance and development of the project, thus may help to dispel doubts of users for the project abandonment.

      Similarly, a complex set of dependencies, which need to be modified for BrainPy, will likely make the project hard to maintain and so a similar plan to those given for the CI pipeline and documentation generation for automation of these modifications would be a good addition. It is also important to periodically reflect on whether it still makes sense to combine all the disparate tools into one framework as the codebase grows and starts to strain under modifications required to maintain its unification.

      We appreciate the reviewer's valuable suggestions on the BrainPy framework.

      First, BrainPy is a self-contained package designed specifically for brain dynamics programming. It boasts minimal dependencies, relying only on fundamental packages within the Python scientific computing ecosystem. In essence, BrainPy relies on numpy for array-based computations and utilizes jax and jaxlib for JIT compilation. While we currently utilize numba to customize dedicated operators, we can also remove this dependency by rewriting these operators with C++ code. We incorporate the use of brainpylib, a package developed by ourselves, which provides dedicated operators for CPUs and GPUs in the context of brain dynamics modeling. Additionally, BrainPy leverages mature solutions within the field for certain auxiliary functions. For instance, we integrate the use of tqdm to facilitate the display of a progress bar during model execution, and employ matplotlib for visualization purposes, capitalizing on its well-established capabilities in the scientific community.

      Second, we agree that there is a risk of overly complex dependencies and architectural strains. To mitigate this risk, we have taken the following changes:

      • We prioritize good software engineering practices like loose coupling, high cohesion and modularity in the framework design. This will isolate dependencies and changes to specific components. For example, brainpy.visualize nodule defines abstract visualization functions in which the visualization backend can be changed anytime.

      • We invest in automating aspects of the build, test, and release process to relieve manual maintenance burdens. We heavily use the GitHub actions for testing BrainPy codes and building documentations.

      • We document dependencies clearly and maintain backwards compatibility when possible. New APIs will be clearly stated supported after which BrainPy version, and deprecated APIs will be deprecated over multiple release cycles.

      • We continuously monitor code complexity metrics and refactor/simplify the architecture when needed.

      • When new tools have significantly different requirements, we will consider spinning them off into separate projects rather than forcing them into the core framework.

      Finally, a live demonstration would be a very useful addition to the project. For example, a Jupyter notebook hosted on mybinder.org or similar, and a fully configured Docker image, would each enable potential users to quickly experiment with BrainPy without having to install a stack of dependencies and troubleshoot version conflicts with their pre-existing setup. This would greatly lower the barrier to adoption and help to convince a larger base of modellers of the potential merits of BrainPy, which could be major, both in terms of the computational speed-up and ease of development for a wide range of modelling paradigms.

      We appreciate the reviewer's valuable feedback and suggestion. We have hosted a Jupyter notebook and a fully configured Docker image on mybinder.org (https://mybinder.org/v2/gh/brainpy/BrainPy-binder/main). Users can easily experiment with BrainPy without the need to install multiple dependencies or troubleshoot version conflicts.

      Reviewer #3

      One potential issue is that the scope of the neuro-simulator is not very clearly explained and the target audience is not well defined: is BrainPy primarily intended for computational neuroscientists or for neuro-AI practitioners? The simulator offers very detailed neural models (HH, fractional order models), classical point-models (LIF, AdEx), rate-coded models (reservoirs), but also deep learning layers (Conv, MaxPool, BatchNorm, LSTM). Is there an advantage to using BrainPy rather than PyTorch for purely deep networks? Is it possible to build hybrid models combining rate-coded reservoirs or convnets with a network of HH neurons? Without such a hybrid approach, it is unclear why the deep learning layers are needed.

      We appreciate the reviewer's concern regarding the scope of BrainPy and the need for clarification regarding the target audience.

      BrainPy is designed to cater to both computational neuroscientists and neuro-AI practitioners by integrating detailed neural models, classical point models, rate-coded models, and deep learning models. The platform aims to provide a general-purpose programming framework for modeling brain dynamics, allowing users to explore the dynamics of brain or brain-inspired models that combines insights from biology and machine learning.

      Particularly, brain dynamics models (provided in brainpy.dyn module) and deep learning models (provided in brainpy.dnn module) are closely integrated with each other in BrainPy. First, to build brain dynamics models, users should use the building blocks in brainpy.dnn module to create synaptic projections.

      Second, to build brain-inspired computing models for machine learning, users could also take advantages of neuronal and synaptic dynamics have been provided in brainpy.dyn module.

      To that end, BrainPy provides building blocks of detailed conductance-based models like Hodgkin-Huxley, as well as common deep learning layers like convolutions.

      Regarding the advantage of using BrainPy over PyTorch for purely deep networks, we acknowledge that existing deep learning libraries like Flax in the JAX ecosystem provide extensive tools and examples for constructing traditional deep neural networks. While BrainPy does implement standard deep learning layers, our primary focus is not to compete directly with those libraries. Instead, we provide these models for the seamless integration of deep learning layers within BrainPy's core modeling abstractions, including variables and dynamical systems. This integration allows researchers to incorporate common deep learning layers into their brain models. Additionally, the inclusion of deep learning layers in BrainPy serves as examples for customization and facilitates the development of tailored layers for neuroscience research. Researchers can modify or extend the implementations to suit their specific needs.

      In summary, BrainPy's scope focuses on the general-purpose brain dynamics programming. The target audience includes computational neuroscientists who want to incorporate insights from machine learning, as well as some ML researchers interested in integrating brain-like components.

      In terms of plasticity, only external training procedures are implemented (backpropagation, FORCE, surrogate gradients). No local plasticity mechanism (Hebbian learning for rate-coded networks, STDP and its variants for spiking networks) seems to be implemented, apart from STP. Is it a planned feature? Appendix 8 refers to bp.synplast.STDP(), but it is not present in the current code (https://github.com/brainpy/BrainPy/tree/master/brainpy/_src/dyn/synplast). Spiking networks without STDP are not going to be very useful to computational neuroscientists, so this suggests that the simulator targets primarily neuro-AI, i.e. AI researchers interested in using spiking models in a machine learning approach.

      We appreciate that the reviewer raising the limitations of BrainPy in terms of local plasticity mechanisms. We are sorry for the delay of implementing STDP models in BrainPy. Currently, we provide very general implementations of STDP. It can be compatible with any synaptic model (such as Exponential, Dual Exponential, AMPA, GABA, and NMDA dynamics), and common connection patterns (such as Dense, and Sparse connection patterns).

      bp.dyn.STDP_Song2001(pre, post, delay, syn, comm, out)

      It can also be easily used with the combination of short-term plasticity models. The modular design of BrainPy's framework also make the plasticity component straightforward to be implemented and integrated into existing models.

      A second weakness of the paper concerns the demos and benchmarks used to demonstrate the versatility and performance of BrainPy, which are not sufficiently described. In Fig. 4, it is for example not explained how the reservoirs are trained (only the readout weights, or also the recurrent ones? Using BPTT only makes sense when the recurrent weights are also trained.), nor how many neurons they have, what the final performance is, etc. The comparison with NEURON, NEST, and Brian2 is hard to trust without detailed explanations. Why are different numbers of neurons used for COBA and COBAHH? How long is the simulation in each setting? Which time is measured: the total time including compilation and network creation, or just the simulation time? Are the same numerical methods used for all simulators? It would also be interesting to discuss why the only result involving TPUs (Fig 8c) shows that it is worse than the V100 GPU. What could be the reason? Are there biologically-realistic networks that would benefit from a TPU? As the support for TPUs is a major selling point of BrainPy, it would be important to investigate its usage further.

      We appreciate the reviewer for raising the important question about the demos and benchmarks used to demonstrate the versatility and performance of BrainPy. To address these concerns, we have added more details in the revised paper, including:

      • In Fig. 4, we explain how the reservoirs are trained in Appendix 10, in which only the readout weights are trained, and they are trained using backpropagation, FORCE learning, and ridge regression algorithms, respectively. We also specify the number of neurons in each reservoir (see L1397), and the final performance of the reservoirs on the task (see Figure 4).

      • To enable readers to better interpret the simulator comparisons in Fig. 8, we have also added more detailed explanations of the comparison with NEURON, NEST, and Brian2 in Appendix 11.

      • In the current revised paper, we provide a comprehensive analysis of BrainPy's compatibility with different hardware platforms, including TPUs, and to identify the specific conditions under which TPUs may offer advantages (see Figure 8 and Appendix 11—figure 7 ). We have also discussed the potential benefits of TPUs for biologically-realistic networks (see L514 - L521). Particularly, for the biological network with arbitrary sparsity, TPUs does not show advantage over GPUs (see Appendix 11—figure 7). TPUs are best at exploiting certain kinds of structured sparsity, for example block sparsity.

    1. Author Response

      Reviewer #1 (Public Review):

      Due complicated and often unpredictable idiosyncratic differences, comparing fMRI topography between subjects typically would require extra expensive scan time and extra laborious analyzing steps to examine with specific functional localizer scan runs that contrast fMRI responses of every subject to different stimulus categories. To overcome this challenge, hyperaligning tools have recently been developed (e.g., Guntupalli et al., 2016; Haxby et al., 2011) based on aligning in a high-dimensional space of voxels of subjects' fMRI responses to watching a given movie. In the present study, Jiahui and colleagues propose a significantly improved version of hyperaligning functional brain topography between individuals. This new version, based on fMRI connectivity, works robustly on datasets when subjects watched different movies and were scanned with different parameters/scanners at different MRI centers.

      Robustness is the major strength of this study. Despite the fact that datasets from different subjects watching different movies at different MRI centers with different scan parameters were used, the results of functional brain topography from between-subject hyperalignment based on fMRI connectivity were comparable to the golden standard of within-subject functional localizations, and significantly better than regular surface anatomical alignments. These results also support the claim that the present approach is a useful improvement from previous hyperalignments based on time-locked fMRI voxel responses, which would require normative samples of subjects watching a same movie.

      We thank the reviewer for the appreciation of our work.

      Given the robustness, this new version of hyperalignment would provide much stronger statistical power for group-level comparisons with less costs of time and efforts to collect and analyze data from large sample size according to the current stringent standard, likely being useful to the whole research community of functional neuroimaging. That said, more discussions of the limit of the present hyperalignment approach would be helpful to potential eLife readers. For example, to what extend the present hyperalignment approach would be applicable to individuals with atypical functional brain topography such as brain lesion patients with e.g., acquired prosopagnosia? Even in typical populations, while bilateral fusiform face areas can be identified in the majority through functional localizer scans, the left fusiform face area sometimes cannot be found. Moreover, many top-down factors are known to modulate functional brain topography. Due to these factors, brain responses and functional connectivity may be different even when a same subject watched a same movie twice (e.g., Cui et al., 2021).

      We thank the reviewer for the suggestion and agree that it would be fascinating if the predictions can be made with high fidelity in neuropsychological populations. Although we are optimistic that our algorithm is able to generalize across diverse populations, to date, no previous literature has provided empirical evidence to illustrate the effectiveness, including optimizations and special applications beyond typical brains. Besides the neuropsychological population, it would also be valuable to study the generalization across a broad age range, for example, from infants to the elderly. The brain changes across age both anatomically and functionally, so it is a challenge to predict functional topographies based on a normative group that only includes young participants. With all these potential applications in mind, future research is needed to illustrate the efficacy, build the pipeline, and construct the representative normative groups to meet the requirements of accurate individualized predictions in diverse populations.

      In typical populations, although participants have great individual variabilities in their functional topographies, for instance, some participants have distinguishable patches of activations in their left ventral temporal cortex while some participants don’t, our algorithms successfully captured these individualized differences in the prediction. The figure below shows, as an example, the face-selective topographies of two individuals that have markedly different face-selective topographies on the left ventral temporal cortex. The left participant has prominent face-selective areas on the left ventral temporal cortex that are in similar sizes as the right side, while the right participant only has a few scattered small face-selective spots on the left side. No matter what their face-selective areas look like, our algorithm accurately recovered the individualized locations, shapes, and sizes, retaining the individual variability in the functional topographies.

      Functional connectivity profiles based on naturalistic stimuli are very stable across the cortex, even when participants watch different movies. In Figure 4-figure supplement 9, the mean correlations of fine-scaled connectome for most searchlights (r = 15mm) when participants watched The Grand Budapest Hotel and the Raiders of the Lost Ark were generally around 0.8. The mean correlations were about 0.9 between the first and second half of the same movie although the stimuli contents were different between the two halves. Thus, the fine-grained functional connectivity profiles remain highly stable and reliable across movie contents, which contributes to the robustness of cross-movie, time, and other parameters (e.g., scanner models, scanning parameter) predictions using our algorithms.

      We added a paragraph in the discuss section to address the concerns (page 18-19):

      “This study successfully illustrated that accurate individualized predictions are both robust and applicable across a variety of conditions, including movie types, languages, scanning parameters, and scanner models. Importantly, the intricate connectivity profiles remain consistent even when participants view entirely different movies, as evidenced by Figure 4-figure supplement 9, reinforcing the prediction's stability in various scenarios. However, all four datasets in this study only included typical participants with anatomically intact brains. An unanswered question is whether individualized topographies of neuropsychological populations with atypical cortical function (e.g., developmental prosopagnosics) or with lesioned brains (e.g., acquired prosopagnosics) could also be accurately predicted using the hyperalignment-based methods. Up to now, as far as we know, no previous literature has investigated this question. Beyond neuropsychological groups, it is also valuable to investigate how well the predictions will be across a wide range of age, from infants to the elderly. Future research is essential to adapt our algorithms to diverse populations.”

      Reviewer #2 (Public Review):

      Guo and her colleagues develop a new approach to map the category-selective functional topographies in individual participants based on their movie-viewing fMRI data and functional localizer data from a normative sample. The connectivity hyperalignment are used to derived the transformation matrices between the participants according to their functional connectomes during movies watching. The transformation matrices are then used to project the localizer data from the normative sample into the new participant and create the idiosyncratic cortical topography for the participant. The authors demonstrate that a target participant's individualized category-selective topography can be accurately estimated using connectivity hyperalignment, regardless of whether different movies are used to calculate the connectome and regardless of other data collection parameters. The new approach allows researchers to estimate a broad range of functional topographies based on naturalistic movies and a normative database, making it possible to integrate datasets from laboratories worldwide to map functional areas for individuals. The topic is of broad interest for neuroimaging community; the rationale of the study is straightforward and the experiments were well designed; the results are comprehensive. I have some concerns that the authors may want to address, particularly on the details of the pipeline used to map individual category-selective functional topographies.

      We thank the reviewer for the encouragement.

      1) How does the length of the scan-length of movie-viewing fMRI affect the accuracy in predicting the idiosyncratic cortical topography? Similarly, how does the number of participants in the normative database affect the prediction of the category-selective topography? This information is important for the researchers who are interested in using the approach in their studies.

      To investigate the influence of movie-viewing data length and the number of participants in the normative database on prediction performance, we systematically varied these parameters. Specifically, we altered the number of runs utilized in the analysis for both the normative and target data and experimented with varying the number of participants in the normative dataset using the Budapest and the Sraiders datasets. We have included a new Figure 4-figure supplement 5 to present a summary of these findings.

      The results reveal that both within-dataset and between-dataset prediction performances are positively correlated with the length of movie-viewing fMRI data used for both the normative and target groups. A similar trend was observed with respect to the number of participants included in the normative dataset. It is important to highlight, though, that, even when analyzing as little as one run of movie-viewing data—roughly 10-15 minutes, our hyperalignment-based prediction performance was significantly higher than that achieved using traditional surface alignment. This held true even when the normative dataset included as few as five participants.

      In summary, our results show that prediction performance generally improves with longer movie-viewing sessions and larger normative datasets. However, it is noteworthy that even with minimal data—10 minutes of movie-viewing and a small number of participants in the normative dataset—our algorithm still outperforms traditional surface alignment methods significantly.

      We also added sentences in the discussion section (page 15):

      “We investigated the influence of naturalistic movie length and the size of the training group on the prediction accuracy of individualized functional topographies. By incrementally increasing both the number of movie runs in the training and target dataset and the participants in the training group in the Budapest and Sraiders dataset, we observed enhanced prediction accuracy (Figure 4-figure supplement 5). Notably, even with just one movie run in the training or target dataset, or with a mere five participants in the training group, our prediction performance (Pearson r) ranged from about 0.6 to 0.7. This accuracy significantly outperformed results obtained using surface-based alignment.”

      2) The data show that category-selective topography can be accurately estimated using connectivity hyperalignment, regardless of whether different movies are used to calculate the connectome and regardless of other data collection parameters. I'm wondering whether the functional connectome from resting state fMRI can do the same job as the movie-watching fMRI. If it is yes, it will expand the approach to broader data.

      We agree with the reviewer that demonstrating the applicability of the resting state data will expand the application scenarios of this approach. Most previous findings on resting state connectivity, including the comparison between the naturalistic and the resting state paradigms, focused on the macro-scale similarities and differences (e.g., Samara et al., 2023). Very few studies have investigated the fine-scaled connectome based on resting state data. The study on connectivity hyperalignment (Guntupalli et al., 2018) demonstrated a shared fine-scale connectivity structure among individuals that co-exists with the common coarse-scale structure and built the algorithm to successfully hyperalign individuals to the shared fine-scaled space. Another study from our lab (Feilong et al., 2021) revealed that the fine-scaled connectivity profiles in both resting and task states are highly predictive of general intelligence, indicating reliable and biologically relevant fine-scaled resting state connectome structures. Thus, it is highly plausible that our approach is able to be generalized to the resting state data, generating significantly better predictions of individualized functional topographies than traditional surface alignment. However, due to the limitations of the current datasets, we do not have resting state data available in the current datasets to perform this analysis. We are in the process of collecting new data to explore this hypothesis in future work.

      We added sentences to the discussion section to discuss this idea (page 18):

      “Studies comparing movie-viewing and resting state functional connectivity have shown that both paradigms yield overlapping macroscale cortical organizations (29), though naturalistic viewing introduces unique modality-specific hierarchical gradients. However, there remains a gap in research comparing the fine-scaled connectomes of naturalistic and resting state paradigms. Guntupalli and colleagues (14) revealed a shared fine-scale structure that coexists with the coarse-scale structure, and connectivity hyperalignment successfully improved intersubject correlations across a wide variety of tasks. Feilong et al. (13) noted that the fine-scaled connectivity profiles in both resting and task states are highly predictive of general intelligence. This suggests a reliable and biologically relevant fine-scale resting state connectivity structure among individuals. Therefore, it is plausible that individualized functional topography could be effectively estimated using resting state functional connectivity, expanding the applicability of our approach. Future studies are needed to explore this direction.”

      3) The authors averaged the hyper-aligned functional localizer data from all of subjects to predict individual category-selective topographies. As there are large spatial variability in the functional areas across subjects, averaging the data from many subjects may blur boundaries of the functional areas. A better solution might be to average those subjects who show highly similar connectome to the target subjects.

      We appreciate the reviewer’s insightful question about optimizing prediction performance by selecting participants most similar in functional connectivity to the target individuals. This is a promising direction and difficult problem as well. Our approach is based on fine-scale connectome to hyperalign participants, thus different groups of participants may be similar to the target participant in different searchlights. In addition, based on results discussed in the response to Q2, the more participants included in the normative dataset, the better the prediction performance. Thus, there is a trade-off between the number of participants included in the normative dataset for the prediction and the overall similarity of those participants to the target participant.

      To quantitatively explore this idea, we used a searchlight in the right ventral temporal cortex, roughly at the location of posterior fusiform face area (pFFA).We sorted participants by their connectome similarity to each target participant and then examined prediction performance based on either the top nine most similar participants or the bottom nine least similar participants. Our results, presented in Figure 4-figure supplement 8, reveal that hyperalignment consistently outperforms surface alignment regardless of the subset of participants used. Notably, using the nine most similar participants did not significantly alter prediction performance (Tukey Test, z = -0.09, p = 0.996), while using the least similar participants did negatively impact it (Tukey Test, z = 2.492, p = 0.034). Interestingly, the stability of hyperalignment-based predictions remained high even when only a subset of participants was used, contrasting with the variability observed in surface-alignment-based predictions.

      Overall, these findings suggest that while selecting functionally similar participants is a promising avenue for future optimization, the process will require nuanced, searchlight-specific criteria. Each searchlight may necessitate its own set of optimal participants to balance between the performance boost from having more participants and the fidelity gained from participant similarity.

      We added the following to the discussion in the manuscript (page 16):

      “In our study, we used fine-scale connectomes, noting that some participants are more similar to the target participant in specific searchlights. It is an interesting question whether predictions could be enhanced by exclusively selecting those more similar participants for the target participant. To explore this option, we examined a searchlight in the right ventral temporal cortex that was roughly at the location of the posterior fusiform area (pFFA) using the top and bottom nine participants similar to each target participant measured by their fine-scale connectome similarities in the budapest dataset. Generally, using all or part of the participants for the prediction generated similar results (Figure 4-figure supplement 8). Compared to using all the participants, using only the top nine participants who are the most similar to the target participants did not significantly improve the prediction (Tukey Test, z = -0.09, p = 0.996), but using only the bottom nine participants generated significantly lower prediction accuracies (Tukey Test, z = 2.492, p = 0.034). This suggests a trade-off between the number of participants included in the prediction and the similarity of the participants. Future studies are needed to explore the optimal threshold for the number of participants included for each searchlight to refine the algorithm.”

      4) It is good to see that predictions made with hyperalignment were close to and sometimes even exceeded the reliability values measured by Cronbach's alpha. But, please clarify how the Cronbach's alpha is calculated.

      Cronbach’s alpha calculates the correlation score between localizer-based maps across the runs, and it reflects the amount of noise in maps based on individual localizer runs. Traditionally, the reliability was estimated based on split-half correlations. For example, Guntupalli et al. (2016) used correlations of category-selectivity maps between odd and even localizer runs as the measure of reliability. The odd/even split measure underestimated reliability and necessitated recalculation of correlations between maps for only half the data to provide valid comparisons. In contrast, Cronbach’s alpha involves all localizer runs and provides a more accurate statistical estimate of the reliability of the topographies estimated with localizer runs.

      Cronbach’s alpha has been used in many previously published works from our lab (e.g., Feilong et al., 2021; Jiahui et al., 2020, 2023). The code for implementing this metric is publicly accessible on the first author’s Github repository (https://github.com/GUOJiahui/face_DCNN/blob/main/code/cronbach_alpha.py).

      We added the detailed explanation above to the Material and Methods section (page 24):

      “Cronbach’s alpha calculates the correlation score between localizer-based maps across the runs, and it reflects the amount of noise in maps based on individual localizer runs. Traditionally, the reliability was estimated based on split-half correlations. The common odd/even split measure underestimated reliability and necessitated recalculation of correlations between maps for only half the data to provide valid comparisons. In contrast, Cronbach’s alpha involves all localizer runs and provides a more accurate statistical estimate of the reliability of the topographies estimated with localizer runs.”

      5) Which algorithm was used to perform surface-based anatomical alignment? Can the state-ofthe-art Multimodal Surface Matching (MSM) algorithm from HCP achieve better performance?

      We preprocessed our datasets using fMRIPrep, which employs algorithms from FreeSurfer’s recon-all for surface-based anatomical alignment. It is worth noting that different alignment methods can yield varying degrees of performance. For instance, a study by Coalson et al. (2018) compared the localization performance of multiple surface-based alignment methods, including Multimodal Surface Matching (MSM) and FreeSurfer. The study found that MSM outperformed FreeSurfer in terms of peak probabilities and spatial clustering, suggesting better overall localization.

      Additionally, Guntupalli et al. (2018) evaluated intersubject correlations (ISC) of functional connectivity from movie-viewing data using both Connectivity Hyperalignment (CHA) and MSM-All with the Human Connectome Project (HCP) dataset. The study showed that although MSM-All yielded marginally better ISC than traditional surface alignment, CHA’s performance was significantly superior.

      In summary, while using a more advanced alignment algorithm like MSM could marginally improve prediction performance, its advantages may not be substantial when compared to our CHA-based predictions. The combination of MSM and CHA represents an intriguing direction for future research, although it falls outside the scope of our current study.

      6) Is it necessary to project to the time course of the functional localizer from the normative sample into the new participants? Does it work if we just project the contrast maps from the normative samples to the new subjects?

      It is an interesting question and a practical alternative to researchers to know whether time series of the localizer runs are required to obtain reasonable predictions, as in some scenarios, contrast maps may be the only accessible data in the analysis. To quantitatively explore this possibility, we applied transformation matrices derived from the movie data to training participants’s individual pre-calculated contrast maps of all four categories, and evaluated the predictions. We found nearly similar prediction performance between the two flavors within and across datasets (Figure 4-figure supplement 7). However, it is worth noting that applying transformation matrices directly to contrast maps did not get as much improvement in the interactive steps as the other flavor in the advanced CHA, perhaps due to the scale changes when multiple iterations were implemented and the difficulty to properly normalize the t-maps compared to the regular time series.

      Overall, although our algorithm is originally designed to be used on the time course of the functional localizer runs, relatively comparable results can be generated even when the contrast maps are directly projected from the normative group to the target participant. However, to derive the best results with our approach, time series are recommended when the situation permits.

      We have also added the contents into the Discussion section (page 16):

      “Our original algorithm is designed to apply transformation matrices to the time series of localizer data of training participants before generating contrast maps. To explore whether directly applying these matrices to pre-calculated contrast maps yields comparable results, we conducted an additional analysis across the four categories. Our findings indicate that the prediction outcomes were indeed quite similar between the two approaches for both the within- and across-datasets predictions (Figure 4-figure supplement 7). However, it is worth noting that the improvements observed with enhanced CHA were not as pronounced when applied directly to the contrast maps as opposed to the time series.”

      7) Saygin and her colleagues have demonstrated that structural connectivity fingerprints can predict cortical selectivity for multiple visual categories across cortex (Osher DE et al, 2016, Cerebral Cortex; Saygin et al, 2011, Nat. Neurosci). I think there's a connection between those studies and the current study. If the author can discuss the connection between them, it may help us understand why CHA work so well.

      We thank the reviewer for raising this point that provides us with the chance of clarifying how our approach differs with methods previously reported in the literature. The computational logic underlying our approach is that we derived the transformation matrices between the training and the target participants in the high-dimensional space based on functional connectivity calculated from the movie data. Then, we applied these transformation matrices to the training participant’s localizer data to accomplish the prediction. On the other hand, Saygin and colleagues directly used diffusion-weighted imaging (DWI) data and predicted participants’ functional responses based on the anatomical-functional correspondence. They evaluated the prediction by calculating the mean absolute errors (AE) of the difference between the actual and predicted contrast responses. Although AE linearly increases with the quality of the prediction, it is difficult to measure the prediction performance of the shape, size, and location of the functional areas precisely using this mean value. With our algorithm, we were able to predict the general location and size of the areas and recover the individualized shapes, generating more powerful predictions. We also used the searchlight analysis to evaluate the performance across the cortex systematically. In addition, Osher et al. (2016) and Saygin et al. (2012) always have a few participants failing to show better predictions based on the connectivity than the group averaged method. Our algorithm is more stable, as all participants across all four datasets had better predicted performance using our algorithm than using the group average. However, although we did not directly use the anatomical-functional correspondence with DWI, the relationships between individual structural connectivity and cortical visual category selectivity could be one of the biological underpinnings that contribute to this robust and accurate prediction.

      The Connectivity-Based Shared Response Model (cSRM, Nastase et al., 2020) offers an alternative framework for aligning individuals through functional connectivity. While the overarching aim of cSRM and our methodology converges, substantial differences emerge in the respective implementation and application between the two methods that make our approach the more suitable for predicting individualized topographies. The most significant difference between the two is that, instead of focusing on within-individual connectivity profiles, cSRM used inter-subject functional connectivity (ISFC) in the initial step. This design requires that all participants must have time-locked time series, making the algorithm unusable for cross-content prediction and making it incompatible with resting-state data. Our approach, on the other hand, does not require time-locked stimuli, thereby offering a more flexible framework that permits generalization across different types of stimuli and experimental settings and enables bringing data across laboratories across the world together. Secondly, cSRM predominantly focuses on Region of Interest (ROI) analyses, whereas our model employs searchlight-based analyses designed to comprehensively cover the entire cortical sheet. Whole-brain coverage is needed to generate the topography that reflects the patterns across the cortex. Finally, with the optimized 1step method, our approach directly hyeraligns the training and target participants together, avoiding the accumulation of errors from the intermediate common space. cSRM, with an implementation similar to the classic connectivity hyperalignment, creates and hyperaligns all participants to a shared information space. In summary, while our approach and cSRM share a similar theoretical foundation, our approach has been specifically optimized to address the challenges and complexities in predicting individualized whole-brain functional topographies. Moreover, our approach demonstrates a remarkable ability to generalize across a variety of contexts and stimuli, offering a significant advantage in dealing with diverse experimental settings and datasets.

      We have added the contents to the discussion section (page 16-17):

      “By leveraging transformation matrices obtained from hyperaligning participants based on movie-viewing data, we successfully mapped these relationships to the training participants’ localizer data, enabling robust predictions. Prior work employing diffusion-weighted imaging (DWI) has underscored the link between anatomical connectivity and category selectivity across diverse visual fields (22, 23) and has established a notable congruence between structural and functional connectivities (24). These findings suggest that the unique anatomical connectivity patterns of individuals may serve as a foundational mechanism, contributing to the stable finescale functional connectome that underpins our approach. The connectivity-based Shared Response Model (cSRM) proposed by Nastase and colleagues (25) used connectivity to functionally align individuals similar to the connectivity hyperalignment algorithm. While both approaches share overarching goals, they diverge considerably in implementation and application. First and most important, cSRM used inter-subject functional connectivity (ISFC) rather than within-subject functional connectivity to initially estimate the connectome. As a result, cSRM requires participants to have time-locked fMRI time series. Therefore, unlike our algorithm, the cSRM approach does not support cross-content applications and also is not suitable for use with resting-state data. Second, cSRM is implemented based on a predefined cortical parcellation rather than the overlapping, regularly-spaced cortical searchlights applied in our method which are not constrained by areal borders. For the application, cSRM has mainly been used to do ROI analysis rather than the estimation of the whole-brain topography that requires broader coverage of the cortex with a searchlight analysis. Third, our method is specifically designed to work in each individual’s space, while cSRM decomposes data across subjects into shared and subjectspecific transformations, focusing on a communal connectivity space. In summary, although cSRM presents a promising alternative for similar aims, its current implementation precludes it from fulfilling the range of applications for which our method is optimized.”

      Reviewer #3 (Public Review):

      In this paper, Jiahui and colleagues propose a new method for learning individual-specific functional resonance imaging (fMRI) patterns from naturalistic stimuli, extending existing hyperalignment methods. They evaluate this method - enhanced connectivity hyperalignment (CHA) - across four datasets, each comprising between nine (Raiders) and twenty (Budapest, Sraiders) participants.

      The work promises to address a significant need in existing functional alignment methods: while hyperalignment and related methods have been increasingly used in the field to compare participants scanned with overlapping stimuli (or lack thereof, in the case of resting state data), their use remains largely tied to naturalistic stimuli. In this case, having non-overlapping stimuli is a significant constraint on application, as many researchers may have access to only partially overlapping stimuli or wish to compare stimuli acquired under different protocols and at different sites.

      It is surprising, however, that the authors do not cite a paper that has already successfully demonstrated a functional alignment method that can address exactly this need: a connectivitybased Shared Response Model (cSRM; Nastase et al., 2020, NeuroImage). It would be relevant for the authors to consider the cSRM method in relation to their enhanced CHA method in detail. In particular, both the relative predictive performance as well as associated computational costs would be useful for researchers to understand in considering enhanced CHA for their applications.

      We thank the reviewer for raising this point that provides us with the chance of clarifying how our approach differs with methods previously reported in the literature. The computational logic underlying our approach is that we derived the transformation matrices between the training and the target participants in the high-dimensional space based on functional connectivity calculated from the movie data. Then, we applied these transformation matrices to the training participant’s localizer data to accomplish the prediction. On the other hand, Saygin and colleagues directly used diffusion-weighted imaging (DWI) data and predicted participants’ functional responses based on the anatomical-functional correspondence. They evaluated the prediction by calculating the mean absolute errors (AE) of the difference between the actual and predicted contrast responses. Although AE linearly increases with the quality of the prediction, it is difficult to measure the prediction performance of the shape, size, and location of the functional areas precisely using this mean value. With our algorithm, we were able to predict the general location and size of the areas and recover the individualized shapes, generating more powerful predictions. We also used the searchlight analysis to evaluate the performance across the cortex systematically. In addition, Osher et al. (2016) and Saygin et al. (2012) always have a few participants failing to show better predictions based on the connectivity than the group averaged method. Our algorithm is more stable, as all participants across all four datasets had better predicted performance using our algorithm than using the group average. However, although we did not directly use the anatomical-functional correspondence with DWI, the relationships between individual structural connectivity and cortical visual category selectivity could be one of the biological underpinnings that contribute to this robust and accurate prediction.

      The Connectivity-Based Shared Response Model (cSRM, Nastase et al., 2020) offers an alternative framework for aligning individuals through functional connectivity. While the overarching aim of cSRM and our methodology converges, substantial differences emerge in the respective implementation and application between the two methods that make our approach the more suitable for predicting individualized topographies. The most significant difference between the two is that, instead of focusing on within-individual connectivity profiles, cSRM used inter-subject functional connectivity (ISFC) in the initial step. This design requires that all participants must have time-locked time series, making the algorithm unusable for cross-content prediction and making it incompatible with resting-state data. Our approach, on the other hand, does not require time-locked stimuli, thereby offering a more flexible framework that permits generalization across different types of stimuli and experimental settings and enables bringing data across laboratories across the world together. Secondly, cSRM predominantly focuses on Region of Interest (ROI) analyses, whereas our model employs searchlight-based analyses designed to comprehensively cover the entire cortical sheet. Whole-brain coverage is needed to generate the topography that reflects the patterns across the cortex. Finally, with the optimized 1step method, our approach directly hyeraligns the training and target participants together, avoiding the accumulation of errors from the intermediate common space. cSRM, with an implementation similar to the classic connectivity hyperalignment, creates and hyperaligns all participants to a shared information space. In summary, while our approach and cSRM share a similar theoretical foundation, our approach has been specifically optimized to address the challenges and complexities in predicting individualized whole-brain functional topographies. Moreover, our approach demonstrates a remarkable ability to generalize across a variety of contexts and stimuli, offering a significant advantage in dealing with diverse experimental settings and datasets.

      We have added the contents to the discussion section (page 16-17):

      “By leveraging transformation matrices obtained from hyperaligning participants based on movie-viewing data, we successfully mapped these relationships to the training participants’ localizer data, enabling robust predictions. Prior work employing diffusion-weighted imaging (DWI) has underscored the link between anatomical connectivity and category selectivity across diverse visual fields (22, 23) and has established a notable congruence between structural and functional connectivities (24). These findings suggest that the unique anatomical connectivity patterns of individuals may serve as a foundational mechanism, contributing to the stable finescale functional connectome that underpins our approach. The connectivity-based Shared Response Model (cSRM) proposed by Nastase and colleagues (25) used connectivity to functionally align individuals similar to the connectivity hyperalignment algorithm. While both approaches share overarching goals, they diverge considerably in implementation and application. First and most important, cSRM used inter-subject functional connectivity (ISFC) rather than within-subject functional connectivity to initially estimate the connectome. As a result, cSRM requires participants to have time-locked fMRI time series. Therefore, unlike our algorithm, the cSRM approach does not support cross-content applications and also is not suitable for use with resting-state data. Second, cSRM is implemented based on a predefined cortical parcellation rather than the overlapping, regularly-spaced cortical searchlights applied in our method which are not constrained by areal borders. For the application, cSRM has mainly been used to do ROI analysis rather than the estimation of the whole-brain topography that requires broader coverage of the cortex with a searchlight analysis. Third, our method is specifically designed to work in each individual’s space, while cSRM decomposes data across subjects into shared and subjectspecific transformations, focusing on a communal connectivity space. In summary, although cSRM presents a promising alternative for similar aims, its current implementation precludes it from fulfilling the range of applications for which our method is optimized.”

      With this in mind, I noted several current weaknesses in the paper:

      First, while the enhanced CHA method is a promising update on existing CHA techniques, it is unclear why this particular six step, iterative approach was adopted. That is: why was six steps chosen over any other number? At present, it is not clear if there is an explicit loss function that the authors are minimizing over their iterations. The relative computational cost of six iterations is also likely significant, particularly compared to previous hyperalignment algorithms. A more detailed theoretical understanding of why six iterations are necessary-or if other researchers could adopt a variable number according to the characteristics of their data-would significantly improve the transferability of this method.

      In the advanced connectivity hyperalignment implementation, we gradually increased the number of targets. The six steps were not intentionally chosen but were the result of the increase to the maximum number of fine-grained targets, namely single cortical vertices.

      Our datasets were resampled to the cortical mesh with 18,742 vertices across both hemispheres (approximately 3 mm vertex spacing; icoorder 5; 20,484 vertices before removing non-cortical vertices). Step 1 was the classic standard connectivity hyperalignment implementation based on the anatomically-aligned data. Since using dense connectivity targets (e.g., using all 18742 vertices on the surface) with anatomically-aligned data generates poor functional correspondence across participants (Busch et al., 2021), we used 1,284 vertices (icoorder 3, before removing the medial wall) as connectivity targets in step 1. However, it is beneficial to include more targets for calculating connectivity patterns after the first iteration of connectivity hyperalignment and repeated iterations to lead to a better solution by gradually aligning the information at finer scales. To better align across participants, we iterated the alignment for another two times (step 2 and step 3) with the same number of 1,284 coarse connectivity targets to ensure improved alignment before increasing the number of targets in the later steps. In step 4, we increased the number of targets to 5,124 (icoorder 4, before removing the medial wall), and iterated with this number of vertices for two times in total (step 4 & step 5) before using all vertices as targets. In the final step (step 6), all vertices were used as connectivity targets.

      It is true that the multiple iteration steps largely increased the computational complexity compared to the classic connectivity hyperalignment, but the prediction increase was steady across all datasets and became comparable to response hyperalignment performance which requires time-locked stimuli. We did not use an explicit loss function in the algorithm, but followed the natural progression of the number of potential connectivity targets in the implementation. On the other hand, the difference between the performance of the improved and the classic connectivity hyperalignment was relatively small (difference of r < 0.05), which indicates the effectiveness of our classic algorithm. It is up to the researchers’ own options to adopt the number of iterations and the pace of increasing the number of targets in each step. If computational resources are limited or if a shorter total computational time is the primary priority, using the classic connectivity hyperalignment may be the best option to balance the trade-offs.

      The Materials and Methods section had the details of the implementation (page 22-23):

      “Using dense connectivity targets (e.g., using all 18742 vertices on the surface) with anatomically-aligned data usually generates poor functional correspondence across participants (33). It is, however, beneficial to include more targets for calculating connectivity patterns after the first iteration of connectivity hyperalignment and repeated iterations to lead to a better solution by gradually aligning the information at finer scales.

      We used six steps to further improve the connectivity hyperalignment method. Step 1 was the initial connectivity hyperalignment step as described above that was based on the raw anatomically aligned movie data. The resultant transformation matrices were applied to those movie runs, and the hyperaligned data were then used in step 2 to calculate new connectivity patterns and calculate new transformation matrices. We repeated this procedure iteratively six times and derived transformation matrices for each step. In steps 1, 2, and 3, 642 × 2 (icoorder3, before removing the medial wall) connectivity targets were defined with 13 mm searchlights. In step 4 and 5, 2562 × 2 (icoorder 4, before removing the medial wall) connectivity targets were used with 7 mm searchlights to calculate target mean time series. In the final step 6, all 18742 vertices were included as separate connectivity targets, using each vertex’s time series rather than calculating the mean in a searchlight. Each step of this advanced connectivity hyperalignment algorithm increased the prediction performance (Figure 4-figure supplement 2).”

      But to help the readers understand the logic of the advanced connectivity hyperalignment algorithm used in this study, we expanded the discussion section (page 15):

      “Because using dense connectivity targets (e.g., using all vertices as connectivity targets) with anatomically-alignment data often leads to suboptimal alignment across participants (33), we started with coarse connectivity targets and gradually increased the number of connectivity targets to form a denser representation of connectivity profiles. The iterations improved the prediction performance step by step, and at the final step (step 6, all vertices were used as connectivity targets) in this analysis, the enhanced CHA generated comparable performance with RHA (Figure 4-figure supplement 4).”

      Second, the existing evaluations for enhanced CHA appear to be entirely based on imagederived correlations. That is, the authors compare the predicted image from CHA with the ground-truth image using correlation. While this provides promising initial evidence, correlation-based measures are often difficult to interpret given their sensitivity to image characteristics such as smoothness. Including Cronbach's alpha reliability as a baseline does not address this concern, as it is similarly an image-based statistic. It would be useful to see additional predictive experiments using frameworks such as time-segment classification, intersubject decoding, or encoding models.

      We appreciate the reviewer’s concern regarding the stability of local correlations in relation to image characteristics. To address this, we conducted additional analysis using different searchlight sizes (with radii of 10 mm, 15 mm, and 20 mm) to evaluate the predicted categoryselective maps, focusing specifically on the Budapest dataset. The local correlations between the predicted category-selective maps (obtained using enhanced CHA) and participants’ own maps based on classic localizer runs were calculated for each searchlight. We averaged these correlations across participants and plotted the resulting maps, as shown in Figure 4-figure supplement 10. Although using a larger searchlight radius is similar to employing a larger smoothing kernel, the results remained relatively stable across different searchlight sizes, particularly in regions selectively responsive to the specific category. This stability suggests that while the evaluation may be influenced by image-related features, the conclusion would remain consistent under varying parameters.

      As for the use of enhanced CHA, it serves as an optimized version of the classic CHA, specifically designed for predicting individualized functional topographies. Evaluating prediction performance in our study is based on t-value contrast maps for each participant. Given this, it's unclear how time-segment classification or other decoding/encoding models could be appropriately implemented for performance evaluation. However, prior research from our lab has already established the effectiveness of classic CHA. Specifically, Guntupalli et al. (2018) showed that classic CHA significantly improved intersubject correlations (ISC) of connectivity profiles across the cortex. They also revealed that CHA captured fine-scale variations in connectivity profiles for nearby cortical nodes across participants and led to improved betweensubject multivariate pattern classification accuracies (bsMVPC) of movie segments. These findings serve as robust evidence for the effectiveness of classic CHA, laying the groundwork for our enhanced CHA approach.

      We added Figure 4-figure supplement 10 to the supplementary material:

      Addressing these concerns and considering cSRM as a comparison model would significantly strengthen the paper. There are also notable strengths that I would encourage the authors to further pursue. In particular, the authors have access to a unique dataset in which the same Raiders of the Lost Ark stimulus was scanned for participants within the Budapest (SRaiders) dataset as well as non-overlapping participants in the Raiders dataset. Exploring the relative performance for cross-movie prediction within a dataset as compared to a shared movie prediction across datasets is particularly interesting for methods development. I would encourage the authors to explicitly report results in this framework to highlight both this unique testing structure as well as the performance of their enhanced CHA method.

      We appreciate the reviewer's suggestion to examine a shared time-series but non-overlapping participants scenario using the Sraiders and Raiders datasets. However, there are significant differences between the two datasets that preclude such direct comparison. These differences include varying scanning parameters, MRI scanners, localizer types, and data collection procedures. Due to these methodological divergences, the datasets cannot be treated as identical time-series.

      Firstly, the scanning parameters vary considerably. Sraiders were scanned with TR = 1 s (TR/TE = 1000/33 ms, flip angle = 59 °, resolution = 2.5 mm3 isotropic voxels, matrix size = 96 × 96, FoV = 240 × 240 mm, multiband acceleration factor = 4, and no in-plane acceleration), and Raiders were scanned with TR = 2.5 s (TR = 2.5 s, TE = 35 ms, Flip angle = 90°, 80 × 80 matrix, FOV = 240 mm × 240 mm, resolution = 0.938 mm × 0.938 mm × 1.0 mm).

      Secondly, participants in the Sraiders were scanned with a 3 T S Magnetom Prisma MRI scanner with a 32 channel head coil and the Raiders dataset, collected more than 10 years ago, used a 3T Philips Intera Achieva scanner with an eight-channel head coil.

      Thirdly, the stimuli presentations were different. In the Sraiders dataset, the movie Raiders of the Lost Ark was split into eight parts (~15 min each), and the first four parts were watched outside of the scanner prior to the scanning (~56 min). The later four parts were watched in the scanner (57 min) with audio. And in the Raiders dataset, the audio-visual movie was split into eight parts (~15 min each). Participants watched all eight parts in the scanner with audio (one part / per run).

      Fourthly and critically, the two datasets included two types of localizers. The Sraiders dataset included dynamic localizer runs, and the Raiders dataset only contained a static localizer that was similarly designed as in the Forrest dataset.

      With all four points, it is not suitable to treat the two datasets as identical time-series. The difference in the localizer type is a further issue. The topographies generated from the two types of localizers are dissimilar in many ways. For all categories, the dynamic localizer elicited stronger and broader category-selective activations than the static localizer, and the searchlight analysis showed that the dynamic localizer had higher reliabilities across the cortex, especially in regions that were selectively responsive to the target category. Due to these differences, crossdataset predictions yielded lower correlations than within-dataset predictions. This is not indicative of methodological failure but reflects diverging topographies activated by different localizers.

      In the manuscript, we have extensively analyzed cross-dataset predictions (Figure 2-figure supplement 1-Figure 4-figure supplement 4 & 6).

      ● Figure 2-figure supplement 1 demonstrates that, despite the limitations of cross-localizertype evaluation, both R-to-S (Raiders to Sraiders) and S-to-R (Sraiders to Raiders) predictions significantly outperformed surface alignment methods across categories.

      ● Figure Figure 2-figure supplement 2 confirms that the prediction performance remained stable across individual participants, underscoring the robustness of our methodology.

      ● Figure 3-figure supplement 1 & Figure 3-figure supplement 2 display contrast maps generated from both native and alternate localizers, revealing that the maps share similar topographies irrespective of the dataset origin.

      ● Figure 4-figure supplement 1 presents a correlation analysis of local similarities in R-to-S and S-to-R predictions, highlighting particularly strong correlations in the ventral face regions.

      ● Figure 4-figure supplement 2 employs histograms to showcase performance across major cortices and furnishes additional evidence regarding the influence of localizer types on the results.

      ● Figure 4-figure supplement 3 offers a searchlight analysis for other categories, enriching the scope of our investigation.

      ● Figure 4-figure supplement 4 affirms that the advanced CHA is effective in both R-to-S and S-to-R predictions.

      ● Figure 4-figure supplement 6 compares the efficacy of 1-step vs. 2-step prediction methods for R-to-S and S-to-R, showing a clear advantage for the 1-step approach.

      These analyses affirmed that our approach outperforms surface alignment methods. But the inherent limitations in data collection and localizer types preclude a direct exploration of the reviewer’s hypothesis. These complexities necessitate further research to fully validate the proposed scenario.

      Overall, I share the authors' enthusiasm for the potential of cross-movie, cross-dataset prediction, and I believe that methods such as enhanced CHA are likely to significantly improve our ability to make these comparisons in the near future. At present, however, I find that the theoretical and experimental support for enhanced CHA is incomplete. It is therefore difficult to assess how enhanced CHA meets its goals or how successfully other researchers would be able to adopt this method in their own experiments.

      We hope our new analysis and replies addressed the reviewer’s concerns.

    1. Fine-tuning takes a pre-trained LLM and further trains the model on a smaller dataset, often with data not previously used to train the LLM, to improve the LLM’s performance for a particular task.

      LLMs can be extended with both RAG and Fine-Tuning Fine-tuning is appropriate when you want to customize a LLM to perform well in a particular domain using private data. For example, you can fine-tune a LLM to become better at producing Python programs by further training the LLM on high-quality Python source code.

      In contrast, you should use RAG when you are able to augment your LLM prompt with data that was not known to your LLM at the time of training, such as real-time data, personal (user) data, or context information useful for the prompt.

    1. y

      does this somehow become an implicit argument? It seems so from later uses, but how do generalized variables decide when to generalize implicitly? More importantly, its type was earlier declared to be A, but it should be [ A ]∞ ?? Ok so it appears that the generalized vars can be instantiated to different things – I don't know if at a term- or module level. Ok so if you look at the next line, already they're instantiated to something else. So apparently you can do this even at the clause level. Btw the source code for this blog can be found at https://github.com/jespercockx/agda2scheme/blob/12ff5dcaebfe38dfbfdb48d4fb97bbbe2aa792d9/test/formalize-all-the-things.agda, there doesn't seem to be a lagda file though.

    1. The next article in this series, “Regular Expression Matching: the Virtual Machine Approach,” discusses NFA-based submatch extraction. The third article, “Regular Expression Matching in the Wild,” examines a production implementation. The fourth article, “Regular Expression Matching with a Trigram Index,” explains how Google Code Search was implemented.

      Russ's regular expression article series makes for a good example when demonstrating the Web's pseudomutability problem. It also works well to discuss forward references.

    1. Author Response

      We thank the reviewers for their suggestions in improving the manuscript. We are currently working on a formal revision and plan to submit a revised manuscript in the near future. However, we would be remiss, if we did not address concerns regarding the conceptual merits of the paper. Below we speak to major points of note that address select reviewer comments and the eLife assessment of our manuscript.

      eLife assessment:

      However, the strength of evidence is incomplete due to the concern that larval contraction is a result of chilling the nervous system and muscles, which causes spreading depolarization and mechanical contraction of the body, rather than an active sensorimotor response to cold.

      Reviewer #3:

      The scientific premise is that a full body contraction in larvae that are exposed to noxious cold is a sensorimotor behavioral pathway. This premise is, to start with, questionable. A common definition of behavior is a set of "orderly movements with recognizable and repeatable patterns of activity produced by members of a species (Baker et al., 2001)." In the case of nociception behaviors, the patterns of movement are typically thought to play a protective role and to protect from potential tissue damage.

      Does noxious cold elicit a set of orderly movements with a recognizable and repeatable pattern in larvae? Can the patterns of movement that are stimulated by noxious cold allow the larvae to escape harm? Based on the available evidence, the answer to both questions is seemingly no.

      We thank the reviewer for their questions and clarify, here. Exposure to cold temperatures does elicit a recognizable and repeatable pattern of behavior across multiple strains, including both wildtype and genetic control strains (w1118, Oregon R) and numerous control conditions that have been previously published (Himmel et al., 2021, Himmel et al., 2023, Patel et al., 2022, Turner et al., 2016, Turner et al., 2018, Tenedini et al., 2019). Our initial publication on Drosophila cold nociception demonstrated a variety of cold-evoked behavior responses including head and/or tail raising of the larva as well as contraction behavior. These behaviors were repeatedly observed in assays involving either local cold stimulation with a cold probe or global cold stimulation on a cold plate. Head and/or tail raise behaviors are consistent with behavior that displaces the larval body from the cold surface, however, exposure to increasingly colder temperatures leads to an increasing level of cold-evoked contraction (CT) responses which result in a reduction of larval area (Turner et al., 2016). Presumably, increasing the level of CIII md neuron activation leads to greater activation of downstream circuitry. We previously performed optogenetic dose response assays to further clarify the increased prevalence CT response to strong noxious cold stimuli and investigated how CIII md neurons discriminate between innocuous touch and noxious cold stimuli. Here, we found that lower-level activation of CIII md neurons lead to predominantly touch-evoked behaviors whereas high-level activation led predominantly to cold-evoked responses (Turner et al., 2016). These analyses were coupled with stimulus-evoked calcium imaging, which revealed that touch-evoked Ca2+ levels were significantly lower than cold-evoked Ca2+ levels (Turner et al., 2016).

      In this manuscript, we confirm our previously published findings that neural silencing of CIII md neurons with either tetanus toxin expression or impairing action potential propagation results impaired cold-evoked CT responses (Turner et al., 2016, Turner et al., 2018). However, neural silencing of CIII md neurons did not eliminate cold-evoked CT responses. We interpret this finding as evidence that some component of cold-evoked CT response may be due to cold-induced muscle contraction. Furthermore, in this manuscript, we implicate the requirement of chordotonal (Ch) neurons in cold-evoked CT and demonstrate cold-evoked Ca2+ increases in Ch neurons. Furthermore, neural silencing of multiple sensory neuron types (CIII + Ch or CIII + CII) resulted in greater deficits in cold-evoked behaviors (Turner et al., 2016). Thus, the noxious cold stimulus is detected by multiple peripheral sensory neurons and inhibiting neural activity in CIII md neurons alone cannot eliminate cold-evoked CT responses.

      In this manuscript and in several other publications, studies have shown that optogenetic activation of CIII md neurons, or CIII neurons plus CII neurons or Ch neurons elicits CT-like responses (Hwang et al., 2007, Shearin et al., 2013, Turner et al., 2016). Conversely, optogenetic stimulation of CIII md neurons knocked down for paralytic, the α-subunit of voltage-gated sodium channel, did not elicit blue light-evoked CT responses due to impaired action potential propagation. These analyses collectively indicate that CIII md neuron activation is sufficient for eliciting CT-like responses. Additionally, we have previously published electrophysiological recordings of CIII md neurons under cold exposure. To address potential confounds of cold-induced muscle contraction on cold-induced electrical activity of CIII md neurons, we performed these analyses on de-muscled fillets revealing that CIII neural activity is not dependent upon muscles in response to cold. Exposure to noxious cold stimuli results in temperature-dependent increases in CIII neuron firing pattern consisting of both bursting and tonic firing (Himmel et al., 2021, Himmel et al., 2023, Maksymchuk et al., 2022, Patel et al., 2022, Himmel et al., 2022, Maksymchuk et al., 2023).

      Reviewer #3:

      Can the patterns of movement that are stimulated by noxious cold allow the larvae to escape harm?

      We were similarly curious about the neuroethological and/or protective implications of cold-evoked behaviors. In Drosophila larvae, noxious mechanical stimuli-evoked body rolling allows for lateral escape from predatory wasp (Hwang et al., 2007). Reducing the overall surface area that is exposed to cold (e.g., huddling behavior) serves as a protective strategy in many species (Canals et al., 1997, Contreras, 1984, Gilbert et al., 2006, Vickery and Millar, 1984, Hayes et al., 1992). Low temperatures can be fatal to poikilotherms (e.g., insects), however, many species have evolved the ability to cold acclimate thereby increasing their cold tolerance. To explore the potential evolutionary benefit of CIII-mediated contraction response to cold, we previously published work revealing a neural basis for cold acclimation in Drosophila larvae implicating these neurons (Himmel et al., 2021). We demonstrated that cold-evoked CT behavior is evolutionarily conserved across 11 different drosophilid species and that other cold-induced behaviors (e.g., tail raise) were also observed. Furthermore, drosophilid species adapted to rapid temperature swings were more likely to retain the ability to locomote even at lower temperatures (Himmel et al., 2021). Next, we elucidated the role of CIII md neurons in cold acclimation. Silencing CIII md neurons resulted in the inability to cold acclimate. We additionally investigated roles of Ch or CII md neurons, which alone did not inhibit the ability of larvae to cold acclimate. However, combinatorial silencing of CIII with CII or Ch neurons resulted in an inability to cold acclimate but did not obviously increase baseline cold tolerance. We explored how developmental exposure to noxious cold temperature impacts CIII md neuron cold-evoked firing pattern. Electrophysiological analyses revealed that cold acclimation results in hypersensitization in CIII md neurons (Himmel et al., 2021). Lastly, developmental optogenetic activation of CIII md neurons led to increased cold tolerance. Therefore, CIII md neurons are necessary and sufficient for cold tolerance and our collective evidence demonstrate that CIII-mediated cold nociception constitutes a peripheral neural basis for Drosophila larval cold acclimation (Himmel et al., 2021).

      Reviewer #3:

      It should be noted that this actuator drives very strong activation, and other studies with milder optogenetic stimulation of Class III neurons have shown that these cells produce behavioral responses that resemble gentle touch responses (Tsubouchi et al 2012 and Yan et al 2013)…The latter makes the reported Calcium responses to cold difficult to interpret in light of the fact that the strong muscle contractions driven by cold may actually be driving mechanosensory responses in these cells (ie through deformation of the mechanosensitive dendrites)…. Are the cIII calcium signals still observed in a preparation where cold induced muscle contractions are prevented?”

      We agree with the reviewer that mild activation of CIII md neurons results in gentle touch-like responses. In this manuscript, and other previously published work, it has been shown that optogenetic activation of CIII neurons, or CIII neurons and other sensory neurons, using a variety of optogenetic actuators (ChR2, ChETA, and CsChrimson) promotes bilateral contraction of the larval body along the anterior-posterior axis (Shearin et al., 2013, Hwang et al., 2007, Meloni et al., 2020, Turner et al., 2016, Patel and Cox, 2017, Patel et al., 2022, Himmel et al., 2023).

      As described above, in our initial publication documenting larval cold nociception in Drosophila, we investigated how CIII md neurons discriminate multimodal stimuli to elicit stimulus relevant behavioral responses. We reported that increased activation of CIII md neurons results in cold-evoked behaviors, where lower activation results in touch-evoked behaviors. Subsequent, calcium analyses revealed greater stimulus-evoked calcium response to noxious cold and milder calcium response to gentle touch (Turner et al., 2016).

      Though we have not performed cold-evoked Ca2+ imaging of CIII md neurons in larval preparations without muscles, we have recorded electrical responses of CIII md neurons in the absence of muscle contractions using de-muscled larvae fillets to analyze cold-evoked firing patterns of CIII md neurons (Himmel et al., 2021, Himmel et al., 2022, Himmel et al., 2023, Patel et al., 2022, Maksymchuk et al., 2022, Maksymchuk et al., 2023). These studies demonstrate the cold-evoked CIII neural activity is not dependent upon muscles.

      Reviewer #3:

      A major weakness of the study is that none of the second or third order neurons (that are downstream of CIII neurons) are found to trigger the CT behavioral responses even when strongly activated with the ChETA actuator (Figure 2 Supplement 2). These findings raise major concerns for this and prior studies and it does not support the hypothesis that the CIII neurons drive the CT behaviors.”

      We conducted extensive screening of interneuron populations post-synaptically connected to CIII neurons in an effort to identify post-synaptic partners that were sufficient to trigger CT response. Much to our surprise, we were unable to find any individual neuron type or driver line that was sufficient to elicit a CT response. However, we provide substantial supporting evidence for our co-activation experiments including neural silencing, EM connectivity and calcium imaging. We also report necessity for the reported second/third order neurons in cold-evoked behavioral responses, where inhibiting neural activity resulted in reduced cold-evoked behavior. Second/third order neurons also exhibit cold-evoked calcium responses. Lastly, we also report CIII-evoked (using optogenetics) increases in calcium response in downstream post-synaptic neurons.

      Previously published literature investigating CIV md neuron circuitry has implicated downstream neurons that are not sufficient to elicit rolling behavior upon activation. In CIV md neuron circuit dissection, select neurons are reported as acting downstream of CIV md neurons that require additional circuit components in order to execute rolling behavior. For example, A00c neuron activation alone does not lead to rolling behavior, however, co-activation of A00c and Basin-4 neurons facilitates rolling response (Ohyama et al., 2015). Similarly, co-activation of Basin-1 and Basin-4 neurons significantly enhance rolling probability relative to Basin-4 alone (Ohyama et al., 2015). Further, DnB neurons require Goro command neuron activity to promote rolling behavior (Burgos et al., 2018). Thus, there is precedent for co-activation requirements to elicit robust behavioral output in sensorimotor circuits and we employed a similar strategy after we discovered that activation of second or third order neurons alone did not elicit CT response.

      Reviewer #3:

      Later experiments in the paper that investigate strong CIII activation (with ChETA) in combination with other second and third order neurons does support the idea activating those neurons can facilitate body-wide muscle contractions. But many of the co-activated cells in question are either repeated in each abdominal neuromere or they project to cells that are found all along the ventral nerve cord, so it is therefore unsurprising that their activation would contribute to what appears to be a non-specific body-wide activation of muscles along the AP axis. Also, if these neurons are already downstream of the CIII neurons the logic of this co-activation approach is not particularly clear.”

      We agree with the reviewer’s comment that various cell-types that were investigated are repeated in every abdominal neuromere, however, only select post-synaptic neurons (Basin 1-4, DnB, mCSI, and Chair neurons) are segmentally repeated in every abdominal segment. Conversely, other projection and ascending neurons we investigated (A09e, A00c, A05q, Goro, TePn04/05, and A08n) are not segmentally repeated in every section. We used connectome evidence to guide our experiments on populations of neurons to explore in cold-evoked behavior and as alluded to above our co-activation approach was driven by the observation that an individual subpopulation of connected interneurons was not found to be sufficient to elicit CT behavior. That said, it does not change the findings that inhibition of neural activity in these subpopulations impairs cold-evoked behavior, nor does it change the observation that connected interneurons exhibit cold-evoked Ca2+ responses that can also be observed with optogenetic activation of CIII neurons. Reviewer #3: “The authors argument that the co-activation studies support "a population code" for cold nociception is a very optimistic interpretation of a brute force optogenetics approach that ultimately results in an enhancement of a relatively non-specific body-wide muscle convulsion.” Many studies exploring circuit bases of behavior have applied large-scale optogenetic, including co-activation strategies, or silencing screens to identify circuit components involved in specific behaviors under investigation. We employed similar methods in our circuit-based dissection and our conclusions are not solely based upon optogenetic analyses.

      References: BURGOS, A., HONJO, K., OHYAMA, T., QIAN, C. S., SHIN, G. J.-E., GOHL, D. M., SILIES, M., TRACEY, W. D., ZLATIC, M., CARDONA, A. & GRUEBER, W. B. 2018. Nociceptive interneurons control modular motor pathways to promote escape behavior in Drosophila. eLife, 7:e26016.

      CANALS, M., ROSENMANN, M. & BOZINOVIC, F. 1997. Geometrical aspects of the energetic effectivenes of huddling in small mammals. Acta Theriologica 42(3):321-328..

      CONTRERAS, L. C. 1984. Bioenergetics of Huddling: Test of a Psycho-Physiological Hypothesis. Journal of Mammalogy, 65, 256-262.

      GILBERT, C., ROBERTSON, G., LE MAHO, Y., NAITO, Y. & ANCEL, A. 2006. Huddling behavior in emperor penguins: Dynamics of huddling. Physiol Behav, 88, 479-88.

      HAYES, J. P., SPEAKMAN, J. R. & RACEY, P. A. 1992. The Contributions of Local Heating and Reducing Exposed Surface Area to the Energetic Benefits of Huddling by Short-Tailed Field Voles (Microtus agrestis). Physiological Zoology, 65, 742-762.

      HIMMEL, N. J., LETCHER, J. M., SAKURAI, A., GRAY, T. R., BENSON, M. N., DONALDSON, K. J. & COX, D. N. 2021. Identification of a neural basis for cold acclimation in Drosophila larvae. iScience, 24, 102657.

      HIMMEL, N. J., SAKURAI, A., DONALDSON, K. J. & COX, D. N. 2022. Protocols for measuring cold-evoked neural activity and cold tolerance in Drosophila larvae following fictive cold acclimation. STAR Protoc, 3, 101510.

      HIMMEL, N. J., SAKURAI, A., PATEL, A. A., BHATTACHARJEE, S., LETCHER, J. M., BENSON, M. N., GRAY, T. R., CYMBALYUK, G. S. & COX, D. N. 2023. Chloride-dependent mechanisms of multimodal sensory discrimination and nociceptive sensitization in Drosophila. elife, 12:e76863.

      HWANG, R. Y., ZHONG, L., XU, Y., JOHNSON, T., ZHANG, F., DEISSEROTH, K. & TRACEY, W. D. 2007. Nociceptive Neurons Protect Drosophila Larvae from Parasitoid Wasps. Current Biology, 17, 2105-2116.

      MAKSYMCHUK, N., SAKURAI, A., COX, D. N. & CYMBALYUK, G. 2022. Transient and Steady-State Properties of Drosophila Sensory Neurons Coding Noxious Cold Temperature. Frontiers in Cellular Neuroscience, 16:831803.

      MAKSYMCHUK, N., SAKURAI, A., COX, D. N. & CYMBALYUK, G. S. 2023. Cold-Temperature Coding with Bursting and Spiking Based on TRP Channel Dynamics in Drosophila Larva Sensory Neurons. Int J Mol Sci, 24(19):14638.

      MELONI, I., SACHIDANANDAN, D., THUM, A. S., KITTEL, R. J. & MURAWSKI, C. 2020. Controlling the behaviour of Drosophila melanogaster via smartphone optogenetics. Scientific Reports, 10, 17614.

      OHYAMA, T., SCHNEIDER-MIZELL, C. M., FETTER, R. D., ALEMAN, J. V., FRANCONVILLE, R., RIVERA-ALBA, M., MENSH, B. D., BRANSON, K. M., SIMPSON, J. H., TRUMAN, J. W., CARDONA, A. & ZLATIC, M. 2015. A multilevel multimodal circuit enhances action selection in Drosophila. Nature, 520, 633-639.

      PATEL, A. & COX, D. 2017. Behavioral and Functional Assays for Investigating Mechanisms of Noxious Cold Detection and Multimodal Sensory Processing in Drosophila Larvae. BIO-PROTOCOL, 7(13):e2388.

      PATEL, A. A., SAKURAI, A., HIMMEL, N. J. & COX, D. N. 2022. Modality specific roles for metabotropic GABAergic signaling and calcium induced calcium release mechanisms in regulating cold nociception. Front Mol Neurosci 15:942548.

      SHEARIN, H. K., DVARISHKIS, A. R., KOZELUH, C. D. & STOWERS, R. S. 2013. Expansion of the Gateway MultiSite Recombination Cloning Toolkit. PLoS ONE, 8, e77724-e77724.

      TENEDINI, F. M., SÁEZ GONZÁLEZ, M., HU, C., PEDERSEN, L. H., PETRUZZI, M. M., SPITZWECK, B., WANG, D., RICHTER, M., PETERSEN, M., SZPOTOWICZ, E., SCHWEIZER, M., SIGRIST, S. J., CALDERON DE ANDA, F. & SOBA, P. 2019. Maintenance of cell type-specific connectivity and circuit function requires Tao kinase. Nature Communications, 10, 3506.

      TURNER, H. N., ARMENGOL, K., PATEL, A. A., HIMMEL, N. J., SULLIVAN, L., IYER, S. C., BHATTACHARYA, S., IYER, E. P. R., LANDRY, C., GALKO, M. J. & COX, D. N. 2016. The TRP Channels Pkd2, NompC, and Trpm Act in Cold-Sensing Neurons to Mediate Unique Aversive Behaviors to Noxious Cold in Drosophila. Curr Biol, 26, 3116-3128.

      TURNER, H. N., PATEL, A. A., COX, D. N. & GALKO, M. J. 2018. Injury-induced cold sensitization in Drosophila larvae involves behavioral shifts that require the TRP channel Brv1. PLoS One, 13, e0209577.

      VICKERY, W. L. & MILLAR, J. S. 1984. The Energetics of Huddling by Endotherms. Oikos, 43, 88-93.

    2. Reviewer #3 (Public Review):

      Summary:<br /> The authors follow up on prior studies where they have argued for the existence of cold nociception in Drosophila larvae. In the proposed pathway, mechanosensitive Class III multidendritic neurons are the noxious cold responding sensory cells. The current study attempts to explore the potential roles of second and third order neurons, based on information of the Class III neuron synaptic outputs that have been obtained from the larval connectome.

      Strengths:

      The major strength of the manuscript is the detailed discussion of the second and third order neurons that are downstream of the mechanosensory Class III multidendritic neurons. These will be useful in further studies of gentle touch mechanosensation and mechanonociception both of which rely on sensory input from these cells. Calcium imaging experiments on Class III activation with optogenetics support the wiring diagram.

      Weaknesses:

      The scientific premise is that a full body contraction in larvae that are exposed to noxious cold is a sensorimotor behavioral pathway. This premise is, to start with, questionable. A common definition of behavior is a set of "orderly movements with recognizable and repeatable patterns of activity produced by members of a species (Baker et al., 2001)." In the case of nociception behaviors, the patterns of movement are typically thought to play a protective role and to protect from potential tissue damage.

      Does noxious cold elicit a set of orderly movements with a recognizable and repeatable pattern in larvae? Can the patterns of movement that are stimulated by noxious cold allow the larvae to escape harm? Based on the available evidence, the answer to both questions is seemingly no. In response to noxious cold stimulation many, if not all, of the muscles in the larva, simultaneously contract (Turner et al., 2016), and as a result the larva becomes stationary. In response to cold, the larva is literally "frozen" in place and it is incapable of moving away. This incapacitation by cold is the antithesis of what one might expect from a behavior that protects the animals from harm.

      Extensive literature has investigated the physiological responses of insects to cold (reviewed in Overgaard and MacMillan, 2017). In numerous studies of insects across many genera (excluding cold adapted insects such as snow flies), exposure to very cold temperatures quickly incapacitates the animal and induces a state that is known as a chill coma. During a chill coma, the insect becomes immobilized by the cold exposure, but if the exposure to cold is very brief the insect can often be revived without apparent damage. Indeed, it is common practice for many laboratories that use adult Drosophila for studies of behavior to use a brief chilling on ice as a form of anesthesia because chilling is less disruptive to subsequent behaviors than the more commonly used carbon dioxide anesthesia. If flies were to perceive cold as a noxious nociceptive stimulus, then this "chill coma" procedure would likely be disruptive to behavioral studies but is not. Furthermore, there is no evidence to suggest that larval sensation of "noxious cold" is aversive.

      The insect chill coma literature has investigated the effects of extreme cold on the physiology of nerves and muscles and the consensus view of the field is that the paralysis that results from cold is due to complex and combined action of direct effects of cold on muscle and on nerves (Overgaard and MacMillan, 2017). Electrophysiological measurements of muscles and neurons find that they are initially depolarized by cold, and after prolonged cold exposure they are unable to maintain potassium homeostasis and this eventually inhibits the firing of action potentials (Overgaard and MacMillan, 2017). The very small thermal capacitance of a Drosophila larva means that its entire neuromuscular system will be quickly exposed to the effect of cold in the behavioral assays under consideration here. It would seem impossible to disentangle the emergent properties of a complex combination of effects on physiology (including neuronal, glial, and muscle homeostasis) on any proposed sensorimotor transformation pathway.

      Nevertheless, the manuscript before us makes a courageous attempt at attempting this. A number of GAL4 drivers tested in the paper are found to affect parameters of contraction behavior (CT) in cold exposed larvae in silencing experiments. However, notably absent from all of the silencing experiments are measurements of larval mobility following cold exposure. Thus, it is not known from the study if these manipulations are truly protecting the larvae from paralysis following cold exposure, or if they are simply reducing the magnitude of the initial muscle contraction that occurs immediately following cold (ie reducing CT). The strongest effect of silencing occurs with the 19-12-GAL4 driver which targets Class III neurons (but is not completely specific to these cells).

      Optogenetic experiments for Class III neurons relying on the 19-12-GAL4 driver combined with a very strong optogenetic acuator (ChETA) show the CT behavior that was reported in prior studies. It should be noted that this actuator drives very strong activation, and other studies with milder optogenetic stimulation of Class III neurons have shown that these cells produce behavioral responses that resemble gentle touch responses (Tsubouchi et al 2012 and Yan et al 2013). As well, these neurons express mechanoreceptor ion channels such as NompC and Rpk that are required for gentle touch responses. The latter makes the reported Calcium responses to cold difficult to interpret in light of the fact that the strong muscle contractions driven by cold may actually be driving mechanosensory responses in these cells (ie through deformation of the mechanosensitive dendrites). Are the cIII calcium signals still observed in a preparation where cold induced muscle contractions are prevented?

      A major weakness of the study is that none of the second or third order neurons (that are downstream of CIII neurons) are found to trigger the CT behavioral responses even when strongly activated with the ChETA actuator (Figure 2 Supplement 2). These findings raise major concerns for this and prior studies and it does not support the hypothesis that the CIII neurons drive the CT behaviors.

      Later experiments in the paper that investigate strong CIII activation (with ChETA) in combination with other second and third order neurons does support the idea activating those neurons can facilitate body-wide muscle contractions. But many of the co-activated cells in question are either repeated in each abdominal neuromere or they project to cells that are found all along the ventral nerve cord, so it is therefore unsurprising that their activation would contribute to what appears to be a non-specific body-wide activation of muscles along the AP axis. Also, if these neurons are already downstream of the CIII neurons the logic of this co-activation approach is not particularly clear. A more convincing experiment would be to silence the different classes of cells in the context of the optogenetic activation of CIII neurons to test for a block of the effects, a set of experiments that is notably absent from the study.

      The authors argument that the co-activation studies support "a population code" for cold nociception is a very optimistic interpretation of a brute force optogenetics approach that ultimately results in an enhancement of a relatively non-specific body-wide muscle convulsion.

    1. Reviewer #1 (Public Review):

      Summary:<br /> The authors present a neural network (NN)-based approach to computationally cheaper emulation of simulations of biophysically relatively detailed cardiac cell models based on systems of ordinary differential equations. Relevant case studies are used to demonstrate the performance in the prediction of standard action potentials, as well as action potentials manifesting early depolarizations. Application to the "reverse problem" (inferring the effect of pharmacological compounds on ion channels based on action potential data before and after drug treatment) is also explored, which is a task of generally high interest.

      Strengths:<br /> This is a well-designed study, which explores an area that many in the cardiac simulation community will be interested in. The article is well written and I particularly commend the authors on transparency of methods description, code sharing, etc. - it feels rather exemplary in this regard and I only wish more authors of cardiac simulation studies took such an approach. The training speed of the network is encouraging and the technique is accessible to anyone with a reasonably strong GPU, not needing specialized equipment.

      Weaknesses:<br /> Below are several points that I consider to be weaknesses and/or uncertainties of the work:

      1. I am not convinced by the authors' premise that there is a great need for further acceleration of cellular cardiac simulations - it is easy to simulate tens of thousands of cells per day on a workstation computer, using simulation conditions similar to those of the authors. I do not really see an unsolved task in the field that would require further speedup of single-cell simulations.

      At the same time, simulations offer multiple advantages, such as the possibility to dissect mechanisms of the model behaviour, and the capability to test its behaviour in a wide array of protocols - whereas a NN is trained for a single purpose/protocol, and does not enable a deep investigation of mechanisms. Therefore, I am not sure the cost/benefit ratio is that strong for single-cell emulation currently.

      An area that is definitely in need of acceleration is simulations of whole ventricles or hearts, but it is not clear how much potential for speedup the presented technology would bring there. I can imagine interesting applications of rapid emulation in such a setting, some of which could be hybrid in nature (e.g. using simulation for the region around the wavefront of propagating electrical waves, while emulating the rest of the tissue, which is behaving more regularly/predictable, and is likely to be emulated well), but this is definitely beyond of the scope of this article.

      2. The authors run a cell simulation for 1000 beats, training the NN emulator to mimic the last beat. It is reported that the simulation of a single cell takes 293 seconds, while emulation takes only milliseconds, implying a massive speedup. However, I consider the claimed speedup achieved by emulation to be highly context-dependent, and somewhat too flattering to the presented method of emulation. Two specific points below:

      First, it appears that a not overly efficient (fixed-step) numerical solver scheme is used for the simulation. On my (comparable, also a Threadripper) CPU, using the same model ("ToR-ORd-dyncl"), but a variable step solver ode15s in Matlab, a simulation of a cell for 1000 beats takes ca. 50 seconds, rather than 293 of the authors. This can be further sped up by parallelization when more cells than available cores are simulated: on 32 cores, this translates into ca. 2 seconds amortized time per cell simulation (I suspect that the NN-based approach cannot be parallelized in a similar way?). By amortization, I mean that if 32 models can be simulated at once, a simulation of X cells will not take X*50 seconds, but (X/32)*50. (with only minor overhead, as this task scales well across cores).

      Second, and this is perhaps more important - the reported speed-up critically depends on the number of beats in the simulation - if I am reading the article correctly, the runtime compares a simulation of 1000 beats versus the emulation of a single beat. If I run a simulation of a single beat across multiple simulated cells (on a 32-core machine), the amortized runtime is around 20 ms per cell, which is only marginally slower than the NN emulation. On the other hand, if the model was simulated for aeons, comparing this to a fixed runtime of the NN, one can get an arbitrarily high speedup.

      Therefore, I'd probably emphasize the concrete speedup less in an abstract and I'd provide some background on the speedup calculation such as above, so that the readers understand the context-dependence. That said, I do think that a simulation for anywhere between 250 and 1000 beats is among the most reasonable points of comparison (long enough for reasonable stability, but not too long to beat an already stable horse; pun with stables was actually completely unintended, but here it is...). I.e., the speedup observed is still valuable and valid, albeit in (I believe) a somewhat limited sense.

      3. It appears that the accuracy of emulation drops off relatively sharply with increasing real-world applicability/relevance of the tasks it is applied to. That said, the authors are to be commended on declaring this transparently, rather than withholding such analyses. I particularly enjoyed the discussion of the not-always-amazing results of the inverse problem on the experimental data. The point on low parameter identifiability is an important one and serves as a warning against overconfidence in our ability to infer cellular parameters from action potentials alone. On the other hand, I'm not that sure the difference between small tissue preps and single cells which authors propose as another source of the discrepancy will be that vast beyond the AP peak potential (probably much of the tissue prep is affected by the pacing electrode?), but that is a subjective view only. The influence of coupling could be checked if the simulated data were generated from 2D tissue samples/fibres, e.g. using the Myokit software.

      Given the points above (particularly the uncertain need for further speedup compared to running single-cell simulations), I am not sure that the technology generated will be that broadly adopted in the near future. However, this does not make the study uninteresting in the slightest - on the contrary, it explores something that many of us are thinking about, and it is likely to stimulate further development in the direction of computationally efficient emulation of relatively complex simulations.

    2. Reviewer #3 (Public Review):

      Summary:<br /> Grandits and colleagues were trying to develop a new tool to accelerate pharmacological studies by using neural networks to emulate the human ventricular cardiomyocyte action potential (AP). The AP is a complex electrical signal that governs the heartbeat, and it is important to accurately model the effects of drugs on the AP to assess their safety and efficacy. Traditional biophysical simulations of the AP are computationally expensive and time-consuming. The authors hypothesized that neural network emulators could be trained to predict the AP with high accuracy and that these emulators could also be used to quickly and accurately predict the effects of drugs on the AP.

      Strengths:<br /> One of the study's major strengths is that the authors use a large and high-quality dataset to train their neural network emulator. The dataset includes a wide range of APs, including normal and abnormal APs exhibiting EADs. This ensures that the emulator is robust and can be used to predict the AP for a variety of different conditions.

      Another major strength of the study is that the authors demonstrate that their neural network emulator can be used to accelerate pharmacological studies. For example, they use the emulator to predict the effects of a set of known arrhythmogenic drugs on the AP. The emulator is able to predict the effects of these drugs, even though it had not been trained on these drugs specifically.

      Weaknesses:<br /> One weakness of the study is that it is important to validate neural network emulators against experimental data to ensure that they are accurate and reliable. The authors do this to some extent, but further validation would be beneficial. In particular for the inverse problem, where the estimation of pharmacological parameters was very challenging and led to particularly large inaccuracies.

      Additional context:<br /> The work by Grandits et al. has the potential to revolutionize the way that pharmacological studies are conducted. Neural network emulation has the promise to reduce the time and cost of drug development and to improve the safety and efficacy of new drugs. The methods and data presented in the paper are useful to the community because they provide a starting point for other researchers to develop and improve neural network emulators for the human ventricular cardiomyocyte AP. The authors have made their code and data publicly available, which will facilitate further research in this area.

      It is important to note that neural network emulation is still a relatively new approach, and there are some challenges that need to be addressed before it can be widely adopted in the pharmaceutical industry. For example, neural network emulators need to be trained on large and high-quality datasets. Additionally, it is important to validate neural network emulators against experimental data to ensure that they are accurate and reliable. Despite these challenges, the potential benefits of neural network emulation for pharmacological studies are significant. As neural network emulation technology continues to develop, it is likely to become a valuable tool for drug discovery and development.

    1. Author Response

      The following is the authors’ response to the original reviews.

      The authors thank the reviewers for their thoughtful and constructive comments. We address each comment below and have uploaded a revised manuscript.

      Public Reviews

      1) One key point that could use further clarification is how to interpret densities in the reconstruction that do overlap with the template. If the omitted regions can be reliably reconstructed, and the density is smooth throughout, it implies the detected particles are not only (mostly) true positives but also their poses must be essentially correct. Therefore, why cannot the entire reconstruction be trusted, including portions overlapping with the template? In the "Future applications" section, the authors state that in order to obtain a reconstruction that is entirely devoid of template bias, it would be necessary to successively omit parts of the template structure through its entirety. I wonder if that is really necessary and if the presented approach of omitting template portions could be better framed as a "gold-standard" validation procedure.

      Our assumption is indeed that the entire reconstruction can be trusted if the omitted features are faithfully reproduced in the reconstruction. We have added a sentence in the discussion to clarify this. However, we think that assessing template bias will still require the omit test (see also our reply below). Also, as discussed in the manuscript, there is likely a little bias left, even if it is not directly visible in the reconstruction. Therefore, if the goal is an entirely unbiased reconstruction, the only way will be to successively omit parts of the template structure throughout the template.

      2) In other words, given the compelling evidence provided by the reconstructions in the omitted areas, I find it hard to imagine how the procedure would be "hallucinating" features in the rest of the structure, as the entire reconstruction depends on the same pose and defocus parameters. A possible experiment to test this hypothesis would be to go the opposite way, deliberately adding an unrealistic feature to the bait and checking whether it comes up in the reconstruction, while at the same time checking how it behaves in omitted parts.

      Template bias might be generated in different ways. A common situation is the presence of noise, which causes biased deviations of the best template match from their “true” match that would just align the target signal to the template. Another type of bias may occur when there is a mismatch between the template and the detected target. The target may still be detected if there is sufficient structural overlap with the template. Since there might not be a clear “correct” alignment of a mismatching target to the template, the best alignment may again be biased, generating artificial density in the reconstruction. This second case may produce bias that is more pronounced in the mismatching regions. The different origins of bias will have to be investigated more thoroughly in another study. For the present study, however, we maintain that unless there is some assessment of bias in a given location, one cannot completely rule out bias based on the absence of it elsewhere in the reconstruction.

      3) When assessing their approach to in situ data (the yeast ribosome), it is intriguing to see that the resolution downgraded from 3.1 to 8 Å when refinement of the particle poses against the current reconstruction was attempted. The authors do provide some possible explanations, such as the reduced signal of the reconstruction at high resolution and the crowded background, but it leaves one to wonder if this means that a 3.1 Å reconstruction could never be obtained from these data by conventional single-particle analysis procedures.

      The refinement results with our in situ data do indeed appear to be limited to low resolution when using the conventional single-particle pipeline and software. It might be possible to improve refinement by introducing certain priors, filters and masking functions that are optimized for the increased background and spectral properties of in situ data. Also, we have not tested all available software, and some might perform better than others. It is worth noting that in a different study using our data, by Cheng et al (2023) and cited in our manuscript, the resolution of the refined reconstruction using different software was ~7 Å resolution, i.e., close to what we report here. Finally, refinement of the detected targets against a high-resolution template does work but since it involved the template, we regard this as part of the template matching process.

      4) Furthermore, in the section "Quantifying template bias", the authors make the intriguing statement that there can still be some overfitting of noise even in true positives. I understand this overfitting would occur in the form of errors in the pose and defocus estimation, but a clarification would be helpful.

      We have added a sentence in the Discussion to clarify where this bias may come from.

      5) In the Discussion, the claim that "it is not necessary to use tomography to generate high-resolution reconstructions of macromolecular complexes in cells" is a misconception, at least in part. As demonstrated in works by the same group and others (https://doi.org/10.1016/j.xinn.2021.100166, https://doi.org/10.1038/s41467-023-36175-y, https://doi.org/10.1038/s41586-023-05831-0), 2D imaging of native cellular environments does offer a faster and better way to obtain high-resolution reconstructions compared to tomography. However, tomography provides the entire 3D context of the macromolecules, such as their localization to membranes and the cellular architecture, which can be readily visualized in a tomogram even at low resolution, so methods for structure determination from tilt series data such as subtomogram averaging remain of paramount importance. Most likely, a combination of 2D and 3D imaging approaches will be necessary to retrieve both the highest structural resolution and their cellular context to address biological questions.

      We agree and have modified our statement accordingly.

      6) The "Materials and Methods" section lacks a description of transmission electron microscopy data collection.

      We are sorry for this oversight and have added these details.

      7) Finally, the preprint version of this work posted on bioRxiv (https://doi.org/10.1101/2023.07.03.547552) contains the following competing interests statement, which is missing from the submitted version: "The authors are listed as inventors on a closely related patent application named "Methods and Systems for Imaging Interactions Between Particles and Fragments", filed on behalf of the University of Massachusetts."

      This is correct. The statement was missing in the first version of the uploaded manuscript and was added after consultation with the eLife editorial office.

      8) Quantification of the amount of model bias is then performed using omit maps, where every 20th residue is removed from the template and corresponding reconstructions are compared (for those residues) with the full-template reconstructions. As expected, model bias increases with lower thresholds for the picking. Some model bias (Omega=8%) remains even for very high thresholds. The authors state this may be due to overfitting of noise when template-matching true particles, instead of introducing false positives. Probably, that still represents some sort of problem. Especially because the authors then go on to show that their expectation of the number of false positives does not always match the correct number of false positives, probably due to inaccuracies in the noise model for more complicated images. This may warrant further in-depth discussion in a revised manuscript.

      We have added further thoughts regarding the mismatch between expected and actual number of false positives in the Discussion section. A full understanding of the issue likely requires further study, which is currently underway.

      9) The authors evaluate the effect of high-resolution 2D template matching on template bias in reconstructions, and provide a quantitative metric for overfitting. It is an interesting manuscript that made me reevaluate and correct some mistakes in my understanding of overfitting and template bias, and I'm sure it will be of great use to others in the field. However, its main point is to promote high-resolution 2D template matching (2DTM) as a more universal analysis method for in vitro and, more importantly, in situ data. While the experiments performed to that end are sound and well-executed in principle, I fail to make that specific conclusion from their results.

      We do not see 2DTM as a more universal analysis method for in vitro and in situ data, but as simply as another method that can be used. We have added a sentence in the introduction to clarify this.

      10) The authors correctly point out that overfitting is largely enabled by the presence of false-positives in the data set. They go on to perform their in situ experiments with ribosomes, which provide an extremely favorable amount of signal that is unrealistic for the vast majority of the proteome. This seems cherry-picked to keep the number of false-positives and false-negatives low. The relationship between overfitting/false-positive rate and the picking threshold will remain the same for smaller proteins (which is a very useful piece of knowledge from this study). However, the false-negative rate will increase a lot compared to ribosomes if the same high picking threshold is maintained. This will limit the applicability of 2DTM, especially for less-abundant proteins.

      The reviewer is correct that the lower SNR of smaller targets poses a fundamental limit to 2DTM. We have stated this in previous studies and have added a sentence in the introduction of the current manuscript to clarify this.

      11) I would like to see an ablation study: Take significantly smaller segments of the ribosome (for which the authors already have particle positions from full-template matching, which are reasonably close to the ground-truth), e.g. 50 kDa, 100 kDa, 200 kDa etc., and calculate the false-negative rate for the same picking threshold. If the resulting number of particles does plummet, it would be very helpful to discuss how that affects the utility of 2DTM for non-ribosomes in situ.

      The suggested ablation study is a good idea and was reported by Rickgauer et al (2020), cited in our manuscript. We added our own analysis for this dataset in Figure 4-figure supplement 1 and show the proportion of LSUs detected as a function of template mass, indicating detection limit of ~300 kDa. We also added a note in the Results section to explain that the threshold we use to limit false positives means that there are also false negatives, with a rate that depends on their molecular mass.

      12) Another point of concern is the dramatic resolution decrease to 8 A after multiple iterations of refinement against experimental reconstructions described in line 159. Was this a local search from the poses provided by 2DTM, or something more global? While this is not a manifestation of overfitting as the authors have conclusively shown, I think it adds an important point to the ongoing "But do we really need tomograms, or can we just 2D everything?" debate in the field, which is also central to the 2D part of 2DTM. Reaching 8 A with 12k ribosome particles would be considered a rather poor subtomogram averaging result these days. Being in the "we need tilt series to be less affected by non-Gaussian noise" camp myself, I wonder if this indicates 2D images are inherently worse for in situ samples. If they are, the same limitations would extend to template matching. In that case, shouldn't the authors advocate for 3DTM instead of 2DTM? It may not be needed for ribosomes, but could give smaller proteins the necessary edge.

      We have extensively discussed the advantages and disadvantages of both tomography and 2DTM (Lucas et al, 2021) and think it is not useful to talk in terms of “better” and “worse”. Instead, each technique has its areas of application, and we maintain that a combination of the two may give the best results. The limitation of 8 Å does not apply to reconstructions aligned against high-resolution templates, as demonstrated in the present study. Regarding noise models, there is also need for these in 3DTM, as explained in recent publications: Maurer et al (2023), bioRxiv, doi.org/10.1101/2023.09.06.556487; Cruz-León et al (2023), bioRxiv, doi.org/10.1101/2023.09.05.556310; Chaillet et al (2023), Int. J. Mol. Sci. 24, 13375.

      13) Right now, this study is also an invitation to practitioners who do not understand the picking threshold used here and cannot relate it to other template-matching programs to do a lot of questionable template matching and claim that the results are true because templates are "unoverfittable". I think such undesirable consequences should be discussed prominently.

      We have added a discussion of this point in the Discussion section.

      Recommendations for the authors

      1) Lines 58-59: What does "nominally untilted" mean? Has the lamella pre-tilt (milling angle) been taken into account or not? If yes, how?

      The lamella milling angle was not taken into account, so there is a tilt built into the sample of about 8° that was not compensated for by a counter-tilt of the microscope goniometer. We have added a note to explain this in the text of the manuscript.

      2) Lines 113-114: A brief explanation of the threshold calculation method from Rickgauer et al, 2017 to achieve an expected false positive rate of one per micrograph would be helpful here.

      We describe the equation for estimating the false discovery rate later in the manuscript. We have added a note in the text to point the reader to the relevant section of the manuscript.

      3) For consistency, it would be interesting to include a plot of the SNR peaks found by 2DTM in the in situ dataset, that could be directly compared to Figure 1 - figure supplement 1B.

      We have added this to Figure 2 - figure supplement 1A-C, to directly compare to Figure 1 – figure supplement 1A-C.

      4) Showing model-map FSC curves between the density retrieved from the omitted areas and their respective models would provide further evidence not only that they are correct but to what extent.

      An FSC calculation would be challenging for small regions, such as side chains and drugs, due to masking artifacts. Moreover, the model was built into an in vitro determined map and was not fit into the in vivo map calculated here. Therefore, deviations between the map and model may reflect differences between the two conditions and may not reflect the agreement of the map to the in vivo structure.

      5) Lines 128-130: The figure references are wrong. Here, Figure 1B should probably be Figure 1A (or 1B), and Figure 1C clearly refers to Supplementary Figure 1F (FSC curve).

      We have corrected the incorrect figure references.

      6) Line 125: Wrong figure reference, Figure 1A here refers to Supplementary Figure 1B (cross-correlation peaks).

      We have corrected the incorrect figure references.

      7) I haven't been able to find mention of code availability in the manuscript. Given that it is a major outcome of the study, I think it should be provided.

      The code is available from the cisTEM repository, github.com/timothygrant80/cisTEM, and an executable version of the program measure_template_bias has been posted for download on the cisTEM webpage, cistem.org. We have added a note in the Methods section to point the readers to these resources.

      8) Line 50: "An additional complication of subtomogram averaging for in situ imaging is the selection of valid targets" - This is not specific to subtomogram averaging, but to in situ samples.

      We agree and have updated the text to reflect this.

      9) Line 77: "if this is true for high-resolution features, which are more susceptible to noise overfitting" - This is not intuitive to me. High-resolution features require more information to be overfitted with a constant set of model parameters, thus making their overfitting harder.

      The reviewer is correct that there is more information at high resolution, partially compensating for the low SNR. However, the overall refinement behavior is still dominated by overfitting at high resolution, as we have demonstrated in an earlier publication in Stewart & Grigorieff (2004), Ultramicroscopy 102, 67–84.

      10) Line 316: "Baited reconstruction is substantially faster and a more streamlined" - To back this and other similar statements, it would be helpful if the authors provided some time measurements for the execution of their potentially very computationally expensive search.

      The current implementation of 2DTM requires 45 GPU hours per template per K3 image to search 13 defocus planes. However, for a comparison, the manual work for annotation, as well as additional processing to align and classify sub-tomograms to generate high resolution averages should also be considered in this comparison. These are highly project-dependent and can exceed the time required for 3DTM manifold. We have clarified this in our Discussion section.

      11) Line 319: "We expect focused classification to identify sub-populations to further improve the resolution" - How would this work if refining the 2D data without a high-resolution template resulted in significantly worse resolution even for a ribosome? Or is this meant to be done with prior knowledge of every state?

      Classification can be done using existing single particle software. To avoid alignment errors, as described above, particle alignment angles and shifts are fixed during classification. This leaves only the particle occupancy per class to be refined, which appears to lead to good classification. We have added a brief note to explain this strategy. However, since this is not shown in this manuscript, we have not added a more extensive discussion of particle classification.

      12) Line 354: "without requiring manual intervention or expert knowledge" - Previous expert knowledge was arguably provided in the form of a high-resolution structure.

      We agree with the reviewer and have clarified our statement.

    1. Postbox is Thunderbird for Mac.6ShareReportSavelevel 2TheRealKenJeong · 2 yr. agoThis is a good app. It started off as a reskinned Thunderbird client but has branched off somewhat. It's different enough at this point that it no longer supports plug-ins, but over t ime, it's assumed most functionality of the more popular plug-ins anyway.

      If it really is based on Thunderbird code, then how are they able to sell it on https://www.postbox-inc.com/store/pricing and not make the source code available for free?